About the role
Research Engineer III to build and maintain ML pipelines.
- •Join our team building production ML infrastructure for enterprise-scale machine learning pipelines.
- •Key Responsibilities Build and maintain Apache Airflow DAGs for ML pipeline orchestration Develop SageMaker training jobs for NLP models (NeMo, PyTorch) Implement MLflow tracking and model registry integrations Write infrastructure-as-code using Terraform (AWS S3, IAM, VPC) Create comprehensive tests for ML pipeline components Follow spec-driven development practices with Code Contribute to ML observability and evaluation frameworks Requirements Experience with PyTorch, transformers, or other ML libraries Familiarity with ML model evaluation and experimentation Interest in ML/AI infrastructure and operations Strong problem-solving and debugging skills Comfortable with Linux/command-line environments Knowledge of AWS services (S3, SageMaker, IAM) Exposure to Apache Airflow or workflow orchestration Understanding of CI/CD, testing, or infrastructure-as-code
Tech stack
AirflowSageMakerPyTorchMLflowTerraformIAMCI/CDLinux
Match insights
Tech:Airflow, SageMaker, PyTorch, MLflow, Terraform
Level:Senior