Skip to content
Airbnb logo

Senior Staff Machine Learning Engineer, Data & Eval

AirbnbShort-term Rentals company
United StatesLead
Data & AI

About the role

Lead ML evaluation and data flywheel for GenAI systems at Airbnb.

  • Set technical direction and lead execution for ML evaluation and data flywheel powering CSxAI products.
  • Key Responsibilities Define evaluation strategy and success metrics for GenAI systems.
  • Build and scale evaluation frameworks with strong controls for bias, drift, and reliability.
  • Design the data flywheel: instrumentation, feedback collection, data quality checks, labeling strategy, dataset versioning, and governance.
  • Requirements PhD in Computer Science, Mathematics, Statistics, or related technical field (or equivalent practical experience). 10+ years building, testing, and shipping ML/AI systems end-to-end; including 2+ years of experience with GenAI/LLM systems in production. 5+ years leading large, ambiguous technical initiatives as a senior IC, influencing roadmap and engineering/science direction across teams.
View original posting →

Tech stack

PythonTensorFlowPyTorchscikit-learnHugging FaceLLMsAirflowDatabricksAWSGoogle CloudAzure

Match insights

Tech:Python, TensorFlow, PyTorch, scikit-learn, Hugging Face
Level:Lead

More roles at Airbnb

View open roles at Airbnb