Senior Staff Machine Learning Engineer, Data & Eval
AirbnbShort-term Rentals company
United StatesLead
Data & AI
About the role
Lead ML evaluation and data flywheel for GenAI systems at Airbnb.
- •Set technical direction and lead execution for ML evaluation and data flywheel powering CSxAI products.
- •Key Responsibilities Define evaluation strategy and success metrics for GenAI systems.
- •Build and scale evaluation frameworks with strong controls for bias, drift, and reliability.
- •Design the data flywheel: instrumentation, feedback collection, data quality checks, labeling strategy, dataset versioning, and governance.
- •Requirements PhD in Computer Science, Mathematics, Statistics, or related technical field (or equivalent practical experience). 10+ years building, testing, and shipping ML/AI systems end-to-end; including 2+ years of experience with GenAI/LLM systems in production. 5+ years leading large, ambiguous technical initiatives as a senior IC, influencing roadmap and engineering/science direction across teams.
Tech stack
PythonTensorFlowPyTorchscikit-learnHugging FaceLLMsAirflowDatabricksAWSGoogle CloudAzure
Match insights
Tech:Python, TensorFlow, PyTorch, scikit-learn, Hugging Face
Level:Lead