About the role
Staff Applied AI Scientist to scale AI product measurement and improvement.
- •Join our team to scale the effective measurement and improvement of our AI products in production.
- •Key Responsibilities Design and run sampling, LLM-as-a-judge, and labelling systems.
- •Build LLM-powered analysis to recommend product improvements.
- •Own the full feedback loop: prompt engineering, evaluation at scale, data labelling and continuous improvement.
- •Requirements Proven experience analysing the performance of AI or data products in production.
- •Hands-on LLM evaluation in production: LLM-as-judge, eval datasets, human-in-the-loop labelling, scoring against thresholds.
- •Observability for LLM and agentic systems (traces, sampling, prompt management, production monitoring).
Tech stack
PythonLLMsOpenAI APIAnthropic APIGemini APILangChain
Match insights
Tech:Python, LLMs, OpenAI API, Anthropic API, Gemini API
Level:Lead