About the role
Design and run evaluation systems for AI product quality and LLM-powered analysis.
- •Join Culture Amp to improve AI product quality in production.
- •Key Responsibilities Design and run evaluation systems over production traces.
- •Build LLM-powered analysis for product improvement.
- •Own the full feedback loop for prompt engineering and continuous improvement.
- •Requirements Proven experience in AI product performance analysis.
- •Hands-on experience with LLM evaluation in production.
- •Observability for LLM and agentic systems.
Tech stack
PythonLLMsOpenAI APIAnthropic APIGemini APILangChain
Match insights
Tech:Python, LLMs, OpenAI API, Anthropic API, Gemini API
Level:Lead