Software Engineer, Safeguards Evals
AnthropicGenerative AI, company
San Francisco, United States$320,000 - $485,000 USDSenior
Data & AI
About the role
Builds evaluation infrastructure for AI safety systems, measuring agent performance and driving improvements.
- •This role builds the evaluation infrastructure that answers questions about the effectiveness of Anthropic's AI safety systems.
- •You'll sit at the intersection of applied ML research and engineering, designing experiments to measure how well an investigative agent performs across harm areas.
- •Key Responsibilities Build and own the evaluation harness for an agentic investigation system.
- •Construct high-quality eval datasets representing real-world misuse.
- •Measure agent performance end-to-end and drive improvements.
- •Requirements Proficiency in Python and comfort working across the stack.
- •Experience building and maintaining data pipelines.
- •Experience working with LLMs and a working understanding of their capabilities and failure modes.
Tech stack
PythonLLMsAnthropic API
Match insights
Tech:Python, LLMs, Anthropic API
Level:Senior