Applied AI Software Engineer
Canvas MedicalHealthcare Software company
San Francisco, United StatesSenior
Data & AI
About the role
Lead evaluations for AI agents in development and post-deployment.
- •We're hiring an Applied AI Software Engineer to lead evaluations for agents in development and the post-deployment fleet of agents operating in Canvas to automate work for our customers.
- •Key Responsibilities Design and execute large-scale evaluation plans for LLM-based agents performing clinical documentation, scheduling, billing, communications, and general workflow automation tasks.
- •Build end-to-end test harnesses that validate model behavior under different configurations (prompt templates, context sources, tool availability, etc.).
- •Partner with clinicians to define accurate expected outcomes (gold standard) for performance comparisons in domains of clinical consequence, and partner with other subject matter experts in other non-clinical domains.
- •Requirements 5+ years of experience in applied machine learning or AI engineering, with a focus on evaluation and benchmarking.
- •Proficiency with foundation model APIs and experience orchestrating complex agent behaviors via prompts or tools.
- •Experience designing and running high-throughput evaluation pipelines, ideally including human-in-the-loop or expert-labeled benchmarks.
Tech stack
PythonOpenAI APIAnthropic APIGemini APILangChainHugging FaceLLMsPandasNumPySQLPostgreSQLMongoDBRedisAirflowAWSDockerKubernetesCI/CDGitJiraConfluencePostmanREST APIgRPCSlackAsanaLinearExcelPowerPoint
Match insights
Tech:Python, OpenAI API, Anthropic API, Gemini API, LangChain
Level:Senior