Staff Site Reliability Engineer
EarninEarned Wage company
Mountain View, United States$252,000 - $308,000Lead
Software Engineering
About the role
Lead AI-first reliability engineering, focusing on incident response, alert triage, and resilient services.
- •Lead EarnIn's shift to AI-first reliability engineering, defining AI-driven incident response, alert triage, and SLO-driven resilience across services.
- •This hybrid Mountain View role requires in-office work two days a week and offers a salary range of $252,000-$308,000 plus equity and benefits.
- •Key Responsibilities Define reliability strategy, SLIs/SLOs, and error budgets; use AI to surface trends and predict risks.
- •Lead high-severity incidents and build AI-driven alert correlation, triage, and postmortems.
- •Develop AI agents for runbook automation and context gathering from observability tools during pages.
- •Partner with product engineers to embed AI-assisted operations and architect resilient services on AWS.
- •Requirements 7+ years in SRE/Software/Infrastructure engineering with cross-org influence.
- •Proven production application of AI/LLMs to operational workflows.
- •Strong software engineering skills (Python or Go) and observability experience (Datadog/OpenTelemetry).
- •Infrastructure-as-code and Kubernetes/Terraform proficiency; experience with AWS services listed.
Tech stack
PythonGoDatadogKubernetesTerraformAWS
Match insights
Tech:Python, Go, Datadog, Kubernetes, Terraform
Level:Lead