This role is no longer accepting applications via Rocketlist.
Founding Site Reliability Engineer
Relevance AIAI Agent company
San Francisco, United StatesMid
Software Engineering
About the role
Seeking a Founding Site Reliability Engineer to establish and scale the SRE discipline in a fast-growing AI company.
- •We’re looking for a Founding Site Reliability Engineer to join us as our first SRE hire in San Francisco.
- •You’ll own the reliability, scalability, and security of our platform as we power tens of thousands of multi-agent workloads across multiple regions.
- •Key Responsibilities Own SRE establishing best practices, tooling, and culture Tackle reliability challenges unique to multi-agent orchestration at enterprise scale Guarantee >99.9% uptime of production systems, ensuring reliability at global scale Requirements 5+ years in SRE/DevOps/Infrastructure roles, with experience in enterprise SaaS environments.
- •Deep AWS expertise (EC2, ECS/EKS, Lambda, RDS, VPC, IAM).
- •Proven track record with Infrastructure as Code (Terraform, Kubernetes/EKS, CDK, or CloudFormation).
Tech stack
AWSKubernetesEKSTerraformGitHub ActionsPostgresMongoPrometheusGrafanaCloudWatchPagerDutyBetterStack
Match insights
Tech:AWS, Kubernetes, EKS, Terraform, GitHub Actions
Level:Mid