About the role
Help scale platform with reliability, observability, and operational excellence.
- •We're looking for an experienced Site Reliability Engineer (SRE) to help us scale our platform with reliability, observability, and operational excellence at the core.
- •Key Responsibilities Design, build, and maintain scalable infrastructure to support real-time analytics and machine learning workloads Improve system reliability and performance through automation, observability, and proactive capacity planning Own and evolve CI/CD pipelines, deployment automation, rollback mechanisms, and config management Requirements 8+ years of experience in SRE, DevOps, or infrastructure engineering roles 5+ years of experience with datacenter operations and/or system and network administration Experience with containerization (Docker), and orchestration (Kubernetes)
Tech stack
DockerKubernetesCI/CDGitHub ActionsArgoCDAnsiblePrometheusGrafanaDatadogTerraformBashPythonAirflow
Match insights
Tech:Docker, Kubernetes, CI/CD, GitHub Actions, ArgoCD
Level:Senior