Site Reliability Engineer
Duvo.aiRetail Automation company
Prague, Czech RepublicSenior
Software Engineering
About the role
Ensure highly available, scalable infrastructure and reliable production services.
- •Join Duvo.ai as a Site Reliability Engineer to ensure highly available, scalable infrastructure and reliable production services for the company's platform.
- •Key Responsibilities Design, build and operate cloud infrastructure and CI/CD pipelines for production services.
- •Maintain and scale Kubernetes clusters and container orchestration.
- •Implement monitoring, alerting and incident response with tools like Prometheus and Grafana.
- •Automate infrastructure provisioning using Terraform and infrastructure-as-code practices.
- •Collaborate with engineering teams to improve reliability and performance.
- •Requirements Strong experience with Kubernetes and Docker containerization.
- •Experience operating services in AWS and using IaC (Terraform).
- •Proficiency with monitoring/alerting and observability tools (Prometheus, Grafana, logging).
- •Familiarity with CI/CD, Linux administration and Git-based workflows.
Tech stack
KubernetesDockerTerraformPrometheusGrafanaCI/CDAWSLinuxGit
Match insights
Tech:Kubernetes, Docker, Terraform, Prometheus, Grafana
Level:Senior