Staff Infrastructure Engineer - Models
TenstorrentDeepTech company
Belgrade, SerbiaLead
Software Engineering
About the role
Design and operate Kubernetes-native applications for large-scale AI infrastructure.
- •Design, build, and operate Kubernetes-native applications and platform services to run large-scale AI training and inference workloads, improving reliability and operational maturity across internal and customer environments.
- •Key Responsibilities Design, build and operate Kubernetes-native applications, services and workloads for large-scale AI infrastructure.
- •Develop operators, controllers, APIs and automation to deploy, scale, monitor and operate complex workloads.
- •Define workload patterns for inference, training, CI/CD and development workflows.
- •Improve reliability, observability and operational maturity of Kubernetes applications.
- •Collaborate with SRE, infrastructure, deployment and engineering teams to support environments.
- •Requirements Strong experience designing and running production workloads on Kubernetes.
- •Deep understanding of workload orchestration, scaling, reliability and production debugging on Kubernetes.
- •Experience building platform services, operators or controllers using Go or Python.
- •Collaborative work style across engineering, infrastructure and SRE teams.
Tech stack
KubernetesGoPython
Match insights
Tech:Kubernetes, Go, Python
Level:Lead