Sr. Software Engineer (Data Center Automation)
xAIGenerative AI company
MemphisSenior
Software Engineering
About the role
Enhance reliability in multi-data center environments through automation and observability.
- •We are seeking a highly skilled Sr.
- •Software Engineer to join our team in managing and enhancing reliability across a multi-data center environment.
- •Key Responsibilities Design, develop, and deploy scalable code and services to automate reliability workflows.
- •Implement and maintain observability tools and practices to provide real-time insights into system health across multiple data centers.
- •Collaborate with cross-functional teams to identify reliability bottlenecks, automate solutions for fault tolerance, disaster recovery, capacity planning, and physical/environmental risk mitigation.
- •Requirements Strong coding abilities with hands-on data center experience.
- •Ability to build scalable reliability services, optimize system performance, and minimize downtime.
- •Close partnership with facility operations to address physical infrastructure impacts.
Tech stack
PythonNode.jsKubernetesCI/CDAirflow
Match insights
Tech:Python, Node.js, Kubernetes, CI/CD, Airflow
Level:Senior