Skip to content
Serve Robotics logo

Senior Reliability Operations Engineer

Serve RoboticsRobotic Delivery company
Penang, MalaysiaSenior
Software Engineering

About the role

Lead operational reliability for robotic and cloud systems in Malaysia.

  • The Senior Reliability Operations Engineer leads operational reliability by region owning incident response, escalations, and Tier 2 support for robotic and cloud systems.
  • Key Responsibilities Respond to escalations from Tier 1 support, using runbooks, metrics, logs, and system diagnostics to investigate and remediate issues or determine when escalation to Tier 3 is necessary.
  • Develop and update runbooks, workflows, and operational documentation to ensure consistent and reliable responses to recurring issues, collaborating with product teams to expand coverage over time.
  • Write, maintain, and enhance automation scripts and tools that streamline common remediation steps, improve response times, and reduce manual operational overhead.
  • Requirements Bachelor’s degree in Computer Science, Information Technology, Engineering, or equivalent practical experience. 5+ years of professional experience in Reliability Operations, Site Reliability Engineering, DevOps, IT Operations, or a related technical support function.
  • Strong proficiency with Linux, including navigating systems, reviewing logs, and performing diagnostics.
  • Experience writing, executing, and maintaining runbooks, automations, and operational workflows.
View original posting →

Tech stack

LinuxGrafanaPrometheusGoogle CloudJiraPagerDuty

Match insights

Tech:Linux, Grafana, Prometheus, Google Cloud, Jira
Level:Senior

More roles at Serve Robotics

View open roles at Serve Robotics