Skip to content
Harvey logo

Senior Software Engineer, Site Reliability

HarveyAI & company
BangaloreSenior
Software Engineering

About the role

Ensure the reliability, scalability, and performance of our legal AI platform.

  • Ensure the reliability, scalability, and performance of our legal AI platform.
  • Key Responsibilities Design, implement, and manage monitoring, alerting, and infrastructure resources across 50+ global regions Lead incident management processes, including postmortems, root cause analyses, and driving actionable improvements Automate operational tasks and workflows, building tools and processes for capacity planning, graceful rollouts, and safe data access to maintain high reliability and reduce manual intervention Collaborate across teams to drive reliability, security, and compliance throughout the software lifecycle Optimize infrastructure costs through strategic capacity planning and build-versus-buy decisions while maintaining system performance, reliability, and functionality.
  • Requirements 3+ years of experience in Site Reliability Engineering or similar roles supporting production environments Expertise in infrastructure as code(IaC) tools (Pulumi, Terraform, CloudFormation, etc.) Deep familiarity with observability tools (Datadog, Sentry, etc.) and incident response practices (PagerDuty, IncidentIO, etc) Proficiency with cloud infrastructure platforms (Azure, GCP, AWS, etc)
View original posting →

Tech stack

TerraformDatadogPagerDutyAWSAzureGoogle Cloud

Match insights

Tech:Terraform, Datadog, PagerDuty, AWS, Azure
Level:Senior

More roles at Harvey

View open roles at Harvey