Staff Site Reliability & DevOps Engineer - Observability

BrandwatchSocial Media company

RemoteLead

Software Engineering

Bookmark Apply on site→

About the role

Design, operate, and evolve observability platforms using Grafana and Prometheus.

•This role focuses on designing, operating, and evolving observability platforms with a strong emphasis on metrics, logging, and alerting, primarily using Grafana and Prometheus.
•You will ensure production systems are observable, reliable, and operable at scale, working closely with platform, infrastructure, and application teams.
•Key Responsibilities Design, build, and operate observability platforms based on Grafana and Prometheus.
•Define and maintain metrics standards, dashboards, alerts, and SLOs.
•Support incident response by providing actionable telemetry and post-incident analysis.
•Automate observability configuration using infrastructure as code.
•Requirements Strong experience with Prometheus and Grafana.
•Solid Linux and networking fundamentals.
•Experience running observability stacks in Kubernetes environments.
•Infrastructure as code experience (Terraform preferred).

View original posting →

View original posting for full requirements →

Tech stack

PrometheusGrafanaLinuxKubernetesTerraform

Match insights

Tech:Prometheus, Grafana, Linux, Kubernetes, Terraform

Level:Lead

More roles at Brandwatch

View open roles at Brandwatch