Skip to content
Foxglove logo

ML Platform Engineer

FoxgloveRobotics Data company
San Francisco, United StatesSenior
Data & AI

About the role

Design, deploy, and scale the ML data platform for Foxglove.

  • We're looking for a ML Platform Engineer with deep infrastructure instincts to help design, deploy, and scale the systems that power Foxglove's data platform.
  • This is a platform-first role: you'll own the infrastructure layer that makes ML possible in production, not just the models that run on top of it.
  • Key Responsibilities Design, deploy, and operate production inference infrastructure
  • including model serving, autoscaling, load balancing, and cost optimization across cloud environments Own the platform architecture for embedding and retrieval pipelines that power semantic search over multimodal robotics data (image, video, point cloud, and timeseries) Build and maintain the training and evaluation infrastructure that enables rapid iteration on model performance
  • including job orchestration, experiment tracking, and dataset versioning Drive cloud infrastructure decisions (AWS/GCP) that directly impact latency, throughput, reliability, and cost at scale Define platform abstractions and internal tooling that let product engineers ship ML-powered features without needing to manage infrastructure themselves Evaluate, integrate, and operationalize third-party ML infrastructure components; establish clear build vs. buy frameworks for the team Requirements Deep, hands-on experience owning production ML infrastructure: inference serving, model optimization (e.g., vLLM, Triton, TorchServe), orchestration, and cloud cost management Strong foundation in distributed systems and cloud infrastructure (AWS/GCP)
  • you think in terms of system reliability, failure modes, and operational burden, not just model accuracy Experience architecting and operating retrieval systems at scale, including vector databases (e.g., Pinecone, Lance, turbopuffer, pgvector) and embedding pipelines over large, heterogeneous datasets A platform engineer's mindset: you build systems that other engineers depend on, and you take that responsibility seriously Proven ability to operate with high ownership
  • you can make hard infrastructure tradeoffs independently and move fast without breaking things Strong communication skills; you can explain infrastructure tradeoffs clearly to both ML and non-ML engineers
View original posting →

Tech stack

AWSGoogle CloudKubernetesCI/CDDockerElasticsearchPostgreSQLMongoDBRedisApache KafkaTensorFlowPyTorchscikit-learnHugging FaceLLMsMLflowKubeflowSageMakerVertex AIONNXJAXPandasNumPySparkAirflowdbtDatabricksFivetranDagsterPrefect

Match insights

Tech:AWS, Google Cloud, Kubernetes, CI/CD, Docker
Level:Senior

More roles at Foxglove

View open roles at Foxglove