We are looking for a DevOps Engineer to design, build, and operate the infrastructure behind our LLM platform. You will be responsible for keeping our ML infrastructure reliable, scalable, and efficient - from data pipelines to training and inference.
In this role, you will develop and maintain CI/CD pipelines, orchestration workflows, and observability for distributed ML workloads across GPU/TPU/CPU environments.
This is a DevOps-first role with strong exposure to ML infrastructure. You will work closely with ML Engineers and Data Engineers, while focusing on building a robust, automated, and production-grade platform that accelerates model development and delivery.