View All Jobs 115500

Mlops Engineer (SRE) - Remote Eligible

Design and implement observability solutions for machine learning models in production environments
Remote
Mid-Level
2 weeks ago
Metova

Metova

A technology company specializing in mobile app development, cybersecurity solutions, and digital transformation services for businesses.

MLops Engineer (Sre)

A leading company in Mexico specializing in accounting software is looking for a highly skilled MLOps Engineer (SRE) to join the team.

Requirements:

  • 4+ years of experience as an SRE, DevOps, or Platform Engineer with ML projects.
  • Fluent technical English.
  • Experience with orchestrators such as Airflow, Kubeflow, or experiment tracking tools (MLflow, Weights & Biases).
  • Experience in high-transaction environments such as banking, accounting, payroll, or logistics. (Nice to Have).

Knowledge and Skills:

  • Knowledge of model monitoring frameworks such as Evidently, Arize AI, WhyLabs, or similar.
  • Proficiency in Prometheus, Grafana, ELK/EFK, OpenTelemetry, or Datadog.
  • Proficiency in Kubernetes, Docker, Helm, and infrastructure automation tools (Terraform, Pulumi).
  • Solid fundamentals in CI/CD for ML pipelines (testing, validation, rollback).

Responsibilities:

  • Design and operate observability solutions for ML models in production (monitoring, alerts, traceability).
  • Develop dashboards and metrics to evaluate model performance, cost, and stability.
  • Implement structured logging, drift monitoring, data quality, and inference error tools.
  • Collaborate with data science and product teams to detect and mitigate incidents related to models in production.
  • Apply SRE practices such as chaos engineering, stress testing, staging testing, and continuous integration.
+ Show Original Job Post
























Mlops Engineer (SRE) - Remote Eligible
Remote
Engineering
About Metova
A technology company specializing in mobile app development, cybersecurity solutions, and digital transformation services for businesses.