As a Machine Learning Engineer focused on Inference and Serving at Yobi, you'll design, optimize, and operate the systems that bring our Behavioral AI models to life in real time. You'll work at the core of our production environment, turning trained models into performant, reliable, and continuously improving services that power our open-web and CTV products.
This is an applied ML systems role—equal parts engineering depth, deployment craft, and model intuition. You'll shape how models are packaged, versioned, rolled out, and observed across environments, ensuring every prediction is fast, accurate, and accountable.
Deep expertise in model deployment. You've built or scaled production ML serving systems—handling versioning, rollouts, rollback strategies, and live experimentation.
Low-latency mindset. You understand what makes inference fast: model graph optimization, quantization, caching, batching, and efficient feature retrieval.
Systems fluency. You write robust, high-performance code in Go, Rust, C++, or Java, and are comfortable bridging to Python for model integration and analysis.
Operational maturity. You treat inference as a living system—monitoring drift, tracking model lineage, and ensuring observability from input to outcome.
Infrastructure intuition. You know how to make serving systems reproducible and portable without over-engineering them, whether that's through custom runtime design, model registries, or lightweight orchestration.
Applied ML understanding. You can reason about model performance, interpret trade-offs, and work with researchers to make models more deployable.
We prioritize attitude, culture, and general (technical) fit over matching perfectly into one of our job descriptions. If our mission and work resonates with you, we encourage you to apply. Tell us how you can help drive our products forward, even if you don't feel like you are a perfect fit for some of the listings.