Los Gatos, California, United States of America
Netflix is one of the world's leading entertainment services, with over 300 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages. Members can play, pause and resume watching as much as they want, anytime, anywhere, and can change their plans at any time.
The Model Development & Management (MDM) team builds and evolves the unified developer experience—SDKs, frameworks, and libraries—that powers end-to-end model creation at Netflix. We focus on maximizing practitioner velocity while making infrastructure complexity invisible, integrating tightly with data/feature, training, serving, and evaluation pillars. Our portfolio-with-paved-paths strategy (Metaflow and other libraries exposed through one opinionated SDK) supports teams from a single data scientist to 100+ MLEs and model scales from ~10M to 100B+ parameters—spanning classic personalization, content understanding, and multimodal GenAI.
We are looking for an experienced ML/AI infrastructure engineering leader to manage MDM and drive the next generation of Netflix's model development platform! You will lead the team to architect, build, test, and launch a cohesive SDK and set of opinionated templates that let practitioners scaffold projects, configure and execute runs (from laptop to tightly coupled multi-node GPU training), track experiments and lineage, package models with evaluation hooks, and promote them confidently. Your work will enable partners across content, studio, consumer, ads, and games to develop and iterate on large-scale models—including LLMs, recommenders, computer vision, and foundation models—throughout the full lifecycle from early research and experimentation to productization and ongoing optimization. Success will be measured by concrete developer-experience KPIs such as time-to-first successful remote run, run success rate (ex-user code), mean time to actionable diagnosis, adoption of paved paths, and template reuse.
We are a highly collaborative team. You will operate cross-functionally with Training Platform and Offline Inference, Serving Systems, Feature/Data Infrastructure, and MLP Tooling to deliver a seamless, consistent experience end-to-end. To thrive here, you bring a strong ML infrastructure background (SDK/CLI design, packaging and environments, experiment tracking/lineage, observability), excellent product taste for developer experience, and the judgment to balance paved-path simplicity with power-user control. You'll design for extensibility as the space evolves, keep interfaces stable with clear deprecation policies, and prioritize measurable outcomes that lift practitioner velocity across Netflix.
At Netflix, we carefully consider various compensation factors to determine your personal top of market. We rely on market indicators to determine compensation and consider your specific job, skills, and experience to get it right. These considerations can cause your compensation to vary and will also depend on your location.
The overall market range for roles in this area of Netflix is typically $190,000 - $920,000. This market range is based on total compensation (vs. only base salary), which is in line with our compensation philosophy.
Inclusion is a Netflix value and we strive to host a meaningful interview experience for all candidates. If you want an accommodation/adjustment for a disability or any other reason during the hiring process, please send a request to your recruiting partner.
We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.