View All Jobs 1580

Software Engineer in Systems

Build systems to distribute work across massive GPU clusters efficiently.
San Francisco Bay Area
Senior
$360,000 - 530,000 USD / year
3 months ago

✨ About The Role

- Design and build distributed systems used to train next-generation models - Focus on building systems to distribute work across massive GPU clusters efficiently - Design and implement methods to make training stack more efficient and scale up to next-generation supercomputers - Implement methods to robustly train models in the presence of hardware failures - Build tooling to enhance understanding of problems in largest training jobs

âš¡ Requirements

- Experienced software engineer with a background in high performance computing and low-level systems - Passionate about building stable and highly efficient distributed systems - Enjoys delving into low-level details about performance optimization - Thrives in designing and implementing methods to make training stacks more efficient and scalable - Comfortable working on massive GPU clusters and designing systems to distribute work efficiently
+ Show Original Job Post
























Software Engineer in Systems
San Francisco Bay Area
$360,000 - 530,000 USD / year
Engineering
About OpenAI
Building artificial general intelligence