View All Jobs 1428

Software Engineer, Networking

Design and implement custom networking collectives tightly integrated into the training stack.
San Francisco Bay Area
Senior
$360,000 - 530,000 USD / year
5 months ago

✨ About The Role

- Design and implement custom networking collectives that are integrated into the training stack - Collaborate with ML researchers to ensure efficient collective operations in C++ and CUDA - Work on simulations to inform future supercomputer network designs - Ensure that the largest training jobs take full advantage of different network transports used in supercomputers - Contribute to the AI research progress at OpenAI by incorporating learnings from the entire research organization into the training platform

âš¡ Requirements

- Ideal candidate has experience in writing distributed algorithms using RDMA and is comfortable with low-level performance-sensitive CPU and/or GPU code - Strong background in network simulation techniques is preferred - Ability to collaborate closely with ML researchers to design and implement efficient collective operations in C++ and CUDA - Experience with custom networking collectives and network transports used in supercomputers is a plus - Thrives in a fast-paced environment and enjoys working on novel collective communication techniques
+ Show Original Job Post
























Software Engineer, Networking
San Francisco Bay Area
$360,000 - 530,000 USD / year
Engineering
About OpenAI
Building artificial general intelligence