✨ About The Role
- Responsible for scaling up critical inference infrastructure to efficiently service customer requests for state-of-the-art AI models
- Collaborate with cross-functional teams to optimize performance, latency, throughput, and efficiency of deployed models
- Design and implement solutions to address bottlenecks and sources of instability in production distributed systems
- Optimize code and Azure VMs to maximize hardware utilization for AI models
- Contribute to OpenAI's mission of deploying broadly beneficial Artificial General Intelligence (AGI) through responsible and impactful work
âš¡ Requirements
- Experienced engineer with a background in modern ML architectures and optimization techniques for performance
- Ability to work collaboratively with machine learning researchers, engineers, and product managers to bring new technologies into production
- Strong expertise in core HPC technologies such as InfiniBand, MPI, and CUDA
- Proven track record of owning problems end-to-end and finding solutions to address high-priority issues
- Humble attitude, eagerness to help colleagues, and a commitment to team success