✨ About The Role
- The role involves optimizing large AI models for high-volume, low-latency production environments.
- The engineer will collaborate with machine learning researchers, engineers, and product managers to deploy the latest technologies.
- Responsibilities include introducing new techniques and tools to enhance model performance and efficiency.
- The engineer will build tools to identify bottlenecks and design solutions to address them.
- Code optimization and efficient utilization of Azure VMs are key tasks in this position.
âš¡ Requirements
- The ideal candidate has a strong understanding of modern machine learning architectures and optimization techniques, particularly for inference.
- They should possess at least 3 years of professional software engineering experience.
- Familiarity with PyTorch, NVidia GPUs, and related software stacks is essential.
- The candidate should have experience in architecting, observing, and debugging production distributed systems.
- A humble attitude and eagerness to assist colleagues are important traits for success in this role.