What You Do At AMD Changes Everything
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
The Role
We are a core algorithm team at AMD, dedicated to end-to-end AI workload optimization on AMD platforms. We are seeking talented engineers specializing in multimodal foundation models, with a focus on Vision-Language Models (VLMs), Vision-Language-Action Models (VLAs), and World Action Models (WAMs). In this role, you will drive model training, compression, quantization, inference optimization, and efficient deployment—enabling next-generation embodied AI and multimodal agents to achieve peak performance on AMD hardware platforms.
Key Responsibilities
- Optimize training strategies, fine-tuning, and alignment for multimodal models (VLM / VLA / WAM) on AMD platforms.
- Enhance action prediction, world state modeling, and long-horizon planning capabilities of WAM/VLA models for embodied intelligence scenarios (e.g., robotics, simulation-based interaction).
- Design and implement model optimization techniques including quantization (PTQ/QAT), pruning, knowledge distillation, operator fusion, and KV cache optimization to improve inference latency, throughput, and energy efficiency.
- Collaborate closely with compiler, driver, and system software teams to deeply integrate models into AMD's software stack.
- Stay at the forefront of research in World Models, action generation, and multimodal agents—and explore novel architectures for AMD's heterogeneous compute platforms.
Qualifications
- Master's or PhD in Computer Science, Artificial Intelligence, Robotics, Electrical Engineering, or a related field.
- Hands-on experience with VLMs, VLAs, or WAMs (World Action Models)—especially in robotics decision-making, simulated environment training, or action sequence generation—is highly preferred.
- Proficiency in PyTorch; familiarity with multimodal and embodied AI frameworks.
- Familiarity with simulation platforms such as Isaac Gym, LIBERO, MuJoCo, or RoboTwin.
- Strong software engineering skills and ability to deliver full-cycle solutions—from research prototyping to production deployment.
Preferred Qualifications
- Contributions to open-source projects in multimodal agents, world models, or robotics (e.g., OpenVLA, DROID, ACT).
- Publication record in top-tier conferences (e.g., CVPR, ICRA, CoRL, NeurIPS, ICLR) in multimodal learning or embodied AI is a strong advantage.
- Strong background in model optimization: quantization, sparsity, kernel fusion, dynamic batching, etc.
- Experience with AMD ROCm ecosystem or heterogeneous computing performance tuning.
- Understanding of GPU/accelerator architecture; experience with CUDA or HIP is a plus.
What We Offer
- Access to cutting-edge AMD compute resources.
- Unique opportunity to shape full-stack co-design across algorithms, compilers, and hardware.
- A collaborative, globally distributed team of world-class AI systems and robotics researchers.