 
                                                
                                            Mistral AI is seeking Applied Scientists and Research Engineers focused on model efficiency and edge deployment. You will research and build ultra-efficient models and toolchains for on-device inference across CPUs, GPUs, NPUs, and specialized accelerators. Your work will enable Mistral models to run privately, reliably, and fast on mobile, desktop, and embedded devices.
• Run pre-training, post-training and deploy state of the art models on clusters with thousands of GPUs. You don't panic when you see OOM errors or when NCCL feels like not wanting to talk.
• Design and evaluate quantization, pruning, distillation, and sparsity methods for LLMs and multimodal models.
• Build deployment stacks, optimize kernels and memory layouts.
• Run large-scale experiments to balance accuracy, latency, throughput, and power under tight memory constraints; profile and fix bandwidth/compute bottlenecks.
• Develop tooling for calibration data generation, mixed-precision training, quant-aware finetuning, structured/unstructured sparsity, and compilation passes.
• Manage research projects and communications with client research teams.
• You are fluent in English, and have excellent communication skills. You are at ease explaining complex technical concepts to both technical and non-technical audiences.
• You're not afraid of contributing to a big codebase and can find yourself around independently with little guidance.
• You've a deep understanding of quantization trade-offs, hardware constraints, and compiler stacks.
• You're expert with PyTorch or JAX; strong C++/CUDA or low-level performance skills a plus; production-grade Python.
• You don't need roadmaps: you just do. You don't need a manager: you just ship.
• Low-ego, collaborative and eager to learn.
• You have a track record of success through personal projects, professional projects or in academia.
• Hold a PhD / master in a relevant field (e.g., Mathematics, Physics, Machine Learning), but if you're an exceptional candidate from a different background, you should apply.
• Have contributed to a large codebase used by many (open source or in the industry).
• Have a track record of publications in top academic journals or conferences.
• Contributions to open-source inference/compilers stacks.
• Love improving existing code by fixing typing issues, adding tests and improving CI pipelines.
• Have experience optimizing inference on edge devices.
We have local offices in Paris, London, Marseille, Singapore and Palo Alto.
France
Competitive cash salary and equity
Food : Daily lunch vouchers
Sport : Monthly contribution to a Gympass subscription
Transportation : Monthly contribution to a mobility pass
Health : Full health insurance for you and your family
Parental : Generous parental leave policy
Visa sponsorship
UK
Competitive cash salary and equity
Insurance
Transportation: Reimburse office parking charges, or 90GBP/month for public transport
Sport: 90GBP/month reimbursement for gym membership
Meal voucher: £200 monthly allowance for its meals
Pension plan: SmartPension (percentages are 5% Employee & 3% Employer)
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.