Design and implement large-scale, production-grade AI systems that integrate LLMs and Generative AI into real-world applications.
Build frameworks that support Retrieval-Augmented Generation (RAG), agentic workflows, and multi-step reasoning at scale.
Ensure models and agents are production-ready with strong observability, monitoring, and performance optimization.
Architect distributed, fault-tolerant systems capable of supporting high-throughput AI workloads.
Lead the design of modular, extensible, and reusable components to accelerate AI adoption across teams.
Build MVPs quickly, validate assumptions, and iterate toward scalable long-term solutions.
Partner with product and platform teams to integrate AI into customer-facing and enterprise-grade applications.
Define and enforce standards for APIs, services, and infrastructure that enable seamless AI adoption.
Balance functional requirements with non-functional goals such as reliability, latency, and security.
Drive technical strategy for AI initiatives and guide teams in best practices for AI-driven software development.
Mentor engineers across software and AI domains to elevate overall technical expertise.
Contribute to thought leadership in AI engineering through internal frameworks, design patterns, and reusable components.
12+ years of experience in software engineering (backend, distributed systems, large-scale platforms), with 2+ years applying Generative AI/LLMs in production.
Proven expertise in distributed computing, cloud-native architectures (GCP, Azure, or AWS), and systems that prioritize scalability and fault tolerance.
Strong coding skills in Python (preferred) and at least one system-level language (Java, Go, or C++).
Experience with ML/AI frameworks (PyTorch, TensorFlow, Hugging Face) as a plus, but applied in the context of building systems, not just training models.
Deep knowledge of RAG pipelines, vector databases, and real-time data integration.
Familiarity with resilience engineering: disaster recovery, failover, monitoring, and high availability.
Exposure to multi-modal AI (text, image, video) and optimization techniques (quantization, distillation) is advantageous.
Strong grounding in system design, performance engineering, and design patterns.
Track record of delivering production systems with AI at scale, not just research or prototyping.
Moved model training/fine-tuning to secondary importance → framed as a plus, not the core.
Emphasized distributed systems, cloud, APIs, reliability, and software engineering fundamentals.
Framed role as "AI Systems Engineer" / "AI Engineer" instead of "ML Engineer."
Highlighted production integration and customer-facing impact, which appeals to senior software engineers.
Equal Opportunity Employer
Walmart, Inc. is an Equal Opportunity Employer – By Choice. We believe we are best equipped to help our associates, customers, and the communities we serve live better when we really know them. That means understanding, respecting, and valuing unique styles, experiences, identities, ideas, and opinions – while being inclusive of all people.