This role is for a hands-on ML Engineer who can design, train, and productionize models powering search relevance, retrieval, personalization, and LLM-based conversational experiences at a massive scale.
You will work closely with backend, platform, and catalog enrichment teams to deliver high-quality ML components under tight performance and latency constraints.
Build and improve search ranking, retrieval, and query understanding models.
Develop ML components for Conversational Search:
Design and optimize embedding models, vector stores, and similarity search systems.
Build personalized ranking and recommendation models using deep learning.
Work on large-scale ML systems optimized for:
Implement ML pipeline best practices (versioning, monitoring, A/B testing, observability).
Collaborate with platform teams to integrate ML services across search, recommendations, and conversational agents.
Develop caching strategies (prompt cache, vector cache, similarity caching) to hit strict SLA targets.
Contribute to long-term roadmap: foundational retrieval models, multi-objective optimization, and user lifecycle modeling.