View All Jobs 123910

ML Engineer

Develop end-to-end pipelines for evaluating large language model outputs and performance
Groningen, Netherlands
Mid-Level
2 days ago
Springer Nature

Springer Nature

A global publisher specializing in academic research, educational content, and scientific, technical, and medical literature.

4 Similar Jobs at Springer Nature

ML Engineer

At Springer Nature AI Labs (SNAIL), we're shaping the future of scientific publishing through responsible, human-centred AI. Our team is at the forefront of integrating advanced AI technologies to optimize processes and enhance the user experience for researchers and academics worldwide. We value a collaborative work environment where ideas flourish, and innovation is encouraged. With our curiosity-driven, impact-first culture, we focus on delivering AI innovation at scale always with integrity and in close collaboration across functions. Our commitment to long-term growth ensures that our people are nurtured and developed to reach their full potential.

As an ML Engineer focused on LLM evaluation, you will design and build both qualitative and quantitative frameworks for assessing large language model outputs, optimize prompts and workflows, and collaborate with cross-functional teams to ensure our generative AI solutions meet rigorous standards of quality, reliability and ethics.

What You'll Do

  • Develop Evaluation Frameworks: Architect end-to-end pipelines that combine automated metrics (BLEU, ROUGE, BERTScore, custom error rates) with human-in-the-loop assessments.
  • Quantitative Analysis: Implement statistical and machine-learning methods to measure LLM performance—accuracy, relevance, bias, fairness, robustness—and analyze trends over releases.
  • Qualitative Assessment: Design annotation guidelines, recruit/train reviewers, and lead structured reviews of model outputs for coherence, factuality and style.
  • Prompt Engineering & Optimization: Use tools like DSPy to craft, test and (automatically) refine prompts; analyze A/B test experiments to maximize response quality and task success.
  • Custom Tooling: Build reusable Python libraries and dashboards for monitoring LLM behaviour, automating evaluation workflows and integrating with our CI/CD approaches.
  • Collaboration & Reporting: Partner with research, product and MLOps teams to translate user needs into evaluation requirements; present findings, drive data-backed decisions and iterate on model improvements.
  • Best Practices & Ethics: Champion documentation, version control, testing standards and fairness audits. Stay up to date on responsible AI guidelines and industry benchmarks.

Must-Have Qualifications

  • Education: MSc or higher in CS, Engineering, Data Science or related.
  • AI/ML Expertise: Deep knowledge of ML algorithms, with a focus on NLP and transformers.
  • GenAI Expertise: 1+ years experience evaluating, optimizing, and productionizing GenAI products
  • Software/Cloud: 3+ years production experience with Python; experience with Docker, Kubernetes, FastAPI; hands-on experience with any major cloud provider (GCP/Azure/AWS)
  • MLOps: Familiarity with CI/CD for models, monitoring, versioning, pipelines (e.g. KubeFlow)
  • Communication: Business-fluent English; able to translate complex concepts for diverse stakeholders.

By joining Springer Nature, you will actively contribute to the development and implementation of AI solutions that drive the future of scientific publishing. As a leader, you will guide your team to innovate and grow, pushing the boundaries of what's possible in AI. Join us as we pioneer the future of scientific publishing through artificial intelligence.

+ Show Original Job Post
























ML Engineer
Groningen, Netherlands
Engineering
About Springer Nature
A global publisher specializing in academic research, educational content, and scientific, technical, and medical literature.