View All Jobs 2742

ML Engineer — LLM Evaluation - Remote Eligible

Own the LLM evaluation processes to enhance safety and real-world applicability.
RemoteLondonNew York
Mid-Level
7 months ago

✨ About The Role

- The role involves owning LLM evaluation processes and methods, focusing on generating benchmarks that reflect real-world usage and safety vulnerabilities. - The candidate will be responsible for generating high-quality synthetic data, curating labels, and conducting rigorous benchmarking. - Delivering robust, scalable, and reproducible production code is a key responsibility. - The position requires developing innovative methods for benchmarking LLMs to assess harmlessness and helpfulness. - The candidate will have opportunities to co-author papers, patents, and presentations with the research team.

⚡ Requirements

- The ideal candidate will have domain knowledge in LLM evaluation and data curation techniques. - Extensive experience in designing and implementing LLM benchmarking is essential, along with a comfort level in leading end-to-end projects. - Adaptability and flexibility are crucial, as the candidate must be able to shift focus based on new findings in the community. - A strong motivation to work on safe and responsible AI is important for success in this role. - Previous research or projects in benchmarking LLMs will be highly regarded.
+ Show Original Job Post
























ML Engineer — LLM Evaluation - Remote Eligible
RemoteLondonNew York
Engineering
About DynamoFL
The most secure solution for enterprise AI