Meta's Reality Labs Wearables division is developing future products in augmented reality and virtual reality. Our work in conversational AI, computer vision, advanced optics, eye tracking, machine learning and reasoning will enable and empower consumers and businesses in new ways. As a Linguistic Engineer in this role, you will use your linguistic analysis and natural language processing (NLP) expertise to identify, assess, and mitigate risks related to language-based vulnerabilities, adversarial content, and safety threats in Meta's products. You will help deliver datasets, models, and knowledge that power the ML systems across all components in the multimodal assistant product, such as ASR/TTS, NLU, NLG, Dialog, and LLMs. The Linguistic Engineer delivers datasets, and designs and produces evaluation metrics across languages and areas that power the voice assistant product. The ideal candidate has demonstrated technical, analytical, and collaboration skills with experience in building datasets for ML applications, with a focus on red-teaming and safety. Experience as an ML practitioner is useful but not required as the role focuses on the datasets and provides an opportunity to interface and work directly with ML teams as part of the greater voice assistant organization.
Build adversarial datasets, data pipelines, and models for ML applications
Directly support product development with rules, prompts, and data patches
Evaluate the safety and quality of models and product experiences and close the feedback loop
Work with project stakeholders
Identify best practices and improve procedures across data systems
Drive and deliver projects from conceptualization through launch and beyond with continual improvement and support
Design and conduct product experiments
Solve complex problems and embrace ambiguity to drive innovative and impactful solutions
Manage and prioritize multiple work streams
Collaborate seamlessly with cross-functional teams
Experience as data scientist, software engineer, computational linguist or in similar role
Experience with programming and data analysis with languages and platforms such as Python, SQL, PHP/Hack
Experience with text analysis, scripting, relational database, No SQL databases or similar
Experience shipping multiple products across various platforms
Degree in Linguistics, Computational Linguistics, Computer Science, Data Science, Information Systems, or related fields, or equivalent experience
Willingness to develop, create and evaluate adversarial and potentially offensive content
Experience with data tools, pipelines, and analytics is needed for this role
Experience collaborating with cross-functional teams to design and execute redteaming exercises, develop robust safety protocols, and enhance detection systems for harmful or manipulative language
2 + years of work experience as data scientist/software engineer/computational linguist, with experience in machine learning and knowledge graph integrations with lexicons or ontologies
Practical knowledge of the relationship between data and machine learning models
Experience with larger scripting projects that involve combining language data from different sources, computing complex metrics over large datasets, and so on
Experience designing and conducting data experiments
Experience with version control, unit tests, and other programming best practices
Professional working experience with additional languages
Advanced coursework and/or research in Linguistics, Computer Science, Data Science, Computational Linguistics, Information Systems, or related fields
Experience being multi-lingual, have an interest in NLP and/or conversational AI systems and being at the cutting edge of their development