View All Jobs 1869

Site Reliability Engineer, Research Platform, SRE

Design and implement solutions to ensure the scalability of the research platform infrastructure.
San Francisco Bay Area
Senior
$310,000 - 465,000 USD / year
2 months ago

✨ About The Role

- Responsible for ensuring the reliability, scalability, and performance of systems as the company continues to expand - Collaborate with researchers, data scientists, and platform developers to specify requirements and design solutions for the research platform - Implement fault-tolerant and resilient design patterns to minimize service disruptions and build automation tools to improve system reliability - Develop and maintain monitoring systems to proactively identify issues and anomalies in the production environment - Participate in an on-call rotation to respond to critical incidents and ensure 24/7 system availability
- Experienced reliability engineer with a track record of accelerating engineering reliability in a fast-paced, rapidly scaling company - Proficient in cloud infrastructure, specifically Azure, and experienced in collaborating with cross-functional teams to ensure reliability and scalability - Skilled in utilizing Infrastructure as Code (IaC) principles to automate infrastructure provisioning and configuration management - Strong problem-solving and troubleshooting skills, with excellent communication and collaboration abilities - Dedicated to creating a diverse, equitable, and inclusive culture while empowering colleagues with excellent tooling and systems
+ Show Original Job Post
























Site Reliability Engineer, Research Platform, SRE
San Francisco Bay Area
$310,000 - 465,000 USD / year
Engineering
About OpenAI
Building artificial general intelligence