✨ About The Role
- Responsible for maintaining reliable, secure, scalable, and highly available infrastructure and applications for over 75,000 service professionals
- Collaborate with product and engineering teams to support an infrastructure platform that is reliable, scalable, secure, and reduces manual toil
- Drive and shape incident management practices across engineering and improve monitoring and alerting infrastructure
- Spread SRE culture throughout the organization and understand industry and company-wide trends to assess and develop new technologies
- Own problems from end to end, managing complexity, engaging directly with stakeholders, and approaching situations with a bias to action
âš¡ Requirements
- Experienced professional with 5-8 years in cloud technologies, production engineering, or site reliability engineering roles
- Skilled in infrastructure-as-code principles, IP networking, DNS, load balancing, and cloud-first monitoring
- Proficient in container technology using Docker and Kubernetes, with the ability to write high-quality code in languages like Typescript, Ruby, or Python
- Outcome-oriented individual who can execute projects from start to finish and is familiar with on-call rotations
- Strong collaborator who can work with product and engineering teams to ensure optimal application performance and scalability