✨ About The Role
- Responsible for maintaining reliable, secure, scalable, and highly available infrastructure and applications for over 70,000 Service Professionals
- Collaborate with Product and Engineering teams to support a reliable and scalable infrastructure platform
- Drive operational excellence, scale AWS footprint, and shape incident management practices
- Improve monitoring and alerting platforms, spread SRE culture, and assess new technologies
- Engage directly with stakeholders to manage complexity, ensure optimal application performance, and build highly resilient systems
âš¡ Requirements
- Experienced Site Reliability Engineer with 5-8 years of relevant industry experience in cloud technologies and infrastructure-as-code principles
- Proficient in working with AWS, GCP, or Azure cloud platforms and designing, building, and maintaining production-grade services
- Skilled in IP networking, DNS, CDN, load balancing, HTTP, firewalls, and container technologies like Docker and Kubernetes
- Strong ability to write high-quality code in languages like Typescript, Ruby, or Python and execute projects from start to finish
- Proven track record in incident management, monitoring, logging, alerting infrastructure, and on-call rotations