The Resiliency Services team is seeking a Site Reliability Engineer II to help drive the reliability, scalability, and operational excellence of our Azure-based solutions. Our team owns and operates several critical services within the AGC, including Azure Automation (streamlining and automating cloud operations), Azure Backup (secure, scalable data protection), Azure Site Recovery (disaster recovery and business continuity), Azure Migrate (cloud migration planning and execution), and the Learn Documents (comprehensive technical documentation and training resources). We are a geographically distributed, collaborative group with in-person coverage at Reston, Elkridge, and Annapolis Junction, and we pride ourselves on fostering a fun, supportive, and high-performing team environment.
We are looking for an individual who is quality-focused, proactive, and passionate about reliability. The ideal candidate is someone who can identify issues and drive solutions, communicates clearly, and thrives as a team player. You’ll have the opportunity to work across a diverse set of Azure services, ensuring they meet the highest standards for resiliency and customer experience. If you enjoy solving problems, collaborating with talented colleagues, and making a real impact, you’ll find our team both rewarding and enjoyable to work with.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.