View All Jobs 118025

Senior System Reliability Engineer, AI Platform - Remote Eligible

Design and scale enterprise AI automation platform governance and reliability across the organization
Remote
Senior
6 days ago
Veeam Software

Veeam Software

Delivers data backup, recovery, and intelligent data management solutions for virtual, physical, cloud, and hybrid IT environments.

11 Similar Jobs at Veeam Software

Senior System Reliability Engineer, AI Platform

Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running. Join us as we move forward together, growing, learning, and making a real impact for some of the world's biggest brands. The future of data resilience is here - go fearlessly forward with us.

About the Role

As a Senior AI Platform & Reliability Engineer, you will play a key role in ensuring the stability, security, and scalability of Veeam's intelligent automation ecosystem. You'll partner with engineering, security, and platform teams to build governance frameworks and resilient infrastructure that support both professional and citizen developers. Your work will help enable safe, scalable AI-driven automation across the organization while maintaining strong operational reliability and compliance.

What You'll Do

  • Design and implement governance frameworks for automation and AI platforms across the enterprise
  • Manage deployment, scaling, and reliability of automation tools and supporting infrastructure
  • Build monitoring, observability, and alerting systems to ensure platform health and performance
  • Support automated incident response and recovery workflows to improve platform resilience
  • Drive lifecycle management practices to enable smooth promotion of automation assets to production
  • Ensure secure integrations, identity management, and compliance with company security standards
  • Collaborate with engineering and security teams to improve platform reliability and operational efficiency

Technologies You'll Work With

  • Microsoft Azure
  • Microsoft Power Platform
  • Copilot Studio
  • Microsoft Foundry
  • Automation platforms such as n8n, Zapier, or similar tools
  • Observability and monitoring platforms
  • Identity and security tooling

What You'll Bring

  • 7+ years of experience in site reliability engineering, cloud architecture, or systems engineering
  • Strong expertise in Azure cloud services and enterprise cloud environments
  • Hands-on experience with Microsoft Power Platform, Copilot Studio, or similar automation ecosystems
  • Experience managing and scaling automation platforms in complex environments
  • Proficiency in scripting or automation using modern tools or languages
  • Ability to design secure, scalable platform architectures
  • Strong collaboration skills and experience working across engineering and security teams

Bonus Skills

  • Experience building governance frameworks or platform operating models
  • Familiarity with enterprise observability, incident automation, or self-healing systems
  • Knowledge of identity architecture, DLP strategies, or tenant-level security models
  • Experience supporting citizen developer platforms or internal automation programs
  • Exposure to AI platform operations or enterprise automation strategy

What You'll Get

  • Two weeks of paid vacation, 12 statutory holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
  • Paid parental leave: 8 days for fathers, 122 days for birthing parents, 92 days for adoptive parents
  • Medical, dental, and vision coverage fully funded through INS Premium for employees and dependents
  • Mental health support, therapy sessions, and virtual care via our Employee Assistance Program
  • Retirement and social security contributions through Costa Rica's statutory programs
  • Life insurance equal to 24x monthly salary, plus disability and funeral coverage
  • Daily cafeteria subsidy
  • Fertility, adoption, and surrogacy support, plus 24 paid volunteer hours through Veeam Cares
  • Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O'Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning

Please note: The position is based in San Jose, Costa Rica. If the applicant is permanently located outside of Costa Rica, Veeam reserves the right to decline the application. All applications must be submitted in English.

+ Show Original Job Post
























Senior System Reliability Engineer, AI Platform - Remote Eligible
Remote
Engineering
About Veeam Software
Delivers data backup, recovery, and intelligent data management solutions for virtual, physical, cloud, and hybrid IT environments.