View All Jobs 166013

Senior Site Reliability Engineer - GCP & Container Platforms

Develop automated self-healing systems to ensure platform resilience and high availability
Chandler, Arizona, United States
Senior
yesterday

Senior Site Reliability Engineer

We are seeking a Senior Site Reliability Engineer to help develop our platform operations across Windows, Linux, and cloud-native environments. This role is central to our transformation from app-specific support to platform-wide reliability engineering. You will bring deep expertise in Google Cloud Platform, container orchestration, and automation, enabling scalable, secure, and resilient infrastructure that supports diverse applications across our enterprise.

Key Responsibilities:

  • Ensure high availability, performance, and security of production systems across Windows, Linux, and Google Cloud Platform environments.
  • Engineer and support containerized workloads using Kubernetes and Docker, enabling scalable microservices architectures.
  • Lead infrastructure provisioning and configuration using Terraform, Ansible, and Google Cloud Platform-native tools.
  • Develop automation scripts and pipelines to eliminate manual toil and accelerate incident response.
  • Implement observability frameworks using SLIs/SLOs, Prometheus, Grafana, and Google Cloud Platform Operations Suite.
  • Drive proactive monitoring, alerting, and telemetry across hybrid environments.
  • Lead incident response, root cause analysis, and postmortems.
  • Build self-healing systems and automated remediation workflows using Google Cloud Platform-native services and scripting.
  • Collaborate with InfoSec to enforce hardening standards, manage vulnerabilities, and support compliance initiatives.
  • Integrate security into CI/CD pipelines and container platforms using IAM, encryption, and policy enforcement.
  • Partner with developers, application owners, and infrastructure teams to deliver reliable, cloud-native platforms.
  • Document configurations, runbooks, and operational procedures to enable cross-team reuse and transparency.

Required Qualifications:

  • 4+ years of Technology Infrastructure Engineering and Solutions experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education.
  • 4+ years of experience in Windows Server administration and production support.
  • Strong scripting skills in PowerShell, Python, or Shell.
  • Hands-on experience with Google Cloud Platform services, including GKE, IAM, Cloud Functions, and Cloud Monitoring.
  • Proficiency in container technologies: Docker and Kubernetes.
  • Familiarity with Linux system administration and hybrid cloud environments.
  • Experience with infrastructure-as-code tools: Terraform, Ansible.
  • Strong understanding of Active Directory, DNS, DHCP, and Windows security principles.

Desired Qualifications:

  • Security certifications (e.g., CISSP, Security+, GCP Professional Cloud Security Engineer).
  • Experience with CI/CD tools (e.g., GitLab CI and Jenkins).
  • Familiarity with ITIL practices and change management.
  • Exposure to ServiceNow, load balancers, certificate management, and endpoint protection tools.

Job Expectations:

  • Ability to work on-site in one of the listed locations in a hybrid environment.
  • Ability to work outside of normal business hours including nights and weekends on a limited/rotational basis.
+ Show Original Job Post
























Senior Site Reliability Engineer - GCP & Container Platforms
Chandler, Arizona, United States
Engineering
About Arizona Staffing
An empty string.