View All Jobs 122633

Site Reliability Engineer

Own the SRE program and drive reliability improvements across Steampunk's cloud-native services.
McLean, Virginia, United States
Senior
$125,000 – 200,000 USD / year
20 hours agoBe an early applicant
Steampunk

Steampunk

Provides human-centered digital transformation and consulting services for government agencies, focusing on design, engineering, and agile delivery.

Site Reliability Engineer (SRE)

As a Site Reliability Engineer (SRE), you will help design, build, and operate reliable, secure, and observable cloud-native systems that support mission-critical applications and services. You will blend software engineering, DevOps practices, and infrastructure expertise to improve system reliability, performance, and operational excellence across the platform.

Responsibilities

  • Establishing development tools and infrastructure for automation.
  • Understanding the needs of stakeholders and conveying this to developers.
  • Automate and improve development, testing, deployment, and release processes.
  • Testing and examining code written by others and analyzing results.
  • Own and improve the reliability, availability, and performance of production systems and services.
  • Define, implement, and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
  • Perform capacity planning, scalability analysis, and performance tuning for applications and infrastructure.
  • Participate in on-call rotations, incident response, and post-incident reviews to drive long-term improvements.
  • Design and implement infrastructure-as-code (IaC) to provision and manage cloud resources (e.g., AWS, Azure, GCP).
  • Build and maintain CI/CD pipelines to ensure reliable, repeatable delivery of application and infrastructure changes.
  • Engineer resilient architectures using concepts such as auto-scaling, blue/green deployments, canary releases, and self-healing patterns.
  • Collaborate with security and platform teams to ensure infrastructure adheres to compliance, security, and governance requirements.
  • Collaborate with application development teams to design reliable, observable, and operable services from the outset.
  • Contribute to application code, tooling, and frameworks that enhance reliability, resilience, and performance.
  • Act as an individual contributor and mentor more junior team members.
  • Present regular status updates and provide cross-training to other DevOps team members.

Qualifications

Required

  • Ability to obtain a U.S. government Security Clearance.
  • BS Degree in an IT field with 10 years of experience OR BS in a non-IT field and 12 years of related IT experience.
  • 3 years of experience with one or more clouds (i.e. AWS, Azure, or GCP).
  • 3 years of experience with Git SCM providers such as GitHub, GitLab, Bitbucket.
  • 3 years of experience with at least one programming language (e.g., Python, Go, Java, or JavaScript) for tooling, automation, or application development.
  • Hands-on experience working with AWS in production environments.
  • Hands-on experience designing, deploying, and operating Kubernetes-based systems (e.g., EKS, AKS, GKE).
  • Experience with DevOps practices and tools, including CI/CD pipelines (e.g., GitHub Actions, GitLab CI, Jenkins, Azure DevOps).
  • Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation, Pulumi) to manage cloud resources.
  • Experience configuring and managing containerization and orchestration platforms.
  • Experience implementing monitoring, logging, and tracing solutions (e.g., CloudWatch, Prometheus, Grafana, Datadog, New Relic, Elastic, OpenTelemetry).
  • Familiarity with networking fundamentals (DNS, load balancing, routing, TLS) and their impact on reliability and performance.
  • Experience with incident management, on-call operations, and production support practices.
  • Certification(s) such as:
    • Cloud certifications (e.g., AWS DevOps Engineer, AWS SysOps Administrator, Azure Administrator/DevOps Engineer, GCP Professional Cloud DevOps Engineer).
    • Kubernetes certifications (e.g., CKA, CKAD).

Preferred

  • Hands-on experience with Drupal and Azure.
  • Experience implementing Automated Testing frameworks including Selenium.
  • Excellent written and verbal communication skills, interpersonal and collaborative skills.
  • Experience documenting an as-is state of the environment, perform a gap analysis, and produce artifacts that articulate options and recommendations.
  • Experience designing and implementing SLOs, SLIs, and error budgets in production environments.
  • Experience with chaos engineering, game days, and resilience testing.
  • Local to Washington, DC metro area and available to be onsite 2 days a week.
  • NIH experience.

About Steampunk

Steampunk relies on several factors to determine salary, including but not limited to geographic location, contractual requirements, education, knowledge, skills, competencies, and experience. The projected compensation range for this position is $125,000 to $200,000. The estimate displayed represents a typical annual salary range for this position. Annual salary is just one aspect of Steampunk's total compensation package for employees. Learn more about additional Steampunk benefits here.

Identity Statement

As part of the application process, you are expected to be on camera during interviews and assessments. We reserve the right to take your picture to verify your identity and prevent fraud.

Steampunk is a Change Agent in the Federal contracting industry, bringing new thinking to clients in the Homeland, Federal Civilian, Health and DoD sectors. Through our Human-Centered delivery methodology, we are fundamentally changing the expectations our Federal clients have for true shared accountability in solving their toughest mission challenges. As an employee owned company, we focus on investing in our employees to enable them to do the greatest work of their careers – and rewarding them for outstanding contributions to our growth. If you want to learn more about our story, visit http://www.steampunk.com.

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, or any other characteristic protected by law. Steampunk participates in the E-Verify program.

+ Show Original Job Post
























Site Reliability Engineer
McLean, Virginia, United States
$125,000 – 200,000 USD / year
Engineering
About Steampunk
Provides human-centered digital transformation and consulting services for government agencies, focusing on design, engineering, and agile delivery.