View All Jobs 145616

Principal Site Reliability Engineer

Design and implement scalable, secure Kubernetes infrastructure supporting multi-tenant applications
New York
Senior
$220,000 – 290,000 USD / year
yesterday
SecurityScorecard

SecurityScorecard

A platform providing cybersecurity ratings and continuous monitoring for risk assessment and mitigation across business ecosystems.

2 Similar Jobs at SecurityScorecard

Principal Site Reliability Engineer

As a Principal Site Reliability Engineer, you will play a strategic and technical leadership role in shaping the reliability, scalability, and velocity of our engineering platform. Your primary focus will be advancing our Kubernetes-based infrastructure and CI/CD systems to support high-scale, high-availability services. You will partner with engineering leaders across the organization to define and drive platform-wide initiatives that enable fast, safe, and repeatable deployments, and foster a culture of reliability and operational excellence.

Key Responsibilities

  • Lead the design and evolution of Kubernetes-based infrastructure to support multi-tenant, high-scale applications with strong isolation, resilience, and security.
  • Architect and optimize CI/CD pipelines to support fast and reliable build, test, and deploy cycles across a polyglot environment.
  • Establish and evangelize best practices for GitOps, canary deployments, rollback strategies, and progressive delivery.
  • Define and implement scalable Infrastructure as Code (IaC) patterns using tools such as Terraform, Helm, and Crossplane.
  • Drive the adoption of automated testing throughout the delivery lifecycle—unit, integration, load, and chaos testing—to ensure high confidence in production changes.
  • Guide teams in designing for observability, SLOs, and alerting, ensuring actionable signals and minimizing alert fatigue.
  • Partner with security, compliance, and development teams to ensure infrastructure and delivery systems meet modern security and governance standards.
  • Lead incident response retrospectives and foster a blameless culture of continuous improvement.
  • Mentor and influence senior engineers across multiple teams, helping to up-level platform reliability capabilities organization-wide.

Qualifications

  • 8+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure roles, with 2+ years in a technical leadership or principal capacity.
  • Deep expertise with Kubernetes internals (controllers, networking, autoscaling, operators, etc.) and production-grade clusters on cloud providers (EKS, GKE, or AKS).
  • Proven experience designing and scaling CI/CD systems using tools such as GitHub Actions, Argo CD, Tekton, Spinnaker, or similar.
  • Strong proficiency in Terraform and modern IaC practices.
  • Advanced knowledge of automated testing strategies, including performance, load, and failure testing.
  • Proficient in one or more programming/scripting languages (Python, Go, Bash, etc.).
  • Deep experience with monitoring and observability stacks such as Prometheus, Grafana, OpenTelemetry, and Datadog.
  • Strong communicator with the ability to align technical initiatives to business objectives and influence across engineering teams.

Nice-to-Have

  • Experience implementing multi-cluster or multi-region Kubernetes strategies.
  • Exposure to chaos engineering and building resilient distributed systems.
  • Familiarity with compliance frameworks (SOC 2, HIPAA, etc.) as they relate to infrastructure and deployment.
  • Contributions to open-source Kubernetes tooling or SRE frameworks.
  • Familiarity with JVM- or Node-based application stacks.

Benefits: Specific to each country, we offer a competitive salary, stock options, health benefits, and unlimited PTO, parental leave, tuition reimbursements, and much more! The estimated total compensation range for this position is $220,000 - $290,000 (base plus bonus). Actual compensation for the position is based on a variety of factors, including, but not limited to affordability, skills, qualifications and experience, and may vary from the range. In addition to base salary, employees may also be eligible for annual performance-based incentive compensation awards and equity, among other company benefits.

+ Show Original Job Post
























Principal Site Reliability Engineer
New York
$220,000 – 290,000 USD / year
Engineering
About SecurityScorecard
A platform providing cybersecurity ratings and continuous monitoring for risk assessment and mitigation across business ecosystems.