Okta is The World's Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth. At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we're looking for lifelong learners and people who can make us better with their unique experiences. Join our team! We're building a world where Identity belongs to you.
We're searching for a Principal Site Reliability Engineer (SRE) with a profound passion for observability to join our team. This isn't just a hands-on role; you'll be a thought leader, shaping the strategy and execution of our observability services-logs, metrics, and tracing-both within the Observability team and across the broader organization. We're looking for someone who can help us see clearly when things get cloudy! Your expertise in Kubernetes will be crucial as we undergo a significant replatforming initiative. You will guide the design, implementation, and operation of our advanced observability capabilities on the new platform. A cornerstone of this role is your exceptional ability to manage and influence stakeholders, ensuring their needs are met, expectations are managed, and they're delighted with the insights our observability services provide. We believe that our important stakeholders deserve metric-ulous attention.
9+ years of experience as a site reliability or platform engineer, preferably in a fast-scaling environment, with a significant and demonstrable track record in leading observability initiatives. 2+ years of experience designing, scaling, and operating observability solutions for applications within a Kubernetes environment. You'll be adept at leveraging Kubernetes capabilities to gain insights into workload performance and health. Familiarity with large-scale containerized deployments, both microservice and monolithic, coupled with a deep understanding of their unique observability challenges and solutions. A proactive and tenacious mindset: always willing to go the extra mile to identify a problem and drive its resolution, especially when it pertains to improving system visibility and reliability. A strong passion for mentoring and encouraging the development of engineering peers, leading by example in adopting and promoting robust observa