Senior Site Reliability Engineer (Cloud Security Posture Management)

At Palo Alto Networks® everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we're looking for innovators who are as committed to shaping the future of cybersecurity as we are.

We take our mission of protecting the digital way of life seriously. We are relentless in protecting our customers and we believe that the unique ideas of every member of our team contributes to our collective success. Our values were crowdsourced by employees and are brought to life through each of us everyday - from disruptive innovation and collaboration, to execution. From showing up for each other with integrity to creating an environment where we all feel included.

As a member of our team, you will be shaping the future of cybersecurity. We work fast, value ongoing learning, and we respect each employee as a unique individual. Knowing we all have different needs, our development and personal wellbeing programs are designed to give you choice in how you are supported. This includes our FLEXBenefits wellbeing spending account with over 1,000 eligible items selected by employees, our mental and financial health resources, and our personalized learning opportunities - just to name a few!

At Palo Alto Networks, we believe in the power of collaboration and value in-person interactions. This is why our employees generally work full time from our office with flexibility offered where needed. This setup fosters casual conversations, problem-solving, and trusted relationships. Our goal is to create an environment where we all win with precision.

Your Career

The Cortex team builds and delivers the industry's most advanced SecOps platform, consisting of XSIAM, XSOAR, and XPANSE. As a member of the Cortex DevOps team, your role involves operating and maintaining a large-scale GCP environment, including the design, implementation, and continuous enhancement of our comprehensive observability systems. To meet the opportunities that such a role provides, you will have a deep knowledge of modern observability and monitoring tools and practices, having managed high cardinality metrics, implemented tracing, and operationalized large scale logging solutions. As part of this role, you will collaborate closely with our engineering teams to develop innovative solutions that provide clear and actionable insights into our systems' performance and health.

Your Impact

As a Senior SRE with the Cortex Cloud Security Posture Management team, you will:

Cloud Expertise - Utilize your expertise in monitoring cloud platforms, particularly GCP, to optimize our infrastructure leveraging cloud-native technologies
Incident Management - Leverage incident management processes to ensure efficient resolution of system issues and minimal impact on services
Automation - Automate complex monitoring and alerting tasks by building tools for cloud operations, such as automated remediation of known issues and auto-scaling
CI/CD - Develop and maintain application deployment tools such as Terraform and Helm
Continuously Improve - Stay up-to-date with cutting-edge technologies, evaluate their potential impact on our operations, and implement them when appropriate
On-Call - Participate with our DevOps team to provide follow-the-sun operational coverage in the production of our SaaS product
Collaborate - Work with our Engineering team to influence the operability of the product and ensure the reliability and availability of our services

Your Experience

Incident and Alerts Management - Clear understanding of incident and alerts management in Site Reliability Engineering
DevOps/SRE Expertise - 4+ years of experience as a DevOps/SRE engineer with a passion for technology and a strong motivation for high reliability at the service level
Cloud Proficiency - High proficiency in either Google Cloud Platform or Amazon Web Services
Kubernetes and Docker - High proficiency with Kubernetes and Docker for container orchestration
Scripting and Automation - High proficiency in Python programming and Linux Shell commands - Experience with Terraform for infrastructure as code
Security - Strong grasp of security concepts and best practices
Observability - Experience with observability and incident response tools
Communication Skills - Effective communication and interpersonal skills, with the ability to work and coordinate between multiple teams
Troubleshooting - Ability to effectively troubleshoot and address emerging and complex problems
Independence - Ability to operate independently, make decisions, take action, and take responsibility

Suggest a correction

Senior Site Reliability Engineer (cloud Security Posture Management)

Palo Alto Networks

Free Jobs Digest

NoDegree