View All Jobs 157218

Site Reliability Engineer II

Build and extend internal automation tools to improve server provisioning and monitoring
Poland
Senior
yesterday

Site Reliability Engineer

Are you passionate about cutting edge technology?

Do solving some of the Internet's most difficult content delivery challenges interest you?

Join our highly skilled Site Reliability team

Our team designs, develops, and manages applications and infrastructure that support Akamai's Compute products and services. We do this while maintaining Akamai's mission at the forefront of what we do. Make life better for billions of people, billions of times a day.

Partner with the best

As a Site Reliability Engineer, you will specialize in creating solutions that help improve automation and efficiencies across all internal teams. You will be expected to drive automation, operational excellence, and support our customer facing applications and infrastructure. This will require creative thinking combined with deep domain expertise in the areas of Linux systems administration, configuration management, performance tuning, etc.

As a Site Reliability Engineer, you will be responsible for:

  • Designing, building and extending internal automation tools written in Bash and Python, handling system provisioning, configuration and lifecycle operations
  • Qualifying and benchmarking servers using a wide range of performance monitoring and profiling tools, ensuring hardware meets reliability and efficiency standards
  • Investigating incidents as well as performing root-cause analysis
  • Developing and deploying hardware monitoring solutions, defining meaningful SLIs and SLOs to drive observability and performance accountability
  • Assembling and maintaining custom Linux images (e.g. via Packer) optimized for diverse hardware and operational contexts
  • Automating configuration management using SaltStack, ensuring consistent and reproducible environments
  • Packaging, distributing and maintaining internal Debian packages, enabling standardized software delivery across environments
  • Participating in an on-call rotation, providing support and incident response outside business hours

Do what you love

To be successful in this role you will:

  • Have expertise in Linux system internals and kernel interactions
  • Have good command of Bash and Python for automation, diagnostics and tool development
  • Possess experience with SaltStack (or other configuration management tools like Ansible, Chef, Puppet, etc.)
  • Demonstrate solid grasp of networking fundamentals and display the ability to build and maintain Debian packages
  • Be familiar with observability/monitoring ecosystems such as Prometheus, Grafana, etc.
  • Exhibit knowledge of Git and version control workflows
  • Be comfortable participating in an on-call rotation outside business hours

Build your career at Akamai

Our ability to shape digital life today relies on developing exceptional people like you. The kind that can turn impossible into possible. We're doing everything we can to make Akamai a great place to work. A place where you can learn, grow and have a meaningful impact.

With our company moving so fast, it's important that you're able to build new skills, explore new roles, and try out different opportunities. There are so many different ways to build your career at Akamai, and we want to support you as much as possible. We have all kinds of development opportunities available, from programs such as GROW and Mentoring, to internal events like the APEX Expo and tools such as Linkedin Learning, all to help you expand your knowledge and experience here.

Learn more

Not sure if this job is the right match for you or want to learn more about the job before you apply? Schedule a 15-minute exploratory call with the Recruiter and they would be happy to share more details.

+ Show Original Job Post
























Site Reliability Engineer II
Poland
Engineering
About Akamai Technologies