View All Jobs 131498

Service Reliability Engineer, GI Application Management

Build automated systems to ensure high availability of critical insurance applications
Charlotte
Senior
yesterday
AIG

AIG

A global insurance organization providing property casualty insurance, life insurance, retirement products, and other financial services.

Site Reliability Engineer (SRE)

As a Site Reliability Engineer (SRE), you will apply software engineering principles to IT operations, ensuring robust and scalable systems. The core mission is to build resilient, efficient, and rapidly evolving IT infrastructure through a data-driven approach. SREs prioritize automation, monitoring, and incident management to minimize outages and speed recovery. You will bridge the gap between development and operations teams, fostering collaboration and shared ownership of reliability. Key responsibilities include defining and meeting Service Level Objectives (SLOs), managing error budgets, and conducting blameless postmortems for continuous improvement. Ultimately, strive to achieve a balance between the speed of software development and system stability, ensuring a seamless user experience.

Key responsibilities include:

  • Keep up continuous uptime and accessibility of critical business applications and services.
  • Respond to and resolve incidents and outages promptly.
  • Automate repetitive, manual tasks (toil) to improve efficiency and reduce human error.
  • Establish and maintain robust monitoring and alerting systems.
  • Analyze usage patterns and forecast resource needs.
  • Conduct blameless post-mortem reviews after major incidents.
  • Act as a bridge between development and operations teams.
  • Establish clear, measurable targets for system performance and reliability.

What you'll need to succeed:

  • Bachelor's degree in related field and 3+ years of relevant technology experience.
  • Solid grasp of core technical areas such as programming, system administration, networking, databases, and cloud computing platforms.
  • Practical experience running production systems.
  • Proficiency in scripting languages and Infrastructure as Code tools.
  • Skill in implementing comprehensive monitoring solutions.
  • Ability to quickly diagnose and resolve system incidents.
  • Ability to rely on data to understand system behavior.
  • Excellent communication skills.
  • Proactive in learning new technologies.

Ready to take your career to the next level?

The position is eligible for a bonus in accordance with the terms of the applicable incentive plan. We're proud to offer a range of competitive benefits.

Veterans encouraged to apply

At AIG, we value in-person collaboration as a vital part of our culture.

Enjoy benefits that take care of what matters.

Reimagining insurance to make a bigger difference to the world.

Welcome to a culture of inclusion.

+ Show Original Job Post
























Service Reliability Engineer, GI Application Management
Charlotte
Support
About AIG
A global insurance organization providing property casualty insurance, life insurance, retirement products, and other financial services.