View All Jobs 140442

Production Support Engineer III - Azure

Manage and automate incident resolution processes for mission-critical systems
Raleigh, North Carolina, United States
Senior
yesterday
North Carolina Staffing

North Carolina Staffing

Provides recruitment and employment services for state government agencies and public sector positions across North Carolina.

447 Similar Jobs at North Carolina Staffing

Production Support Engineer III

The Production Support Engineer III is responsible for ensuring the operational integrity, availability, and performance of mission-critical systems. This role involves managing technical incidents, troubleshooting recurring issues, and implementing permanent solutions to maintain system stability. The Engineer will collaborate with cross-functional teams to resolve incidents efficiently and improve system resiliency through proactive monitoring and automation.

Essential Duties And Responsibilities

Following is a summary of the essential functions for this job. Other duties may be performed, both major and minor, which are not mentioned below. Specific activities may change from time to time.

  1. Handle the identification, triage, and resolution of medium-to-high priority incidents with minimal supervision to ensure business operations are minimally impacted.
  2. Collaborate with development teams, business partners, and other stakeholders to diagnose and resolve technical issues, implementing long-term fixes to prevent incident recurrence.
  3. Use monitoring tools (e.g., Splunk, Dynatrace, CloudWatch) to detect performance issues and execute corrective actions promptly.
  4. Enhance system observability to proactively detect issues and improve overall system performance and stability.
  5. Develop and maintain automation scripts to streamline routine production support tasks, reducing manual interventions.
  6. Implement automation strategies to improve production stability and minimize downtime.
  7. Maintain clear and detailed documentation of troubleshooting procedures, contributing to the shared knowledge base.
  8. Provide assistance in improving the incident, problem, and change management processes, following ITIL best practices.
  9. Participate in root cause analysis and suggest process improvements to enhance system stability and performance.
  10. Collaborate with cross-functional teams in resolving recurring production support issues and optimizing workflows.
  11. Actively mentor junior support engineers, fostering technical growth within the team.
  12. Escalate complex or unresolved issues to senior engineers or technical experts when necessary.
  13. Build and maintain the automation and streamlining of software delivery and operations for new or existing software applications through proficiency in capabilities and tools in the DevOps lifecycle including: Infrastructure as Code; Agile and DevOps Lifecycle Management; Source Code Management; Build Orchestration; Build Management; Artifact Repository Management; Behavior Driven Development; Test Driven Development; Automated Testing including Unit Testing, Integration Testing, Functional Testing, Smoke Testing, Regression Testing, Stress Testing, and Performance Testing; Static Code Analysis; Load and Performance Testing; Artifact Scanning; Database Schema Management, Orchestration and Recovery; Compliance Automation and Audit Trails; Configuration Management; Containers; Application Release Automation; Deployment Strategies and Patterns including Blue/Green Deployment, Canary Releases, and Rolling Releases; Logging and Log Analytics; and Performance Monitoring and Management.

Qualifications

Required Qualifications The requirements listed below are representative of the knowledge, skill and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

  • Bachelor's degree in Computer Science, Information Systems, Engineering, or a related discipline.
  • Six to ten years of experience in production support or related technical roles.
  • Experience in managing incident management, triage, and production support functions for both on-premise and cloud environments.
  • Proficiency with IT Service Management (ITSM) tools such as ServiceNow, and familiarity with incident, problem, and change management processes.
  • Strong experience with monitoring tools such as Dynatrace, Splunk, or CloudWatch for proactive issue detection and troubleshooting.
  • Understanding of infrastructure, application technology stacks, and the software development lifecycle.
  • Strong analytical and problem-solving skills with a focus on root cause analysis.
  • Ability to work independently, handle medium-to-complex issues, and escalate critical problems to senior staff as needed.

Preferred Qualifications

  • Experience in DevSecOps and support of CI/CD pipelines.
  • Experience in supporting Agile team/processes.
  • Financial services industry experience
  • Familiarity with Site Reliability Engineering (SRE) practices

Other Job Requirements / Working Conditions

Sitting Constantly (More than 50% of the time) Standing Frequently (25% - 50% of the time) Walking Frequently (25% - 50% of the time) Visual / Audio / Speaking Able to access and interpret client information received from the computer and able to hear and speak with individuals in person and on the phone. Manual Dexterity / Keyboarding Able to work standard office equipment, including PC keyboard and mouse, copy/fax machines, and printers. Availability Able to work all hours scheduled, including overtime as directed by manager/supervisor and required by business need. Travel Minimal and up to 10%

+ Show Original Job Post
























Production Support Engineer III - Azure
Raleigh, North Carolina, United States
Support
About North Carolina Staffing
Provides recruitment and employment services for state government agencies and public sector positions across North Carolina.