Oracle Cloud Infrastructure (OCI) provides mission-critical cloud services to enterprises worldwide. The Network Reliability Engineering (NRE) Automation, Reporting, and Tooling team builds innovative solutions that boost the productivity and efficiency of the Global Network Operations Center (GNOC). Our tooling empowers the GNOC and Network Reliability Engineering (NRE) teams with observability, automation, and actionable insights at hyperscale. As a Principal Software Engineer, you will design, build, and deliver scalable automation frameworks and advanced platforms leveraging AI to drive operational excellence across OCI's global network. This includes building AI agents that accelerate issue resolution for the GNOC team, as well as developing robust tools that provide intelligent data insights, enable natural language search in systems like Jira and data lakes, reduce operational toil, and ultimately keep OCI's network running smoothly and securely. You are passionate about developing software that solves real-world operational challenges, thrive in a fast-paced team, and are comfortable working with complex distributed systems. You value simplicity, scalability, and collaboration.
Design, implement, test, and deploy large-scale automation, reporting, and productivity tools for OCI's global network operations. Lead the design and development of intelligent systems using Large Language Models (LLMs). Develop AI agents to enable natural language querying of Jira data providing context-aware answers to user questions about Jira issues, projects, and metrics. Collaborate with GNOC and NRE engineers to gather requirements and deliver impactful solutions. Build and maintain observability dashboards and data pipelines that drive decision-making and root cause analysis. Develop auto-remediation, orchestration, and workflow automation services for operational tasks. Ensure high availability, reliability, and performance of developed solutions in production environments. Participate in code reviews, mentor peers, and help build a culture of engineering excellence. Own and drive multiple technical projects and priorities in an agile, collaborative environment.
8 - 10 years of experience in software engineering, automation development, or similar roles. Bachelors in computer science and engineering or related engineering fields. Strong coding skills in Java, Python, or a comparable programming language. Experience developing context-aware, intelligent systems leveraging LLMs for real-world operational workflows. Experience with distributed systems, microservices, and cloud-native technologies. Hands-on expertise with Linux environments and scripting languages. Proficiency with data modeling, data analysis, and reporting frameworks (e.g., SQL, Spark, Prometheus, Grafana, etc.). Understanding of network operations or large-scale IT infrastructure. Excellent problem-solving, organizational, and communication skills.
Experience developing automation and orchestration tools for network or cloud operations. Background in creating dashboards, alerts, and real-time reporting platforms. Familiarity with workflow automation (e.g., Apache Airflow), CI/CD pipelines, or infrastructure as code. Previous experience supporting or building tools for NOC, GNOC, or SRE teams. Knowledge of cloud platforms, REST APIs, and service-oriented architecture. Familiarity with agile methodologies and DevOps practices. Experience with ticketing and version control systems (e.g., Jira, Git).
Disclaimer: Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates. Range and benefit information provided in this posting are specific to the stated locations only US: Hiring Range in USD from: $96,800 to $223,400 per annum. May be eligible for bonus and equity. Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business. Candidates are typically placed into the range based on the preceding factors as well as internal peer equity. Oracle US offers a comprehensive benefits package which includes the following: medical, dental, and vision insurance, including expert medical opinion; short term disability and long term disability; life insurance and AD&D supplemental life insurance (Employee/Spouse/Child); health care and dependent care Flexible Spending Accounts; pre-tax commuter and parking benefits; 401(k) Savings and Investment Plan with company match; paid time off: flexible vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation. 11 paid holidays; paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours; paid parental leave; adoption assistance; Employee Stock Purchase Plan; financial planning and group legal; voluntary benefits including auto, homeowner and pet insurance.