View All Jobs 171140

Sr Software Engineer - Network Reliability Engineering - AI/ML

Develop AI-powered automation tools to improve network reliability and operational efficiency
Helena, Montana, United States
Senior
$79,200 – 178,100 USD / year
yesterday
+Veterans Staffing

+Veterans Staffing

A platform dedicated to connecting U.S. military veterans with employment opportunities and career development resources.

268 Similar Jobs at +Veterans Staffing

Oracle Cloud Infrastructure Sr Software Developer

Oracle Cloud Infrastructure (OCI) provides mission-critical cloud services to enterprises worldwide. The Network Reliability Engineering (NRE) Automation, Reporting, and Tooling team builds innovative solutions that boost the productivity and efficiency of the Global Network Operations Center (GNOC). Our tooling empowers the GNOC and Network Reliability Engineering (NRE) teams with observability, automation, and actionable insights at hyperscale. As a Sr Software Developer, you will design, build, and deliver scalable automation frameworks and advanced platforms leveraging AI/ML to drive operational excellence across OCI's global network.

This includes building network event driven data (such as failures), hybrid classification, and both training and inference. You are passionate about developing software that solves real-world operational challenges, thrive in a fast-paced team, and are comfortable working with complex distributed systems. You value simplicity, scalability, and collaboration.

Responsibilities

  • Architect, build, and support distributed systems for process control and execution based on Product Requirement Documents (PRDs).
  • Develop and sustain DevOps tooling, new product process integrations and automated testing.
  • Develop ML in Python 3; build backend services in Go (Golang); create command-line interface (CLI) tools in Rust or Python 3; and integrate with other services as needed using Go, Python 3, or C.
  • Build and maintain schemas/models to ensure every platform and service write is captured for monitoring, debugging and compliance.
  • Build and maintain dashboards that monitor the quality and effectiveness of service execution for "process as code" your team delivers.
  • Build automated systems that route code failures to the appropriate oncall engineers and service owners.
  • Ensure high availability, reliability, and performance of developed solutions in production environments.
  • Support serverless workflow development for workflows which call and utilize the above mentioned services support our GNOC, GNRE, and onsite operations and hardware support teams.
  • Participate in code reviews, mentor peers, and help build a culture of engineering excellence.
  • Operate in an Extreme Programming (XP) asynchronous environment (chat/tasks) without daily standups, and keep work visible by continuously updating task and ticket states in Jira.

Required Qualifications

  • 3 - 5 years of experience in process as code, software engineering, automation development, or similar roles.
  • Bachelors in computer science and Engineering or related engineering fields.
  • Strong coding skills in Go and Python3.
  • Experience with distributed systems, micro-services, and cloud-native technologies.
  • Proficiency in Linux environments and scripting languages.
  • Proficiency with database creation, maintenance and code using SQL and Go or Py3 libraries.
  • Understanding of network operations or large-scale IT infrastructure.
  • Excellent problem-solving, organizational, and communication skills.
  • Experience using AI coding assistants or AI-powered tools to help accelerate software development, including code generation, code review, or debugging.

Preferred Qualifications

  • Process engineering experience (control systems, proportional integral derivative's (pid), statistical process control (SPC)).
  • Proficiency with data modeling, data analysis, and reporting frameworks (e.g., SQL, Spark, Prometheus, Grafana, etc.).
  • Experience with C, Cpp, Java, or Rust.
  • Experience developing automation and tools for network or scale cloud operations.
  • Background in creating dashboards, alerts, and real-time reporting platforms.
  • Familiarity with workflow automation (e.g., Apache Airflow), CI/CD pipelines, or infrastructure as code.
  • Previous experience supporting or building tools for (any) hyperscale or scale could network, compute, or storage operations.
  • Knowledge of REST APIs, remote procedure calls (RPCs), and service oriented architectures (SOA).
  • Familiarity with eXtreme programming (xp), agile, and devops process.
  • Experience with ticketing and version control systems (e.g., Jira, Git).

Disclaimer: Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates. Range and benefit information provided in this posting are specific to the stated locations only US: Hiring Range in USD from: $79,200 to $178,100 per annum. May be eligible for bonus and equity. Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business. Candidates are typically placed into the range based on the preceding factors as well as internal peer equity. Oracle US offers a comprehensive benefits package which includes the following: 1. Medical, dental, and vision insurance, including expert medical opinion 2. Short term disability and long term disability 3. Life insurance and AD&D 4. Supplemental life insurance (Employee/Spouse/Child) 5. Health care and dependent care Flexible Spending Accounts 6. Pre-tax commuter and parking benefits 7. 401(k) Savings and Investment Plan with company match 8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation. 9. 11 paid holidays 10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours. 11. Paid parental leave 12. Adoption assistance 13. Employee Stock Purchase Plan 14. Financial planning and group legal 15. Voluntary benefits including auto, homeowner and pet insurance The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted. Career Level - IC3

+ Show Original Job Post
























Sr Software Engineer - Network Reliability Engineering - AI/ML
Helena, Montana, United States
$79,200 – 178,100 USD / year
Engineering
About +Veterans Staffing
A platform dedicated to connecting U.S. military veterans with employment opportunities and career development resources.