Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Data, Analytics Platform. This team will focus on product development and product strategy for Oracle Health, while building out a complete platform supporting modernized, automated healthcare. This is a net new line of business, constructed with an entrepreneurial spirit that promotes an energetic and creative environment. We are unencumbered and will need your contribution to make it a world class engineering center with the focus on excellence. Oracle Health Data, Analytics Platform has a rare opportunity to play a critical role in how Oracle Health products impact and disrupt the healthcare industry by transforming how healthcare and technology intersect. You will have the opportunity to:
This role provides technical leadership for the core data platforms behind Oracle Health's Data & Analytics Platform. As a Principal Site Reliability Engineer (SRE), you will own shared, mission-critical systems used by multiple products and teams. You will lead the design and operation of large-scale, stateful distributed platforms, including Hadoop ecosystem components (HDFS, YARN, HBase) deployed on Oracle Big Data Service (BDS), Kafka, and Storm. These multi-tenant platforms are deployed and operated through Ansible- and Terraform-based automation and require strong architectural ownership to manage scale, change, and broad blast radius.
Own the end-to-end reliability, scalability, and operability of shared data platforms. Define platform standards, architectural direction, and operational guardrails. Influence cross-team technical decisions and long-term platform strategy. Drive long-term platform evolution and influence reliability strategy across the data ecosystem.
Lead platform architecture and design reviews. Clearly articulate system behavior, dependencies, and failure modes. Make principled trade-offs between reliability, performance, cost, and complexity. Provide guidance and guardrails that enable downstream teams to use platforms safely and effectively.
Establish capacity models, scaling strategies, and operational best practices. Design platforms that behave predictably under load, failure, and change. Own platform lifecycle events: upgrades, expansions, decommissioning, and recovery.
Operate and evolve stateful distributed systems where data placement, replication, and recovery are critical. Reason about failure modes such as backpressure, rebalancing, region movement, replication lag, and rolling upgrades.
Operate and maintain Kerberized platforms, including authentication, authorization, and secure service-to-service communication. Treat security as a first-class architectural concern.
Design and evolve an Ansible- and Terraform-driven automation framework. Treat automation as production software: versioned, reviewed, tested, and improved. Eliminate operational toil by encoding reliability and safety into the platform.
Serve as the ultimate escalation point for complex or ambiguous incidents. Focus on eliminating entire classes of failure, not just resolving individual issues.
Represent SRE and platform engineering in high-visibility and sensitive forums. Communicate clearly with engineering leadership and partner teams.
The team operates within the Oracle Health Data & Analytics Platform, supporting one of Oracle Health's core products, HealtheIntent. We operate the big data and streaming infrastructure that enables downstream teams to deliver reliable customer-facing solutions at scale, while continuously improving operability and efficiency.
8+ years operating large-scale, customer-facing distributed platforms. Deep experience with HDFS, YARN, HBase, Kafka, Storm, or similar systems. Strong background in Linux, networking, and distributed system troubleshooting. Infrastructure-as-Code using Ansible and Terraform. Scripting and automation using Python, Ruby, and Bash. Hands-on experience operating Kerberized environments. Proven ability to define and document technical architecture for complex systems. Demonstrated ownership of shared platforms with broad blast radius and multiple downstream consumers. Experience designing observability and capacity models for distributed platforms.
U.S. Citizenship and eligibility for a Federal Security Clearance. 10+ years of technical experience relevant to this position. Ability to communicate effectively and build rapport with team members. BS or MS in Computer Science, or equivalent.
Certain US customer or client-facing roles may be required to comply with applicable requirements, such as immunization and occupational health mandates.
US: Hiring Range in USD from: $86,400 to $199,500 per annum. May be eligible for bonus and equity. Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business. Candidates are typically placed into the range based on the preceding factors as well as internal peer equity. Oracle US offers a comprehensive benefits package which includes the following:
The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted. Career Level - IC4
As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. We've partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when everyone is empowered to contribute. That's why we're committed to growing an inclusive workforce that promotes opportunities for all. Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs. We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing accommodation-request_mb@oracle.com or by calling +1 888 404 2494 in the United States. Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.