Technology Product Owner – AI Operations & Resilience Engineering
Fuel your passion for AI and Engineering and transform outcomes for CBA's people and customers!
Join Australia's largest bank as we lead the world in AI innovation and ambition.
Let's revolutionise our engineering community with cutting-edge AI tools and capabilities!
About the Team
The CIO for Technology team ensures CommBank has world-class engineering capability and is at the forefront of technology, using innovative and emerging technologies to help our customers and support our Group Strategy. Focused on innovation, best practices, and collaboration, the team seeks a Technical Product Owner with strong engineering leadership experience to work with our global engineering community and deliver AI-assisted tools and capabilities that enhance productivity and resilience across the bank.
About AI Powered Engineering
Through partnerships with leading LLM and AI Developer Platforms like Anthropic, we're providing over 7,000 engineers with world-class AI-powered engineering tools and capabilities. Our mission is to unlock the potential of these tools and help engineers be more productive, less burdened by repetitive tasks, and more focused on innovation.
We are the team behind Project Coral, an agentic AI solution.
What You'll Do
Bring your engineering leadership experience to shape AIOps resilience capabilities across the Challenge and Run (testing and production) phases of our full-cycle engineering model:
- Own the product vision and roadmap for AIOps resilience capabilities (e.g., intelligent alerting, anomaly detection, predictive capacity/latency analytics, incident copilots, automated remediation, runbook orchestration, chaos/continuous resilience testing).
- Lead a cross-functional squad with an Engineering Lead to deliver incremental value: define OKRs, prioritise backlogs, plan releases, and communicate outcomes with transparent, executive-ready reporting.
- Partner with SRE, platform, cloud, and application teams to instrument SLIs/SLOs, reduce MTTR/MTTD, and industrialise practices like error budgeting, capacity management, change risk scoring, and failure-mode analysis.
- Integrate AIOps with core operational tooling and processes (e.g., observability stacks, CMDB, ITSM/ITIL workflows such as ServiceNow; on-call and incident tooling like PagerDuty/Opsgenie; runbook and automation platforms).
- Translate operational pain points into clear requirements and acceptance criteria; embed telemetry, feedback loops, and robust product analytics to validate value realisation (noise reduction, toil elimination, stability uplift).
- Navigate complex stakeholder landscapes—engineering, cyber, risk, compliance, and business operations—aligning priorities, managing dependencies, and balancing innovation with control obligations.
- Embed Responsible AI guardrails in operational use cases (explainability, human-in-the-loop for remediation, access controls, monitoring and evaluation of AI models).
- Drive adoption, enablement, and change management at scale—training, communications, playbooks, and communities of practice to uplift operational maturity and behaviour.
- Remain current on AI and reliability trends (LLMs for summarisation/RCA, RAG over runbooks, causal/seasonal anomaly detection, event correlation, pattern mining, chaos engineering, resilience testing) and apply them pragmatically to real environments.
About You
- Former engineer or strong engineering background with hands-on experience in software, platform, or site reliability engineering.
- Proven leadership in engineering teams or technical squads, with ability to coach and influence engineering practices.
- Demonstrated product ownership experience in technology operations, SRE, or platform enablement products in large/regulated organisations.
- Deep understanding of Challenge and Run phases: resilience engineering, operational excellence, and production reliability.
- Comfortable engaging with modern observability and AIOps ecosystems (Prometheus/Grafana, Splunk/Elastic, Datadog, Dynatrace/New Relic, Open Telemetry; PagerDuty/Opsgenie; Kubernetes; public cloud; automation/runbooks).
- Skilled in Agile delivery, product discovery, and OKR-driven prioritisation; adept with Jira and Confluence; excellent storytelling and executive communication.
- Experience working within risk, security, privacy, and compliance frameworks; able to align with operational resilience standards and regulatory expectations.
- Curious, resilient, and bias-to-action—comfortable challenging assumptions, asking the hard questions, and unblocking delivery in a complex, matrixed environment.
If you've led engineering teams and understand what it takes to keep systems resilient and reliable at scale, this role is for you. Apply today!
We support our people with the flexibility to balance where work is done with at least half your time each month connecting in office.