View All Jobs 141227

Sr AI Observability Engineer

Build an AI-powered observability platform to monitor and troubleshoot large-scale distributed systems
Cupertino, California, United States
Senior
$212,000 – 318,400 USD / year
yesterday
Apple

Apple

Designs and sells consumer electronics, software, and digital services, including smartphones, computers, wearables, and media platforms.

Sr AI Observability Engineer

Do you want to build the future of AI enabled observability at Apple? We're looking for an experienced AI observability engineer to design and build AI observability solutions that power Apple Intelligence, Search, and AI infrastructure powering Apple's intelligent products. We're at the forefront of building AI-first observability services, blending AI, cloud-first engineering, and industry standards to deliver smart, scalable solutions. Your work will directly impact the experience of billions of users on their favorite Apple devices. If you are a seasoned principal or senior software engineer with a proven track record in building AI enabled observability solutions and have a deep passion for observability, AI, cloud-native technologies and large-scale distributed systems, we want to talk with you.

Description

We're pioneering the next generation of AI-powered observability solutions. While we innovate to build new solutions, we also leverage industry-standard open-source technologies. In this role, you will collaborate with a team of engineers to lead the design and development of user-facing observability features for AIML products and infrastructure. You will also be responsible for providing technical guidance, sharing observability best practices and know-how, leveraging AI pipelines and mentoring the team to develop and deliver best-of-class features and a delightful user experience for all users.

Minimum Qualifications

• 7+ years of experience in building ML pipelines, portable workflows and in model tuning to deploy ML and LLM models in production for customer-facing features

• 7+ years software engineering experience and strong background in computer science: distributed systems, algorithms and data structures, APIs and highly-scalable, reliable systems and micro-services

• Demonstrated experience using LLM and ML models for AIOps and model observability

• Demonstrated experience using LLMs, ML frameworks i.e. TensorFlow, PyTorch and libraries like Scikit-learn, NumPy, LangChain, MLFlow, KubeFlow

• Demonstrated experience in delivering well-architected, reliable, highly-scalable cloud-native distributed systems for data management, observability or analytics services

• Strong software engineering experience in design, development and testing in cloud-native environments

• Strong coding skills in Python, Go, Javascript, Java

• Demonstrated experience in building large-scale micro-services using public cloud infrastructure and/or "private cloud" environments

• Experience developing intelligent detection and resolution features for incident management, automated remediation and root cause analysis

• Excellent verbal and written communication skills with strong problem solving skills

• Excellent interpersonal skills for collaborating across teams, stakeholders, and open source collaborators

Preferred Qualifications

• Knowledge of current Gen AI research and techniques in the following areas: MCPs, RAG systems, Agentic AI (multi-agent orchestration, tool calling)

• Hands-on experience with agentic AI frameworks (e.g. LangGraph, AutoGen, CrewAI) for building multi-step reasoning and tool-using agents

• Demonstrated experience in building observability systems for metrics, distributed tracing, logs, profiling and in building observability data collection using OpenTelemetry

• Demonstrated proficiency in AWS services such as EKS and native Kubernetes, storage such as S3, networking, database and observability services

• Experience with large scale observability visualization systems with knowledge of popular visualization tools like Grafana, DataDog, and ELK

• Proficiency using cloud-native software development tools including coding, CI/CD and testing frameworks

• Building large-scale incident management, alert management and notification systems

• Active open source project contributions is a plus

Pay & Benefits

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $212,000 and $318,400, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple's discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple's Employee Stock Purchase Plan. You'll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation.

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics.

Apple accepts applications to this posting on an ongoing basis.

+ Show Original Job Post
























Sr AI Observability Engineer
Cupertino, California, United States
$212,000 – 318,400 USD / year
Engineering
About Apple
Designs and sells consumer electronics, software, and digital services, including smartphones, computers, wearables, and media platforms.