Software Engineer III, Ai/Ml, Gpu Inference, Optimization

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google's needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve.

Cloud ML Compute Services (CMCS) team is within the Cloud organization that has been chartered to build x-Google alignment towards a unified central infrastructure to host all of Google's ML needs, including internal and external use cases. The CMCS Inference team is part of the CMCS team and focuses on the inference workloads and the serving infrastructure. In this role, you will be optimizing machine learning models for large scale inference workloads and will also have experience in different large scale Machine Learning (ML) optimizations techniques for improving latency and throughput. You will have experience with accelerators (TPUs or GPUs), or HPC.

The ML, Systems, & Cloud AI (MSCA) organization at Google designs, implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.) and Google Cloud. Our end users are Googlers, Cloud customers and the billions of people who use Google services around the world. We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud's Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers.

The US base salary range for this full-time position is $141,000-$202,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits.

Responsibilities:

Write product or system development code.
Collaborate with peers and stakeholders through design and code reviews to ensure best practices amongst available technologies.
Contribute to existing documentation or educational content and adapt content based on product/program updates and user feedback.
Triage product or system issues and debug/track/resolve by analyzing the sources of issues and the impact on hardware, network, or service operations and quality.
Implement solutions in one or more specialized ML areas, utilize ML infrastructure, and contribute to model improvement and data processing.

Suggest a correction

Software Engineer III, Ai/ml, GPU Inference, Optimization

Washington Staffing

Free Jobs Digest

NoDegree

Software Engineer III, Ai/Ml, Gpu Inference, Optimization

Software Engineer III, Ai/ml, GPU Inference, Optimization

About Washington Staffing