View All Jobs 142162

Senior Software Engineer

Design and implement real-time data pipelines processing over a million events per second
United States
Senior
$143,625 – 299,375 USD / year
yesterday
Yahoo!

Yahoo!

An internet service provider offering a search engine, email, news, and a variety of other online media and communication services.

Data Engineer

Yahoo serves as a trusted guide for hundreds of millions of people globally, helping them achieve their goals online through our portfolio of iconic products. For advertisers, Yahoo Advertising offers omnichannel solutions and powerful data to engage with our brands and deliver results.

The ideal candidate will have strong distributed systems knowledge and AI/ML experience to design, build, and optimize scalable data pipelines, and infrastructure that power advanced analytics and machine learning solutions. In this role, you will collaborate closely with software engineers, product owners and business stakeholders to prepare and transform large datasets(realtime pipelines), support end-to-end development and deployment, and ensure robust, efficient, and secure data flows. You will leverage your expertise in cloud platforms, big data tools, and machine learning frameworks to drive innovation and deliver actionable insights that advance our organization's AI initiatives and business objectives.

Yahoo's Central Data team manages massive scale (100+ petabyte) data systems to glean insights on Yahoo! products and to improve the experience for its 1B+ user base. The team provides the foundations for the user engagement data collection and processing for all of Yahoo's users, Operational Excellence, Anomaly detection and Governance across the organization. Your work will directly influence product changes and you will work on a team of talented and motivated engineers to improve the user experience on popular Yahoo! sites and apps like Yahoo Mail & Homepage, Yahoo Sports, Yahoo Finance, Yahoo News and many other new products.

A lot about you:

  • Apply software engineering expertise to build high-performance, scalable data warehouses.
  • Be excited to learn and take ownership for large-scale projects spanning many tech stacks and environments.
  • Design, build, and launch efficient & reliable data pipelines to move and transform data on the scale of multiple petabyte(s) using the latest technologies.
  • Build real time analytics and ingestion pipelines capable of processing more than a million events per second and provide insights at sub-second latency.
  • Interact with product owners and end users to understand and solve new business requirements as they emerge.
  • Design and audit processes for ensuring the delivery of high-quality data through rigorous QA checks.
  • Have excellent data modeling skills to understand the nuances of various dimension and metric types in warehouses.
  • Design workflows to ingest, load and present new data sets for users.
  • Provide active support, be on rotation for on-call support on production pipelines (typically a couple of times each quarter).
  • Define and manage SLA for all data sets in allocated areas of ownership.
  • Work with the production engineering / infrastructure team to drive resolution to production issues.

Required skills:

  • BS/MS in Computer Science and/or Mathematics/Statistics
  • 4+ years experience in relevant software development with at least 2 years of professional Java and/or Python experience
  • 2+ years experience in the Big Data pipeline and analytics space with experience across technology stacks.
  • 2+ years experience in custom ETL design using Big Data stack environments (Hadoop, MapReduce, Pig, Hive, AWS EMR, Apache Beam, Google Cloud Platform Dataflow, BigQuery), implementation and maintenance.

Preferred experience:

  • Experience or familiarity with some of the following tools: Kafka, Storm, Streaming (Spark,Dataflow), ElasticSearch
  • Design, build, and maintain scalable data pipelines and ETL processes to support machine learning and AI initiatives on Google Cloud Platform (GCP).
  • Implement and optimize data storage solutions using GCP services such as BigQuery, Cloud Storage, and Dataflow.
  • Ensure data quality, integrity, and security throughout the data lifecycle.
  • Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver actionable insights.
  • Monitor, troubleshoot, and maintain the health and performance of cloud-based data infrastructure.
  • Automate manual processes and repetitive tasks to improve efficiency and reduce errors.
  • Apply data governance and compliance best practices to protect sensitive information and meet regulatory standards.
  • Stay current with new GCP features, tools, and best practices to continuously enhance data management capabilities.
  • Document solutions, processes, and architectural decisions to facilitate knowledge sharing and maintainability.
  • Experience working with either MapReduce or any other Parallel data processing system.
  • Experience with schema design and dimensional data modeling.
  • Comfortable writing complex SQL queries.
  • Strong data mindset with a deep appreciation for analyzing data to identify product gaps and enhancements to improve user engagement and revenue growth.
  • Excellent communication skills and ability to tell insightful stories using data and also manage communication within internal teams and stakeholders.

At Yahoo, we offer flexible hybrid work options that our employees love! While most roles don't require regular office attendance, you may occasionally be asked to attend in-person events or team sessions. You'll always get notice to make arrangements. Your recruiter will let you know if a specific job requires regular attendance at a Yahoo office or facility. If you have any questions about how this applies to the role, just ask the recruiter!

Yahoo is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category. Yahoo will consider for employment qualified applicants with criminal histories in a manner consistent with applicable law. Yahoo is dedicated to providing an accessible environment for all candidates during the application process and for employees during their employment. If you need accessibility assistance and/or a reasonable accommodation due to a disability, please submit a request via the Accommodation Request Form or call +1.866.772.3182.

The compensation for this position ranges from $143,625.00 - $299,375.00/yr and will vary depending on factors such as your location, skills and experience. The compensation package may also include incentive compensation opportunities in the form of discretionary annual bonus or commissions. Our comprehensive benefits include healthcare, a great 401k, backup childcare, education stipends and much more.

+ Show Original Job Post
























Senior Software Engineer
United States
$143,625 – 299,375 USD / year
Engineering
About Yahoo!
An internet service provider offering a search engine, email, news, and a variety of other online media and communication services.