We are looking for a Data Engineer to push the boundaries of machine learning and simulation at Amazon scale. As a member of People Experience Technology and Finance, you will help drive AI/ML forecasting and prediction related improvements by delivering state-of-the-art system and model optimization techniques in collaboration with multiple Amazon science and engineering teams. Candidate should be passionate about technology, innovation, and customer experience, and is ready to make a lasting impact on the AI driven solutions and business intelligence. You'll be working with talented scientists, engineers, and product and finance managers to innovate on behalf of our customers. If you're fired up about being part of a dynamic, driven team, then this is your moment to join us on this exciting journey and change the world of distributed simulations for population dynamics forecasting and expense planning.This is a high-impact and visibility role where you will lead development of applications that will be used by planners and decision makers across Amazon.
Key job responsibilities:
A day in the life:
As a member of the Finance Forecasting team, you'll play a key role in solving one of the world's most complex technical challenges in data engineering. You will utilize large-scale compute platform to build big datasets used in distributed systems for machine learning and statistical analysis. Our Data Engineer needs to be able to gather and understand data requirements, build and maintain big data sources to prepare data for machine learning models, data scientists, business intelligent engineers, and work with software engineers to achieve high quality data ingestion and transformation solutions.
Successful candidates should come from a strong data engineering background. You need to have experience with structured data, and being able to analyze/transform the data using various tools. Your analytical skills and knowledge of schema, metadata and data structure in analytical data world will be essential. As a data engineer, you will need to design and develop high scalable ETLs with EMR, Spark, Pytorch based applications as well supporting them on Glue ETL or Redshift. Knowledge of big data architecture, and design is a must.