Develop and maintain ETL pipelines to support data processing, analytics, and visualization needs. Collaborate with cross-functional teams to understand business requirements and provide data-driven solutions. Manage multiple disparate data sources and integrate them into a cohesive data platform and dashboards. Develop and implement validation processes to ensure data accuracy and consistency. Utilize PySpark, Spark, SQL to optimize data processing and querying performance. Design and implement scalable, fault-tolerant data processing systems in Amazon Web Services. Work independently and be comfortable walking through code with the team lead. Document all processes, code, and work, as well as be willing to ask questions and raise issues when roadblocks are encountered. Extensive experience in developing API services for data consumption. Working knowledge in AWS infrastructure with hands-on knowledge in Glue, Redshift, Amazon Dynamo DB or any other NoSQL or Columnar databases. Demonstrate extensive knowledge in data quality management and reference data management. Working experience on data stewardship process and data governance is an advantage. Ability to multi-task and work in an agile environment is a must.