Job Description: • Design, build, and maintain scalable data pipelines using PySpark and Python • Develop and optimize complex SQL queries for data processing • Implement and manage ETL/ELT workflows ensuring data quality and reliability • Collaborate with business and product teams to translate requirements into data solutions • Build and maintain data warehouse systems • Handle large-scale data processing using Hadoop and Big Data technologies • Perform performance tuning and optimization of data pipelines Required Skills: • Strong hands-on experience with PySpark and Python • Advanced proficiency in SQL • Solid experience in ETL processes and data warehousing • Familiarity with Hadoop ecosystem and Big Data technologies • Experience working with large datasets in distributed environments • Good communication skills and business understanding Good to Have: • Experience with Apache Airflow • Exposure to cloud platforms (AWS, GCP, Azure) • Knowledge of data lakes and modern data architectures • Experience with streaming tools (Kafka, Spark Streaming)