 
                                                
                                            Company Name: Confidential
Job Title: Lead Data Engineer With Python
Qualification: Any graduation
Experience: 10 to 13 years
Must Have Skills: Strong proficiency in Python for data engineering, ETL pipelines, and analytics on Databricks. Hands-on experience with Apache Spark (PySpark) — including data transformations, joins, aggregations, and performance optimization. Expertise in Databricks environment setup and management, including clusters, jobs, notebooks, and workspace administration. Experience integrating with cloud platforms (Azure, AWS, or GCP) using Databricks for data ingestion and processing. Knowledge of Delta Lake and data lake architecture for building reliable and scalable data pipelines.
Good to Have Skills: Familiarity with CI/CD pipelines for Databricks (e.g., using Azure DevOps, GitHub Actions, or Jenkins). Experience with SQL and data warehousing concepts for analytics and reporting. Understanding of machine learning workflows in Databricks using MLflow or scikit-learn. Knowledge of data governance and security practices (e.g., Unity Catalog, data lineage). Proficiency in REST APIs or other integration frameworks for connecting external systems with Databricks.
Roles and Responsibilities: Design, develop, and maintain ETL/data pipelines on Databricks using Python and PySpark. Collaborate with data engineers, analysts, and stakeholders to ensure data availability, quality, and reliability. Optimize Spark jobs and cluster configurations for performance and cost efficiency. Implement best practices for version control, testing, and deployment of Databricks notebooks and workflows. Monitor, troubleshoot, and support production workloads within the Databricks environment.
Location: Hyderabad
CTC Range: 28 LPA
Notice period: Immediate to 15 days
Shift Timings: 1 PM TO 10 PM
Mode of Interview: Virtual
Mode of Work: 5 Days Work from office
Mode of Hire: Permanent