Company Name: Confidential
Job Title: Data Engineer
Qualification: Any graduation
Experience: 5 to 8 years
Must Have Skills: Exposure to AWS RDS, DynamoDB, or other database services. Experience with Spark Streaming, Kafka, or Structured Streaming. Knowledge of CloudFormation, Terraform, or CDK for infrastructure as code. Experience with Python testing frameworks (pytest/unittest), async programming, and Flask/FastAPI. Familiarity with Databricks, Hive, Airflow, and CI/CD pipelines.
Good to Have Skills: Strong hands-on experience with AWS Services: EC2, S3, IAM, Lambda, CloudWatch. Proficiency in Python for scripting, data processing, and API integration.
Roles and Responsibilities: Design, develop, and deploy scalable data pipelines using PySpark on cloud platforms like AWS EMR or Databricks. Build and manage data lake/storage solutions using Amazon S3, and orchestrate workflows using AWS Lambda, Step Functions, or Apache Airflow. Write efficient, reusable, and modular Python code for automation, transformation, and integration tasks. Optimize performance of distributed data processing jobs by tuning Spark configurations and applying best practices (partitioning, caching, etc.). Monitor, log, and troubleshoot pipelines using CloudWatch, CloudTrail, and implement alerting strategies. Secure cloud resources with proper IAM roles/policies, and ensure data privacy and compliance standards are met. Work closely with cross-functional teams including data scientists, analysts, DevOps, and product teams.
Location: Hyderabad, Noida
CTC Range: 22 lpa
Notice Period: Immediate
Shift Timings
Mode of Interview: Virtual
Mode of Work
Mode of Hire
Note