View All Jobs 166826

Senior Data Engineer

Architect and optimize large-scale data pipelines on AWS using Spark and Databricks
Dubai
Senior
3 weeks ago
Washmen

Washmen

A laundry and dry cleaning service provider offering pickup and delivery in the UAE.

Senior Data Engineer

We're seeking a self-sufficient Senior Data Engineer to build and scale our data infrastructure supporting product, engineering, and analytics team. You'll architect data pipelines, optimize our data platform, and ensure the teams have reliable, high-quality data to drive business decisions. This is a hands-on role for someone who can own the entire data engineering stack - from ingestion to transformation to orchestration. You'll work independently to solve complex data challenges and build scalable solutions.

Core Responsibilities

Data Pipeline Development & Optimization

Design, build, and maintain scalable data pipelines using Spark and Databricks

Develop ETL/ELT workflows to process large volumes of customer behavior data

Optimize Spark jobs for performance, cost efficiency, and reliability

Build real-time and batch data processing solutions

Implement data quality checks and monitoring throughout pipelines

Ensure data freshness and SLA compliance for analytics workloads

AWS Data Infrastructure

Architect and manage data infrastructure on AWS (S3, Glue, EMR, Redshift)

Design and implement data lake architecture with proper partitioning and optimization

Configure and optimize AWS Glue for ETL jobs and data cataloging

Shifting Glue jobs to Zero ETL

Implement security best practices for data access and governance

Monitor and optimize cloud costs related to data infrastructure

Data Modeling & Architecture

Design and implement dimensional data models for analytics

Build star/snowflake schemas optimized for analytical queries

Create data marts for specific business domains (retention, campaigns, product)

Ensure data model scalability and maintainability

Document data lineage, dependencies, and business logic

Implement slowly changing dimensions and historical tracking

Orchestration & Automation

Build and maintain workflow orchestration using Airflow or similar tools

Implement scheduling, monitoring, and alerting for data pipelines

Create automated data quality validation frameworks

Design retry logic and error handling for production pipelines

Build CI/CD pipelines for data workflows

Automate infrastructure provisioning using Infrastructure as Code

Cross-Functional Collaboration

Partner with Senior Data Analyst to understand analytics requirements

Work with Growth Director and team to enable data-driven decision making

Support CRM Lead with data needs for campaign execution

Collaborate with Product and Engineering on event tracking and instrumentation

Document technical specifications and best practices for the team

Work closely with all squads, establish data contracts with engineers to land data in a most optimal way.

Required Qualifications

Must-Have Technical Skills

Apache Spark: Expert-level proficiency in PySpark/Spark SQL for large-scale data processing - this is non-negotiable

Databricks: Strong hands-on experience building and optimizing pipelines on Databricks platform - this is non-negotiable

AWS: Deep knowledge of AWS data services (S3, Glue, EMR, Redshift, Athena) - this is non-negotiable

Data Modeling: Proven experience designing dimensional models and data warehouses - this is non-negotiable

Orchestration: Strong experience with workflow orchestration tools (Airflow, Prefect, or similar) - this is non-negotiable

SQL: Advanced SQL skills for complex queries and optimization

Python: Strong programming skills for data engineering tasks

Experience

6-10 years in data engineering with focus on building scalable data platforms

Proven track record architecting and implementing data infrastructure from scratch

Experience processing large volumes of event data (billions of records)

Background in high-growth tech companies or consumer-facing products

Experience with mobile/web analytics data preferred

Technical Requirements

Expert in Apache Spark (PySpark and Spark SQL) with performance tuning experience

Deep hands-on experience with Databricks (clusters, jobs, notebooks, Delta Lake)

Strong AWS expertise: S3, Glue, EMR, Redshift, Athena, Lambda, CloudWatch

Proficiency with orchestration tools: Airflow, Prefect, Step Functions, or similar

Advanced data modeling skills: dimensional modeling, normalization, denormalization

Experience with data formats: Parquet, Avro, ORC, Delta Lake

Version control with Git and CI/CD practices

Infrastructure as Code: Terraform, CloudFormation, or similar

Understanding of data streaming technologies (Kafka, Kinesis) is a plus

Core Competencies

Self-sufficient: You figure things out independently without constant guidance

Problem solver: You diagnose and fix complex data pipeline issues autonomously

Performance-focused: You optimize for speed, cost, and reliability

Quality-driven: You build robust, maintainable, and well-documented solutions

Ownership mindset: You take end-to-end responsibility for your work

Collaborative: You work well with analysts and business stakeholders despite being independent

What We Offer

Competitive salary based on experience

Ownership of critical data infrastructure and architecture decisions

Work with modern data stack and cutting-edge AWS technologies

Direct impact on business decisions through data platform improvements

Comprehensive health benefits

+ Show Original Job Post
























Senior Data Engineer
Dubai
Engineering
About Washmen
A laundry and dry cleaning service provider offering pickup and delivery in the UAE.