View All Jobs 157218

Senior Data Engineer (pyspark) - Technology - Remote Eligible

Maintain and optimize scalable data pipelines on AWS cloud infrastructure
Remote
Senior
18 hours agoBe an early applicant
Truelogic

Truelogic

A digital solutions provider specializing in software development, web design, and digital marketing services.

Data Project Manager

This role is responsible for maintaining and optimizing data projects to ensure high performance, reliability, and scalability. This role supports data services initiative by strengthening data-driven capabilities through the integration of reliable data sources and improved client engagement.

Responsibilities

  • Provide maintenance and technical support for existing data projects to ensure performance, reliability, and scalability.
  • Evaluate and onboard potential data providers, ensuring alignment with business and technical requirements.
  • Assess prospective clients interested in data services, identifying opportunities for integration and value creation.
  • Collaborate with cross-functional teams to enhance data quality, streamline operations, and support ongoing data strategy initiatives.
  • Contribute to the continuous improvement of data workflows and service delivery processes.

Qualifications and Job Requirements

  • Strong proficiency in PySpark and experience with distributed computing frameworks.
  • Solid understanding of SQL (Presto, Athena), Python, and shell scripting.
  • Hands-on experience working with the AWS cloud ecosystem, including S3, EC2, EMR, Glue, Athena, and related services.
  • Proven ability to build and maintain robust ETL pipelines using Hive Query Language, Presto SQL, and shell scripts to integrate datasets from diverse sources for predictive modeling.
  • Comfortable working in a fast-paced, agile environment with limited documentation.
  • Strong organizational and communication skills, with the ability to collaborate effectively across teams.

Nice to Have

  • Experience working with Graph Databases.
  • Background in building scalable and distributed ETL pipelines on AWS, ideally using PySpark.
  • Familiarity with Apache Airflow for data orchestration and workflow management (training can be provided).
  • Experience in performance tuning and optimization of Spark jobs.
  • Knowledge of designing resilient big data schemas on the Apache Hadoop platform (at petabyte scale) to support predictive modeling and analytics workflows that enhance marketing ROI.

What We Offer

  • 100% Remote Work: Enjoy the freedom to work from the location that helps you thrive. All it takes is a laptop and a reliable internet connection.
  • Highly Competitive USD Pay: Earn an excellent, market-leading compensation in USD, that goes beyond typical market offerings.
  • Paid Time Off: We value your well-being. Our paid time off policies ensure you have the chance to unwind and recharge when needed.
  • Work with Autonomy: Enjoy the freedom to manage your time as long as the work gets done. Focus on results, not the clock.
  • Work with Top American Companies: Grow your expertise working on innovative, high-impact projects with Industry-Leading U.S. Companies.

Why You'll Like Working Here

  • A Culture That Values You: We prioritize well-being and work-life balance, offering engagement activities and fostering dynamic teams to ensure you thrive both personally and professionally.
  • Diverse, Global Network: Connect with over 600 professionals in 25+ countries, expand your network, and collaborate with a multicultural team from Latin America.
  • Team Up with Skilled Professionals: Join forces with senior talent. All of our team members are seasoned experts, ensuring you're working with the best in your field.
+ Show Original Job Post
























Senior Data Engineer (pyspark) - Technology - Remote Eligible
Remote
Engineering
About Truelogic
A digital solutions provider specializing in software development, web design, and digital marketing services.