This role is 100% virtual/work from home, but can also be located onsite at our Corporate Headquarters.
Key Responsibilities:
Pipeline Reliability: Design, implement, and manage monitoring, alerting, and logging systems for all production data pipelines to ensure high availability and reliability within the Databricks environment.
CI/CD Implementation: Develop and maintain Continuous Integration/Continuous Delivery (CI/CD) pipelines for data code and infrastructure using tools like Git, Databricks Asset Bundles, or similar, ensuring rapid and safe deployment of changes.
Proactive Resolution: Identify, diagnose, and resolve workflow issues, performance bottlenecks, and operational failures, applying a software engineering mindset to infrastructure and operations.
Integration Development: Develop and optimize complex, scalable data integration solutions to ingest and deliver data from various enterprise source systems.
Code Quality: Set and maintain high standards for clean code and architecture with robust testing for integration components, treating data pipelines as production-grade software.
Collaboration: Work directly with Senior Data Engineers and Architects to translate designs into stable, well-engineered, and operational data services.
Basic Qualifications:
Must have, at minimum, a Bachelors degree in Computer Science, Software Engineering, or a related technical field.
Must have a minimum of 8 years of experience in a role focused on software development, DevOps, and data engineering.
Proven experience building data integrations from diverse sources like SQL, Oracle, and REST APIs.
Advanced proficiency in Python, with a strong background in writing robust, tested application code.
Hands-on experience with Databricks, Apache Spark, and AWS cloud services (e.g. S3, Kafka, Batch) and implementing Infrastructure as Code.
Demonstrated experience designing and implementing CI/CD practices for data or software systems.
Preferred Qualifications:
Experience implementing data governance and permissions at scale within modern tools and technologies.
Deep understanding of networking, security, and performance tuning with Databricks and AWS.
Experience integrating data from specific enterprise systems (e.g., SAP, MES, PLM).