Join our team as an Engineering Expert to lead cutting-edge projects focused on optimizing our data product runtime architecture within SAP Business Data Cloud. Your expertise will drive innovation and strategic decision-making.
Responsibilities:
• Analyze and refine Spark workloads to enhance performance and resource efficiency. This includes tuning parameters, re-engineering data processing pipelines, and ensuring optimal execution strategies.
• Implement monitoring solutions to detect and diagnose performance bottlenecks or failures in Spark applications. Solve complex runtime issues related to resource allocation, execution errors, or data inconsistencies.
• Collaborate with various parts of the organization to design and build a framework that allows Business Data Cloud workloads to scale effortlessly and elastically across distributed data environments.
• Produce documentation on best practices for Spark optimization and conduct training sessions or workshops to enhance the broader team's expertise in performance tuning and support strategies.
• Lead AI-automation initiatives to improve workload scalability and efficiency.
• Mentor engineering teams and lead cross-functional global collaborations.
Qualifications:
• Deep familiarity with principles of distributed computing, including concurrency, fault tolerance, and network latency, which are essential for optimizing distributed data processing.
• Comprehensive knowledge of Apache Spark architecture, including its core components like the driver, executor, and cluster manager, as well as Spark's execution model.
• Expertise in performance profiling and tuning Spark applications, including optimizing resource allocation, parallelism, and shuffling processes to reduce execution time and improve efficiency.
• Skilled in writing efficient SQL queries and transformations using SparkSQL and DataFrames, optimizing operations to reduce computation overhead.
• Strong debugging skills to identify and resolve runtime issues, optimize code paths, and rectify configuration or environment-related problems.
• Experience with tools such as Spark's web UI, Ganglia, Grafana, or Prometheus for monitoring application status and diagnosing performance bottlenecks.
• Comprehensive understanding of AI, ML trends and their applications in the distributed Data query execution space.
• Excellent leadership, mentorship, and communication skills.
• Strategic vision and deep analysis skills to foresee industry advancements.