We're seeking a Senior Systems Development Engineer to join the Unified Workcell Compute (UWC) team. This is a high-ownership, high-impact role where you'll architect and build foundational systems that manage Amazon's edge device fleet - over a million devices across thousands of locations worldwide. You'll work at the intersection of cloud infrastructure, device management, robotics systems, and operational excellence, solving significantly complex technical problems that enable Amazon's robotics and fulfillment operations to scale globally.
As a Senior SysDE, you'll be a technical leader who defines technical strategy, drives architectural decisions, and builds systems that enable robotics and automation teams to deploy and manage their edge compute solutions with the same ease as deploying to AWS. You'll work with ambiguity, translating undefined business problems into concrete technical solutions while balancing short-term tactical needs with long-term strategic vision. This role requires deep technical expertise across multiple domains - Linux systems, AWS services, IoT platforms, robotics compute infrastructure, and large-scale distributed systems - combined with the ability to influence and mentor engineers across the organization.
A day in the life
Your day might start by investigating a critical issue where 5,000 robotics devices across multiple fulfillment centers are experiencing intermittent kernel panics during high-load operations. You dive deep into kernel logs, memory dumps, and device telemetry, correlating the failures with a recent driver update for NVIDIA GPU systems. You quickly develop a Python or Rust-based diagnostic tool to capture more granular system metrics and work with your team to roll back the problematic driver version while engineering a proper fix that addresses the underlying memory management issue.
Mid-morning, you're troubleshooting why a new OS image isn't booting correctly on ARM-based manipulation robotics devices. You boot into a recovery environment, examine the initramfs, trace through systemd unit dependencies, and discover a race condition in the device initialization sequence. You modify the Yocto recipe to fix the boot ordering, test across multiple hardware variants, and document the pattern for other teams building custom images. You then lead a quick sync with an Amazon Robotics team to help them debug why their software components are failing to deploy - walking through IoT certificate validation, network connectivity from the edge device, and AWS IAM permissions until you identify a misconfigured security group.
After lunch, you're deep in code review for a new credential rotation service providing written feedback on error handling patterns, memory safety, and how to better structure the state machine for resilience. You spend time optimizing a Linux system configuration that's causing performance bottlenecks on AI perception systems - configuring and optimizing Linux system parameters to drive high-performance compute workloads at scale. You mentor a mid-level engineer who's struggling with a complex Yocto build failure, helping them understand layer dependencies and BitBake recipe inheritance while teaching them debugging techniques they can apply independently.
The afternoon includes responding to an urgent page where devices in a specific building can't connect to AWS IoT Core. You systematically eliminate possibilities - checking DNS resolution, testing TLS handshakes, examining certificate chains, and analyzing network packet captures - until you discover a misconfigured firewall rule blocking MQTT traffic. You implement a monitoring enhancement to detect this class of issue proactively across all sites. You then write a technical design document proposing improvements to UWC's device provisioning workflow that will reduce provisioning time from 20 minutes to under 10 minutes by parallelizing certificate generation and optimizing the Linux boot sequence. You'll end your day reviewing system metrics across the fleet, identifying devices with degraded disk I/O that need proactive maintenance, and ensuring your team is unblocked for tomorrow's work.
About the team
The Unified Workcell Compute (UWC) team is at the forefront of Amazon's robotics and automation efforts, building and operating the foundational device management platform for Amazon's on-premise edge compute fleet. Our services manage over a million robotic devices across thousands of locations worldwide - from the latest NVIDIA GPU offerings supporting AI perception efforts to bleeding-edge manipulation robotics systems, industrial PCs, thin clients, Drive Units, and embedded devices across Amazon's global fulfillment network.
Our mission is to enable robotics solution teams to deploy to Operations buildings with the same self-service, ownership, and accountability as deploying to AWS cloud. We're revolutionizing Amazon's logistics and fulfillment operations by pushing the boundaries of what's possible in automation and compute management at unprecedented scale.
We're a team of builders who value automation, operational excellence, and customer obsession. We own a critical technology ecosystem that powers device provisioning, software distribution, credential management, and fleet operations for robotics workcells and fulfillment systems. Our work directly impacts millions of customer orders and enables Amazon's promise to fast, reliable delivery. We're solving problems that few organizations face, building systems that have never existed before, and defining the future of edge compute management for robotics at Amazon scale. We foster a culture that encourages personal and professional growth, empowering our team members to continually expand their skills and knowledge. Work-life balance is a priority for us, and we strive to create an environment where our team can thrive both professionally and personally.