✨ About The Role
- This role involves owning the availability, performance, and scalability of Sift's primary online storage systems and infrastructure.
- The engineer will design and build immutable infrastructure and fault-tolerant systems that are resilient and self-healing.
- Responsibilities include implementing multi-region deployments and optimizing local development and testing workflows.
- The position requires developing tools for monitoring, detecting faults, and automatically repairing distributed systems.
- The engineer will also provide design support to internal teams for optimal usage of data stores and production workload optimization.
âš¡ Requirements
- The ideal candidate has over 8 years of experience in software engineering, particularly in infrastructure or Site Reliability Engineering (SRE) roles.
- A strong programming background in languages such as Java, Scala, or Python is essential for success in this position.
- The candidate should have experience designing and implementing distributed systems and managing cloud infrastructure on platforms like AWS or GCP.
- Expertise in building infrastructure as code and automating provisioning processes using tools like Terraform is crucial.
- The successful individual will have a collaborative mindset and a passion for creating self-healing systems through proactive monitoring and alerting.