In Digital Resiliency Engineering (DRE), we combine software and systems engineering to build and operate large-scale and distributed systems designed and/or built by the Singapore Government. We ensure Government services are reliable, meet expected performance, and satisfy customer needs.
If you are someone with a strong DevOps, Infrastructure engineering, and/or SRE background, have experience operating mission critical production technology infrastructure at scale, and are looking for opportunities to work with a team of practitioners and leading industry experts, we welcome you to join us.
In this role, you will build central services for observability and automation of infrastructure services. You will be part of a rotation with other engineers in providing rapid response to major incidents impacting critical Government Services. You will provide technical leadership for the team and work closely with technical leads to operate highly available solutions. You will also provide guidance to other team members on managing availability and performance of mission critical services, building automation and monitoring solutions to prevent problem recurrence, and building automated responses for non-exceptional service conditions.
You will also manage execution of project priorities, deadlines, and deliverables. You will also lead designs of major components, systems, and features to improve availability, scalability, latency, and efficiency of services design and built by the Government.
Key Responsibilities: