Platform/Site Reliability Engineer

We're looking for an experienced Platform/Site Reliability Engineer to help evolve and expand our engineering foundation. In this role, you'll ensure our systems remain robust, scalable, and efficient, while creating the tooling and automation that empower our development teams to move faster and more effectively.

This position is central to shaping our platform roadmap, driving best practices, and implementing solutions that support both developer experience and operational excellence.

Key Responsibilities

Infrastructure & DevOps

Architect, build, and maintain resilient infrastructure that supports diverse engineering initiatives.
Guide adoption of scalable patterns to improve reliability and cost efficiency.

Deployment & Release Management

Refine CI/CD pipelines with AWS CDK to accelerate safe and automated delivery.
Develop tooling for deployments and database migrations to reduce friction in release processes.
Enhance visibility into delivery cycles and streamline rollout workflows.

Reliability & Observability

Design and support monitoring frameworks, log aggregation, and alerting systems.
Proactively identify and resolve issues to maintain uptime and service quality.

Internal Developer Experience

Build productivity tools that shorten feedback loops and automate repetitive tasks.
Champion practices that improve engineering velocity across teams.

Security & Governance

Embed strong security practices into infrastructure and operational processes.
Support compliance initiatives across standards such as SOC, ISO, and GDPR.

We're Looking For

7+ years total professional experience, with 5+ years focused on reliability, infrastructure, or platform roles. Experience in startup environments is a plus.
Strong background in AWS, with deep knowledge of container-based services (Fargate, Kubernetes).
Proven success improving CI/CD workflows with AWS CDK, including automation for deployments and migrations.
Familiarity with modern observability platforms (e.g. Datadog, Prometheus, Grafana).
Solid expertise in designing systems for high availability and horizontal scalability.
Strong coding and scripting skills in languages such as Python, Bash, or TypeScript.
Understanding of infrastructure security best practices and regulatory compliance requirements.
Collaborative mindset, able to partner effectively across engineering teams.

Our Technology Environment

Infrastructure: AWS (Fargate, Redis, PostgreSQL, SQS, CDK), GitHub, Retool
Backend: Django REST Framework, Celery
Frontend: Next.js, Tailwind CSS
AI/LLM Tools: OpenAI, Claude, AWS Bedrock

Suggest a correction

Senior Platform & Reliability Engineer

Calliere Group

Free Jobs Digest

NoDegree

Platform/Site Reliability Engineer

Key Responsibilities

We're Looking For

Our Technology Environment

Senior Platform & Reliability Engineer

About Calliere Group