View All Jobs 112985

SRE Engineer - Remote Eligible

Implement comprehensive monitoring and automation to enhance platform reliability
Remote
Senior
4 days ago
Ansira

Ansira

A marketing agency specializing in brand-to-local activation and channel partner marketing programs.

Site Reliability Engineer

This SRE will ensure the reliability, performance, and scalability of our MarTech SaaS platform that serves millions of users running thousands of marketing campaigns daily. They'll be responsible for monitoring systems, responding to incidents, and implementing automation to improve platform reliability.

About Ansira

Ansira is a leading marketing technology company dedicated to helping brands connect with customers and grow their businesses. Our platform integrates internal and external teams across channels, markets, and regions to deliver impactful brand-to-local growth strategies. At Ansira, we empower companies by optimizing marketing performance through AI-powered technology, growing partner ecosystems, cultivating brand loyalty, and ensuring profitable client growth. We serve a variety of industries, including financial services, retail, automotive, and technology.

About the Role

Join our growing organization as a Site Reliability Engineer and help ensure the reliability and performance of our SaaS platform that serves millions of users executing thousands of marketing campaigns every day. You'll be joining a lean, high-impact team where your work directly influences the experience of our customers and the success of their marketing efforts. This is a remote-first position where you'll play a crucial role in maintaining and improving the reliability, scalability, and performance of our mission-critical systems.

What You'll Do

  • Monitor & Alert: Design, implement, and maintain comprehensive monitoring and alerting systems using tools such as Prometheus, Grafana, and DataDog to ensure early detection of issues and optimal system performance.
  • Incident Response: Lead incident response efforts, conduct root cause analyses, and implement preventive measures to reduce future occurrences.
  • Automation: Build and maintain automation tools and processes to reduce manual work, improve deployment reliability, and enhance system resilience.
  • Reliability Engineering: Identify and implement reliability improvements across our platform, working closely with development teams to embed best practices.
  • Capacity Planning: Monitor system performance trends and plan for scaling needs to support our growing user base and campaign volume.
  • Documentation: Create and maintain runbooks, procedures, and system documentation to support the team and improve knowledge sharing.

What We're Looking For

Required:

  • 3+ years of hands-on experience in site reliability engineering, DevOps, or similar roles with focus on monitoring and reliability improvements.
  • Strong knowledge of SRE best practices including SLIs/SLOs, error budgets, and reliability engineering principles.
  • Cloud platform experience with services like Compute Engine, Kubernetes, Cloud SQL, and related infrastructure components.
  • DataDog or similar expertise for monitoring, alerting, and observability.
  • Backend development experience with Java, PHP, and/or Node.js to understand and troubleshoot application-level issues.
  • Incident management skills including on-call experience, troubleshooting under pressure, and post-incident review processes.
  • Automation mindset with experience in scripting and Infrastructure as Code principles.

Preferred:

  • SaaS platform experience, particularly in high-volume environments serving millions of users.
  • MarTech or AdTech industry background with understanding of campaign management systems.
  • Experience scaling systems that handle thousands of concurrent operations.
  • CI/CD pipeline experience and deployment automation.
  • Security best practices knowledge for cloud environments.

What We Offer

  • Remote-first culture with flexible working arrangements.
  • High-impact role in a small, collaborative team where your contributions directly matter.
  • Growth opportunities as we scale our platform and expand our engineering team.
  • Competitive compensation and benefits package.
  • Learning budget for professional development and certifications.
  • Modern tech stack with opportunities to work with cutting-edge solutions.

Our Environment

You'll be working with systems that process millions of user interactions daily across thousands of active marketing campaigns. Our platform operates at significant scale, requiring robust monitoring, quick incident response, and continuous reliability improvements. As part of a small cross-functional team, you'll have the opportunity to make a substantial impact on both our technical infrastructure and our growing engineering culture.

Ready to Apply?

We're looking for someone who thrives in a fast-paced environment, enjoys solving complex technical challenges, and wants to help build reliable systems that power successful marketing campaigns for our customers. Please submit your resume explaining:

  • Your relevant SRE/reliability engineering experience.
  • Examples of monitoring and automation improvements you've implemented.
  • Why you're interested in joining a MarTech company.
+ Show Original Job Post
























SRE Engineer - Remote Eligible
Remote
Engineering
About Ansira
A marketing agency specializing in brand-to-local activation and channel partner marketing programs.