Senior Software Engineer - Alerting Systems And Metrics Analysis
B2B SAAS data observability software. Cribl does differently. What does that mean? It means we are a serious company that doesn't take itself too seriously; and we're looking for people who love to get stuff done, and laugh a bit along the way. We're growing rapidly - looking for collaborative, curious, and motivated team members who are passionate about putting customers first. As a remote-first company we believe in empowering our employees to do their best work, wherever they are. As the data engine for IT and Security many of the biggest names in the most demanding industries trust Cribl to solve their most pressing data needs. Ready to do the best work of your career? Join the herd and unlock your opportunity.
In this role, you will work closely with Product, Operations, and other business functions while collaborating with your direct team to own and deliver end-to-end features and functionality for our alerting and observability platform. As a Senior Software Engineer specializing in alerting systems and metrics analysis, you will bring your experience and expertise to help your team build intelligent, responsive alerting capabilities. You will have the opportunity to tackle complex observability challenges, owning the design, implementation, and rollout of alerting infrastructure with the support of your team.
As An Active Member Of Our Team, You Will...
- Design and build sophisticated alerting systems that enable proactive monitoring and incident detection across distributed systems
- Develop query-based alert rules and expressions using PromQL, SQL, and other query languages to surface meaningful insights
- Create intelligent alert routing, deduplication, and correlation mechanisms to reduce noise and improve signal quality
- Build scalable backend services for alert evaluation, notification delivery, and alert management workflows
- Optimize time-series data storage and query performance for high-volume metrics and telemetry data
- Develop intuitive interfaces for alert configuration, visualization, and management using React and modern frontend technologies
- Collaborate with cross-functional teams to understand monitoring requirements and deliver comprehensive alerting solutions
- Mentor and guide engineers on best practices for observability and alerting architecture
If You've Got It - We Want It...
- Strong proficiency in TypeScript/Node.js with a proven track record of building production-grade services
- Experience with query languages for metrics and monitoring (PromQL, SQL, or similar) and ability to write complex queries for data analysis
- Hands-on experience building or maintaining alerting systems, including rule evaluation engines and notification pipelines
- Experience with time-series databases and columnar storage systems (ClickHouse experience is a plus)
- Frontend development skills with React and modern JavaScript frameworks for building data visualization and management interfaces
- Strong understanding of distributed systems, data structures, and algorithms
- Experience with observability concepts including metrics, logs, traces, and their correlation
- Ability to work independently with minimal supervision and a track record of learning quickly
- Dedication to writing clean, maintainable, and well-tested code
- Prometheus ecosystem, including AlertManager
- Background in building rule engines or expression evaluation systems
- Experience with notification systems and integrations (PagerDuty, Slack, webhooks, etc.)
- Familiarity with observability tools like Grafana, EL