Senior Data Engineer
At Hive, we're all about creating moments that matter and helping event marketers connect with their biggest fans. Our platform powers marketing for 1,500+ iconic events, festivals, venues, and promoters across North America. We help them grow their customer base and sell out shows using intelligent, automated, and personalized digital marketing tools.
Hive integrates with 25+ platforms (like Ticketmaster) to provide rich customer data in real-time, enabling event marketers to engage their audiences with precision and impact.
Hive's R&D Data team is responsible for how we store and query production data at scale. We aren't focused on only BI or dashboards — we build the systems that power Hive's products and make data accessible, reliable, and performant.
As a Senior Data Engineer, you'll play a vital role in evolving our data platform, which directly determines what our customers can do, how fast our product moves, and how confidently leadership can make bets. You'll own outcomes, not tickets. If a business metric is off and it touches data, that's yours to care about.
What you'll get up to:
- Build our Data Platform: Design and own a cloud-native big data platform handling audience data for millions of attendees and billions of interactions a year. You're not just building pipelines — you're building the infrastructure that determines the quality of every insight, recommendation, and decision Hive's customers make.
- Build our ML Platform: Design and own the infrastructure that takes models from experiment to production — feature stores, training pipelines, model serving, and monitoring. You switch hats between data engineering and ML engineering, ensuring reliable, low-latency access to the features and infrastructure we need to build and ship models confidently. When a model degrades in production, you're the one who built the observability to catch it before the customer does.
- Own the Full Pipeline — and Its Business Impact: From Change Data Capture through validation, transformation, and denormalization — you drive the stack end to end. But you also understand what breaks for a customer when a pipeline is late, a metric drifts, or a model gets stale data. You connect the technical dots to the business dots.
- Treat Data as a Product: You don't ship pipelines — you ship data products that internal teams and customers depend on like a production API. You define SLAs, obsess over data health, build for discoverability.
- Build and Leverage Agentic Systems: You bring an agentic engineering mindset to everything — both how you work and what you build. You use AI coding agents (e.g. Claude Code) as a force multiplier. And you build LLM-powered pipelines and autonomous agents that enrich, classify, and act on audience data at scale.
Our Tech Stack
- AWS Services: DMS, RDS, Kinesis, Glue, Redshift
- Programming: Python and Django
- Data Stores: Clickhouse, MongoDB, ElasticSearch, Snowflake
- Orchestration: Airflow or Dagster
What you bring
- 8+ years of hands-on data engineering experience, with a proven track record of designing, building, and operating large-scale distributed data systems in production — high-throughput event streams, real SLAs, and real consequences when things fail
- Strong foundations in distributed systems principles — partitioning strategies, consistency models, backpressure handling, fault tolerance, and capacity planning at 10x the volume you designed for
- End-to-end ML engineering experience: feature engineering and feature store design, training pipeline orchestration, model deployment and serving infrastructure, and production monitoring including drift detection and retraining triggers
- Experience applying LLMs and agentic systems in production data or ML contexts — whether enriching pipelines, automating classification, or building autonomous workflow components
- A product and commercial orientation — you consistently frame technical decisions in terms of customer impact and business outcomes, and you have the stakeholder communication skills to make that case to non-technical audiences.
Who you are:
- Comfortable operating independently and making progress in ambiguous, fast-changing environments
- Biased toward action. You're willing to make decisions with imperfect information and iterate quickly, communicating with other teams inside product and engineering
- Skilled at troubleshooting complex systems and building durable solutions when things break
- Excited to shape the future of Hive's data infrastructure and team in a high-growth, fast-paced company
Nice to haves:
- History of owning or re-architecting a data platform end-to-end in a fast-growing environment.
- Background in SaaS or event-driven products where data systems directly power user-facing features.
Compensation/Benefits Package
- Meaningful salary and equity: you're rewarded based on impact.
- Work fully remote from the comfort of your home.
- Flexible work hours: minimal meetings and no 9-5
- Health & Dental coverage with Parental Leave top-ups in addition to EI benefits
- Unlimited vacation/PTO: so you can be happy and healthy!
About Hive.co
Hive.co is a marketing platform for event marketers. We help brands personalize and automate their campaigns, using email and SMS, to empower them to sell out so they can focus on making their events unforgettable.
By integrating with ticketing partners like Ticketmaster and e-commerce partners like Shopify, we enable brands to access and act on all their customer data, so they can easily segment their list in thousands of ways, and send more customized, timely email campaigns that land in inboxes.
We started our company inside a University of Waterloo computer lab in early 2014, graduated from Y Combinator that summer (S14 batch) and have been growing ever since. Originally based in Kitchener, our team is now 100% remote and located all across Canada! We strive to provide an online work environment that allows team members to have a strong work life balance while still feeling connected to their team and Hive's mission.