Established in 2009, ClickHouse leads the industry with its open-source column-oriented database system, driven by the vision of becoming the fastest OLAP database globally. The company empowers users to generate real-time analytical reports through SQL queries, emphasizing speed in managing escalating data volumes. Enterprises globally, including Lyft, Sony, IBM, GitLab, Twilio, HubSpot, and many more, rely on ClickHouse Cloud. It is available through open-source or on AWS, GCP, Azure, and Alibaba.
The Product Metrics team owns the collection, storage, and serving of metrics collected from customers' ClickHouse instances. As a part of the team you will be responsible for designing, building, operating and maintaining components of the petabyte-scale platform that stores trillions of records and processes millions of new events every second, owning its reliability, performance and availability. Our stack is built using Golang, runs in Kubernetes, and is, of course, stored in ClickHouse. The team's responsibilities include gathering and processing data for the internal billing and accounting system as well as customer-facing dashboards that provide our customers with immediate insights and analytics. The Product Metrics system is specifically designed to prioritize delivery guarantees, precision, and accuracy in handling extensive data volumes. Its primary goal is to provide data that is both prompt and accurate, supporting reliable operational decisions and enhancing the overall customer experience.
Take an active part in determining the roadmap for the Product Metrics team
Work closely within the team to deliver new features, iterate and improve them
Design, build, operate, and maintain business-critical petabyte-scale systems
Be responsible for the performance, reliability, availability and cost-efficiency of the Product Metrics systems
Mentor and support other team members, participate in design discussions and collaborate with the team
Be a part of on-call rotation and take ownership of the services you're running
You demonstrate a strong initiative and a preference for action, high level of responsibility, ownership and accountability
You prioritize customer needs, ensuring that our products are designed with the user in mind
You are able to take on complex challenges and break them down to achieve short feedback loops: to analyze, design, and build modular solutions, deliver MVPs, gather data and feedback and then progress iteratively
You have a strong problem solving mindset and have solid production debugging skills
You have excellent communication skills and the ability to work well within a team and across engineering teams in a fully remote environment
You thrive in a fast-paced environment, and see yourself as a partner with the business with the shared goal of moving the business forward
5+ years of relevant software development industry experience building and operating scalable, fault-tolerant, distributed systems
Solid experience with at least one programming language. We use Go, but if you have familiarity with Python, C, C++, Rust or similar that translates well
Experience with at least one of the major Cloud Service Providers such as AWS, GCP or Azure
Experience with storing, shipping, and retrieving large volumes of data efficiently using technologies such as ClickHouse
Experience with technologies such as Kubernetes, Helm, ArgoCD, Temporal as well as infrastructure-as-code tools such as Terraform
Experience with ClickHouse
Experience writing Kubernetes operators or controllers