As a Data Engineer, you will build and evolve the data backbone of an AI-first product spanning document intelligence, time-series IoT data, and agentic AI systems. This is a highly hands-on, end-to-end role for someone who thrives in early-stage ambiguity, is comfortable working close to models and customers, and wants to directly shape how data and AI are used in production. This role will initially focus on the power industry, where your work will directly support the reliable operation of power plants used by millions of people every day.
You will design, implement, and operate data systems across the full lifecycle from raw ingestion to AI-driven outputs used by customers in the real world. You’ll work directly with customers and internal stakeholders to understand real problems, translate them into technical solutions, and iterate quickly. You’ll build pipelines that support document processing, sensor data, and ML workflows, contribute to feature engineering and model experimentation when needed, and own systems in production. You’ll make pragmatic architectural decisions, improve reliability over time, and help define best practices as the team and product scale.
Technologies We Use Python, SQL Cloud-native data and GenAI tooling (GCP, VertexAI, etc) Streaming and messaging systems Distributed processing frameworks Data warehouses, lakes, and object storage Time-series and NoSQL databases ML and AI tooling (feature stores, vector databases, model pipelines) Docker, Kubernetes, and infrastructure-as-code tools
About You You’re a builder who likes owning problems end to end from understanding customer needs to shipping and maintaining production systems. You’re comfortable crossing boundaries between data engineering, ML, and backend work, and you’re energized by fast feedback loops and real-world impact. You prefer pragmatic solutions over perfect ones, communicate clearly with technical and non-technical stakeholders, and enjoy helping shape both the product and the engineering culture.
Required Skills 7+ years of experience in data engineering, backend engineering, or adjacent roles Strong Python skills, proficient with ML packages and distributed backends Experience building production data pipelines and systems from scratch Comfort working with both structured and unstructured data Experience operating systems in production and owning reliability Ability to work directly with customers or end users to understand requirements Strong problem-solving skills in ambiguous, fast-moving environments Preferred Skills Experience contributing to ML workflows (feature engineering, training pipelines, evaluation, or inference) Experience with agentic AI systems or similar orchestration frameworks for enterprise-grade reasoning and/or automation Experience with document processing, NLP, or vector search Experience with time-series or IoT data Startup or early-stage product experience Experience making architectural tradeoffs under real-world constraints