Infrastructure Engineer With A Focus In Machine Learning (ML Ops)

Monarch is a powerful, all-in-one personal finance platform designed to help make the complexity of finances feel simple again. Since launching in 2021, we've become the top-recommended personal finance app by users and experts. Our goal? To take the stress out of finances so our members can focus on what truly matters.

We are a team of do-ers led by experienced entrepreneurs who are passionate about helping our members reach their financial goals. We are hyper focused on building a product people love and continuing to evolve based on user feedback.

As a fully remote company (even before COVID!), we welcome applicants from almost anywhere. Our team collaborates synchronously mostly from 9 AM – 2 PM PT and embraces asynchronous work to stay connected across time zones.

Join us on our mission to transform lives by simplifying money, together.

The Role

Monarch is seeking an Infrastructure Engineer with a focus in Machine Learning (ML Ops) to join our Infrastructure and Platform Engineering team during a period of hyper-growth. Reporting directly to the Head of Software Infrastructure, you will design, build, and maintain the infrastructure that powers our machine learning and AI workloads. You'll play a critical role in enabling our ML engineers to train, deploy, and monitor models at scale, while ensuring our infrastructure is secure, cost-effective, and optimized for performance.

This is a hands-on role where you'll contribute to both the evolution of our ML infrastructure and the implementation of new AI capabilities, directly shaping how Monarch leverages machine learning to deliver exceptional customer experiences.

What You'll Do

Maintain, improve, and scale cloud infrastructure that supports both traditional applications and ML/AI workloads.
Partner with ML engineers to design and deploy specialized resources for model training, inference, and data pipelines.
Implement automated infrastructure solutions using Terraform/OpenTofu to accelerate environment provisioning and resource management.
Introduce and integrate modern AI infrastructure capabilities, including vector databases, model observability tools, and GPU/accelerator workloads.
Provide technical guidance on ML workload architecture, security, and performance optimization, aligning with Monarch's culture of continuous improvement.

A Partnership with AI Engineering

You Own: The core cloud infrastructure (IaC), networking, secrets management, Kubernetes/GPU orchestration, and shared platform services.
AI Eng Owns: The LLM runtime, retrieval architecture (vector stores, indexing), evaluation frameworks, safety guardrails, prompt/model versioning, AI observability, and cost/latency optimization.
Together You Own: SLAs/SLOs, rollout strategies, incident response protocols, and capacity planning for all AI services.

What You'll Bring

4+ years of professional experience with cloud infrastructure (AWS or GCP preferred).
2+ years of professional experience deploying and managing ML workloads in the cloud.
Proficiency in Python for automation, scripting, and tooling.
Advanced hands-on experience with Infrastructure-as-Code tools (Terraform or OpenTofu) in production environments.
Strong problem-solving skills, ability to work autonomously, and a collaborative mindset.
Experience in cloud networking and security best practices for data-intensive workloads.
Clear verbal and written communication, cross-functional collaboration, analytical thinking, ability to manage multiple priorities, self-motivated, and proactive.
Experience working in a high-growth startup or fast-paced environment.

Nice to Have's

Background in vector search and retrieval-augmented generation (RAG) architectures.
Experience with GPU/accelerator optimization (CUDA, TensorRT, ONNX Runtime).
Familiarity with model serving platforms (Seldon Core, KServe, BentoML, NVIDIA Triton).
Exposure to feature stores (Feast, Tecton) and real-time ML data serving.
Contributions to ML Ops open-source projects or AI engineering communities.
Experience in fintech or data-rich SaaS environments.

Typical Process

Recruiter Video Call
Hiring Manager Video Call
Take Home Assignment
Virtual "onsite" round consisting of 3 rounds
Reference Checks
Offer!

Benefits

Work wherever you want! As a fully remote company with no central office, we want you to work wherever you are happiest and most productive. Whether that's out of your home, a co-working space, or elsewhere.
Competitive cash and equity compensation in a hyper growth, early stage company.
Stipend to set-up your ideal working environment.
Competitive Benefit Plans for employees based on your location (e.g. in the US we offer: Medical, dental and vision benefits and the ability to contribute to a 401k plan).
Unlimited PTO.
3 day weekend every month! We take off the "First Friday" every month to focus on rest, recuperation, or just having fun!

We are an equal opportunity employer and value diversity. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Suggest a correction

Software Engineer, Infra & ML Ops - Remote Eligible

Monarch Money

Free Jobs Digest

NoDegree