View All Jobs 138581

Staff Infrastructure Engineer - Models

Build and operate Kubernetes-native infrastructure services for AI workloads at scale
Belgrade, Central Serbia, Serbia
Senior
yesterday
Tenstorrent

Tenstorrent

Designs high-performance AI and RISC-V processors and systems for data centers, edge computing, and machine learning workloads.

10 Similar Jobs at Tenstorrent

Staff Infrastructure Engineer - Models

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

Our AI Software Infrastructure team builds the Kubernetes-native applications, services, and platform tooling that power large-scale AI workloads across internal and customer-facing environments. In this role, you will design and operate the systems that make complex inference, training, CI/CD, and development workflows easier to deploy, scale, monitor, and support in production. If you enjoy building reliable backend and platform software, working close to infrastructure and automation, and helping raise the operational maturity of high-performance systems, this is where it all comes together.

This role is hybrid, based out of Belgrade, Serbia.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

Who You Are

  • Strong backend, infrastructure, or platform engineer with deep experience designing and running production workloads on Kubernetes.
  • Strong understanding of Kubernetes-native application design, workload orchestration, scaling, reliability and production debugging.
  • Experience building platform services, APIs, automation, operators, or controllers using Go or Python.
  • Collaborative and adaptable, able to work across engineering, infrastructure, SRE and deployment teams.
  • Experience with AI, ML, HPC, training, or inference workloads is a strong plus.

What We Need

  • Design, build and operate Kubernetes-native applications, services and workloads for large-scale AI infrastructure.
  • Develop operators, controllers, APIs and automation that make complex workloads easier to deploy, scale, monitor and operate.
  • Define workload patterns for inference, training, CI/CD, internal development workflows and platform services.
  • Improve reliability, observability and operational maturity of applications running on Kubernetes.
  • Partner with SRE, infrastructure, deployment and engineering teams to support internal and customer-facing environments.

What You Will Learn

  • How large-scale AI workloads are designed, deployed and operated on custom accelerator hardware.
  • How inference, training, CI/CD and platform workloads behave at scale.
  • How to build applications and platform services that run reliably across different Kubernetes environments.
  • How internal infrastructure platforms evolve into production-grade systems used by engineering teams and customers.
  • How to influence platform direction, define best practices and raise the Kubernetes maturity of the broader engineering organization.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

+ Show Original Job Post
























Staff Infrastructure Engineer - Models
Belgrade, Central Serbia, Serbia
Engineering
About Tenstorrent
Designs high-performance AI and RISC-V processors and systems for data centers, edge computing, and machine learning workloads.