At TensorWave, we're leading the charge in AI compute, building a versatile cloud platform that's driving the next generation of AI innovation. We're focused on creating a foundation that empowers cutting-edge advancements in intelligent computing, pushing the boundaries of what's possible in the AI landscape.
We are seeking an exceptional Machine Learning Engineer who has made training and AI workload scheduling a specialty. This is a senior-level role for someone who has significant experience managing distributed machine learning workloads at scale using Slurm and/or Kubernetes.
As a technical visionary and hands-on expert, you will lead the evolution of our managed Slurm and Kubernetes offerings, as well as internal health checking and cluster automation.
Must-Have:
Nice-to-Have: