Senior AI Infrastructure Engineer

4 weeks ago


San Francisco, California, United States Together AI Full time
As a Senior AI Infrastructure Engineer, you will be responsible for building the next generation, highly available, global, multi-cloud PaaS platform with open-source technologies to enable and accelerate Together AI's rapid growth.

This system spans many diverse environments (Kubernetes, VMs, bare metal compute, and edge deployments) and provides a cohesive and reliable abstraction for running AI workloads in them. You will get to be a technology thought leader, evangelize new, cutting-edge technologies, and solve complex problems.

To be successful, you'll need to be deeply technical and possess excellent communication, collaboration, and diplomacy skills. You have experience practicing infrastructure-as-code, including using tools like Terraform and Ansible. You have strong software development fundamentals and skills. In addition, you have strong systems knowledge and troubleshooting abilities.

Requirements

5+ years of professional software development experience and proficiency in at least one backend programming language (Golang desired)
Demonstrated experience with high performance or distributed cloud microservices architectures and ideally experience building them in operation at a global scale using multiple cloud providers such as AWS, Azure, or GCP
Excellent understanding of low level operating systems concepts including multi-threading, memory management, networking and storage, performance, and scale
Pragmatic, methodical, well-organized, detail-oriented, and self-starting
Experience with Kubernetes and containerization, VPNs, AI workloads, and blockchain based protocols a plus
GPU programming, NCCL, CUDA knowledge a plus
Experience with Pytorch or Tensorflow a plus
5+ years experience writing high-performance, well-tested, production quality code
Responsibilities

Perform architecture and research work for decentralized AI workloads
Work on the core, open-source Together AI platform
Create services, tools, and developer documentation
Create testing frameworks for robustness and fault-tolerance

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
#LI-ONSITE

  • San Francisco, California, United States Together AI Full time

    About the Role As a Senior AI Infrastructure Engineer, you will be responsible for building the next generation, highly available, global, multi-cloud PaaS platform with open-source technologies to enable and accelerate Together AI's rapid growth. This system spans many diverse environments (Kubernetes, VMs, bare metal compute, and edge deployments) and...


  • San Francisco, California, United States Together AI Full time

    As a Senior AI Infrastructure Engineer, you will be responsible for building the next generation, highly available, global, multi-cloud PaaS platform with open-source technologies to enable and accelerate Together AI's rapid growth.This system spans many diverse environments (Kubernetes, VMs, bare metal compute, and edge deployments) and provides a cohesive...


  • San Francisco, California, United States WaveForms AI Full time

    Job title: Software Engineer, AI Infrastructure (Training + Inference) / Member of Technical Staff Who We Are WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive. Role...


  • San Francisco, California, United States Together AI Full time

    About Together AIWe are a research-driven artificial intelligence company. Our mission is to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models.Our team has made significant contributions to open-source research, models, and datasets that advance the frontier of AI. We invite you to join our...

  • Senior AI Engineer

    2 weeks ago


    San Francisco, California, United States Top AI Start up Full time

    We're looking for a Senior AI Engineer to join our team at Top AI Start up. As a key member of our engineering team, you'll play a critical role in shaping the future of AI innovation.About the RoleIn this position, you'll work on creating datasets, benchmarking models, and defining evaluation metrics for various tasks in our voice agent platform. You'll...


  • San Francisco, California, United States Together AI Full time

    Together AI is looking for a Senior Data Infrastructure Engineer to help define, build, and operate the data infrastructure that handles millions of events every day to power Together's mission-critical systems. As a Senior Data Infrastructure Engineer, you will work with our Data and Commerce engineering team to scale the data processing components of...


  • San Francisco, California, United States Together AI Full time

    Role Together AI is looking for a Senior Data Infrastructure Engineer to help define, build, and operate the data infrastructure that handles millions of events every day to power Together's mission-critical systems. As a Senior Data Infrastructure Engineer, you will work with our Data and Commerce engineering team to scale the data processing components of...


  • San Francisco, California, United States Waveforms AI, Inc Full time

    Job title:Software Engineer, AI Infrastructure (Training + Inference) / Member of Technical StaffWho We Are WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive.Role...


  • San Francisco, California, United States Hayden AI Full time

    Hayden AI is dedicated to creating an inclusive and diverse workplace where everyone is treated with respect and dignity. We believe that our differences make us stronger and drive innovation. As an equal opportunity employer, we do not discriminate against any employee or applicant based on race, color, religion, sex, sexual orientation, gender identity,...


  • San Francisco, California, United States Source Technology Full time

    We are seeking a highly skilled AI Infrastructure Engineer to join our team on a contract basis. The ideal candidate will have experience in designing, deploying, and managing scalable infrastructure for AI and machine learning (ML) applications. This role will focus on optimizing workflows, ensuring system reliability, and enabling seamless integration of...