AI Systems Infrastructure Specialist

2 weeks ago


San Francisco, California, United States The Rundown AI, Inc. Full time

About The Rundown AI, Inc.

Company Overview

The Horizons team at The Rundown AI, Inc. leads the development of our company's reinforcement learning research and advancements in AI systems. We've made significant contributions to all Claude models, with substantial impacts on the autonomy and coding capabilities of Claude 3.5 and 3.7 Sonnet.

About the Role

As an Infrastructure & Runtime Engineer on the Horizons team, you will design and implement the foundational systems that enable our AI research. You'll work closely with researchers and engineers to develop robust infrastructure for large language model training, focusing on code execution environments, data pipelines, and performance optimization.

Key Responsibilities
  • Design high-performance data pipelines for processing large-scale code datasets with an emphasis on reliability and reproducibility
  • Build and maintain secure sandboxed execution environments using virtualization technologies like GVisor and Firecracker
  • Develop infrastructure for reinforcement learning training environments, balancing security requirements with performance needs
  • Omit resource utilization across our distributed computing infrastructure through profiling, benchmarking, and systems-level improvements
Requirements
  • Proficiency in Python and async/concurrent programming with frameworks like Trio
  • Experience with container technologies and virtualization systems
  • Strong systems programming skills and understanding performance optimization
  • Familiarity with data pipeline development and ETL processes


  • San Francisco, California, United States The Rundown AI, Inc. Full time

    About the RoleThe Rundown AI, Inc. is seeking an AI Infrastructure Specialist to join our Data Encodings and Tokenization team. As a key member of our team, you'll play a crucial role in developing and optimizing the encodings and tokenization systems used throughout our Finetuning workflows.This position requires a strong understanding of machine learning...


  • San Francisco, California, United States WaveForms AI Full time

    Job title: Software Engineer, AI Infrastructure (Training + Inference) / Member of Technical Staff Who We Are WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive. Role...


  • San Francisco, California, United States Together AI Full time

    About Together AIWe are a research-driven artificial intelligence company. Our mission is to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models.Our team has made significant contributions to open-source research, models, and datasets that advance the frontier of AI. We invite you to join our...


  • San Francisco, California, United States The Rundown AI, Inc. Full time

    About the RoleThe Rundown AI, Inc. is seeking a highly skilled Machine Learning Systems Engineer to join its Model Evaluations team. As a member of this team, you will be responsible for designing, building, and maintaining scalable systems that enable researchers to effectively evaluate models and conduct inference tasks critical to the organization's...


  • San Francisco, California, United States Waveforms AI, Inc Full time

    Job title:Software Engineer, AI Infrastructure (Training + Inference) / Member of Technical StaffWho We Are WaveForms AI is an Audio Large Language Models (LLMs) company building the future of audio intelligence through advanced research and products. Our models will transform human-AI interactions making them more natural, engaging and immersive.Role...


  • San Francisco, California, United States Together AI Full time

    Company OverviewTogether AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society. Our team has been behind technological advancements such as FlashAttention, Hyena, FlexGen, and RedPajama.Role SummaryWe are seeking a Distributed ML Systems Engineer to...


  • San Francisco, California, United States Together AI Full time

    As a Senior AI Infrastructure Engineer, you will be responsible for building the next generation, highly available, global, multi-cloud PaaS platform with open-source technologies to enable and accelerate Together AI's rapid growth.This system spans many diverse environments (Kubernetes, VMs, bare metal compute, and edge deployments) and provides a cohesive...


  • San Francisco, California, United States Together AI Full time

    About the Role As a Senior AI Infrastructure Engineer, you will be responsible for building the next generation, highly available, global, multi-cloud PaaS platform with open-source technologies to enable and accelerate Together AI's rapid growth. This system spans many diverse environments (Kubernetes, VMs, bare metal compute, and edge deployments) and...


  • San Francisco, California, United States Together AI Full time

    As a Senior AI Infrastructure Engineer, you will be responsible for building the next generation, highly available, global, multi-cloud PaaS platform with open-source technologies to enable and accelerate Together AI's rapid growth.This system spans many diverse environments (Kubernetes, VMs, bare metal compute, and edge deployments) and provides a cohesive...


  • San Francisco, California, United States beBee Careers Full time

    Job OverviewWe are looking for an experienced Ai Infrastructure Specialist to join our team. As a key member of our infrastructure development team, you will be responsible for designing and implementing scalable and efficient solutions for our AI/ML infrastructure.About the RoleYou will work closely with our ML developers, data engineers, and DevOps...