GPU Infrastructure Project Lead

2 weeks ago


San Francisco, California, United States Together AI Full time
Responsibilities:
• Monitor and manage GPU hardware inventory across multiple decentralized data centers
• Track the lifecycle of GPUs, including acquisition, deployment, usage, maintenance, and decommissioning
• Develop and maintain a system to log and track all GPU outages or malfunctions, including root cause analysis, downtime duration, and replacement cycles
• Generate reports on utilization, availability, and performance trends, and recommend improvements

Together AI is a research-driven artificial intelligence company that believes open and transparent AI systems will drive innovation and create the best outcomes for society. We are committed to significantly lowering the cost of modern AI systems by co-designing software, hardware, algorithms, and models. If you are passionate about building the next generation of AI infrastructure, we invite you to join our team.

We offer a competitive salary range of $111,000 - $165,000 + equity + benefits, as well as startup equity, health insurance, and other benefits. The ideal candidate will have at least 3 years of experience in technical program management, inventory management, and/or data center operations/project management.
  • GPU Platform Lead

    1 week ago


    San Francisco, California, United States OpenAI Full time

    OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. Our team runs the GPU fleet that serves the models backing ChatGPT and the API.We build automation to provision and manage one of the largest cutting-edge GPU inference fleets in the world, exposing it as a singular...


  • San Francisco, California, United States OpenAI Full time

    Infrastructure Engineering Leadership:We are seeking an experienced engineering manager to lead our GPU platform team. As a key member of our infrastructure team, you will be responsible for building and scaling one of the largest inference fleets in the world. You will collaborate closely with product and infrastructure teams to help ship reliable products...


  • San Francisco, California, United States OpenAI Full time

    About the RoleWe are seeking an experienced engineering manager to join our GPU platform team. You will be responsible for building and scaling our large-scale inference fleet, collaborating closely with product and infrastructure teams to deliver reliable products quickly.Key Responsibilities:Develop and implement strategies for scaling our inference...


  • San Francisco, California, United States 795b0fc78924510bbd095de6fe06799b Full time

    Company OverviewFluidStack is a cutting-edge organization in the field of AI infrastructure, building and operating GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.The Job DescriptionWe are seeking an experienced Head of Computing Infrastructure to lead deployments...

  • GPU Program Director

    2 weeks ago


    San Francisco, California, United States Together AI Full time

    About Together AIWe're a research-driven artificial intelligence company dedicated to creating transparent and open AI systems that drive innovation and benefit society. Our goal is to lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We've contributed to leading open-source research, models, and datasets to...


  • San Francisco, California, United States Voltage Park, Inc. Full time

    At Voltage Park, we strive to push the boundaries of innovation in the field of Machine Learning Infrastructure. Our goal is to provide seamless access to compute resources, empowering organizations to unlock new possibilities.We are seeking a highly skilled GPU Engineer Specialist to join our team. As the technical expert, you will design and craft tailored...


  • San Francisco, California, United States Together AI Full time

    About the Job: We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. You will be responsible for managing GPU hardware inventory, developing and maintaining a system to log and track GPU outages, and continuously seeking opportunities to improve GPU tracking processes and systems.About...


  • San Francisco, California, United States Coastal Carbon Full time

    Role SummaryWe're seeking an Ai Infrastructure Specialist to help run large-scale experiments, manage infrastructure for foundation models and large machine learning models efficiently on GPUs. The ideal candidate will have experience with scalable training-inference pipelines, strong expertise in distributed computation infrastructure of current-generation...

  • Hardware Project Lead

    2 weeks ago


    San Francisco, California, United States Together AI Full time

    Overview:We are seeking an experienced professional to join our hardware team as a Project Manager. The successful candidate will be responsible for managing GPU resources across multiple decentralized data centers. This critical role involves developing and implementing strategies to optimize resource utilization, reduce downtime, and enhance overall system...


  • San Francisco, California, United States 795b0fc78924510bbd095de6fe06799b Full time

    About FluidStackFluidStack is a pioneering organization in the field of AI infrastructure, building and operating GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.The RoleWe are seeking an experienced Infrastructure Director to lead deployments of 10,000+ GPU...

  • GPU Design Lead

    3 weeks ago


    San Francisco, California, United States Vivid Technology Inc Full time

    This is an opportunity to join a cutting-edge start-up that's redefining automotive perception technology. Fresh off a funding round, they're well-capitalized They are looking to hire an experienced SoC Design Lead to spearhead the development of high-performance, low-power System-on-Chip (SoC) solutions, with a strong focus on DSPs and GPUs .This is a...

  • GPU Design Lead

    1 week ago


    San Francisco, California, United States Vivid Technology Inc Full time

    This is an opportunity to join a cutting-edge start-up that's redefining automotive perception technology. Fresh off a funding round, they're well-capitalized They are looking to hire an experienced SoC Design Lead to spearhead the development of high-performance, low-power System-on-Chip (SoC) solutions, with a strong focus on DSPs and GPUs .This is a...


  • San Francisco, California, United States Together AI Full time

    RoleAs a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI systems....


  • San Francisco, California, United States Together AI Full time

    About the Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI...


  • San Francisco, California, United States Jobleads-US Full time

    Systems Research Engineer, GPU Programming Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the...

  • GPU Design Lead

    3 weeks ago


    San Francisco, California, United States Vivid Technology Full time

    This is an opportunity to join a cutting-edge start-up that's redefining automotive perception technology. Fresh off a funding round, they're well-capitalized They are looking to hire an experienced SoC Design Lead to spearhead the development of high-performance, low-power System-on-Chip (SoC) solutions, with a strong focus on DSPs and GPUs.This is a unique...


  • San Francisco, California, United States OpenAI Full time

    Our team is responsible for running the GPU fleet that serves the models backing ChatGPT and the API. We build automation to provision and manage one of the largest cutting-edge GPU inference fleets in the world, exposing it as a singular platform for other OpenAI teams to seamlessly run production applied AI workloads.We seek to learn from deployment and...


  • San Francisco, California, United States 795b0fc78924510bbd095de6fe06799b Full time

    About FluidStackFluidStack is accelerating the trend of AI infrastructure development, building and operating GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.The OpportunityWe are looking for an experienced professional to join our team as a GPU Cluster Deployment...


  • San Francisco, California, United States OpenAI Full time

    Engineering Manager, GPU Platform | OpenAIEngineering Manager, GPU PlatformResearch - San FranciscoAbout the TeamOur team runs the GPU fleet that serves the models backing ChatGPT and the API. We build automation to provision and manage one of the largest cutting-edge GPU inference fleets in the world, exposing it as a singular platform for other OpenAI...


  • San Francisco, California, United States Together AI Full time

    Job Description:As a key member of Together AI's hardware team, you will be responsible for optimizing and scaling our decentralized GPU resources. This critical role involves ensuring the efficient operation of thousands of GPUs distributed across multiple data centers. Your expertise will enable cutting-edge AI advancements that democratize access to AI...