GPU Infrastructure Project Lead
2 weeks ago
• Monitor and manage GPU hardware inventory across multiple decentralized data centers
• Track the lifecycle of GPUs, including acquisition, deployment, usage, maintenance, and decommissioning
• Develop and maintain a system to log and track all GPU outages or malfunctions, including root cause analysis, downtime duration, and replacement cycles
• Generate reports on utilization, availability, and performance trends, and recommend improvements
Together AI is a research-driven artificial intelligence company that believes open and transparent AI systems will drive innovation and create the best outcomes for society. We are committed to significantly lowering the cost of modern AI systems by co-designing software, hardware, algorithms, and models. If you are passionate about building the next generation of AI infrastructure, we invite you to join our team.
We offer a competitive salary range of $111,000 - $165,000 + equity + benefits, as well as startup equity, health insurance, and other benefits. The ideal candidate will have at least 3 years of experience in technical program management, inventory management, and/or data center operations/project management.
-
GPU Platform Lead
1 week ago
San Francisco, California, United States OpenAI Full timeOpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. Our team runs the GPU fleet that serves the models backing ChatGPT and the API.We build automation to provision and manage one of the largest cutting-edge GPU inference fleets in the world, exposing it as a singular...
-
GPU Infrastructure Management Lead
6 days ago
San Francisco, California, United States OpenAI Full timeInfrastructure Engineering Leadership:We are seeking an experienced engineering manager to lead our GPU platform team. As a key member of our infrastructure team, you will be responsible for building and scaling one of the largest inference fleets in the world. You will collaborate closely with product and infrastructure teams to help ship reliable products...
-
Senior Infrastructure Manager
1 week ago
San Francisco, California, United States OpenAI Full timeAbout the RoleWe are seeking an experienced engineering manager to join our GPU platform team. You will be responsible for building and scaling our large-scale inference fleet, collaborating closely with product and infrastructure teams to deliver reliable products quickly.Key Responsibilities:Develop and implement strategies for scaling our inference...
-
GPU Cluster Deployment Manager
2 days ago
San Francisco, California, United States 795b0fc78924510bbd095de6fe06799b Full timeCompany OverviewFluidStack is a cutting-edge organization in the field of AI infrastructure, building and operating GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.The Job DescriptionWe are seeking an experienced Head of Computing Infrastructure to lead deployments...
-
GPU Program Director
2 weeks ago
San Francisco, California, United States Together AI Full timeAbout Together AIWe're a research-driven artificial intelligence company dedicated to creating transparent and open AI systems that drive innovation and benefit society. Our goal is to lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We've contributed to leading open-source research, models, and datasets to...
-
GPU Engineer Specialist
2 weeks ago
San Francisco, California, United States Voltage Park, Inc. Full timeAt Voltage Park, we strive to push the boundaries of innovation in the field of Machine Learning Infrastructure. Our goal is to provide seamless access to compute resources, empowering organizations to unlock new possibilities.We are seeking a highly skilled GPU Engineer Specialist to join our team. As the technical expert, you will design and craft tailored...
-
Project Manager for AI Infrastructure
2 weeks ago
San Francisco, California, United States Together AI Full timeAbout the Job: We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. You will be responsible for managing GPU hardware inventory, developing and maintaining a system to log and track GPU outages, and continuously seeking opportunities to improve GPU tracking processes and systems.About...
-
GPU Optimization Engineer
5 days ago
San Francisco, California, United States Coastal Carbon Full timeRole SummaryWe're seeking an Ai Infrastructure Specialist to help run large-scale experiments, manage infrastructure for foundation models and large machine learning models efficiently on GPUs. The ideal candidate will have experience with scalable training-inference pipelines, strong expertise in distributed computation infrastructure of current-generation...
-
Hardware Project Lead
2 weeks ago
San Francisco, California, United States Together AI Full timeOverview:We are seeking an experienced professional to join our hardware team as a Project Manager. The successful candidate will be responsible for managing GPU resources across multiple decentralized data centers. This critical role involves developing and implementing strategies to optimize resource utilization, reduce downtime, and enhance overall system...
-
Infrastructure Director
2 days ago
San Francisco, California, United States 795b0fc78924510bbd095de6fe06799b Full timeAbout FluidStackFluidStack is a pioneering organization in the field of AI infrastructure, building and operating GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.The RoleWe are seeking an experienced Infrastructure Director to lead deployments of 10,000+ GPU...
-
GPU Design Lead
3 weeks ago
San Francisco, California, United States Vivid Technology Inc Full timeThis is an opportunity to join a cutting-edge start-up that's redefining automotive perception technology. Fresh off a funding round, they're well-capitalized They are looking to hire an experienced SoC Design Lead to spearhead the development of high-performance, low-power System-on-Chip (SoC) solutions, with a strong focus on DSPs and GPUs .This is a...
-
GPU Design Lead
1 week ago
San Francisco, California, United States Vivid Technology Inc Full timeThis is an opportunity to join a cutting-edge start-up that's redefining automotive perception technology. Fresh off a funding round, they're well-capitalized They are looking to hire an experienced SoC Design Lead to spearhead the development of high-performance, low-power System-on-Chip (SoC) solutions, with a strong focus on DSPs and GPUs .This is a...
-
Systems Research Engineer, GPU Programming
3 weeks ago
San Francisco, California, United States Together AI Full timeRoleAs a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI systems....
-
Systems Research Engineer, GPU Programming
2 weeks ago
San Francisco, California, United States Together AI Full timeAbout the Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the performance and efficiency of our AI...
-
Systems Research Engineer, GPU Programming
2 weeks ago
San Francisco, California, United States Jobleads-US Full timeSystems Research Engineer, GPU Programming Role As a Systems Research Engineer specialized in GPU Programming, you will play a crucial role in developing and optimizing GPU-accelerated kernels and algorithms for ML/AI applications. Working closely with the modeling and algorithm team, you will co-design GPU kernels and model architecture to enhance the...
-
GPU Design Lead
3 weeks ago
San Francisco, California, United States Vivid Technology Full timeThis is an opportunity to join a cutting-edge start-up that's redefining automotive perception technology. Fresh off a funding round, they're well-capitalized They are looking to hire an experienced SoC Design Lead to spearhead the development of high-performance, low-power System-on-Chip (SoC) solutions, with a strong focus on DSPs and GPUs.This is a unique...
-
Engineering Manager for AI Infrastructure
1 week ago
San Francisco, California, United States OpenAI Full timeOur team is responsible for running the GPU fleet that serves the models backing ChatGPT and the API. We build automation to provision and manage one of the largest cutting-edge GPU inference fleets in the world, exposing it as a singular platform for other OpenAI teams to seamlessly run production applied AI workloads.We seek to learn from deployment and...
-
Head of Computing Infrastructure
2 days ago
San Francisco, California, United States 795b0fc78924510bbd095de6fe06799b Full timeAbout FluidStackFluidStack is accelerating the trend of AI infrastructure development, building and operating GPU supercomputers for top AI labs, governments, and enterprises. Our customers include Mistral, Poolside, Black Forest Labs, Meta, and more.The OpportunityWe are looking for an experienced professional to join our team as a GPU Cluster Deployment...
-
Engineering Manager, GPU Platform
3 weeks ago
San Francisco, California, United States OpenAI Full timeEngineering Manager, GPU Platform | OpenAIEngineering Manager, GPU PlatformResearch - San FranciscoAbout the TeamOur team runs the GPU fleet that serves the models backing ChatGPT and the API. We build automation to provision and manage one of the largest cutting-edge GPU inference fleets in the world, exposing it as a singular platform for other OpenAI...
-
AI Infrastructure Optimizer
2 weeks ago
San Francisco, California, United States Together AI Full timeJob Description:As a key member of Together AI's hardware team, you will be responsible for optimizing and scaling our decentralized GPU resources. This critical role involves ensuring the efficient operation of thousands of GPUs distributed across multiple data centers. Your expertise will enable cutting-edge AI advancements that democratize access to AI...