Senior AI Performance Optimization Specialist

4 days ago

San Francisco, California, United States Genmo Full time

We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI.

As a Deep Learning Performance Engineer at Genmo, you will play a critical role in optimizing the performance of our large generative AI models.

Your expertise will ensure that our models run efficiently on clusters, leveraging advanced techniques and tools to enhance their performance.

This role is perfect for someone with a deep understanding of deep learning performance bottlenecks, kernel optimization, and distributed training strategies.

Key Responsibilities:

Analyze and optimize the performance of massively parallel and distributed systems
Implement and fine-tune distributed training strategies for multi-GPU and multi-node environments
Implement high-performance CUDA, Triton, C++ and PyTorch code
Profile model performance and identify bottlenecks using tools like NVIDIA NSight Systems, PyTorch Profiler, and TensorFlow Profiler
Develop and maintain benchmarking suites for continuous performance monitoring

Qualifications:

Master's or PhD in Computer Science, Electrical Engineering, or a related field
5+ years of experience in optimizing deep learning models, preferably in a production environment
Must have
- Strong programming skills in Python and C++. Experience in training large models using Python & PyTorch and/or TensorFlow including their distributed training frameworks.
- Proven track record of optimizing large-scale models (10B+ parameters)
- Deep understanding of GPU architecture and CUDA programming
- Experience in entire development pipeline from data processing, preparation & data loading to training and inference.
- Experience optimizing and deploying inference workloads for throughput and latency across the stack (inputs, model inference, outputs, parallel processing etc.)
- Demonstrated expertise in high-performance computing using NVIDIA Triton and CUDA
- Demonstrated ability to significantly improve model inference and training speeds through low-level optimizations
Ideal candidates will have
- Knowledge of distributed inference systems for handling high-volume workloads
- Strong background in linear algebra, optimization, and machine learning algorithms
- Experience with generative AI models (GANs, Diffusion Models, Transformers)
- Knowledge of hardware-aware neural architecture design
- Experience with high-performance computing (HPC) environments
- Contributions to relevant open source projects or research publications

Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish. Genmo

Senior AI Engineer

2 weeks ago

San Francisco, California, United States Perplexity AI Full time

Job OverviewPerplexity AI is seeking a talented Senior AI Engineer to join our team and contribute to the development of our conversational answer engine. As a key member of our engineering team, you will be responsible for designing and implementing large-scale machine learning systems that drive user engagement.Key Responsibilities:Design and develop...
Senior Inference Optimization Engineer

2 weeks ago

San Francisco, California, United States Liquid AI Full time

About the RoleWe're seeking a highly skilled engineer to join our team at Liquid AI, where you'll play a critical role in optimizing inference stacks for our AI models.As a key member of our team, you'll be responsible for taking our models and delivering highly optimized inference stacks that leverage existing frameworks like ggml, vllm, and DeepSpeed to...
AI Performance Optimization Engineer

5 days ago

San Francisco, California, United States Databricks Full time

We are seeking a highly skilled AI Performance Optimization Engineer to join our team at Databricks. The ideal candidate will have hands-on experience with deep learning frameworks, high-performance linear algebra libraries, and distributed systems development.Key Responsibilities:Explore and analyze performance bottlenecks in ML training and...
Senior AI Infrastructure Specialist

1 week ago

San Francisco, California, United States Acceler8 Talent Full time

Introduction:We're seeking a highly skilled Senior AI Infrastructure Specialist to join our pioneering team at the forefront of AI and ML technology. Our team is dedicated to revolutionizing user experiences by innovating at every level, from user interfaces down to the most efficient models.About the Company:Our company thrives on the belief that a small,...
Senior AI Infrastructure Architect

1 week ago

San Francisco, California, United States Together AI Full time

About the RoleWe are seeking a highly skilled Senior AI Infrastructure Engineer to join our team at Together AI. As a key member of our infrastructure team, you will be responsible for designing and building the next generation of our AI platform, leveraging open-source technologies to enable and accelerate our growth.Key Responsibilities:Design and...
Senior Engineering Manager

2 weeks ago

San Francisco, California, United States Scale AI Full time

Senior Engineering Manager - Generative AIAt Scale AI, we're revolutionizing the way organizations build and deploy AI. As a Senior Engineering Manager for our Generative AI team, you'll be responsible for leading a high-performing engineering team to develop and deploy cutting-edge AI models. Your expertise in software development and AI/ML will drive...
Senior Performance Optimization Engineer

2 weeks ago

San Francisco, California, United States Databricks Full time

About the Role:We are seeking a highly skilled Senior Performance Optimization Engineer to join our team at Databricks. As a key member of our GenAI team, you will be responsible for exploring and analyzing performance bottlenecks in ML training and inference, designing and implementing libraries and methods to overcome these bottlenecks, and building tools...
Senior AI Infrastructure Engineer

5 days ago

San Francisco, California, United States Together AI Full time

Job SummaryWe are seeking a highly skilled Senior AI Infrastructure Engineer to join our team at Together AI. As a key member of our infrastructure team, you will be responsible for designing, building, and maintaining our next-generation AI platform.Key ResponsibilitiesDesign and implement highly available AI infrastructure solutionsDevelop and maintain our...
Senior Inference Optimization Engineer

4 days ago

San Francisco, California, United States Liquid AI Full time

At Liquid AI, we're seeking a highly skilled engineer to optimize inference stacks tailored to various hardware platforms.The ideal candidate has extensive experience in CUDA, C++, and Triton, as well as a deep understanding of GPU, CPU, and NPU architectures.They should be self-motivated, capable of working independently, and driven by a passion for...
Performance Optimization Specialist

4 weeks ago

San Jose, California, United States IBM Full time

About the RoleWe are seeking a highly skilled Performance Optimization Specialist to join our team at IBM. As a key member of our team, you will be responsible for optimizing the performance of our high-volume data repositories under heavy loads, both read and update.Key ResponsibilitiesTest and analyze performance best practices in high-volume data...
Software Engineer, AI/ML Infrastructure Specialist

5 days ago

San Francisco, California, United States Together AI Full time

Job ResponsibilitiesInfrastructure Development:Identify and resolve infrastructure gaps to ensure reliable, efficient, and scalable AI/ML solutions.AI/ML Solutions:Develop advanced AI/ML infrastructure solutions to enhance the efficiency of our ML teams, leveraging expertise in distributed systems and large-scale data processing.System Design:Design and...
Senior Inference Optimization Engineer

2 weeks ago

San Francisco, California, United States Liquid AI Full time

Job Title: Member of Technical StaffAt Liquid AI, we're seeking a highly skilled engineer to optimize inference stacks for our models across various device types, including GPUs, CPUs, and NPUs.Key Responsibilities:Collaborate with ML Teams: Work with machine learning staff to effectively interface with our technical team.Hardware Awareness: Understand...
Senior Inference Optimization Engineer

1 month ago

San Francisco, California, United States Liquid AI Full time

Optimize Inference Stacks for Liquid AIAs we prepare to deploy our models across various device types, including GPUs, CPUs, and NPUs, we're seeking an expert who can optimize inference stacks tailored to each platform. We're looking for someone who can take our models, dive deep into the task, and return with a highly optimized inference stack-leveraging...
AI Infrastructure Architect

1 week ago

San Francisco, California, United States Together AI Full time

AI Infrastructure Expertise:Design and implement high-performance AI/ML infrastructure, ensuring scalability, availability, and efficient resource utilization.Automation and Optimization:Develop and deploy automation tools, monitoring solutions, and operational strategies to streamline infrastructure management and reduce manual tasks.Collaboration and...
AI Researcher

2 weeks ago

San Francisco, California, United States Together AI Full time

About the RoleWe are seeking a highly skilled AI Researcher to join our team at Together AI. As a key member of our research team, you will be responsible for pushing the boundaries of foundation model research and developing novel architectures, system optimizations, optimization algorithms, and data-centric optimizations that significantly improve over...
AI Developer Relations Specialist

1 week ago

San Francisco, California, United States Eidon AI Full time

About the RoleEidon AI is seeking a highly skilled AI Developer Relations Specialist to engage with the AI developer community, fostering adoption and collaboration within our decentralized AI data marketplace.This role requires a deep technical understanding of the AI landscape, exceptional communication skills, and a passion for the AI space.The successful...
Senior Software Engineer

2 weeks ago

San Francisco, California, United States Altana AI Full time

About the RoleWe are seeking a highly skilled Senior Software Engineer to join our team at Altana AI. As a key member of our engineering team, you will be responsible for designing and developing our cloud-based AI platform.Key Responsibilities:Ingest product requirements and define technical requirements for our AI platform.Design and develop scalable,...
AI Research Scientist

2 weeks ago

San Francisco, California, United States Perplexity AI Full time

Unlock the Power of AI ResearchPerplexity AI is seeking a talented AI Research Engineer to drive innovation in our answer engine. As a key member of our team, you will be responsible for pushing the boundaries of AI-powered search and developing cutting-edge solutions.Key Responsibilities:Design and implement novel AI algorithms to improve search accuracy...
Enterprise AI Solutions Specialist

3 days ago

San Francisco, California, United States Refuel AI Full time

About the RoleAt Refuel AI, we're revolutionizing the way enterprises tackle data quality challenges. As an Enterprise AI Solutions Specialist, you'll play a crucial role in leading the technical execution of high-priority post-sales engagements and POCs. Your expertise in LLMs and Refuel's capabilities will enable you to build technical guides and...
AI Ecosystem Developer

1 month ago

San Francisco, California, United States Mistral AI Full time

About Mistral AIMistral AI is a pioneering company in the field of artificial intelligence, dedicated to making AI ubiquitous and open. Our team is passionate about AI and strives to create innovative solutions that benefit society.Role SummaryWe are seeking a highly skilled Developer Relations Engineer to join our team. This role will involve actively...

Americas

Europe

Asia / Oceania

Africa

Senior AI Performance Optimization Specialist