Software Engineer, AI Inference Codesign

4 days ago

Palo Alto, CA, United States Tesla Full time

What to Expect

The AI inference co-design team's goal is to take research models and make them run efficiently on our AI-ASIC to power real-time inference for Autopilot and Optimus programs. This unique role lies at the intersection of AI research, compiler development, kernel optimization, math and HW design. You will work extensively with AI engineers and come up with novel techniques to quantize models, improve precision and explore non-standard alternate architectures. You will be developing optimized micro kernels using a cutting-edge MLIR compiler and solve the performance bottlenecks needed to achieve real-time latency needed for self-driving and humanoid robots. You will work closely with the HW team and bring state-of-the-art HW architecture techniques to our next generation HW SoCs.

What You'll Do

Optimize bottlenecks in the inference flow, make precision/performance tradeoff decisions and figure out novel techniques to improve hardware utilization and throughput
Implement/improve highly performant micro kernels for Tesla's AI ASIC
Work with HW teams to shape the next generation of inference hardware, balancing performance with versatility
Bringup core compiler features for current and future versions of hardware to help with design and performance verification
Research and implement state-of-the-art machine learning techniques to achieve high performance on our hardware
Experiment with numerical methods and alternative architectures
Collaborate with the compiler infrastructure for programmability and performance

What You'll Bring

Degree in Engineering, Computer Science or equivalent in experience and evidence of exceptional ability
Proficiency with Python and C++, including modern C++ (14/17/20)
Experience with AI networks, such as CNNs, transformers, and diffusion model architectures, and their performance characteristics
Understanding of GPU, SIMD, multithreading and/or other accelerators with vectorized instructions
Exposure to computer architecture and chip architecture/micro-architecture
Specialized experience in one or more of the following machine learning/deep learning domains: Model compression, hardware aware model optimizations, hardware accelerators architecture, GPU/ASIC architecture, machine learning compilers, high performance computing, performance optimizations, numerics and SW/HW co-design

Compensation and Benefits Benefits

Along with competitive pay, as a full-time Tesla employee, you are eligible for the following benefits at day 1 of hire:

Aetna PPO and HSA plans > 2 medical plan options with $0 payroll deduction
Family-building, fertility, adoption and surrogacy benefits
Dental (including orthodontic coverage) and vision plans, both have options with a $0 paycheck contribution
Company Paid (Health Savings Account) HSA Contribution when enrolled in the High Deductible Aetna medical plan with HSA
Healthcare and Dependent Care Flexible Spending Accounts (FSA)
401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits
Company paid Basic Life, AD&D, short-term and long-term disability insurance
Employee Assistance Program
Sick and Vacation time (Flex time for salary positions), and Paid Holidays
Back-up childcare and parenting support resources
Voluntary benefits to include: critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance
Weight Loss and Tobacco Cessation Programs
Tesla Babies program
Commuter benefits
Employee discounts and perks program

Expected Compensation $132,000 - $330,000/annual salary + cash and stock awards + benefits

Pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. The total compensation package for this position may also include other elements dependent on the position offered. Details of participation in these benefit plans will be provided if an employee receives an offer of employment.

Senior Software Engineer, Inference Platform

16 hours ago

Palo Alto, CA, United States MongoDB Full time

About the Role We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic search, retrieval, and AI-native experiences in MongoDB Atlas. You'll join the broader Search and AI Platform organization and collaborate with ML researchers and engineers from our Voyage.ai acquisition....
Senior Software Engineer, Inference Platform

1 week ago

Palo Alto, CA, United States MongoDB Full time

About the Role We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic search, retrieval, and AI-native experiences in MongoDB Atlas. You'll join the broader Search and AI Platform organization and collaborate with ML researchers and engineers from our Voyage.ai acquisition....
Senior Software Engineer, Inference Platform

4 days ago

Palo Alto, CA, United States MongoDB Full time

About the Role We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic search, retrieval, and AI-native experiences in MongoDB Atlas. You'll join the broader Search and AI Platform organization and collaborate with ML researchers and engineers from our Voyage.ai acquisition....
AI Engineer, Multimodal Model Optimization, Tesla AI

1 day ago

Palo Alto, CA, United States Tesla Full time

What to Expect Scaling transformers, as well as more recent advances in Reinforcement Learning with Verifiable Rewards (RLVR), has created models with Ph-D level intelligence in a wide variety of subject areas - from Math to Social Sciences. Yet these models continue to struggle in real-world physical reasoning, often struggling to tell left from right. At...
AI Engineer, Multimodal Model Optimization, Tesla AI

2 weeks ago

Palo Alto, CA, United States Tesla Full time

What to Expect Scaling transformers, as well as more recent advances in Reinforcement Learning with Verifiable Rewards (RLVR), has created models with Ph-D level intelligence in a wide variety of subject areas - from Math to Social Sciences. Yet these models continue to struggle in real-world physical reasoning, often struggling to tell left from right. At...
AI Engineer, Multimodal Model Optimization, Tesla AI

7 days ago

Palo Alto, CA, United States Tesla Full time

What to Expect Scaling transformers, as well as more recent advances in Reinforcement Learning with Verifiable Rewards (RLVR), has created models with Ph-D level intelligence in a wide variety of subject areas - from Math to Social Sciences. Yet these models continue to struggle in real-world physical reasoning, often struggling to tell left from right. At...
Software Engineer

2 weeks ago

Palo Alto, CA, United States Xai Full time

About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational...
Software Engineer

15 hours ago

Palo Alto, CA, United States Xai Full time

About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational...
Software Engineer

7 days ago

Palo Alto, CA, United States Xai Full time

About xAI xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational...
Lead Engineer, Inference Platform

1 day ago

Palo Alto, CA, United States MongoDB Full time

We're looking for a Lead Engineer, Inference Platform to join our team building the inference platform for embedding models that power semantic search, retrieval, and AI-native features across MongoDB Atlas. This role is part of the broader Search and AI Platform team and involves close collaboration with AI engineers and researchers from our Voyage.ai...

Americas

Europe

Asia / Oceania

Africa

Software Engineer, AI Inference Codesign