Reliability Solutions Engineer
4 days ago
**Job Overview**
Luma AI is seeking a highly skilled Reliability Solutions Engineer to join our team. As a key member of our Infrastructure and Research teams, you will be responsible for ensuring the health and reliability of our GPU clusters.
We are looking for someone with a strong background in cloud infrastructure, containerization, and programming/scripting languages. Experience with Kubernetes, Terraform, or CloudFormation is a plus.
The ideal candidate will have excellent problem-solving skills, strong communication and collaboration abilities, and a passion for building scalable and fault-tolerant systems.
**Salary:** $180,000 - $250,000 per year (based on location and experience)
**Benefits:*
- A sizable grant of Luma's equity
Responsibilities:
- Collaborate with researchers and engineers to specify requirements for GPU infrastructure
- Work with cloud providers to scale up/down, maintain, and monitor our GPUs
- Design and implement solutions to ensure scalability and reliability
- Implement monitoring systems to proactively identify issues
- Participate in an on-call rotation to respond to critical incidents
-
Reliable Mechanical Solutions Specialist
5 days ago
Palo Alto, California, United States Tesla Full timeJob OverviewTesla is seeking a talented Mechanical Design Engineer to develop innovative and reliable mechanical solutions for our next-generation computer systems. This role requires strong expertise in mechanical design, analysis, and simulation, as well as excellent communication and collaboration skills.Key ResponsibilitiesDesign and develop mechanical...
-
Reliability Engineering Professional
5 days ago
Palo Alto, California, United States Tesla Full time**About the Role:**Tesla is looking for a highly motivated Reliability Engineering Professional to join our team. As a key member of our engineering group, you will play a crucial role in ensuring the reliability of our innovative products.This position offers an exciting opportunity to contribute to the development of cutting-edge technology and shape the...
-
Palo Alto, California, United States Tesla Full timeJob DescriptionWe are seeking an experienced Electronics Reliability Specialist to join our team at Tesla. As a key member of our energy storage and electronics reliability team, you will play a critical role in enhancing the reliability of our innovative energy solutions.You will be responsible for conducting in-depth failure analysis and investigating the...
-
Reliability Engineer for Distributed Systems
2 weeks ago
Palo Alto, California, United States Tesla Full timeCompany OverviewTesla is a leading electric vehicle manufacturer accelerating the world's transition to sustainable energy. Our mission-critical systems enable our engineers to design and develop innovative solutions.Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our Design Technology Operations team. This position will be...
-
Palo Alto, California, United States Wing Inflatables, Inc. Full timeAbout Wing:We are a technology company pushing the boundaries of drone delivery. Our mission is to create a scalable and sustainable solution for last mile logistics.Our team is dedicated to designing and building highly automated delivery drones, which transport small packages directly from businesses to homes on-demand, in minutes. We operate our aircraft...
-
Palo Alto, California, United States Wing Inflatables, Inc. Full timeAbout Wing:We are a pioneer in drone delivery technology, offering a safe, fast, and sustainable solution for last mile logistics. Our mission is to create the preferred means of delivery for the planet by building a workforce that's representative of the global communities we serve.Our Design for Excellence (DFX) team in Palo Alto, California, is seeking a...
-
Site Reliability Infrastructure Engineer
5 days ago
Palo Alto, California, United States Assured Full timeAbout Assured">At Assured, we modernize insurance by providing software solutions to large insurers. We empower them to win in a technology-driven world with self-service claim filing software and backend fraud detection.">Job Overview">We are looking for a Site Reliability Engineer to join our team. The ideal candidate will have experience working in a...
-
Reliability Expert
1 week ago
Palo Alto, California, United States Wing Aviation Full timeAbout Wing AviationWe're revolutionizing last-mile logistics with drone delivery. Our technology is designed to be easy to integrate into existing networks, offering a scalable solution for businesses worldwide.Job OverviewWe're seeking a Reliability Engineer to join our Hardware Reliability team in Palo Alto, CA.Responsibilities:Define and execute design...
-
Site Reliability Engineering Manager
5 days ago
Palo Alto, California, United States Plume Full timeAbout the JobThe Technical Manager will lead a team of Site Reliability Engineers, providing technical guidance and oversight. Key responsibilities include:Supervise a team of Site Reliability Engineers who provide first-line support to Customer Clouds.Attend and conduct customer Meetings for Project and Roadmap specification.Manage growth and performance of...
-
Design for Reliability Expert
5 days ago
Palo Alto, California, United States Tesla Full timeWe are looking for an exceptional Mechanical Reliability Engineer to join our Design for Reliability team at Tesla. As a key member of this team, you will be responsible for designing reliability into the mechanical components and sub-systems of our Tesla Bot.What You'll DoAssess Product Risks and Identify Failure Modes: Work in cross-functional settings to...
-
Palo Alto, California, United States InDepth Engineering Solutions, LLC Full timeAbout InDepth Engineering Solutions, LLCWe are a leading provider of cutting-edge autonomy hardware solutions, pushing the boundaries of innovation in the automotive industry. Our team is dedicated to delivering high-quality test software and validation frameworks that ensure the reliability and performance of our SoC solutions.
-
Palo Alto, California, United States Tesla Full timeJob Description:We are seeking an experienced Electrical Engineer to join our team at Tesla, focusing on designing and developing innovative power electronics solutions for our powertrains and energy products. The ideal candidate will have a strong understanding of power converter topologies, high-voltage power semiconductor selection, magnetics, and...
-
AI/ML System Reliability Expert
5 days ago
Palo Alto, California, United States Luma AI Full time**Job Description**We are seeking a highly skilled AI/ML System Reliability Expert to join our team at Luma AI. As a key member of our Infrastructure and Research teams, you will be responsible for ensuring the health and reliability of our GPU clusters.The ideal candidate will have a strong background in AI/ML system reliability, cloud infrastructure, and...
-
Palo Alto, California, United States Amazon Full timeOverviewAmazon Advertising is dedicated to driving measurable outcomes for brand advertisers, agencies, authors, and entrepreneurs. Our ad solutions leverage Amazon's innovations and insights to find, attract, and engage intended audiences throughout their daily journeys.Salary RangeThe base pay for this position ranges from $151,300/year in our lowest...
-
Technical Site Reliability Engineering Leader
2 weeks ago
Palo Alto, California, United States Plume Full timeAbout the CompanyPlume is a leader in the smart home and small business market, delivering services to over 50 million locations globally. Our software-defined network platform allows CSPs to decouple their service offerings from hardware and rapidly curate and deliver new services over a multi-vendor, open-platform architecture.We're looking for a seasoned...
-
Palo Alto, California, United States oilandgas Full timeJob Description:In this challenging role as a Senior Reliability Specialist, you will play a pivotal part in driving exceptional reliability into Tesla's energy systems.Responsibilities:Develop and communicate reliability targets for site, product, subsystem, and components to ensure seamless integration.Create Fault Trees and reliability block diagrams to...
-
Electronics Reliability Specialist
5 days ago
Palo Alto, California, United States Tesla Full timeJob OverviewIn this role as a Senior Electronics Reliability Engineer, you will play a key part in enhancing the reliability of our innovative Energy and Charging products. You will be responsible for conducting in-depth failure analysis and investigating the underlying mechanisms of electronic failures within our Industrial Energy, Residential Energy,...
-
Palo Alto, California, United States Tesla Full timeJob SummaryWe are seeking an experienced Industrialization Engineer to join our Energy Electromechanical team at Tesla. In this role, you will be responsible for the industrialization of critical components, ensuring they meet the highest quality and reliability standards.Main ResponsibilitiesIndustrialization Activities: Plan, organize, and direct...
-
Site Reliability Engineer
2 weeks ago
Palo Alto, California, United States Tesla Full timeRole DescriptionThis is a challenging opportunity to work with cutting-edge technology and contribute to the development of automation tools. As a Site Reliability Engineer, you will drive root cause analysis of system failures, manage containerization technology, and maintain site performance using various tools.Expected CompensationThe estimated annual...
-
Cell Reliability Specialist
4 days ago
Palo Alto, California, United States Tesla Full timeRole OverviewTesla is seeking a highly skilled Cell Reliability Specialist to join our team. As a key member of our engineering group, you will play a crucial role in developing and implementing strategies to ensure the reliability and performance of our battery cells.In this role, you will be responsible for guiding the development of new cell technologies...