Senior High-Performance Computing Professional

7 days ago


Palo Alto, California, United States Tesla Full time
Job Description

We are seeking a highly skilled HPC Engineer to join our Supercomputing/AI infrastructure team. In this role, you will be responsible for maintaining and improving our AI infrastructure platform. This includes managing/operating our AI infrastructure, monitoring compute/GPU/network metrics, Linux troubleshooting & performance tuning, and collaborating with our Data Center team.

Your primary responsibilities will include:

  • Supporting the AI/ML cluster infrastructure on both GPU and Dojo platforms, focusing on systems automation, configuration management, and deployment at scale.
  • Improving our monitoring & self-healing pipelines, as well as security posture.
  • Working with hardware and storage vendors to tune and optimize server, storage, and network performance.
  • Performance tuning & OS provisioning on Linux systems.
  • Managing HPC clusters, workloads, and applications.
  • Automation and systems engineering.
  • Participating in 24x7 on-call rotation.

  • Senior SRE Engineer

    17 hours ago


    Palo Alto, California, United States Luma AI Full time

    Join Our TeamLuma AI is a fast-paced, rapidly scaling company that requires experienced professionals like you. As a Senior SRE Engineer - High-Performance Computing, you will collaborate with researchers and engineers to specify availability, performance, correctness, and efficiency requirements of our GPU infrastructure.


  • Palo Alto, California, United States Tesla Full time

    As a Supercomputing Engineer, you will play a key role in maintaining and improving Tesla's AI infrastructure, ensuring our Full-Self-Driving (FSD), Tesla Bot & Dojo engineering teams have the necessary tools and resources to be productive. This includes managing/operating our AI infrastructure, monitoring compute/GPU/network metrics, Linux troubleshooting &...


  • Palo Alto, California, United States Criteo Full time

    About the Role">We are looking for a skilled Senior Network Engineer to join our global infrastructure team at Criteo.Criteo is a leader in commerce marketing, driving profit and sales for retailers and brands through its high-performing commerce marketing ecosystem.The ideal candidate will have a strong background in datacenter operations, WAN management,...


  • Palo Alto, California, United States Criteo Full time

    Criteo, a leader in commerce marketing, is building the highest performing and open commerce marketing ecosystem to drive profits and sales for retailers and brands.We are looking for a High-Performance Computing Specialist to join our team. As a key member of our engineering organization, you will play a critical role in designing, implementing, and...


  • Palo Alto, California, United States Foundry Technologies, Inc. Full time

    Foundry Technologies, Inc. is seeking a High-Performance Computing Engineer to join our team in Palo Alto, California. As an Infrastructure Engineer, you will collaborate closely with our development team to architect, build, and deploy cutting-edge infrastructure solutions. We offer a competitive salary range of $170,000 - $230,000 per year, depending on...


  • Palo Alto, California, United States SambaNova Systems Full time

    In this exciting role as a Senior Software Engineer, you will contribute to the development of innovative system software solutions for AI and machine learning applications in high-performance distributed systems. At SambaNova Systems, we value expertise in software engineering, particularly in areas like performance optimization, scalability, and...


  • Palo Alto, California, United States SambaNova Systems Full time

    Key Responsibilities:Design and implement new features for our runtime/embedded OS stack to support high-performance ML training applicationsWork on system software support for the next generation RDU systemProvide tools and performance profilers for customers to configure and use the Datascale systemQualifications:Bachelor's degree in Computer Science,...


  • Palo Alto, California, United States CV Library Full time

    **Job Summary**Career Opportunities at CV LibraryWe are seeking a skilled High Performance Computing Systems Engineer to join our team. As a key member of our HPC infrastructure team, you will play a critical role in building and operating the computational technology backbone of the company.The ideal candidate will have experience in managing...


  • Palo Alto, California, United States criteo Full time

    About the RoleAs a High-Performance Computing Engineer at Criteo, you will play a key role in designing and developing software that automates traditional system administration tasks. Our team works on building state-of-the-art technologies to manage billions of ad impressions every day.ResponsibilitiesDesign and develop scalable software systems using...


  • Palo Alto, California, United States PsiQuantum Full time

    Join Our TeamPsiQuantum is dedicated to fostering a culture of innovation and excellence. We are committed to delivering cutting-edge solutions in quantum computing and empowering our employees to succeed.Key Skills and QualificationsExpertise in quantum computing and high-performance systems.Experience in software development, particularly in scientific...


  • Palo Alto, California, United States Tesla Full time

    We are looking for a high-performance computing software engineer to join our AI team at Tesla. In this role, you will be responsible for developing and maintaining efficient software for neural network training. You will work closely with cross-functional teams to identify areas for improvement and implement performance optimization techniques to reduce...


  • Palo Alto, California, United States Gitty Inc. Full time

    We are looking for a talented Senior Java Developer to join our team at Gitty Inc. in Palo Alto, CA. As a senior developer, you will play a key role in designing and implementing high-performance systems that can handle massive traffic. Your responsibilities will include developing scalable, distributed systems, collaborating with cross-functional teams, and...


  • Palo Alto, California, United States Clockwork Inc Full time

    Company OverviewClockwork Inc is a pioneering startup in Silicon Valley, revolutionizing computer networking and distributed systems. Founded in 2018 by a group of researchers from Stanford University, our high-precision network clock synchronization system delivers up to nanosecond accuracy at scale, powering mission-critical enterprise applications in...


  • Palo Alto, California, United States AISERA Full time

    About AiseraAisera is a leading provider of AI Copilot solutions, utilizing advanced Generative AI to drive business transformation and revenue growth. Our innovative approach combines industry-specific LLMs with human-like experiences to deliver exceptional results. With 400+ integrations and 1200+ prebuilt workflows, our customers achieve remarkable...


  • Palo Alto, California, United States Criteo Full time

    Job DescriptionCriteo is a leader in performance marketing, delivering personalized advertising solutions to businesses worldwide. We're seeking an experienced Senior Developer to join our team in Palo Alto, where you'll design, develop, and deploy high-performance software systems that meet the demands of our fast-paced business.About the RoleSalary:...


  • Palo Alto, California, United States Criteo Full time

    We are looking for an exceptional High Performance Developer to join our team in Palo Alto, California. As a key member of our platform team, you will be responsible for designing and developing high-quality, maintainable code that meets the needs of our business.You will work closely with our cross-functional teams to ensure seamless integration and...


  • Palo Alto, California, United States Criteo Full time

    About the RoleWe are seeking an experienced Senior Cloud Systems Engineer to join our team in Palo Alto. As a key member of our platform team, you will be responsible for designing, developing, and deploying scalable, high-performance systems that meet the demands of our business.


  • Palo Alto, California, United States AISERA Full time

    Aisera, a global leader in AI Copilot solutions, seeks an experienced Senior Performance Engineer to join our team. In this role, you will be responsible for ensuring that our software systems and applications operate with high performance, reliability, and efficiency.Key Responsibilities:Design, develop, and execute performance tests to simulate load,...


  • Palo Alto, California, United States Broadcom Corporation Full time

    Job DescriptionBroadcom Corporation is a global technology leader that designs, develops, and supplies a broad range of semiconductor and infrastructure software solutions. We are seeking an experienced Principal Software Engineer to join our vMotion team and contribute to the development of our flagship feature.The successful applicant will have a strong...


  • Palo Alto, California, United States Clockwork Inc Full time

    **About Us**Clockwork Inc. is a renowned startup in Silicon Valley, focused on revolutionizing computer networking and distributed systems.We are seeking a highly skilled High-Performance System Developer to contribute to the design and build of our next-generation time-sensitive applications.In this role, you will utilize your expertise in data structures,...