High-Performance Computing Professional
3 weeks ago
As a Supercomputing Engineer, you will play a key role in maintaining and improving Tesla's AI infrastructure, ensuring our Full-Self-Driving (FSD), Tesla Bot & Dojo engineering teams have the necessary tools and resources to be productive. This includes managing/operating our AI infrastructure, monitoring compute/GPU/network metrics, Linux troubleshooting & performance tuning, and collaborating with our Data Center team to coordinate the smooth operation of hundreds of servers.
- Main Responsibilities:
- Support the AI/ML cluster infrastructure on both GPU and Dojo platforms, focusing on systems automation, configuration management and deployment at scale
- Improve our monitoring & self-healing pipelines, as well as security posture
- Work with hardware and storage vendors to tune and optimize our server, storage and network performance
- Performance tuning & OS provisioning on Linux systems
- Manage HPC clusters, workloads and applications
- Automation and systems engineering
- Participate in 24x7 on-call rotation
- Proficiency with scripting languages such as Python or Bash
- Proficiency with Linux & network fundamentals
- Experience with configuration management software (Ansible, etc.), systems monitoring & alerting (Prometheus, Grafana, Telegraf, Splunk, etc.) is a plus
- Experience with high-throughput low-latency networks, GPU-based computing systems, and/or high performance storage systems is a plus
- Experience with Slurm, LSF and storage management of parallel file systems is a plus
- Bachelor's Degree in Computer Science, Computer Engineering, Electrical Engineering, Physics or proof of exceptional skills in related field
- 3+ years of additional equivalent experience or evidence of exceptional ability related to the position
Our highly competitive salary range starts at $120,000 per year, with benefits including comprehensive health insurance, retirement plans, and more.
-
High-Performance Computing Specialist
4 weeks ago
Palo Alto, California, United States Criteo Full timeCriteo, a leader in commerce marketing, is building the highest performing and open commerce marketing ecosystem to drive profits and sales for retailers and brands.We are looking for a High-Performance Computing Specialist to join our team. As a key member of our engineering organization, you will play a critical role in designing, implementing, and...
-
High-Performance Computing Engineer
3 weeks ago
Palo Alto, California, United States Foundry Technologies, Inc. Full timeFoundry Technologies, Inc. is seeking a High-Performance Computing Engineer to join our team in Palo Alto, California. As an Infrastructure Engineer, you will collaborate closely with our development team to architect, build, and deploy cutting-edge infrastructure solutions. We offer a competitive salary range of $170,000 - $230,000 per year, depending on...
-
High-Performance Computing Engineer
3 weeks ago
Palo Alto, California, United States SambaNova Systems Full timeKey Responsibilities:Design and implement new features for our runtime/embedded OS stack to support high-performance ML training applicationsWork on system software support for the next generation RDU systemProvide tools and performance profilers for customers to configure and use the Datascale systemQualifications:Bachelor's degree in Computer Science,...
-
Senior High-Performance Computing Professional
3 weeks ago
Palo Alto, California, United States Tesla Full timeJob DescriptionWe are seeking a highly skilled HPC Engineer to join our Supercomputing/AI infrastructure team. In this role, you will be responsible for maintaining and improving our AI infrastructure platform. This includes managing/operating our AI infrastructure, monitoring compute/GPU/network metrics, Linux troubleshooting & performance tuning, and...
-
High-Performance Computing Engineer
3 weeks ago
Palo Alto, California, United States criteo Full timeAbout the RoleAs a High-Performance Computing Engineer at Criteo, you will play a key role in designing and developing software that automates traditional system administration tasks. Our team works on building state-of-the-art technologies to manage billions of ad impressions every day.ResponsibilitiesDesign and develop scalable software systems using...
-
High-Performance Computing Specialist
4 weeks ago
Palo Alto, California, United States Criteo Full timeAbout the Role">We are looking for a skilled Senior Network Engineer to join our global infrastructure team at Criteo.Criteo is a leader in commerce marketing, driving profit and sales for retailers and brands through its high-performing commerce marketing ecosystem.The ideal candidate will have a strong background in datacenter operations, WAN management,...
-
High-Performance Computing Systems Developer
4 weeks ago
Palo Alto, California, United States PsiQuantum Full timeJoin Our TeamPsiQuantum is dedicated to fostering a culture of innovation and excellence. We are committed to delivering cutting-edge solutions in quantum computing and empowering our employees to succeed.Key Skills and QualificationsExpertise in quantum computing and high-performance systems.Experience in software development, particularly in scientific...
-
High-Performance Computing Software Engineer
2 weeks ago
Palo Alto, California, United States Tesla Full timeWe are looking for a high-performance computing software engineer to join our AI team at Tesla. In this role, you will be responsible for developing and maintaining efficient software for neural network training. You will work closely with cross-functional teams to identify areas for improvement and implement performance optimization techniques to reduce...
-
Senior SRE Engineer
2 weeks ago
Palo Alto, California, United States Luma AI Full timeJoin Our TeamLuma AI is a fast-paced, rapidly scaling company that requires experienced professionals like you. As a Senior SRE Engineer - High-Performance Computing, you will collaborate with researchers and engineers to specify availability, performance, correctness, and efficiency requirements of our GPU infrastructure.
-
Palo Alto, California, United States SambaNova Systems Full timeIn this exciting role as a Senior Software Engineer, you will contribute to the development of innovative system software solutions for AI and machine learning applications in high-performance distributed systems. At SambaNova Systems, we value expertise in software engineering, particularly in areas like performance optimization, scalability, and...
-
High Performance Developer
3 weeks ago
Palo Alto, California, United States Criteo Full timeWe are looking for an exceptional High Performance Developer to join our team in Palo Alto, California. As a key member of our platform team, you will be responsible for designing and developing high-quality, maintainable code that meets the needs of our business.You will work closely with our cross-functional teams to ensure seamless integration and...
-
High-Performance Architect
3 days ago
Palo Alto, California, United States Palantir Technologies Full timePalantir Technologies OverviewWe're a data analytics company that helps organizations make better decisions by bringing the right data to the people who need it. Our platforms are used by partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more. We're looking for talented engineers to join our team and...
-
High-Performance vMotion Developer
2 weeks ago
Palo Alto, California, United States Broadcom Corporation Full timeJob DescriptionBroadcom Corporation is a global technology leader that designs, develops, and supplies a broad range of semiconductor and infrastructure software solutions. We are seeking an experienced Principal Software Engineer to join our vMotion team and contribute to the development of our flagship feature.The successful applicant will have a strong...
-
High Performance Systems Engineer
3 weeks ago
Palo Alto, California, United States Clockwork Inc Full timeCompany OverviewClockwork Inc is a pioneering startup in Silicon Valley, revolutionizing computer networking and distributed systems. Founded in 2018 by a group of researchers from Stanford University, our high-precision network clock synchronization system delivers up to nanosecond accuracy at scale, powering mission-critical enterprise applications in...
-
High Performance Network Architect
3 days ago
Palo Alto, California, United States Tesla Motors Full timeJob DescriptionWe are seeking a highly skilled Network Routing Chip Engineer to join our team. The ideal candidate will have a strong background in developing C and Python codes for generating routing tables, as well as testing and validating the functionality and performance of routing algorithms and hardware health. In this role, you will be responsible...
-
High-Performance System Developer
2 weeks ago
Palo Alto, California, United States Clockwork Inc Full time**About Us**Clockwork Inc. is a renowned startup in Silicon Valley, focused on revolutionizing computer networking and distributed systems.We are seeking a highly skilled High-Performance System Developer to contribute to the design and build of our next-generation time-sensitive applications.In this role, you will utilize your expertise in data structures,...
-
High-Performance IC Package Designer
3 weeks ago
Palo Alto, California, United States Tesla Full timeAbout the RoleWe are looking for a talented High-Performance IC Package Designer to join our team at Tesla. As a High-Performance IC Package Designer, you will be responsible for designing IC packages for high-performance computing projects, including Self-Driving Hardware and Dojo Super AI Computer.You will work closely with IC package process, SI/PI,...
-
High-Performance Software Systems Engineer
4 weeks ago
Palo Alto, California, United States Rubrik Full timeAbout the RoleWe are seeking a highly skilled High-Performance Software Systems Engineer to join our team at Rubrik. In this role, you will take full ownership of projects from design to implementation, test and deployment.Your primary focus will be on designing, developing, and delivering hardware and OS abstraction for Rubrik CDM software services. You...
-
High-Performance System Architect
3 weeks ago
Palo Alto, California, United States Gitty Inc. Full timeAbout Gitty Inc.Gitty Inc. is a leading provider of innovative software solutions, based in Palo Alto, CA. We're driven by a passion for delivering exceptional results and exceeding customer expectations.Job OverviewWe're seeking a highly skilled Java Software Engineer to join our backend development team. In this role, you'll design, develop, and maintain...
-
High-Performance Systems Developer
4 weeks ago
Palo Alto, California, United States Criteo Full timeAbout the OpportunityWe are looking for a highly experienced Staff Software Engineer to join our team in Palo Alto. As a key member of the platform team, you will be responsible for designing and developing large-scale, distributed systems that meet the highest standards of performance and scalability.Key ResponsibilitiesDevelop high-quality, maintainable...