Staff Software Engineer, Dojo Datacenter Network Specialist

4 weeks ago


Palo Alto, California, United States Tesla Full time
Job Summary

We are seeking a highly skilled Software Engineer to join our team at Tesla and contribute to the development of our Dojo Datacenter Platform.

As a key member of our infrastructure team, you will design, develop, and deploy software that ensures the reliability, availability, and scalability of our datacenter operations.

You will have a strong focus on network infrastructure and provisioning and will work closely with our Network Engineering team to ensure seamless integration of our software with our network systems.

Additionally, you will work on resource management and distributed storage systems to support our high-performance computing and data analytics workloads.

Responsibilities
  • Contribute to the system and network design for our Dojo datacenter, ensuring alignment with business requirements and industry best practices
  • Develop and implement control paths for Dojo datacenter components, including network and infrastructure elements
  • Collaborate with cross-functional teams to design, develop, and deploy infrastructure software that meets the needs of our datacenter operations
  • Develop and maintain code for infrastructure software, focusing on a variety of areas including: scheduling, scalability, configuration, storage, fault tolerance, storage management & monitoring, distributed systems, and network provisioning & automation
  • Work closely with the operations team to ensure smooth deployment and operation of infrastructure software
  • Participate in the testing and validation of infrastructure software to ensure it meets quality and reliability standards
  • Collaborate with other Engineers to identify and resolve technical issues, and to continuously improve the design and operation of our datacenter infrastructure
Requirements
  • Degree in Computer Science, Electrical Engineering, or related field or equivalent experience
  • 5+ years of experience in software development, with a focus on infrastructure software and datacenter operations
  • Strong programming skills in languages such as Python, Go, or Bash
  • In-depth understanding of network protocols and technologies, including: TCP/IP & OSI model, DNS & DHCP, and BGP
  • Experience with Slurm resource management and job scheduling systems
  • Experience with distributed storage systems, including Ceph, Gluster, or other similar technologies
  • Strong understanding of system design principles, including scalability, availability, and reliability
  • Experience with agile development methodologies and version control systems such as Git
  • Excellent problem-solving skills, with the ability to analyze complex technical issues and develop creative solutions
  • Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams


  • Palo Alto, California, United States Tesla Full time

    We are seeking a highly skilled Software Engineer to contribute to the development of our Dojo Datacenter Platform.As a key member of our infrastructure team, you will design, develop, and deploy software that ensures the reliability, availability, and scalability of our datacenter operations.You will have a strong focus on network infrastructure and...


  • Palo Alto, California, United States Tesla Full time

    Job SummaryWe are seeking a highly skilled Software Engineer to contribute to the development of our Dojo Datacenter Platform.As a key member of our infrastructure team, you will design, develop, and deploy software that ensures the reliability, availability, and scalability of our datacenter operations.ResponsibilitiesDesign and develop software components...


  • Palo Alto, California, United States Tesla Full time

    As a Machine Learning Software Engineer within Dojo, you will play a crucial role in bridging the gap between our cutting-edge Dojo training accelerator and the neural networks developed by our Autopilot ML team. Collaborate closely with world-class ML Researchers, Compiler and Hardware Engineers to tackle unique challenges at the intersection of AI and ML...


  • Palo Alto, California, United States Criteo Full time

    Job DescriptionCriteo is seeking a highly skilled Senior Network Architect to join our infrastructure team. As a key member of our team, you will be responsible for designing, implementing, and operating Criteo's global network of datacenters.Key Responsibilities:Manage operational environments, suppliers, and logistics relationshipsEngineer custom hardware...


  • Palo Alto, California, United States Tesla Full time

    Tesla's Dojo team is seeking a highly skilled VLSI engineer to design and integrate SOCs, IP, circuits, tool flows, and methodologies into systems using advanced technologies.This position entails leading large design blocks and SOCs from early design stage to tape out, floorplanning and partitioning designs to meet area, timing, and power requirements, and...


  • Palo Alto, California, United States Tesla Full time

    As a Software Engineer at Tesla, you will focus on optimizing and scaling our neural network training and auto-labeling infrastructure for Autopilot and the Humanoid robot. Our autonomy capabilities rely on multiple neural networks that the Deep Learning team designs to train on large amounts of data across GPU clusters and our supercomputer Dojo. Reducing...


  • Palo Alto, California, United States Spotnana Technology Full time

    About the RoleWe are seeking a highly skilled Staff Software Engineer to join our team at Spotnana Technology. As a Staff Software Engineer, you will play a critical role in the design and development of high-quality cloud-native services in our platform and products.Key ResponsibilitiesWork with top talent in the design and development of high-quality...


  • Palo Alto, California, United States Spotnana Technology Full time

    Transform the Travel Industry with Spotnana TechnologyAt Spotnana Technology, we're revolutionizing the travel infrastructure with innovative solutions. As a Staff Software Engineer, Backend, you'll play a crucial role in shaping our cloud-native services and products.Key Responsibilities:Design and develop high-quality cloud-native servicesOwn customer...


  • Palo Alto, California, United States Tesla Full time

    As a key member of Tesla's Autopilot AI team, you will play a pivotal role in optimizing and scaling our neural network training infrastructure.You will collaborate with a specialized team of machine learning experts and have access to one of the world's largest model training clusters.Your primary focus will be to design, implement, and maintain...


  • Palo Alto, California, United States Tesla Full time

    About the RoleTesla's AI infrastructure team is seeking a highly skilled HPC Engineer to join our team. As a key member of our team, you will be responsible for maintaining and improving our AI infrastructure to support our Full-Self-Driving (FSD), Tesla Bot & Dojo engineering teams.Key ResponsibilitiesManage and operate our AI infrastructure, including...


  • Palo Alto, California, United States Tesla Full time

    Job Title: HPC Engineer, AI InfrastructureTesla's AI Infrastructure team is responsible for designing and maintaining the high-performance computing systems that power our machine learning algorithms. As an HPC Engineer, you will play a critical role in ensuring the smooth operation of our AI infrastructure, including virtual simulations, Autopilot hardware,...


  • Palo Alto, California, United States Machinify Full time

    Job OverviewMachinify is a leading provider of AI-powered software products that transform healthcare claims and payment operations. The company's revolutionary AI-platform has enabled the development and deployment of industry-specific products that increase the speed and accuracy of claims processing by orders of magnitude.We're seeking a talented Staff...


  • Palo Alto, California, United States Rivian Full time

    About RivianRivian is a pioneering company that's revolutionizing the electric vehicle industry. Our mission is to keep the world adventurous forever, and we're seeking a highly skilled DevOps Engineer to join our team.Role SummaryWe're looking for a seasoned DevOps Engineer to further our DevOps initiatives and drive continuous integration, software...


  • Palo Alto, California, United States Tesla Full time

    Job DescriptionThe Dojo & Self-Driving Hardware teams at Tesla are seeking an IC Package Layout Engineer to design and develop next-generation IC packages for our Self-Driving Hardware and Dojo Super AI Computer projects.ResponsibilitiesDesign and develop IC packages using Cadence APD+ and SiP LayoutPerform feasibility studies, including die floor plan...


  • Palo Alto, California, United States Machinify, Inc. Full time

    Machinify, Inc. is a leading provider of AI-powered software products that transform healthcare claims and payment operations. Our revolutionary AI-platform has enabled us to develop and deploy industry-specific products that increase the speed and accuracy of claims processing by orders of magnitude.We're seeking a Sr/Staff Software Engineer, BE|ML to join...


  • Palo Alto, California, United States Machinify, Inc. Full time

    Machinify is a leading provider of AI-powered software products that transform healthcare claims and payment operations.Our team is responsible for developing and deploying scalable, reliable backend systems that increase the speed and accuracy of claims processing.We're looking for a talented Staff Software Engineer - Backend to join our growing engineering...


  • Palo Alto, California, United States Tesla Full time

    Electrical Distribution Systems Software DeveloperAs a member of the Electrical Distribution Systems (EDS) Software team at Tesla, you will be responsible for developing internal tooling, such as web applications and APIs, that are at the heart of electrical engineering development globally.The EDS team is part of the Low Voltage Architecture and Circuitry...


  • Palo Alto, California, United States Guardant Health Full time

    Job OverviewGuardant Health is a leading precision oncology company seeking a highly skilled HPC Infrastructure Specialist to join its team. The successful candidate will be responsible for designing, implementing, and maintaining the company's high-performance computing infrastructure.The ideal candidate will have a strong background in Linux/Unix...


  • Palo Alto, California, United States Machinify Full time

    About the RoleMachinify is a leading provider of AI-powered software products that transform healthcare claims and payment operations. The company's revolutionary AI-platform has enabled the development and deployment of industry-specific products that increase the speed and accuracy of claims processing by orders of magnitude.We're seeking a Staff Software...


  • Palo Alto, California, United States Yoh - A Day & Zimmerman Company Full time

    Job DescriptionWe are seeking a highly skilled Staff Chassis Controls Integration Engineer to join our engineering team. In this role, you will be responsible for integrating and optimizing chassis control systems, including braking, suspension, and stability control, to enhance vehicle dynamics and safety.Key ResponsibilitiesSystem Integration: Integrate...