Software Development Engineer, HPC Expert

3 days ago


San Francisco, California, United States Magic AI Full time
About Magic AI

Magic AI is a cutting-edge technology company that aims to develop safe and reliable Artificial General Intelligence (AGI). Our mission is to accelerate humanity's progress on the world's most pressing problems by automating research and code generation. We believe in combining frontier-scale pre-training, domain-specific reinforcement learning, ultra-long context, and inference-time compute to achieve this goal.

Job Overview

This role is for a highly skilled Software Development Engineer with expertise in High-Performance Computing (HPC) to join our Supercomputing Platform & Infrastructure team. You will design and build resilient and optimized solutions for AI workloads on massive computing clusters.

Key Responsibilities
  • Collaborate closely with training and inference teams to deliver high performance and reliability across storage, networking, and distributed computing designs.
  • Develop the software stack to run massive-scale (thousands of GPUs), highly available supercomputing infrastructure.
  • Troubleshoot and resolve complex issues across hardware accelerated devices, networking, storage subsystems (local NVMe/Block Storage/NFS), OS, drivers, and cloud environments, and automate detection and recovery processes.
  • Operate data-intensive workloads at petabyte-scale.
  • Increase the ease-of-use and self-serviceability of the compute platforms at Magic through top-notch documentation and developer workflow design.
  • Investigate and resolve incidents across security and availability.
Requirements and Qualifications
  • Experience working with production GPU deployments, data-intensive applications, large-scale model training, and HPC.
  • Strong understanding of networking-, storage-, and data-related technologies.
  • Experience with GCP, AWS, Azure, OCI, or similar cloud platforms.
  • Strong software engineering skills.
  • Strong Infrastructure as Code (IaC) knowledge with extensive experience in Terraform, Pulumi, AWS CDK/CloudFormation, or similar.
Compensation and Benefits
  • Annual salary range: $120,000 - $600,000.
  • Equity is a significant part of total compensation, in addition to salary.
  • 401(k) plan with 6% salary matching.
  • Generous health, dental, and vision insurance for you and your dependents.
  • Unlimited paid time off.
  • Visa sponsorship and relocation stipend to bring you to San Francisco, if possible.
Our Culture
  • Integrity: Words and actions should be aligned.
  • Hands-on: At Magic AI, everyone is building.
  • Teamwork: We move as one team, not N individuals.
  • Focus: Safely deploy AGI. Everything else is noise.
  • Quality: Magic AI should feel like magic.

  • HPC Systems Expert

    7 days ago


    San Francisco, California, United States Hewlett Packard Enterprise Development LP Full time

    Job Title: HPC Systems ExpertAbout the Role:We are seeking a highly skilled HPC Systems Expert to join our team at Hewlett Packard Enterprise Development LP. In this role, you will provide technology consulting to external customers and internal project teams, responsible for delivering part of a detailed technical design that meets customer...


  • San Francisco, California, United States University of California , San Francisco Full time

    **Job Overview**The University of California, San Francisco (UCSF) seeks an HPC Systems Engineer to join our Academic Research Service team. As a key member of the Core HPC team, you will play a vital role in the development, maintenance, and day-to-day operations of our next-generation institutional HPC cluster.**Key Responsibilities**Apply advanced systems...


  • San Diego, California, United States TALENT Software Services Full time

    Job Description: Talent Software Services is seeking a highly motivated and experienced Senior Manager, Memory Procurement to join their team in San Diego, CA. In this role, you will be responsible for developing and driving sourcing strategies for memory products that meet the demands of our high-performance computing solutions.Key...


  • San Diego, California, United States Talent Software Services Full time

    Senior Manager, PCB/ODM Sourcing Job OverviewTalent Software Services is seeking a Senior Manager to lead PCB and ODM procurement for direct placement in San Diego, CA.Job SummaryWe are looking for a highly motivated and experienced Senior Manager to implement standard procurement processes, strategic sourcing, and supply chain design for Machine Learning,...


  • San Diego, California, United States TALENT Software Services Full time

    Talent Software Services is a leading software development company seeking a highly motivated and experienced Senior Manager for PCB/ODM Sourcing. As part of our PCB and ODM procurement team, you will be the go-to person for implementing standard procurement processes, strategic sourcing, and supply chain design for Machine Learning, Networking, and Server...


  • San Francisco, California, United States VamosVentures Full time

    About VamosVenturesVamosVentures is a leading-edge company that empowers its clients to reach new heights. We are on a mission to create innovative solutions that drive success and growth.Role OverviewWe are seeking a talented Software Development Expert to join our team. As a key member of our engineering team, you will be responsible for designing and...


  • San Francisco, California, United States Aclima, Inc. Full time

    Aclima, Inc. is a pioneering company tackling global environmental challenges with cutting-edge technology.About AclimaOur diverse team of pioneers includes scientists, engineers, policy experts, product designers, and field technicians working together to transform data into breathable communities and a healthy planet.Our ValuesWe prioritize positive impact...


  • San Francisco, California, United States Aloden, Inc. Full time

    **About the Role**We are seeking a highly skilled Software Engineering expert to join our team at Aloden, Inc. as a contingent resource for a complex initiative.**Job Summary**In this assignment, you will consult on software engineering initiatives with broad impact and large-scale planning. Your responsibilities will include reviewing and analyzing complex...


  • San Francisco, California, United States Salesforce Inc Full time

    Company Overview:">Salesforce Inc is a world-leading customer relationship management platform that empowers businesses to connect with customers in a whole new way. We are committed to creating a workforce that reflects society through inclusive programs and initiatives.About the Role:">We are seeking a highly skilled Senior Curriculum Developer to join our...


  • San Francisco, California, United States MhyMatch Full time

    At MhyMatch, we are seeking a skilled Software Development Expert to join our dynamic team, focusing on developing scalable and efficient software solutions that leverage cutting-edge technology.About the RoleWe are looking for a highly motivated and experienced developer who is proficient in Python and familiar with a broad range of programming and...


  • San Francisco, California, United States VamosVentures Full time

    Unlock Your Potential as a Software Development ExpertBecome a key contributor to our innovative engineering team and help shape the future of scalable systems. At VamosVentures, we're committed to empowering our engineers with the tools, resources, and support they need to grow their careers.About UsWe're an AI-powered spend platform that helps companies...


  • San Francisco, California, United States Mintlify, Inc. Full time

    Company OverviewMintlify, Inc. is a leading platform that empowers developers worldwide by providing cutting-edge documentation solutions.Compensation and BenefitsWe offer a competitive salary of approximately $160,000 per year, based on industry standards and market conditions.In addition to the base salary, we provide a range of benefits, including...


  • San Diego, California, United States Falconwood Full time

    Job Title: Software Development ExpertFalconwood is seeking a Software Development Expert to research, design, and develop computer and network software or specialized utility programs. This opportunity is contingent upon award and requires an active Secret clearance. The selected candidate will work with other functional disciplines to integrate hardware...


  • San Francisco, California, United States Rippling Full time

    Rippling is a unified workforce platform that empowers businesses to manage their HR and IT needs. We're looking for a skilled Software Development Engineer to join our team and help us accelerate growth through exceptional product experiences.About the RoleAs an engineer on our HRIS Company Team, you'll play a crucial role in massively accelerating...


  • San Francisco, California, United States Magic AI Full time

    Job OverviewMagic AI is revolutionizing the field of artificial intelligence by focusing on building safe and reliable solutions that accelerate human progress.About Magic AIWe strive to be at the forefront of AI development, leveraging cutting-edge technologies to drive innovation and address complex challenges. Our mission is centered around automating...


  • San Francisco, California, United States Google Full time

    Unlock your potential as a software development expert in machine learning with Google. With over 15 years of experience, we're looking for skilled engineers to join our team and contribute to the development of next-generation technologies that change how billions of users connect, explore, and interact with information and one another.Role OverviewWe need...


  • San Francisco, California, United States Unreal Gigs Full time

    Unlock a Rewarding Career as a Senior Financial Software Development Expert at Unreal GigsWe are seeking an exceptional Senior Financial Software Development Expert to join our team at Unreal Gigs. As a key member of our engineering team, you will play a crucial role in designing and developing innovative financial software solutions that empower the future...


  • San Francisco, California, United States Discord Full time

    About This RoleWe're seeking a highly skilled Senior Software Engineer to help us build the next generation of social gaming features on Discord. As a key member of our Discovery organization, you'll drive technical solutions for new products that make Discord the best place to discover and talk about games with your friends.Key Responsibilities:Collaborate...


  • San Jose, California, United States ASML US, LLC Full time

    Introduction to the RoleASML US, LLC brings together talented individuals in science and technology to develop cutting-edge lithography machines that enable the production of faster, cheaper, and more energy-efficient microchips. Our company designs, develops, integrates, markets, and services these advanced machines, which empower our customers – the...


  • San Francisco, California, United States Agile Enterprise Solutions Inc. Full time

    Agile Enterprise Solutions Inc.We are a leading provider of innovative computer software solutions, and we are seeking an experienced Software Development Expert to join our team in downtown Sunnyvale.