Senior HPC Network Engineer: RDMA, GPU Clusters

2 weeks ago


Sunnyvale, United States Institute of Foundation Models Full time

A dedicated research lab is seeking a Network Engineer to design and optimize low-latency, high-bandwidth networking solutions for AI supercomputing clusters. You will work on cutting-edge technologies in collaboration with world-class researchers. The ideal candidate has strong experience with NVIDIA RDMA technologies, networking protocols, and Kubernetes. This role offers a salary range of $200,000 - $400,000 annually, depending on level, and includes comprehensive benefits such as medical plans and a 401K. #J-18808-Ljbffr



  • Sunnyvale, United States Institute of Foundation Models Full time

    About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next generation of AI builders, and drive transformative contributions to a knowledge-driven economy. As part of our team, youll have the opportunity to work on the...


  • Sunnyvale, CA, United States Institute of Foundation Models Full time

    About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next generation of AI builders, and drive transformative contributions to a knowledge-driven economy. As part of our team, you'll have the opportunity to work on the...


  • Sunnyvale, CA, United States Institute of Foundation Models Full time

    About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next generation of AI builders, and drive transformative contributions to a knowledge-driven economy. As part of our team, you'll have the opportunity to work on the...

  • RDMA Ops Engineer

    4 weeks ago


    Sunnyvale, United States Alibaba Cloud Full time

    OverviewWe're seeking a skilled RDMA Ops Engineer to optimize and maintain high-performance networking infrastructure for our computing clusters. This role focuses on building and operating ultra-low latency, high-throughput networks using RDMA technologies to power next-generation computing workloads.ResponsibilitiesDeploy, operate and maintain RDMA-based...

  • RDMA Ops Engineer

    4 days ago


    Sunnyvale, CA, United States Alibaba Cloud Full time

    We're seeking a skilled RDMA Ops Engineer to optimize and maintain high-performance networking infrastructure for our computing clusters. This role focuses on building and operatiing ultra-low latency, high-throughput networks using RDMA technologies to power next-generation computing workloads.Key Responsibilities:• Deploy, operate and maintain RDMA-based...


  • Sunnyvale, CA, United States CMK Resources, Inc. Full time

    CMK Resources is partnering with a fast-scaling AI cloud platform on a high-impact, confidential search. This team is solving cutting-edge infrastructure challenges to support massive-scale AI and HPC workloads. Staff+ Network Engineer to lead architecture and design of next-generation networking systems. This is a contract-to-hire role (6 months) with...


  • Sunnyvale, United States Crusoe Full time

    About this Role Crusoe Cloud Network Engineering team is looking for an ambitious, experienced team player to join our Network Engineering team. The Network Engineering Team is responsible for designing, building, and operating the global edge, backbone, and data center network for High Performance Compute (HPC) Clusters with GPUs. The ideal individual will...


  • Sunnyvale, United States Epoch Biodesign Full time

    LocationSunnyvale, CA - USEmployment TypeFull timeLocation TypeOn‑siteDepartmentCloud EngineeringCrusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.Be a part of the AI revolution with...


  • Sunnyvale, United States CMK Resources, Inc. Full time

    CMK Resources is partnering with a fast-scaling AI cloud platform on a high-impact, confidential search. This team is solving cutting-edge infrastructure challenges to support massive-scale AI and HPC workloads. They are urgently seeking an experienced Staff/Sr. Staff+ Network Engineer to lead architecture and design of next-generation networking...


  • Sunnyvale, CA, United States CEREBRAS SYSTEMS INC. Full time

    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to...