AI/ML Networking Engineer for HPC Infrastructure

15 hours ago


Santa Clara, United States Boson AI Full time

A leading AI technology firm based in California is seeking an experienced Network Engineer to design and manage high-performance networking for AI/ML operations. The role involves configuring InfiniBand networks, optimizing performance, and ensuring security. Ideal candidates have over 4 years of experience, strong knowledge of networking protocols, and hands-on skills with high-speed networking. Join a team at the forefront of technology and contribute to scaling ambitious workloads.
#J-18808-Ljbffr



  • Santa Clara, United States Boson AI Full time

    About The Role We’re seeking an experienced Network Engineer to design, build, and optimize the high-performance networking infrastructure powering our AI/ML operations in Toronto. You’ll work at the cutting edge of network technology—managing InfiniBand and ultra-high-speed Ethernet fabrics that connect NVIDIA H100 and A100 GPUs, over 20PB of Ceph...


  • Santa Clara, California, United States Boson AI Full time $150,000 - $250,000

    About The RoleWe're seeking an experienced Network Engineer to design, build, and optimize the high-performance networking infrastructure powering our AI/ML operations in Toronto. You'll work at the cutting edge of network technology—managing InfiniBand and ultra-high-speed Ethernet fabrics that connect NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage,...


  • Santa Clara, United States Boson AI Full time

    About The Role We're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around—our Toronto datacenter packed with NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage, terabit networking, and hundreds of servers. You'll be hands‑on with the full lifecycle of HPC infrastructure: planning, building, testing,...


  • Santa Clara, United States Boson AI Full time

    About The Role We’re looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around—our Toronto datacenter packed with NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage, terabit networking, and hundreds of servers. You’ll be hands‑on with the full lifecycle of HPC infrastructure: planning, building,...


  • Santa Barbara, United States Hewlett Packard Enterprise Full time

    A leading edge-to-cloud solutions company is seeking an AI/ML Engineer III in Santa Barbara, California. This full-time role involves researching and developing advanced networking technologies for AI and HPC solutions. The ideal candidate will possess a Doctoral degree in Computer Science or Electrical Engineering and a strong programming background,...

  • Network Engineer

    2 weeks ago


    Santa Clara, CA, United States Diverse Lynx Full time

    Top Skills: Data Center & AI Cluster Networking • High-performance interconnects - GPU, HPC, AI clusters • InfiniBand, Ultra Ethernet, ROCEv2, DCQCN • Dark Fiber / Carrier Interconnect Optimization Hybrid DC Network Architecture & Fabric Design Job Description/Responsibilities: This is a hands-on network engineering position focused on the...

  • Network Engineer

    2 weeks ago


    Santa Clara, CA, United States Diverse Lynx Full time

    Top Skills: Data Center & AI Cluster Networking • High-performance interconnects - GPU, HPC, AI clusters • InfiniBand, Ultra Ethernet, ROCEv2, DCQCN • Dark Fiber / Carrier Interconnect Optimization Hybrid DC Network Architecture & Fabric Design Job Description/Responsibilities: This is a hands-on network engineering position focused on the...

  • Network Engineer

    56 minutes ago


    Santa Clara, United States Diverse Lynx Full time

    Top Skills:Data Center & AI Cluster Networking• High-performance interconnects - GPU, HPC, AI clusters• InfiniBand, Ultra Ethernet, ROCEv2, DCQCN• Dark Fiber / Carrier Interconnect OptimizationHybrid DC Network Architecture & Fabric DesignJob Description/Responsibilities:This is a hands-on network engineering position focused on the architecture,...


  • Santa Clara, United States Oracle Full time

    Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks an AI/ML Infrastructure Engineer on the GPU Strategic Customers Engineering team, you will play a critical role in designing, implementing, and maintaining the infrastructure that supports our AI and machine learning initiatives. You will work closely with...


  • Santa Clara, United States Tenstorrent Inc. Full time

    Tenstorrent is leading the industry on cutting‑edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists has developed a...