AI/ML Networking Engineer for HPC Infrastructure
15 hours ago
A leading AI technology firm based in California is seeking an experienced Network Engineer to design and manage high-performance networking for AI/ML operations. The role involves configuring InfiniBand networks, optimizing performance, and ensuring security. Ideal candidates have over 4 years of experience, strong knowledge of networking protocols, and hands-on skills with high-speed networking. Join a team at the forefront of technology and contribute to scaling ambitious workloads.
#J-18808-Ljbffr
-
Network Engineer, AI/ML Infrastructure
1 day ago
Santa Clara, United States Boson AI Full timeAbout The Role We’re seeking an experienced Network Engineer to design, build, and optimize the high-performance networking infrastructure powering our AI/ML operations in Toronto. You’ll work at the cutting edge of network technology—managing InfiniBand and ultra-high-speed Ethernet fabrics that connect NVIDIA H100 and A100 GPUs, over 20PB of Ceph...
-
Network Engineer, AI/ML Infrastructure
23 hours ago
Santa Clara, California, United States Boson AI Full time $150,000 - $250,000About The RoleWe're seeking an experienced Network Engineer to design, build, and optimize the high-performance networking infrastructure powering our AI/ML operations in Toronto. You'll work at the cutting edge of network technology—managing InfiniBand and ultra-high-speed Ethernet fabrics that connect NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage,...
-
Site Reliability Engineer, AI/ML Infrastructure
19 hours ago
Santa Clara, United States Boson AI Full timeAbout The Role We're looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around—our Toronto datacenter packed with NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage, terabit networking, and hundreds of servers. You'll be hands‑on with the full lifecycle of HPC infrastructure: planning, building, testing,...
-
Site Reliability Engineer, AI/ML Infrastructure
3 weeks ago
Santa Clara, United States Boson AI Full timeAbout The Role We’re looking for a Senior Site Reliability Engineer to help us run one of the most exciting GPU clusters around—our Toronto datacenter packed with NVIDIA H100 and A100 GPUs, over 20PB of Ceph storage, terabit networking, and hundreds of servers. You’ll be hands‑on with the full lifecycle of HPC infrastructure: planning, building,...
-
Senior AI/ML Engineer, Networking
1 day ago
Santa Barbara, United States Hewlett Packard Enterprise Full timeA leading edge-to-cloud solutions company is seeking an AI/ML Engineer III in Santa Barbara, California. This full-time role involves researching and developing advanced networking technologies for AI and HPC solutions. The ideal candidate will possess a Doctoral degree in Computer Science or Electrical Engineering and a strong programming background,...
-
Network Engineer
2 weeks ago
Santa Clara, CA, United States Diverse Lynx Full timeTop Skills: Data Center & AI Cluster Networking • High-performance interconnects - GPU, HPC, AI clusters • InfiniBand, Ultra Ethernet, ROCEv2, DCQCN • Dark Fiber / Carrier Interconnect Optimization Hybrid DC Network Architecture & Fabric Design Job Description/Responsibilities: This is a hands-on network engineering position focused on the...
-
Network Engineer
2 weeks ago
Santa Clara, CA, United States Diverse Lynx Full timeTop Skills: Data Center & AI Cluster Networking • High-performance interconnects - GPU, HPC, AI clusters • InfiniBand, Ultra Ethernet, ROCEv2, DCQCN • Dark Fiber / Carrier Interconnect Optimization Hybrid DC Network Architecture & Fabric Design Job Description/Responsibilities: This is a hands-on network engineering position focused on the...
-
Network Engineer
56 minutes ago
Santa Clara, United States Diverse Lynx Full timeTop Skills:Data Center & AI Cluster Networking• High-performance interconnects - GPU, HPC, AI clusters• InfiniBand, Ultra Ethernet, ROCEv2, DCQCN• Dark Fiber / Carrier Interconnect OptimizationHybrid DC Network Architecture & Fabric DesignJob Description/Responsibilities:This is a hands-on network engineering position focused on the architecture,...
-
Consulting Member of Technical Staff
2 weeks ago
Santa Clara, United States Oracle Full timeDesign, develop, troubleshoot and debug software programs for databases, applications, tools, networks an AI/ML Infrastructure Engineer on the GPU Strategic Customers Engineering team, you will play a critical role in designing, implementing, and maintaining the infrastructure that supports our AI and machine learning initiatives. You will work closely with...
-
Staff Engineer, HPC Infrastructure
20 hours ago
Santa Clara, United States Tenstorrent Inc. Full timeTenstorrent is leading the industry on cutting‑edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists has developed a...