Network Architect

2 weeks ago


San Francisco, CA, United States Together AI Full time
About the Role

Together AI is building the next-generation AI compute platform, and networking is at the center of that mission. As a Network Architect, you will define and evolve the global network architecture that powers our AI training, inference, and research platforms. This is a deeply technical and strategic role: you will own the end-to-end routing, topology, traffic engineering, and control-plane strategy for a global network spanning self-built data centers, partner colo, cloud environments, and high-capacity backbone fabrics.

You will collaborate closely with infrastructure engineering, compute systems, hardware, and operations teams to design architectures that deliver massive east-west bandwidth, low latency, high resiliency, and predictable performance at multi-terabit scale. Your work directly influences how we build, scale, and operate the physical and logical networks that underpin cutting-edge AI workloads.

This is a role for architects who are hands-on enough to validate designs in production, experienced enough to reason about systems at huge scale, and creative enough to develop architectures that don't exist yet.
Responsibilities
  • Define and evolve Together AI's global routing and backbone architecture, spanning self-built data centers, partner colocation sites, PoPs, cloud regions, and interconnect fabrics.
  • Establish the end-to-end topology strategy for high-bandwidth AI workloads: east-west fabrics, spine/superspine/core, DCI, and cross-region interconnect.
  • Design traffic engineering, load balancing, and capacity planning models to ensure low latency, deterministic performance, and fault tolerance at scale.
  • Develop the multicloud interconnect and peering strategy, including BGP policy frameworks, route leak mitigation, and security posture across heterogeneous networks.
  • Architect the control-plane stack for programmability, stability, and automation-including routing design, provisioning, configuration management, and state consistency.
  • Establish foundational observability primitives for a global backbone (telemetry, flow sampling, path validation, synthetic testing, health models).
  • Work closely with compute, storage, hardware, and data platform teams to ensure network design meets the performance demands of distributed AI training workloads.
  • Collaborate with operations and NOC teams to ensure designs are supportable, debuggable, and resilient under real-world failure conditions.
  • Provide architectural direction and mentorship to engineers across the org, influencing long-term strategy for both physical and virtual network domains.
  • Model evolving topologies for next-generation workloads (multi-Tbps east-west, high fan-in/fan-out distributed systems, GPU cluster fabrics).
  • Evaluate and guide the adoption of emerging technologies: advanced optical transport, RoCEv2, high-speed Ethernet fabrics, Infiniband overlays, EVPN/VXLAN, SR-MPLS/SRv6, programmable data planes, and hardware offload.
Requirements
  • Have deep experience designing and operating large-scale GPU clusters or HPC-style compute fabrics, and understand the unique demands these workloads place on network design (east-west dominance, congestion behavior, fan-in/fan-out patterns, loss sensitivity).
  • Are fluent in building high-throughput data center fabrics (leaf-spine/superspine/core) that support tens of thousands of GPUs, multi-terabit east-west traffic, and strict performance SLAs.
  • Have architected or operated RoCEv2 or lossless Ethernet environments at scale-including PFC/ECN tuning, congestion control, and end-to-end stability considerations.
  • Are experienced designing backbone and DCI architectures that support GPU training clusters across multiple regions, interconnect exotic fabrics, and handle high-volume synchronization traffic.
  • Have led architecture for networks spanning multiple clouds, private backbones, and diverse PoPs, and understand how AI workloads behave across these domains.
  • Design with operational realities in mind: observability, capacity modeling, automation, telemetry, and failure-mode analysis for GPU-heavy environments.
  • Are comfortable setting architectural direction in fast-moving environments where compute, storage, and network evolution are tightly coupled.

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $250,000 - $280,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at https://www.together.ai/privacy
  • Network Architect

    1 week ago


    San Francisco, CA, United States University of California Full time

    Network ArchitectIT Data Network Full Time 86223BR Job Summary As the Network Architect, you will occupy a critical role in the strategic development and execution of our network infrastructure, ensuring optimal performance, robust security, and scalability tailored to diverse organizational needs, including initiatives at campuses, medical centers, or the...

  • Sr Network Engineer

    4 days ago


    San Francisco, CA, United States RIT Solutions, Inc. Full time

    Sr Network Engineer / Network Architect 6 Months with extensions likely Hybrid schedule/on-site in San Francisco Responsibilities Design, develop and deploy network infrastructure across WAN, LAN, wireless and remote environments, ensuring it meets the company's needs for performance, reliability, and security. Monitor network performance, conduct...


  • San Francisco, CA, United States World Wide Technology Full time

    Why WWT? Founded in 1990, WWT is a global technology solutions provider leading the AI and Digital Revolution. WWT combines the power of strategy, execution and partnership to accelerate digital transformational outcomes for organizations around the globe. Through its Advanced Technology Center, a collaborative ecosystem of the world's most advanced hardware...


  • San Francisco, CA, United States World Wide Technology Full time

    Why WWT? Founded in 1990, WWT is a global technology solutions provider leading the AI and Digital Revolution. WWT combines the power of strategy, execution and partnership to accelerate digital transformational outcomes for organizations around the globe. Through its Advanced Technology Center, a collaborative ecosystem of the world's most advanced hardware...


  • San Francisco, CA, United States World Wide Technology Full time

    Why WWT? Founded in 1990, WWT is a global technology solutions provider leading the AI and Digital Revolution. WWT combines the power of strategy, execution and partnership to accelerate digital transformational outcomes for organizations around the globe. Through its Advanced Technology Center, a collaborative ecosystem of the world's most advanced hardware...


  • San Francisco, CA, United States World Wide Technology Full time

    Why WWT? Founded in 1990, WWT is a global technology solutions provider leading the AI and Digital Revolution. WWT combines the power of strategy, execution and partnership to accelerate digital transformational outcomes for organizations around the globe. Through its Advanced Technology Center, a collaborative ecosystem of the world's most advanced hardware...


  • San Francisco, CA, United States DocuSign Full time

    Company Overview Docusign brings agreements to life. Over 1.5 million customers and more than a billion people in over 180 countries use Docusign solutions to accelerate the process of doing business and simplify people's lives. With intelligent agreement management, Docusign unleashes business-critical data that is trapped inside of documents. Until now,...


  • San Diego, CA, United States BOOZ, ALLEN & HAMILTON, INC. Full time

    C2 Datacenter Network Architect, Lead The Opportunity: A well-designed network is critical to move data and enable our C2 clients to achieve their missions, but how can an organization make sure their network will fit their current and evolving needs? Crafting the right network with the right equipment and sof tware requires a combination of technical skill...


  • San Diego, CA, United States The Applied Research Laboratory at Penn State University Full time

    A prestigious research laboratory is seeking a Network Architect to implement engineering solutions in support of critical defense missions. This role requires a Bachelors degree and 4+ years of experience, with responsibilities involving the design and management of advanced network solutions. Candidates must possess security clearance and have strong...


  • San Diego, CA, United States Leidos Full time

    Description As a Principal Network and Protocol Architect, you will be responsible for creating and evaluating innovative network concepts across multiple domains, including Wireline, RF, and Free Space Optics. This role requires deep technical expertise in network protocols, security, and cloud technologies, combined with leadership and communication...