AI Cluster Architect
1 day ago
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
THE ROLE:
We are looking for a dynamic, energetic Lead/Principal HPC Cluster Network Architect to join our growing team. As a key contributor to the success of AMD's product, you will be part of a leading team to drive and improve AMD's abilities to deliver the highest quality, industry-leading technologies to market. AMD's Systems Design Engineering team fosters and encourages continuous technical innovation to showcase successes as well as facilitate continuous career development.
THE PERSON:
The Cluster Network Architect plays a critical role in shaping the future of AI/ML training and inferencing systems as they move into the Ethernet era. This individual will collaborate with a broad range of internal and external partners, including NIC, Switch, and Software Enablement teams, to integrate state-of-the-art technology solutions that pave the way for ethernet to be used as a viable network technology for the GPU-to-GPU communication required during AI inferencing and training.
KEY RESPONSIBILITIES:
- Designing state of the art cluster network architectures for large AI/ML training and inferencing systems which can be optimized for hyperscale capabilities
- Engage with AMD customer base while aligning system and networking architecture
- Standardize ethernet network architectures and best practices for GPU-to-GPU communication for deep learning and AI workloads using Infiniband and Ethernet technologies
- Co-design new Ethernet technology with AMD partner companies to build the next generation of AI cluster networks
- Pioneering system and container networking strategies to facilitate seamless operation and scaling of AI clusters
- Developing scalable AI/ML training and inferencing communication network reference architectures for each generation of AMD AI/ML products
- Serve as chief network engineer on projects supporting Partner OEM co-design of AI/ML clusters
- Participate in design phase of each AMD AI/ML GPU generation by developing cluster communication network architectures and requirements
- Collaborate across AMD internal and external partner teams to improve communication performance for AMD AI/ML clusters
PREFERRED EXPERIENCE:
- In-depth knowledge and experience with network topologies such as Rail and Fat Tree, and technologies including Infiniband, RDMA, RoCE, NVLINK, and PCIe
- Expertise in network security, automation, and visualization, along with a solid understanding of OSI network models and TCP/IP suites
- Professional certifications such as Cisco CCNA, CCNP, CCIE, CompTIA Network+, and Arista ACE are highly regarded
- Extensive real world experience designing hyperscale ethernet networks
- Expert in the TCP/IP protocol and it's application
- Strong analytical/problem-solving skills and pronounced attention to details
- Must be a self-starter, and able to independently drive tasks to completion
ACADEMIC CREDENTIALS:
- Master's or PhD degree preferred in Mathematics, Statistics, Electrical Engineering, Computer Engineering, or a related computational field; equivalent experience also considered.
Location: The role could be Hybrid or Remote
#LI-TL1
Benefits offered are described: AMD benefits at a glance.
AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.
-
Austin, Texas, United States AMD Full time $120,000 - $180,000 per yearWHAT YOU DO AT AMD CHANGES EVERYTHINGAt AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
AI Validation Engineer
7 days ago
Austin, Texas, United States AMD Full time $120,000 - $180,000 per yearWHAT YOU DO AT AMD CHANGES EVERYTHINGAt AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
AI/ML Architect
7 days ago
Austin, Texas, United States Wipro Full time $80,000 - $158,000 per yearJob description:AI/ML ArchitectOverview:As an AI/ML Lead / Architect you will design and implement advanced language models that power customer service automation, personalized shopping experiences, and operational insights. You will build, fine-tune, and deploy solutions such as chatbots, voice assistants, search optimization, and sentiment analysis. Your...
-
Application Architect with Gen AI experience
4 days ago
Austin, Texas, United States LTIMindtree Full time $120,000 - $180,000 per yearAbout Us:LTIMindtree is a global technology consulting and digital solutions company that enables enterprises across industries to reimagine business models, accelerate innovation, and maximize growth by harnessing digital technologies. As a digital transformation partner to more than 700+ clients, LTIMindtree brings extensive domain and technology expertise...
-
AI/HPC Data Center Rack Design Engineer
7 days ago
Austin, Texas, United States Advanced Micro Devices, Inc Full time $120,000 - $180,000 per yearWHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create...
-
Austin, Texas, United States Google Full time $177,000 - $263,000 per yearThe application window will be open until at least October 20, 2025. This opportunity will remain online based on business needs which may be before or after the specified date.Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Washington D.C., DC, USA; Atlanta, GA, USA; Austin, TX,...
-
Conversational AI Bot Architect
23 hours ago
Austin, Texas, United States TTEC Digital Full timeAt TTEC Digital, we coach clients to ensure their employees feel valued, and fully supported, because an amazing customer experience is an employee first process. Our vision is the same, a place where employees know they can thrive.As a Conversational AI Bot Architect, you will be responsible for providing technical solutions to meet the various needs of our...
-
Product Manager – Enforcement Infrastructure
5 days ago
Austin, Texas, United States Elloe AI Full time $150,000 - $200,000 per yearFull-time | Remote | Product Ops | Reports to CTO About Elloe Elloe is the trust layer for AI. We sit between the world's most powerful language models and the institutions that can't afford to get it wrong — hospitals, banks, regulators. We trace and block failures in real time. That's not marketing — we're deployed at the European Commission, with NIH...
-
Knowledge Graph AI/ML
6 days ago
Austin, Texas, United States Wipro Full time $60,000 - $135,000 per yearJob description:Knowledge Graph EngineerOverview:We are seeking aData Scientist / Knowledge Graph Engineer with deep expertise insemantic graph analytics,AI-drivenanomaly detection, and large language models (LLMs). This individual will serve as a technical pioneer, designing, implementing, and validating novel methodologies to transform machine log data...
-
Principal Performance Engineer
7 days ago
Austin, Texas, United States Arm Full time $200,000 - $300,000 per yearJob ID Date posted Oct. 28, 2025Location Austin, TexasCategory Hardware EngineeringArm technology is becoming the platform of choice for compute and AI. The Arm System Engineering team's mission is to architect, design, and develop server and rack-level infrastructure for at-scale datacenter deployments. The team capabilities span across system hardware,...