Software Development Manager

4 weeks ago


Cupertino, California, United States Haystack Full time

Software Development Manager, LLM Inference Model Enablement, Neuron SDK | Cupertino, California | Remote-Friendly | $166,400 - $287,700
We're working with Annapurna Labs (U.S.) Inc. on this exciting opportunity.

This role offers a unique opportunity to lead a team of expert AI/ML engineers in optimizing and enabling state-of-the-art open-source and customer LLMs on custom AWS machine learning accelerators. You'll drive innovation in model enablement speed and inference usability, working across a vertically integrated system stack that includes PyTorch, Neuron compiler, and runtime.

Key Responsibilities

  • Lead a team of expert AI/ML engineers to onboard and optimize open-source and customer LLMs for inference on Neuron, Trainium, and Inferentia accelerators.
  • Drive improvements in model enablement speed and overall experience.
  • Advance inference usability and quality through new features, infrastructure optimization, tools, and automation.
  • Define and deliver model enablement and performance optimization for the latest state-of-the-art LLMs in collaboration with senior management.

What You'll Need

  • 3+ years of engineering team management experience.
  • Strong background in LLM model architectures, performance optimizations, and inference techniques using distributed inference libraries.
  • Ability to manage demanding, fast-changing priorities in a dynamic environment.
  • Strong technical ability to understand and deliver as part of a vertically integrated system stack including PyTorch inference library, Neuron compiler, runtime, and collectives.

What's On Offer

  • Opportunities for mentorship and career growth within AWS.
  • A focus on work-life harmony and flexibility.

Apply via Haystack today



  • Cupertino, California, United States Haystack Full time

    Software Development Manager, LLM Inference Model Enablement, Neuron SDK | Cupertino, California | Remote-Friendly | $166,400 - $287,700We're working with Annapurna Labs (U.S.) Inc. on this exciting opportunity.Lead a team of expert AI/ML engineers to optimize state-of-the-art open-source and customer LLMs for inference on Neuron, Trainium, and Inferentia...


  • Cupertino, California, United States Amazon Full time $212,700 - $287,700

    AWS Hardware Engineering Services support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we're looking for talented people who...


  • Cupertino, California, United States Amazon Full time

    AWS Hardware Engineering Services support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people...


  • Cupertino, California, United States Amazon Full time

    AWS Hardware Engineering Services support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people...


  • Cupertino, California, United States Amazon Full time

    We're seeking a Software Development Manager to lead our Frameworks team within AWS Neuron, the software stack powering AWS Inferentia and Trainium machine learning accelerators. This role combines technical leadership, team management, and strategic open-source collaboration to shape the future of machine learning acceleration at AWS.As the Software...


  • Cupertino, California, United States Apple Full time $264,514 - $272,100

    Imagine what you can do here. Apple is a place where extraordinary people gather to do their lives best work. Together we create products and experiences people once couldn't have imagined, and now, can't imagine living without. It's the diversity of those people and their ideas that inspires the innovation that runs through everything we do.DescriptionAPPLE...


  • Cupertino, California, United States Apple Full time

    SummaryImagine what you can do here. Apple is a place where extraordinary people gather to do their lives best work. Together we create products and experiences people once couldn't have imagined, and now, can't imagine living without. It's the diversity of those people and their ideas that inspires the innovation that runs through everything we...


  • Cupertino, California, United States Annapurna Labs (U.S.) Inc. Full time

    DESCRIPTIONThe Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The AWS Neuron SDK, developed by the Annapurna Labs team at AWS, is the backbone for accelerating deep learning and...


  • Cupertino, California, United States Amazon Full time $129,300 - $223,600

    Amazon Web Services (AWS) is building a central pipeline of Software Development Engineer (SDE) talent for anticipated roles in 2026. This requisition supports hiring across all AWS SDE positions, from fungible SDE roles to specialized engineering positions in areas including:- Embedded Systems- Game Development - Compiler Engineering - Artificial...


  • Cupertino, California, United States Apple Full time

    SummaryImagine what you can do here. Apple is a place where extraordinary people gather to do their lives best work. Together we create products and experiences people once couldn't have imagined, and now, can't imagine living without. It's the diversity of those people and their ideas that inspires the innovation that runs through everything we...