Software Engineer-AI/ML, Inference

2 days ago


Seattle, Washington, United States AWS Neuron Full time
AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is responsible for development and performance optimization of core building blocks of LLM Inference - Attention, MLP, Quantization, Speculative Decoding, Mixture of Experts, etc.

The team works side by side with chip architects, compiler engineers and runtime engineers to deliver performance and accuracy on Neuron devices across a range of models such as Llama 3.3 70B, B, DBRX, Mixtral, and so on.

Key job responsibilities
Responsibilities of this role include adapting latest research in LLM optimization to Neuron chips to extract best performance from both open source as well as internally developed models. Working across teams and organizations is key.

About the team
Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we're building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.

- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience programming with at least one software programming language
- Fundamentals of Machine learning models, their architecture, training and inference lifecycles along with work experience on some optimizations for improving the model performance.

- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- Hands-on experience with PyTorch or Jax - preferably involving developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware.

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $129,300/year in our lowest geographic market up to $223,600/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit This position will remain posted until filled. Applicants should apply via our internal or external career site.


  • Seattle, Washington, United States Google Full time $197,000 - $291,000 per year

    Minimum qualifications:Bachelor's degree or equivalent practical experience.8 years of experience in software development.5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture.5 years of experience building and developing large-scale infrastructure, distributed systems or networks, or...


  • Seattle, Washington, United States JPMorgan Chase Full time $200,000 - $250,000 per year

    Be an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top-notch technology products.As a Senior Lead Software Engineer at JPMorgan Chase within the Corporate Sector, Infrastructure Platforms team, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading...


  • Seattle, Washington, United States JPMorgan Chase Full time $200,000 - $250,000 per year

    We have an exciting and rewarding opportunity for you to take your software engineering career to the next level.As a Software Engineer III at JPMorgan Chase within the Corporate Sector, Infrastructure Platforms team, you serve as a seasoned member of an agile team to design and deliver trusted, market-leading technology products in a secure, stable, and...


  • Seattle, Washington, United States Scale AI Full time $179,400 - $310,500

    As a Software Engineer on the ML Infrastructure team, you will design and build the next generation of foundational systems that power all ML Infrastructure compute at Scale - from model training and evaluation to large-scale inference and experimentation.Our platform is responsible for orchestrating workloads across heterogeneous compute environments (GPU,...


  • Seattle, Washington, United States Google Full time $141,000 - $202,000

    Minimum qualifications:Bachelor's degree or equivalent practical experience.2 years of software development experience in one or more general-purpose programming languages (e.g., C++, Java, Python, Go).Experience with the software development life-cycle, including testing, deployment, and maintenance.Experience contributing to the design of software systems...

  • Senior ML Engineer

    26 minutes ago


    Seattle, Washington, United States Truveta Full time

    Senior ML EngineerTruveta is the world's first health provider led data platform with a vision of Saving Lives with Data. Our mission is to enable researchers to find cures faster, empower every clinician to be an expert, and help families make the most informed decisions about their care. Achieving Truveta' s ambitious vision requires an incredible team of...

  • AI/ML Engineer

    1 week ago


    Seattle, Washington, United States Optum Full time

    Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers,...


  • Seattle, Washington, United States Oracle Full time $200,000 - $250,000 per year

    At Oracle Cloud Infrastructure (OCI), we are building the future of cloud computing—designed for enterprises, engineered for performance, and optimized for AI at scale. We are a fast-paced, mission-driven team within one of the world's largest cloud platforms.The Generative AI Service team within OCI is focused on developing infrastructure and tools to...

  • AI Engineer

    1 week ago


    Seattle, Washington, United States Sign AI Full time $100,000 - $190,000 per year

    Well-capitalized startup seeks extremely talented AI Engineers to help us pioneer the future of Sign Language translation. Our vision is to create a human-level AI Sign Language Interpreter available on any device, anywhere. You must have experience architecting ML pipelines, training and evaluating multimodal models, and leveraging AI to do your job faster...


  • Seattle, Washington, United States Apple Full time $171,600 - $302,200 per year

    We're building the foundation for intelligent, adaptive AI systems from multi-agent platforms and RAG pipelines to advanced evaluation and reasoning frameworks. We're looking for a Senior Applied ML Engineer to design, build, and scale machine learning systems that power next-generation AI applications. In this role, you'll work at the intersection of...