Optimizing AI Frameworks for Global Scalability

7 days ago


Mountain View, California, United States Microsoft Full time

The AI Frameworks team at Microsoft is responsible for developing cutting-edge software that enables the execution of AI models on a wide range of devices, from powerful supercomputers to servers, desktops, mobile phones, IoT devices, and web browsers. Collaboration with hardware teams and partners allows us to build tailored software stacks for novel AI accelerators.

By working closely with machine learning researchers and developers, we optimize and scale out model training and inference processes. Our team operates at the intersection of AI algorithmic innovation, purpose-built AI hardware, systems, and software.

We own the inference performance of OpenAI and other state-of-the-art Large Language Models (LLMs), working directly with OpenAI to host these models on the Azure OpenAI service, which serves massive workloads in major Microsoft products like Office, Windows, Bing, SQL Server, and Dynamics.

This role involves working on multiple levels of the AI software stack, including fundamental abstractions, programming models, compilers, runtimes, libraries, and APIs to facilitate large-scale model training and inference. You will benchmark OpenAI and other LLMs for performance on GPUs and Microsoft hardware, debug and optimize performance, monitor performance, and enable model deployment within the shortest timeframe and least amount of hardware possible, contributing to Microsoft Azure's capital expenditure goals.

This is a technical position that requires hands-on software design and development skills. We're seeking someone with a proven track record of solving complex technical problems and a motivation to tackle challenging tasks in building a comprehensive end-to-end AI stack.

Key Responsibilities:

  1. Drive improvements to end-to-end inference performance of OpenAI and other state-of-the-art LLMs.
  2. Benchmark performance on Nvidia/AMD GPUs and first-party Microsoft silicon.
  3. Optimize and monitor LLM performance, develop SW tooling for performance insights, and reduce computing fleet footprint to achieve Azure AI capex goals.
  4. Enable fast time-to-market for LLMs/models by building SW tools for velocity in porting models on new Nvidia, AMD GPUs, and Maia silicon.
  5. Design, implement, and test functions/components for our AI/DNN/LLM frameworks and tools.
  6. Streamline key components/pipelines to improve performance/effectiveness of our systems.
  7. Collaborate with internal and external partners.

Requirements:

  1. Bachelor's Degree in Computer Science or related field AND 4+ years of technical engineering experience with coding in languages such as C, C++, or Python OR equivalent experience.
  2. 2+ years' practical experience working on High-Performance Applications and Performance Debug/Optimization on CPUs/GPUs.

Preferred Qualifications:

  1. Master's Degree in Computer Science or related field AND 6+ years of technical engineering experience with coding in languages such as C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related field AND 8+ years of technical engineering experience with coding in languages such as C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  2. Technical background and solid foundation in software engineering principles, computer architecture, GPU architecture, HW neural net acceleration.
  3. Experience in end-to-end performance analysis and optimization of state-of-the-art LLMs, HPC applications, including proficiency using GPU profiling tools.
  4. Experience in DNN/LLM inference and familiarity with one or more DL frameworks such as PyTorch, Tensorflow, or ONNX Runtime and knowledge of CUDA, ROCm, Triton.
  5. Cross-team collaboration skills and a willingness to collaborate with researchers and developers.
  6. Experience in working with orchestration platforms like K8 and Service Fabric.
  7. 2+ years of experience with Deep Learning and AI Infrastructure, including Diagnostic, Profiling, and Performance Analysis Tools.

The estimated salary for this position ranges between $117,200 and $229,200 per year, depending on the specific location. Certain roles may be eligible for additional benefits and compensation. Find more information here: https://careers.microsoft.com/us/en/us-corporate-pay. Microsoft accepts ongoing applications and offers for these roles. The company is an equal opportunity employer and provides a supportive environment for all qualified applicants.



  • Mountain View, California, United States Waymo Full time

    About the JobWe are seeking a highly skilled Distributed Machine Learning Architect to join our team at Waymo. In this role, you will be responsible for designing and implementing a scalable and reliable distributed training infrastructure that can handle large-scale machine learning workloads.Our ideal candidate will have experience with distributed systems...


  • Mountain View, California, United States Inworld AI Full time

    Inworld AI is seeking a highly skilled Artificial Intelligence Systems Engineer to join our team. This role offers an exceptional opportunity to work on cutting-edge ML projects and contribute to the development of innovative AI products.Key ResponsibilitiesDesign, develop, and implement scalable AI systems using modern frameworks and tools.Collaborate with...


  • Mountain View, California, United States Global Technology Associates Full time

    Job DescriptionWe are seeking a highly motivated and passionate AI Research Engineer to join our team at Global Technology Associates. This is an exciting opportunity to work on cutting-edge research in AI modeling optimization and related software systems for next-gen mobility services.The primary responsibility of this role is to conduct research in AI...


  • Mountain View, California, United States Microsoft Corporation Full time

    At Microsoft Corporation, we are committed to empowering individuals and organizations to achieve their full potential through technology. Our AI Frameworks team is seeking a talented Senior Software Engineer- GPU to join our ranks and contribute to the development of cutting-edge AI software.In this role, you will work closely with our hardware teams and...


  • Mountain View, California, United States Microsoft Corporation Full time

    About the RoleWe are seeking a highly skilled Senior Software Engineer to join our AI Frameworks team at Microsoft. As a member of this team, you will have the opportunity to work on multiple levels of the AI software stack, including fundamental abstractions, programming models, compilers, runtimes, libraries and APIs to enable large scale training and...

  • AI Researcher

    1 week ago


    Mountain View, California, United States Contextual AI Full time

    Job DescriptionThe role of an AI Researcher at Contextual AI involves working on cutting-edge research projects and applying that knowledge to production systems for RAG 2.0 (Contextual Language Models + Fine Tuning + Alignment).You will be responsible for:Working on and researching state-of-the-art retrieval augmented language models as well as fine tuning...


  • Mountain View, California, United States Contextual AI Full time

    Promote Accuracy and Relevance with Contextual AIWe are seeking a highly skilled Prompt Optimization Specialist to join our team at Contextual AI. As a key member of our RAG 2.0 development team, you will apply discipline to the process of working with Large Language Models (LLMs) by designing and optimizing prompts that guide these models to produce...


  • Mountain View, California, United States Inworld AI Full time

    About Inworld AIInworld AI is a leading provider of cutting-edge AI technology, empowering developers to create immersive and interactive experiences. With a $500 million valuation and backing from top-tier investors, we're at the forefront of revolutionizing the gaming industry.We're seeking an experienced Machine Learning Engineer to join our team in...

  • Senior Data Engineer

    4 weeks ago


    Mountain View, California, United States Gatik AI Full time

    About the RoleWe are seeking a skilled Data Engineer to join our AV Infrastructure & DataOps team at Gatik AI, focusing on critical automation tools and pipelines for our autonomous vehicle software stack.This team is at the forefront of our efforts to streamline and optimize our processes for the development, validation & deployment of Gatik's autonomous...

  • AI Prompt Engineer

    4 weeks ago


    Mountain View, California, United States Contextual AI Full time

    About the RoleContextual AI's cutting-edge technology enables the development of production-grade AI applications. However, getting models to deliver optimal results requires a systematic approach to prompt engineering. As a Prompt Engineer at Contextual AI, you will apply a range of techniques to systematically discover and document best practices for...


  • Mountain View, California, United States Moveworks Full time

    About the RoleWe are looking for a talented Cutting Edge AI Systems Engineer to join our team at Moveworks. In this role, you will be responsible for building and productionizing machine learning infrastructure that runs state-of-the-art models. Your expertise will impact the way our customers experience AI, and you will play a critical role in the long-term...


  • Mountain View, California, United States Inworld AI Full time

    Inworld AI is seeking a skilled Cloud DevOps/Site Reliability Engineer to maintain and optimize our infrastructure.ResponsibilitiesInfrastructure Management: Contribute to Infrastructure-as-Code (Terraform) and maintain cloud infrastructure on AWS, Azure, or GCP platforms.Pipeline Orchestration: Develop CI/CD pipelines using Github Actions, Helm, and ArgoCD...


  • Mountain View, California, United States Otter Full time

    About the RoleWe are seeking an experienced AI Inference Optimization Engineer to join our Platform team. As a key member of this team, you will be responsible for advancing the frontier of AI by maximizing efficiency and performance.Key Responsibilities:Model Optimization: Collaborate with machine learning researchers to understand model architectures and...

  • AI Mapping Expert

    4 weeks ago


    Mountain View, California, United States Gatik AI Full time

    **Job Summary:**We're looking for a talented AI Research Scientist to join our team at Perot Jain, where you'll play a crucial role in designing and refining ML algorithms for autonomous truck navigation in complex environments. Collaborating with experts in AI, robotics, and software engineering, you'll work on Offline and Online Mapping Research tracks,...


  • Mountain View, California, United States Inworld AI Full time

    About Inworld AIInworld AI is revolutionizing the gaming industry with its innovative AI-powered solutions. Our AI engine empowers developers to build immersive, interactive, and responsive gaming experiences.We are looking for an experienced Senior Product Manager to drive the development of our AI Engine, including API and SDKs. The ideal candidate will...


  • Mountain View, California, United States Gatik AI Full time

    About UsGatik AI is a leader in autonomous middle mile logistics, delivering goods safely and efficiently using its fleet of light & medium-duty trucks. Our focus is on short-haul, B2B logistics for Fortune 500 customers, optimizing their hub-and-spoke supply chain operations, enhancing service levels, and reducing labor costs.Job SummaryWe are seeking a...


  • Mountain View, California, United States Inworld AI Full time

    About UsInworld AI is a pioneering company in the field of AI-powered gaming and interactive media. Our mission is to empower developers to create immersive, interactive, and responsive gaming experiences.We are seeking a highly skilled Senior Product Manager to lead the development of our AI Engine, including API and SDKs. The ideal candidate will have a...


  • Mountain View, California, United States Inworld AI Full time

    About Inworld AIInworld AI is a leading innovator in the field of AI-powered gaming and interactive media. Our cutting-edge AI engine enables developers to create immersive, responsive, and personalized gaming experiences.We are seeking an experienced Senior Product Manager to lead the development of our AI Engine, including API and SDKs. The ideal candidate...


  • Mountain View, California, United States Inworld AI Full time

    Spearhead the development of Inworld's AI Engine as our Senior AI Product Manager. Drive product vision and strategy to enable the future of AI-powered gaming and media. Collaborate closely with our AI/ML teams to determine how best to bring AI-powered solutions to game development.As a key member of our team, you will drive complex relationships with...

  • Senior AI Developer

    1 week ago


    Mountain View, California, United States Bonfy Full time

    About Us: Bonfy.AI is an innovative startup working behind the scenes on groundbreaking AI innovations. Our mission is to make AI trustworthy and reliable. We're committed to innovation and excellence, and we're looking for talented professionals to join our team.Job Responsibilities:Develop, deploy, and optimize machine learning models for various...