Staff Software Engineer, AI Platform

5 days ago


Mountain View, California, United States LinkedIn Full time

Company Description
LinkedIn is the world's largest professional network, built to help members of all backgrounds and experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain insights. We believe amazing things happen when we work together in an environment where everyone feels a true sense of belonging, and that what matters most in a candidate is having the skills needed to succeed. It inspires us to invest in our talent and support career growth. Join us to challenge yourself with work that matters.

At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team.

Job Description
This role can be based in Mountain View, CA, San Francisco, CA, or Bellevue, WA.

Join us to push the boundaries of scaling large models together. The team is responsible for scaling LinkedIn's AI model training, feature engineering and serving with hundreds of billions of parameters models and large scale feature engineering infra for all AI use cases from recommendation models, large language models, to computer vision models. We optimize performance across algorithms, AI frameworks, data infra, compute software, and hardware to harness the power of our GPU fleet with thousands of latest GPU cards. The team also works closely with the open source community and has many open source committers (TensorFlow, Horovod, Ray, vLLM, Hugginface, DeepSpeed etc.) in the team. Additionally, this team focussed on technologies like LLMs, GNNs, Incremental Learning, Online Learning and Serving performance optimizations across billions of user queries

Model Training Infrastructure: As an engineer on the AI Training Infra team, you will play a crucial role in building the next-gen training infrastructure to power AI use cases. You will design and implement high performance data I/O, work with open source teams to identify and resolve issues in popular libraries like Huggingface, Horovod and PyTorch, enable distributed training over 100s of billions of parameter models, debug and optimize deep learning training, and provide advanced support for internal AI teams in areas like model parallelism, tensor parallelism, Zero++ etc. Finally, you will assist in and guide the development of containerized pipeline orchestration infrastructure, including developing and distributing stable base container images, providing advanced profiling and observability, and updating internally maintained versions of deep learning frameworks and their companion libraries like Tensorflow, PyTorch, DeepSpeed, GNNs, Flash Attention. PyTorch Lightning and more and more.

Feature Engineering: this team shapes the future of AI with the state-of-the-art Feature Platform, which empowers AI Users to effortlessly create, compute, store, consume, monitor, and govern features within online, offline, and nearline environments, optimizing the process for model training and serving. As an engineer in the team, you will explore and innovate within the online, offline, and nearline spaces at scale (millions of QPS, multi terabytes of data, etc), developing and refining the infrastructure necessary to transform raw data into valuable feature insights. Utilizing leading open-source technologies like Spark, Beam, and Flink and more, you will play a crucial role in processing and structuring feature data, ensuring its most optimal storage in the Feature Store, and serving feature data with high performance.

Model Serving Infrastructure: this team builds low latency high performance applications serving very large & complex models across LLM and Personalization models. As an engineer, you will build compute efficient infra on top of native cloud, enable GPU based inference for a large variety of use cases, cuda level optimizations for high performance, enable on-device and online training. Challenges include scale (10s of thousands of QPS, multiple terabytes of data, billions of model parameters), agility (experiment with hundreds of new ML models per quarter using thousands of features), and enabling GPU inference at scale.

ML Ops: The MLOps and Experimentation team is responsible for the infrastructure that runs MLOps and experimentation systems across LinkedIn. From Ramping to Observability, this org powers the AI products that define LinkedIn. This team, inside MLOps, is responsible for AI Metadata, Observability, Orchestration, Ramping and Experimentation for all models; building tools that enable our product and infrastructure engineers to optimize their models and deliver the best performance possible.

As a Staff Software Engineer, you will have first-hand opportunities to advance one of the most scalable AI platforms in the world. At the same time, you will work together with our talented teams of researchers and engineers to build your career and your personal brand in the AI industry.

Responsibilities
-Owning the technical strategy for broad or complex requirements with insightful and forward-looking approaches that go beyond the direct team and solve large open-ended problems.
-Designing, implementing, and optimizing the performance of large-scale distributed serving or training for personalized recommendation as well as large language models.
-Improving the observability and understandability of various systems with a focus on improving developer productivity and system sustenance.
-Mentoring other engineers, defining our challenging technical culture, and helping to build a fast-growing team.
-Working closely with the open-source community to participate and influence cutting edge open-source projects (e.g., vLLMs, PyTorch, GNNs, DeepSpeed, Huggingface, etc.).
-Functioning as the tech-lead for several concurrent key initiatives AI Infrastructure and defining the future of AI Platforms.

Basic Qualifications:
-Bachelor's Degree in Computer Science or related technical discipline, or equivalent practical experience
-4+ years of experience in the industry with leading/ building deep learning systems.
-4+ years of experience with Java, C++, Python, Go, Rust, C# and/or Functional languages such as Scala or other relevant coding languages
-Hands-on experience developing distributed systems or other large-scale systems.

Preferred Qualifications:
-BS and 8+ years of relevant work experienceMS and 7+ years of relevant work experience, or PhD and 4+ years of relevant work experience
-Previous experience working with geographically distributed co-workers.
-Outstanding interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SRE/SWE Engineers, Project Managers, etc.
-Experience building ML applications, LLM serving, GPU serving.
-Experience with search systems or similar large-scale distributed systems
-Expertise in machine learning infrastructure, including technologies like MLFlow, Kubeflow and large scale distributed systems
-Experience with distributed data processing engines like Flink, Beam, Spark etc., feature engineering,
-Co-author or maintainer of any open-source projects
-Familiarity with containers and container orchestration systems
-Expertise in deep learning frameworks and tensor libraries like PyTorch, Tensorflow, JAX/FLAX

Suggested Skills:
-ML Algorithm Development
-Experience in Machine Learning and Deep Learning
-Experience in Information retrieval / recommendation systems / distributed serving / Big Data is a plus.
-Communication
-Stakeholder Management

You will Benefit from our Culture:
We strongly believe in the well-being of our employees and their families. That is why we offer generous health and wellness programs and time away for employees of all levels.

LinkedIn is committed to fair and equitable compensation practices. The pay range for this role is $156,000 - $255,000. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location. This may be different in other locations due to differences in the cost of labor.

The total compensation package for this position may also include annual performance bonus, stock, benefits and/or other applicable incentive compensation plans. For additional information, visit: .

Equal Opportunity Statement
LinkedIn is committed to diversity in its workforce and is proud to be an equal opportunity employer. LinkedIn considers qualified applicants without regard to race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other legally protected class. LinkedIn is an Affirmative Action and Equal Opportunity Employer as described in our equal opportunity statement here: :b:/t/LinkedInGCI/EeE8sk7CTIdFmEp9ONzFOTEBM62TPrWLMHs4J1C_QxVTbg?e=5hfhpE. Please reference .12ScreenRdr.pdf and .pdf for more information.

LinkedIn is committed to offering an inclusive and accessible experience for all job seekers, including individuals with disabilities. Our goal is to foster an inclusive and accessible workplace where everyone has the opportunity to be successful.

If you need a reasonable accommodation to search for a job opening, apply for a position, or participate in the interview process, connect with us at accommodations@ and describe the specific accommodation requested for a disability-related limitation.

Reasonable accommodations are modifications or adjustments to the application or hiring process that would enable you to fully participate in that process. Examples of reasonable accommodations include but are not limited to:

-Documents in alternate formats or read aloud to you
-Having interviews in an accessible location
-Being accompanied by a service dog
-Having a sign language interpreter present for the interview

A request for an accommodation will be responded to within three business days. However, non-disability related requests, such as following up on an application, will not receive a response.

LinkedIn will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant. However, employees who have access to the compensation information of other employees or applicants as a part of their essential job functions cannot disclose the pay of other employees or applicants to individuals who do not otherwise have access to compensation information, unless the disclosure is (a) in response to a formal complaint or charge, (b) in furtherance of an investigation, proceeding, hearing, or action, including an investigation conducted by LinkedIn, or (c) consistent with LinkedIn's legal duty to furnish information.

Pay Transparency Policy Statement
As a federal contractor, LinkedIn follows the Pay Transparency and non-discrimination provisions described at this link: https://lnkd.in/paytransparency.

Global Data Privacy Notice for Job Candidates
This document provides transparency around the way in which LinkedIn handles personal data of employees and job applicants: https://lnkd.in/GlobalDataPrivacyNotice



  • Mountain View, California, United States LinkedIn Full time

    Job DescriptionWe are looking for a talented Senior Software Engineer to join our AI Platform team. The successful candidate will have expertise in designing, implementing, and optimizing large-scale distributed systems for machine learning applications.The ideal candidate will have a strong background in software engineering, with experience in languages...


  • Mountain View, California, United States LinkedIn Full time

    Company DescriptionLinkedIn is the world's largest professional network, built to help members of all backgrounds and experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain...


  • Mountain View, California, United States Inworld AI Full time

    Why Join InworldInworld is the best-funded startup in AI and games with a $500 million valuation and backing from top tier investors including Intel Capital, Microsoft's M12 fund, Lightspeed Venture Partners, Section 32, BITKRAFT Ventures, Kleiner Perkins, Founders Fund, and First Spark Ventures.Inworld is the leading framework for building agentic...


  • Mountain View, California, United States Inworld AI Full time

    view open roles Why Join Inworld Inworld is the best-funded startup in AI and games with a $500 million valuation and backing from top tier investors including Intel Capital, Microsoft's M12 fund, Lightspeed Venture Partners, Section 32, BITKRAFT Ventures, Kleiner Perkins, Founders Fund, and First Spark Ventures. Inworld is the leading framework for building...


  • Mountain View, California, United States LinkedIn Full time

    **Technical Requirements**We are seeking a highly skilled Senior Staff Software Engineer to join our AI infrastructure team. As a key member of the team, you will play a crucial role in building the next-gen training infrastructure to power AI use cases. Your primary responsibility will be to design and implement high-performance data I/O, work with...


  • Mountain View, California, United States Contextual AI Full time

    About the RoleAs a Developer Advocate at Contextual AI, you will serve as the bridge between our cutting-edge Enterprise AI platform and the technical community. You'll create compelling technical content, drive developer engagement, and evangelize our platform's capabilities to AI engineers and technical decision-makers.Key Responsibilities:Create...


  • Mountain View, California, United States LinkedIn Full time

    Company DescriptionLinkedIn is the world's largest professional network, built to help members of all backgrounds and experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain...


  • Mountain View, California, United States LinkedIn Full time

    Company DescriptionLinkedIn is the world's largest professional network, built to help members of all backgrounds and experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain...


  • Mountain View, California, United States Bonfy AI Full time

    About Bonfy.AI Bonfy.AI is a well-funded startup currently in stealth mode, dedicated to solving one of the most pressing challenges of our time: ensuring the safe and secure adoption of AI technologies by enterprises. As a rapidly growing startup, we combine cutting-edge AI/ML technologies with a fast-paced, highly skilled-environment. Our mission is to...


  • Mountain View, California, United States Moveworks Full time

    Moveworks is at the forefront of developing conversational AI technology, enabling businesses to simplify employee support processes. We are seeking a seasoned senior software engineer to join our core platform team and contribute to the advancement of our AI-powered copilot.As a senior software engineer, you will design and build highly reliable,...

  • Software Engineer

    2 weeks ago


    Mountain View, California, United States Tech AI Start up Full time

    Software Engineer/Machine Learning Engineer)I'm working with a Series A (about to secure Series B funding) cutting edge tech AI start-up, who is building the next-generation Automation AI platform to eliminate repetitive tasks using cutting-edge foundation models. As a Software Engineer, you will contribute to scalable AI-powered applications that redefine...


  • Mountain View, California, United States LinkedIn Full time

    About the RoleThis is a unique opportunity to join our AI Platform team as a Staff Software Engineer. As a key member of this team, you will play a critical role in designing and implementing large-scale distributed systems.Your primary responsibility will be to build and maintain our AI infrastructure, ensuring it is scalable, efficient, and secure. You...


  • Mountain View, California, United States LinkedIn Full time

    LinkedIn is the world's largest professional network, built to help members of all backgrounds and experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain insights. We believe...


  • Mountain View, California, United States LinkedIn Full time

    Job DescriptionAs a Principal Staff Software Engineer on the AI Training Infra team, you will play a crucial role in leading and building the next-gen training infrastructure to power AI use cases. You will design and implement high performance AI Training pipeline, data I/O, work with open source teams to identify and resolve issues in popular libraries...


  • Mountain View, California, United States LinkedIn Full time

    Company DescriptionLinkedIn is the world's largest professional network, built to help members of all backgrounds and experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain...


  • Mountain View, California, United States LinkedIn Full time

    LinkedIn is the world's largest professional network, built to help members of all backgrounds and experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain insights. We believe...


  • Mountain View, California, United States Inworld AI Full time

    About UsInworld AI is a leading framework for building agentic experiences, empowering developers to bring their AI engines in-house with optimized real-time data ingestion, low latency, and massive scale.


  • Mountain View, California, United States LinkedIn Full time

    At LinkedIn, we're pushing the boundaries of scaling large models together. Our AI Training Infra team is responsible for building the next-gen training infrastructure to power AI use cases. As an engineer on this team, you'll play a crucial role in designing and implementing high performance data I/O, working with open source teams to identify and resolve...


  • Mountain View, California, United States Samsung Electronics GmbH Full time

    About the RoleWe are looking for a highly skilled Staff Software Engineer (AI/ML) to lead the development of scalable GenAI/LLM applications. The successful candidate will have extensive industry experience in building, scaling, and optimizing ML pipelines, as well as a strong track record of success in commercializing products.The ideal candidate will be...


  • Mountain View, California, United States LinkedIn Full time

    Innovate, collaborate, and grow with us at LinkedIn! We're a dynamic company dedicated to providing transformational opportunities for our employees and creating economic opportunities for every member of the global workforce.This exciting role offers the chance to make a meaningful impact by shaping the future of employee productivity through AI,...