Current jobs related to Research Scientist, Systems ML and HPC - Menlo Park - Facebook


  • Menlo Park, California, United States Modern Mechanical Systems, Inc. Full time

    Meta is seeking a software engineer to join our AI & Systems Co–Design team to drive the definition of our next–generation compute and storage architectures. This person will work cross–functionally with internal software and platforms engineering teams to understand the workloads and infrastructure requirements. They will drive technology...


  • Menlo Park, United States META Full time

    Meta is seeking a Research Scientist to join our Research & Development teams. The ideal candidate will have industry experience working on AI Infrastructure related topics. The position will involve taking these skills and applying them to solve for some of the most crucial & exciting problems that exist on the web. We are hiring in multiple locations. The...


  • Menlo Park, United States META Full time

    We are looking for a Applied Research Scientist to join the Meta AI Speech team. The AI Speech team creates spoken language technology to make it faster and easier for people to build community and connect with others around the world. Our mission is to conduct product-motivated research in ML/AI and design, develop and deploy state of the art models to the...


  • Oak Park, Illinois, United States ITR Full time

    Job OverviewThe HPC Linux Systems Engineer position requires an individual with an active DOE Q Clearance or a DOD Top Secret Clearance that is convertible. This role necessitates onsite presence.Team ResponsibilitiesThe team is dedicated to supporting research and development initiatives. They are tasked with the design, deployment, optimization,...


  • Menlo Park, United States META Full time

    Meta is seeking Research Scientists to join its Generative AI (GenAI) organization, focused on making significant advances in AI. We work on the flagship Llama language models and publish state-of-the-art research in Machine Learning. We are currently seeking talented researchers with experience in Language Research to join and work with us on extending the...


  • Walnut Park, United States InsideHigherEd Full time

    About the RoleWe are seeking a highly skilled HPC System Administrator to join our team at InsideHigherEd. As a key member of our organization, you will be responsible for the administration and management of our high-performance computing (HPC) systems.Key ResponsibilitiesAdminister and manage HPC systems, including cluster administration, system software...

  • Research Scientist

    4 weeks ago


    Menlo Park, United States META Full time

    Meta Platforms, Inc. (Meta), formerly known as Facebook Inc., builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps and services like Messenger, Instagram, and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens...


  • Menlo Park, United States ANNEA.ai GmbH Full time

    Meta was built to help people connect and share, and over the last decade our tools have played a critical part in changing how people around the world communicate with one another. With over a billion people using the service and more than fifty offices around the globe, a career at Meta offers countless ways to make an impact in a fast growing...


  • Oak Park, Illinois, United States ITR Full time

    Job OverviewHPC Linux Systems EngineerA leading research and development organization is in search of exceptional candidates to enhance the security, efficiency, and dependability of its computing systems. This position involves supporting one of the most advanced supercomputers globally. As an HPC Linux Systems Engineer, you will join the Infrastructure...


  • Menlo Park, California, United States META Full time

    Meta is on the lookout for exceptional Research Scientists to be part of its Generative AI (GenAI) division, dedicated to achieving groundbreaking advancements in artificial intelligence. Our focus is on the renowned Llama language models, where we strive to publish cutting-edge research in the field of Machine Learning. We are eager to welcome skilled...


  • Menlo Park, California, United States Facebook Full time

    Meta's AI Training and Inference Infrastructure is growing exponentially to support ever increasing uses cases of AI. This results in a dramatic scaling challenge that our engineers have to deal with on a daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like GPUs together. In addition, we need...


  • Menlo Park, California, United States META Full time

    Meta is on the lookout for exceptional Research Scientists to become part of its Generative AI (GenAI) division, dedicated to pioneering advancements in artificial intelligence. Our focus is on the flagship Llama language models, and we are committed to publishing cutting-edge research in the field of Machine Learning. We invite skilled researchers with a...


  • Menlo Park, California, United States Meta Full time

    We are looking for researchers and applied scientists to join the Central Applied Science team.Central Applied Science is an interdisciplinary team of quantitative scientists that aims to deliver research and innovation that fundamentally contribute to Meta's success.Individuals in this role are expected to have expertise and publications within research...


  • Menlo Park, United States META Full time

    Meta is seeking a Research Scientist to join the Computational Social Science team. Meta is committed to understanding and improving our impact on important societal topics, such as fostering healthy connection and community, social cohesion, youth experiences, civic discourse, elections and democracy, institutional trust, economic opportunity, and...


  • Menlo Park, California, United States Facebook Full time

    Meta is seeking Research Engineer to join its Fundamental AI Research (FAIR) organization, an organization focused on making significant advances in AI. We are seeking talented research engineer with experience in LLMs and vision-language models to join the team. They will investigate how to build contextually aware agents that will be the foundation for a...


  • Menlo Park, California, United States Meta Full time

    From making valuable connections between people and businesses to building premium services that deliver high-value experiences, the Monetization organization at Meta empowers people and businesses to succeed in the global economy. As Meta focuses on building the next evolution of social experiences, the Monetization team plays a crucial role in shaping the...


  • Oak Park, Illinois, United States ITR Full time

    Job OverviewHPC Linux Systems EngineerITR is in search of exceptional candidates to enhance the security, efficiency, and dependability of our advanced computing systems. This position plays a crucial role in supporting one of the leading supercomputers globally. As an HPC Linux Systems Engineer, you will be integrated into the Infrastructure team within the...


  • Oak Park, Illinois, United States ITR Full time

    Job OverviewHPC Linux Systems EngineerITR is in search of highly skilled professionals to enhance the security, efficiency, and dependability of our computational infrastructure. This position involves working with one of the leading supercomputers globally. As an HPC Linux Systems Engineer, you will be an integral part of the Infrastructure team within the...


  • Menlo Park, California, United States Facebook Full time

    Reality Labs at Meta is seeking Research Scientists with experience in product-focused machine learning and signal processing research to advance our pioneering work in neuromotor interfaces, which has grown out of the acquisition of CTRL-labs. We're building a practical interface drawing on the rich neuromotor signals that can be measured non-invasively via...


  • Menlo Park, United States Facebook Full time

    Central Applied Science (CAS) is a research and development team, working to improve Meta's products, infrastructure, and processes. We generate real-world impact through a combination of scientific rigor and methodological innovation. Our focus is on longer-term, foundational work that addresses new opportunities and challenges across the Meta family of...

Research Scientist, Systems ML and HPC

2 months ago


Menlo Park, United States Facebook Full time
Meta is seeking a Research Scientist to join our Research & Development teams. The ideal candidate will have industry experience working on AI Infrastructure related topics. The position will involve taking these skills and applying them to solve for some of the most crucial & exciting problems that exist in the hardware/software space for AI Training. We are hiring in multiple locations and across different teams: The Model/System Co-Design team works on (1) optimizing the parallelisms, compute efficiency, training paradigms to improve the scalability and reliability of large scale distributed training systems; (2) innovating and co-designing noval model architecture for sustained scaling and hardware efficiency; (3) co-designing the learning algorithm to improve the efficiency and robustness of training convergence. We have succesfully landed a number of step function changes to both LLM pre-training and ranking/recommendation model co-design, and continue to focus on bleeding edge exploration to achieve industry-leading scale and efficiency.The MTIA Training Performance team is dedicated to maximizing training performance of Generative AI and recommendation models on Meta's custom accelerators. We model and project the performance of current and future training workloads on custom hardware while it is being designed to provide early, crucial feedback to the architecture, compiler, and kernels teams. We employ cutting-edge optimization and data parallelization strategies to maximize training throughput for the next generations of LLMs and deep recommendation models, and we work cross-functionally with many partner teams to assure the end-to-end performance of large-scale training in order to more quickly deliver the next generation of Generative AI experiences to our users.The Collectives and Communication team within AI Co-design helps drive the development, optimization and tuning of Collective Communications libraries for Nvidia GPUs, MTIA accelerators and AMD GPUs covering both AI training and inference use cases. The comms team works to optimize communications performance at scale and investigate improvements to algorithms, tooling, and interfaces that can impact Meta workloads. We actively work in multiple HPC collective communication libraries and collaborate with teams across Meta and externally.

Research Scientist, Systems ML and HPC - SW/HW Co-Design Responsibilities:



  • Apply High-Performance Computing (HPC) algorithms and techniques to optimize large-scale AI workloads
  • Analyze, benchmark, and optimize large-scale workloads on next-generation training superclusters
  • Apply relevant AI infrastructure and software/hardware acceleration techniques to build and optimize our intelligent ML systems that improve Meta's products and experiences
  • Influence next-generation model and hardware architecture choices by projecting training performance and model efficiency
  • Goal-setting related to project impact, AI system design, and infrastructure/developer efficiency
  • Directly or influencing partners to deliver impact through deep, thorough data-driven analysis
  • Drive large projects across multiple teams
  • Define use cases and develop methodology and benchmarks to evaluate different approaches
  • Apply in depth knowledge of how ML infra interacts with the other systems around it
  • Experience in systems software development such as collective Communications


Minimum Qualifications:



  • Currently has, or is in the process of obtaining, a Master's/PhD degree in Computer Science, Computer Vision, Generative AI, NLP, relevant technical field, or equivalent practical experience. Degree requirements must be completed prior to joining Meta
  • Currently has, or is in the process of obtaining, a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
  • Specialized experience in one or more of the following machine learning/deep learning domains: high-performance computing, performance optimizations, SW/HW co-design, hardware accelerators architecture, GPU architecture, machine learning compilers, ML systems, AI infrastructure, or machine learning frameworks (e.g. PyTorch), numerics, Collective Communication libraries (NCCL or RCCL), and model compression
  • Experience developing AI system infrastructure or AI algorithms in C/C++ or Python
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta.


Preferred Qualifications:



  • Experience or knowledge of training/inference of large-scale AI models
  • Experience or knowledge of distributed and cloud systems
  • Experience or knowledge in one or more of: recommendation and ranking models, LLM and/or LDM, or Collective Communication libraries (NCCL or RCCL)


Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.

Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.