Member of Technical Staff

3 days ago


San Francisco, United States Liquid AI Full time

As we prepare to deploy our models across various device types, including GPUs, CPUs, and NPUs, we're seeking an expert who can optimize inference stacks tailored to each platform. We're looking for someone who can take our models, dive deep into the task, and return with a highly optimized inference stack-leveraging existing frameworks like ggml, vllm, and DeepSpeed to deliver exceptional throughput and low latency.

The ideal candidate is a highly skilled engineer with extensive experience in CUDA, C++, and Triton, as well as a deep understanding of GPU, CPU, and NPU architectures. They should be self-motivated, capable of working independently, and driven by a passion for optimizing performance across diverse hardware platforms. Proficiency in building and enhancing inference stacks using frameworks like ggml, vllm, and DeepSpeed is essential. Additionally, experience with mobile development and expertise in cache-aware algorithms will be highly valued.

Responsibilities

    • Strong ML Experience: Proficiency in Python and PyTorch to effectively interface with the ML team at a deeply technical level.
    • Hardware Awareness: Must understand modern hardware architecture, including cache hierarchies and memory access patterns, and their impact on performance.
    • Proficient in Coding: Expertise in Python, PyTorch, and either CUDA, Triton, or C++ is essential for this role.
    • Optimization of Low-Level Primitives: Responsible for optimizing core primitives to ensure efficient model execution.
    • Self-Guided and Ownership: Ability to independently take a PyTorch model and inference requirements (e.g., maximize GPU throughput or minimize CPU latency) and deliver a fully optimized stack with minimal guidance.
    • Research-Driven: Should stay up-to-date with advancements in ML inference, such as new quantization techniques or speculative decoding, while maintaining focus on delivering practical solutions.


  • san francisco, United States Acceler8 Talent Full time

    Member of Technical Staff: Backend EngineerAre you an experienced backend engineer ready to tackle real-world AI challenges? We are seeking a Member of Technical Staff: Backend Engineer to join a leading company innovating in the AI space. This role is ideal for someone who excels at building robust backend systems and has a passion for solving complex...


  • san francisco, United States Acceler8 Talent Full time

    Member of Technical Staff: Backend EngineerAre you an experienced backend engineer ready to tackle real-world AI challenges? We are seeking a Member of Technical Staff: Backend Engineer to join a leading company innovating in the AI space. This role is ideal for someone who excels at building robust backend systems and has a passion for solving complex...


  • San Francisco, United States Acceler8 Talent Full time

    Member of Technical Staff: Backend EngineerAre you an experienced backend engineer ready to tackle real-world AI challenges? We are seeking a Member of Technical Staff: Backend Engineer to join a leading company innovating in the AI space. This role is ideal for someone who excels at building robust backend systems and has a passion for solving complex...


  • San Francisco, United States Future House USA Full time

    About FutureHouse FutureHouse is a new, non-profit AI-for-science lab using AI to automate research in biology and other complex sciences. We are backed by Eric Schmidt, and our mission is to build AI systems that can scale scientific research and allow humanity to proceed as quickly as possible to find cures for disease, solutions for climate change, and...


  • San Francisco, United States Future House USA Full time

    About FutureHouse FutureHouse is a new, non-profit AI-for-science lab using AI to automate research in biology and other complex sciences. We are backed by Eric Schmidt, and our mission is to build AI systems that can scale scientific research and allow humanity to proceed as quickly as possible to find cures for disease, solutions for climate change, and...


  • San Francisco, CA, United States Acceler8 Talent Full time

    Member of Technical Staff: Backend EngineerAre you an experienced backend engineer ready to tackle real-world AI challenges? We are seeking a Member of Technical Staff: Backend Engineer to join a leading company innovating in the AI space. This role is ideal for someone who excels at building robust backend systems and has a passion for solving complex...


  • San Francisco, United States FutureHouse, Inc. Full time

    FutureHouse is a new, non-profit AI-for-science lab using AI to automate research in biology and other complex sciences. We are backed by Eric Schmidt, and our mission is to build AI systems that can scale scientific research and allow humanity to proceed as quickly as possible to find cures for disease, solutions for climate change, and other...


  • San Francisco, United States Future House USA Full time

    FutureHouse is a new, non-profit AI-for-science lab using AI to automate research in biology and other complex sciences. We are backed by Eric Schmidt, and our mission is to build AI systems that can scale scientific research and allow humanity to proceed as quickly as possible to find cures for disease, solutions for climate change, and other...


  • San Francisco, United States Future House USA Full time

    FutureHouse is a new, non-profit AI-for-science lab using AI to automate research in biology and other complex sciences. We are backed by Eric Schmidt, and our mission is to build AI systems that can scale scientific research and allow humanity to proceed as quickly as possible to find cures for disease, solutions for climate change, and other...


  • San Francisco, United States Coframe Full time

    About Us: At Coframe, we're giving every website and app its own AI growth engineer. We envision a future where user interfaces can adapt, evolve, and personalize themselves, giving the internet its own sense of life and intelligence. We are a primarily in-person company based in San Francisco. Our values are core to who we are and how we operate: agency,...


  • San Francisco, United States Coframe Full time

    About Us: At Coframe, we're giving every website and app its own AI growth engineer. We envision a future where user interfaces can adapt, evolve, and personalize themselves, giving the internet its own sense of life and intelligence. We are a primarily in-person company based in San Francisco. Our values are core to who we are and how we operate: agency,...


  • San Francisco, United States Coframe Full time

    About Us:One day, our user interfaces will adapt, evolve, and personalize themselves, giving the internet its own sense of life and intelligence. Coframe is building this future.Our values are core to who we are and how we operate: agency, truth-seeking, growth, and velocity.We are well-capitalized with backing from Khosla Ventures, Nat Friedman, the founder...


  • San Francisco, United States Social Finance Ltd Full time

    Employee Applicant Privacy Notice Who we are: Shape a brighter financial future with us. Together with our members, we're changing the way people think about and interact with personal finance. We're a next-generation financial services company and national bank using innovative, mobile-first technology to help our millions of members reach their goals....


  • San Francisco, United States Social Finance (SoFi) Full time

    Employee Applicant Privacy NoticeWho we are:Shape a brighter financial future with us. Together with our members, we're changing the way people think about and interact with personal finance. We're a next-generation financial services company and national bank using innovative, mobile-first technology to help our millions of members reach their goals. The...


  • San Francisco, United States Salesforce Full time

    About Salesforce. Were Salesforce, the Customer Company, inspiring the future of business with AI Data CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we Technical Writer, Developer, Staff, Technical, Writer, Salesforce, Technology


  • San Francisco, United States SoFi Full time

    Employee Applicant Privacy Notice Who we are: Shape a brighter financial future with us. Together with our members, we’re changing the way people think about and interact with personal finance. We’re a next-generation financial services company and national bank using innovative, mobile-first technology to help our millions of members reach their goals....


  • San Francisco, United States SoFi Full time

    Employee Applicant Privacy NoticeWho we are: Shape a brighter financial future with us. Together with our members, we're changing the way people think about and interact with personal finance. We're a next-generation financial services company and national bank using innovative, mobile-first technology to help our millions of members reach their goals. The...


  • San Francisco, United States Social Finance Ltd Full time

    Employee Applicant Privacy Notice Who we are: Shape a brighter financial future with us. Together with our members, we're changing the way people think about and interact with personal finance. We're a next-generation financial services company and national bank using innovative, mobile-first technology to help our millions of members reach their goals. The...


  • San Francisco, United States Oracle Full time

    The Oracle Cloud Infrastructure (OCI) team builds and manages a suite of massive-scale, integrated cloud services in a broadly distributed, multi-tenant cloud environment. OCI is committed to providing the best in cloud products that meet the needs of our customers who are tackling some of the world's biggest challenges. Observability org of OCI is on a...


  • San Francisco, United States Liquid AI Full time

    As we prepare to deploy our models across various device types, including GPUs, CPUs, and NPUs, we're seeking an expert who can optimize inference stacks tailored to each platform. We're looking for someone who can take our models, dive deep into the task, and return with a highly optimized inference stack-leveraging existing frameworks like ggml, vllm, and...