Senior Inference Engineer

1 week ago


Mountain View, California, United States contextual ai Full time
Job Overview

The AI Inference team at Contextual AI is responsible for designing, building, and operating Gen AI and LLM inference systems at scale. The team focuses on optimizing latency, throughput, and cost for all Contextual AI models powered by RAG 2.0 technology.

Key Responsibilities
  • Design, develop, test, and deploy high-performance inference solutions for Gen AI state-of-the-art model architectures, RAG 2.0, knowledge retrieval models, and language encoders.
  • Optimize end-to-end inference latency, throughput, and cost, ensuring the most efficient use of our inference cluster.
  • Drive system architecture, spearhead best practices, and mentor junior engineers.
  • Improve the reliability, scalability, and observability of our distributed inference infrastructure.
  • Stay up-to-date with emerging techniques by reading papers and consulting with scientists, integrating insights into our roadmap.
  • Design and experiment with new algorithms, benchmarking the latency and accuracy of your implementations.
Requirements
  • M.Sc. or PhD in Computer Science, Engineering, Statistics, Mathematics, or a related field.
  • 5+ years of non-internship professional software development experience, including experience in leading design or architecture of new and existing systems.
  • Experience as a mentor, tech lead, or leading an engineering team.
  • Proficiency in Python, PyTorch, multi-threaded asynchronous C++/Go, and performance optimization.
  • Experience with GPU programming and the GPU inference stack: TensorRT-LLM, Triton, CUDA, and CUPTI.
  • Proficiency in the TensorFlow and/or PyTorch frameworks.
  • Experience with Linux kernel system calls or the POSIX API (process control, communication, and device management).
  • A problem-solving mindset, owning tasks end-to-end and acquiring the necessary knowledge to get the job done.
  • A good intuition for when off-the-shelf solutions are sufficient and the ability to build tools to accelerate your workflow when they aren't.
  • The ability to move quickly in an environment where things are sometimes loosely defined and may have competing priorities or deadlines.
Location and Compensation

Location: Mountain View, CA

Salary Range for California Based Applicants: $140,000 - $300,000 + equity + benefits (actual compensation will be determined based on experience, location, and other factors permitted by law).

Equal Opportunity

Contextual AI is an equal opportunity employer and complies with all applicable federal, state, and local fair employment practices laws. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, ancestry, sex, sexual orientation, gender, gender expression, gender identity, genetic information or characteristics, physical or mental disability, marital/domestic partner status, age, military/veteran status, medical condition, or any other characteristic protected by law.



  • Mountain View, California, United States Otter Full time

    About the RoleWe are seeking a highly skilled Research Engineer to join our Platform team at Otter. As a key member of our team, you will play a critical role in advancing the frontier of AI by maximizing efficiency and performance to achieve feats previously thought impossible.Key ResponsibilitiesCollaborate with machine learning researchers to understand...


  • Mountain View, California, United States Otter Full time

    About the OpportunityWe are seeking a highly skilled Research Engineer to join our Platform team, where you will play a key role in advancing the frontier of AI by optimizing and deploying machine learning models for real-time applications.Your ImpactModel Optimization: Collaborate with machine learning researchers to understand model architectures and...


  • Mountain View, California, United States Gatik Full time

    About GatikGatik is a leader in autonomous middle mile logistics, delivering goods safely and efficiently using its fleet of light & medium-duty trucks. We focus on short-haul, B2B logistics for Fortune 500 customers, enabling them to optimize their hub-and-spoke supply chain operations, enhance service levels, and reduce labor costs.Job SummaryWe're seeking...


  • Mountain View, California, United States Coupang Full time

    About the RoleCoupang is seeking a highly skilled Senior Staff Machine Learning Infrastructure Engineer to join our Search and Discovery organization. As a key member of our team, you will be responsible for designing and implementing durable and efficient software solutions that handle massive volumes of structured and unstructured data needed to train...


  • Mountain View, California, United States Athelas Full time

    About the RoleAt Commure + Athelas, we're on a mission to transform the healthcare industry through innovative technology solutions. As a Senior Backend Engineer on our Strongline Engineering team, you'll play a pivotal role in enhancing our healthcare real-time location system (RTLS) product.This is an exceptional opportunity for someone who is passionate...


  • Mountain View, California, United States Otter Full time

    About the RoleWe are seeking a highly skilled Research Engineer to join our Platform team at Otter.ai. As a Research Engineer, you will play a critical role in advancing the frontier of AI by maximizing efficiency and performance to achieve feats previously thought impossible.Key ResponsibilitiesModel Optimization: Collaborate with machine learning...


  • Mountain View, California, United States Groq Full time

    About GroqGroq is a cutting-edge technology company that's revolutionizing the AI economy. We believe in making AI accessible to all, and our mission is to deliver the fastest inference engine in the world.Job Title: Sr. Site Reliability Engineer, Distributed SystemsWe're seeking a highly skilled Sr. Site Reliability Engineer to join our team. As a key...

  • Research Engineer

    3 weeks ago


    Mountain View, California, United States Otter Full time

    About Otter.aiWe are shaping the future of work by making conversations more valuable. With over 1B meetings transcribed, Otter.ai is the world's leading tool for meeting transcription, summarization, and collaboration. Using artificial intelligence, Otter generates real-time automated meeting notes, summaries, and other insights from in-person and virtual...


  • Mountain View, California, United States Groq Full time

    About GroqWe're a company that believes in an AI economy powered by human agency. Our mission is to make AI accessible to all, and we're working towards a world where processing power is better, faster, and more affordable than it is today.Job Title: Sr. Site Reliability Engineer, Distributed SystemsWe're looking for a highly skilled Sr. Site Reliability...


  • Mountain View, California, United States Toyota Motor Sales, U.S.A., Inc. Full time

    Job Title: Senior Research ScientistWe are seeking a highly motivated and experienced Senior Research Scientist to join our team at Toyota InfoTech Labs. As a key member of our research team, you will be responsible for developing innovative connected and cooperative mobility applications and technologies.Key Responsibilities:Develop and implement advanced...


  • Mountain View, California, United States Inworld AI Full time

    Why Inworld AIInworld AI is a leading AI engine for games, enabling developers to build groundbreaking game mechanics, dynamic NPCs and worlds that evolve with each action. Our technology powers experiences built by top game developers and has partnerships with key industry players such as Microsoft/Xbox, Epic Games, and Unity.We are seeking a highly...


  • Mountain View, California, United States Otter Full time

    About the OpportunityWe are seeking a highly skilled Research Engineer to join our Platform team at Otter, a leading AI-powered collaboration platform. As a Research Engineer, you will play a critical role in advancing the frontier of AI by maximizing efficiency and performance to achieve feats previously thought impossible.Your ImpactModel Optimization:...


  • Mountain View, California, United States Applied Intuition Full time

    About Applied IntuitionApplied Intuition is a leading provider of AI-powered software solutions for the automotive industry. We accelerate the adoption of safe and intelligent machines worldwide by delivering a comprehensive toolchain, vehicle platform, and autonomy stack to our customers.Our team is passionate about building innovative solutions that help...


  • Mountain View, California, United States IBM Full time

    About the RoleWe are seeking a highly skilled Machine Learning Engineer to join our team at IBM. As a key member of our conversational AI group, you will be responsible for designing, implementing, and deploying machine learning solutions to improve our conversational AI products.Key ResponsibilitiesInvestigate and experiment with new model architectures to...


  • Mountain View, California, United States Inworld AI Full time

    About Inworld AIInworld AI is a pioneering startup in the field of artificial intelligence and games, boasting a $500 million valuation and backing from top-tier investors. We were recognized by CB Insights as one of the 100 most promising AI companies in the world and were nominated alongside Anthropic, DeepMind, OpenAI, and Nvidia for Generative AI...


  • Mountain View, California, United States IBM Full time

    About the RoleWe are seeking a highly skilled Machine Learning Engineer to join our team at IBM. As a key member of our conversational AI group, you will play a critical role in designing, developing, and deploying advanced machine learning models that power our conversational AI system.Key ResponsibilitiesInvestigate and experiment with new model...


  • Mountain View, California, United States Coupang Full time

    About CoupangCoupang is a leading e-commerce company that is revolutionizing the way people shop, eat, and live. Our mission is to build the future of commerce, and we're looking for talented individuals to join our team.Job OverviewThe Search Analytics team is a critical component of Coupang's Search and Discovery experiences. As a Senior Staff Data...

  • Software Engineer

    2 weeks ago


    Mountain View, California, United States Syntiant Full time

    About the RoleWe are seeking a highly skilled Software Engineer to join our team at Syntiant, a leading provider of AI solutions for embedded devices. As a Software Engineer on our team, you will be responsible for building and deploying machine learning solutions for our customers.As a key member of our team, you will work closely with our Core ML and...


  • Mountain View, California, United States Microsoft Corporation Full time

    Job Title: Senior Hardware EngineerMicrosoft Corporation is seeking a highly skilled Senior Hardware Engineer to join our team. As a Senior Hardware Engineer, you will be responsible for designing and developing innovative hardware solutions for our cloud infrastructure.Responsibilities:Lead the design of hardware components and systemsCollaborate with...


  • Mountain View, California, United States Toyota Motor Sales, U.S.A., Inc. Full time

    Job SummaryWe are seeking a highly skilled Senior Research Scientist to join our team at Toyota Motor Sales, U.S.A., Inc. As a key member of our research and development team, you will be responsible for researching and developing innovative connected and cooperative mobility applications and technologies.About the RoleThis is an exciting opportunity to work...