Software Engineer, Inference

7 days ago


San Francisco, California, United States Anthropic Full time

About Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.

Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the role:

Our Inference team builds the service that generates outputs from our models in production. This service is the key driver of our efficiency, latency, and reliability.

As an engineer on this team, you'll work on improving those metrics by solving complex distributed-systems problems across all layers of our stack.

You may be a good fit if you:

  • Have significant software engineering experience
  • Are results-oriented, with a bias towards flexibility and impact
  • Pick up slack, even if it goes outside your job description
  • Enjoy pair programming (we love to pair)
  • Want to learn more about machine learning research
  • Care about the societal impacts of your work

Strong candidates may also have experience with:

  • High-performance, large-scale distributed systems
  • Kubernetes
  • Python
  • Machine learning

Representative projects:

  • Improving how inference requests are routed to model servers to maximize compute efficiency
  • Building a performance model to predict the impact of future architecture and hardware improvements
  • Implementing inference for a new model architecture down to the Jax / PyTorch / Kernel layers
  • Analyzing observability data to tune performance based on production workloads
  • Implementing inference on a new hardware platform
  • Building instrumentation to detect and eliminate Python GIL contention
  • Optimizing the efficiency of our accelerator kernels
  • Ensuring smooth and regular deployment of inference services

Deadline to apply:

None. Applications will be reviewed on a rolling basis.

The expected salary range for this position is:

$280,000-$510,000 USD

Logistics

Location-based hybrid policy:

Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.

Visa sponsorship:

We do sponsor visas. However, we aren't able to successfully sponsor visas for every role and every candidate.

We encourage you to apply even if you do not believe you meet every single qualification.

Not all strong candidates will meet every single qualification as listed.

Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work.

We think AI systems like the ones we're building have enormous social and ethical implications.

We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Compensation and Benefits for Full-Time Employees

Anthropic's compensation package consists of three elements: salary, equity, and benefits.

We are committed to pay fairness and aim for these three elements collectively to be highly competitive with market rates.

Equity

  • For eligible roles, equity will be a major component of the total compensation.

We aim to offer higher-than-average equity compensation for a company of our size, and communicate equity amounts at the time of offer issuance.

US Benefits for Full-Time Employees

  • The following benefits are for our US-based employees:
  • Optional equity donation matching
  • Comprehensive health, dental, and vision insurance for you and all your dependents
  • 401(k) plan with 4% matching
  • 22 weeks of paid parental leave
  • Unlimited PTO - most staff take between 4-6 weeks each year, sometimes more
  • Stipends for education, home office improvements, commuting, and wellness
  • Fertility benefits via Carrot
  • Daily lunches and snacks in our office
  • Relocation support for those moving to the Bay Area

UK Benefits for Full-Time Employees

  • The following benefits are for our UK-based employees:
  • Optional equity donation matching
  • Private health, dental, and vision insurance for you and your dependents
  • Pension contribution (matching 4% of your salary)
  • 21 weeks of paid parental leave
  • Unlimited PTO - most staff take between 4-6 weeks each year, sometimes more
  • Health cash plan
  • Life insurance and income protection
  • Daily lunches and snacks in our office

This compensation and benefits information is based on Anthropic's good faith estimate for this position as of the date of publication and may be modified in the future.

Employees based outside of the UK or US will receive a different benefits package.

The level of pay within the range will depend on a variety of job-related factors, including where you place on our internal performance ladders, which is based on factors including past work experience, relevant education, and performance on our interviews or in a work trial.

How we're different

We believe that the highest-impact AI research will be big science. At Anthropic, we work as a single cohesive team on just a few large-scale research efforts.

We value impact - advancing our long-term goals of steerable, trustworthy AI - rather than work on smaller and more specific puzzles.

We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science.

We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time.

As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research.

This research continues many of the directions our team worked on prior to Anthropic, including:

GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.

Come work with us

Anthropic is a public benefit corporation headquartered in San Francisco.

We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.



  • San Francisco, California, United States Anthropic Full time

    About AnthropicAnthropic is a public benefit corporation headquartered in San Francisco, dedicated to creating reliable, interpretable, and steerable AI systems. Our mission is to develop AI that is safe and beneficial for users and society as a whole.Job DescriptionWe are seeking a skilled Software Engineer, Inference to join our Inference team. As a key...


  • San Francisco, California, United States OpenAI Full time

    Key Role: We're seeking a skilled Software Engineer to join our team at OpenAI and contribute to the development of our critical inference infrastructure.About the Job: As an Inference Infrastructure Engineer, you will work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production. Your primary...


  • San Francisco, California, United States Liquid AI Full time

    Job Title: Member of Technical StaffAt Liquid AI, we're seeking a highly skilled engineer to optimize inference stacks for our models across various device types, including GPUs, CPUs, and NPUs.Key Responsibilities:Collaborate with ML Teams: Work with machine learning staff to effectively interface with our technical team.Hardware Awareness: Understand...


  • San Francisco, California, United States Perplexity AI Full time

    We are seeking an experienced AI Inference Systems Engineer to join our growing team at Perplexity AI. Our current stack includes Python, C++, TensorRT-LLM, and Kubernetes, providing a unique opportunity to work on large-scale deployment of machine learning models for real-time inference.Key Responsibilities:Develop APIs for AI inference that will be used by...


  • San Francisco, California, United States Liquid AI Full time

    Optimize Inference Stacks for Liquid AIAs we prepare to deploy our models across various device types, including GPUs, CPUs, and NPUs, we're seeking an expert who can optimize inference stacks tailored to each platform. We're looking for someone who can take our models, dive deep into the task, and return with a highly optimized inference stack-leveraging...


  • San Francisco, California, United States Liquid AI Full time

    About the RoleWe're seeking a highly skilled engineer to join our team at Liquid AI, where you'll play a critical role in optimizing inference stacks for our AI models.As a key member of our team, you'll be responsible for taking our models and delivering highly optimized inference stacks that leverage existing frameworks like ggml, vllm, and DeepSpeed to...


  • San Francisco, California, United States Openai Full time

    About the TeamThe Platform ML team is responsible for building the ML side of our internal training framework, which is used to train cutting-edge models.We work on distributed model execution, as well as the interfaces and implementation for model code, training, and inference.Our priorities are to maximize training throughput and researcher throughput,...


  • San Francisco, California, United States Liquid AI Full time

    At Liquid AI, we're seeking a highly skilled engineer to optimize inference stacks tailored to various hardware platforms.The ideal candidate has extensive experience in CUDA, C++, and Triton, as well as a deep understanding of GPU, CPU, and NPU architectures.They should be self-motivated, capable of working independently, and driven by a passion for...


  • San Francisco, California, United States Hyperbolic Labs Full time

    About Us:At Hyperbolic Labs, we're on a mission to democratize AI by leveraging idle computing resources worldwide. Our Open-Access AI Cloud offers an innovative GPU marketplace and AI inference service, making AI more accessible, affordable, and secure for all.We're a team of pioneers at the intersection of AI and open-source technology, driven by a passion...


  • San Jose, California, United States Adobe Full time

    Job SummaryWe are seeking a highly skilled Senior Engineering Manager to lead the development of our AI Inference Platform at Adobe. As a key member of our team, you will be responsible for driving the architecture, design, development, and testing of the platform. Your primary goal will be to enable the Firefly Product Team to easily run and deploy ML...


  • San Francisco, California, United States Discord Full time

    At Discord, we're revolutionizing the way people connect and engage with each other through gaming and shared interests. Our Experimentation Platform plays a vital role in driving business decisions and growth, and we're seeking an experienced Senior Data Scientist to join our team.The ideal candidate will have a strong background in causal inference and...


  • San Francisco, California, United States Coastal Carbon Full time

    About the RoleWe are seeking an AI Software Engineer to join our team at Coastal Carbon. As a key member of our engineering team, you will be responsible for designing, developing, and deploying large-scale machine learning models to support our mission of creating positive impact through earth observation and AI.Key Responsibilities:Collaborate with...


  • San Francisco, California, United States Crusoe Full time

    About the RoleAs a Senior/Staff Software Engineer on the Managed AI team at Crusoe, you'll have a pivotal role in shaping the architecture and scalability of our next-generation AI inference platform.You will lead the design and implementation of core systems for our AI services, including resilient fault-tolerant queues, model catalogs, and scheduling...


  • San Francisco, California, United States Triunity Software Full time

    Job Title: Senior Java Software EngineerWe are seeking a highly skilled Senior Java Software Engineer to join our team at Triunity Software.Key Responsibilities:* Design, develop, and test complex software applications using Java* Collaborate with cross-functional teams to identify and prioritize project requirements* Develop and maintain high-quality,...


  • San Francisco, California, United States Crusoe Full time

    About the RoleAs a Senior/Staff Software Engineer on the Managed AI team at Crusoe, you'll have a pivotal role in shaping the architecture and scalability of our next-generation AI inference platform.You will lead the design and implementation of core systems for our AI services, including resilient fault-tolerant queues, model catalogs, and scheduling...


  • San Francisco, California, United States Discord Full time

    Role OverviewWe are seeking an experienced Senior Data Scientist to join our Experimentation Platform team at Discord. As a Causal Inference expert, you will play a crucial role in ensuring the statistical underpinnings of our platform rebuild are sound, and experimenters can design experiments with high rigor.Key ResponsibilitiesProvide statistical...


  • San Francisco, California, United States Discord Full time

    Role OverviewWe are seeking an experienced Senior Data Scientist to join our Experimentation Platform team at Discord. As a Causal Inference expert, you will play a critical role in ensuring the statistical underpinnings of our platform rebuild are sound, and that experimenters can design experiments with high rigor.Our team directly impacts the strategy and...


  • San Diego, California, United States ServiceNow Full time

    Job Title: Staff Software EngineerJob SummaryWe are seeking a highly skilled Staff Software Engineer to join our team at ServiceNow. As a Staff Software Engineer, you will be responsible for developing and maintaining APIs using NVIDIA Triton Inference Server for scalable deployment of Large Language Models (LLMs). You will also implement and optimize...


  • San Diego, California, United States ServiceNow Full time

    Job Title: Staff Software EngineerJob SummaryWe are seeking a highly skilled Staff Software Engineer to join our team at ServiceNow. As a Staff Software Engineer, you will be responsible for developing and maintaining APIs using NVIDIA Triton Inference Server for scalable deployment of Large Language Models (LLMs). You will also implement and optimize...


  • San Francisco, California, United States Triunity Software Full time

    Job Title : Java Developer Focused on Core Java Spring/Spring Boot/Spring BatchAt Triunity Software, we are seeking a skilled Java Developer to join our team. As a Java Developer, you will be responsible for designing, developing, testing, and deploying Java-based software applications using the Java Spring and Spring Batch frameworks.Key Responsibilities:...