AI Inference Systems Architect

5 days ago

San Francisco, California, United States Tbwa ChiatDay Inc Full time

Perplexity Perks

We are seeking a skilled AI Inference Engineer to join our rapidly growing team in the San Francisco Bay area. With a base salary range of $190,000 - $240,000, this role offers an attractive compensation package.

About the Role

In this position, you will have the opportunity to work on large-scale deployments of machine learning models for real-time inference. Your responsibilities will include developing APIs for AI inference, benchmarking and addressing bottlenecks throughout our inference stack, improving the reliability and observability of our systems, and exploring novel research and implementing LLM inference optimizations.

To be successful in this role, you should have experience with ML systems and deep learning frameworks (e.g., PyTorch, TensorFlow, ONNX), familiarity with common LLM architectures and inference optimization techniques, and experience with deploying reliable, distributed, real-time model serving at scale.

What We Offer

As a valued member of our team, you can expect comprehensive health, dental, and vision insurance for you and your dependents, including a 401(k) plan. In addition to the base salary, equity is part of the total compensation package.

Get Started

Our mobile apps have been installed over 1 million times across iOS and Android devices, and we've served over 500 million queries from users around the globe. If you're passionate about working on large-scale inference systems and have a proven track record, please apply.

AI Inference Software Architect

7 days ago

San Francisco, California, United States Untether AI Full time

Software Architect for AI InferenceWe are seeking an exceptional Software Architect to join our team at Untether AI, where you will play a key role in designing and developing software that interacts with our innovative chip. As part of our top-notch team, you will collaborate closely with hardware engineers and fellow software engineers to create software...
Transformative AI Systems Architect

4 days ago

San Francisco, California, United States Abridge AI Inc. Full time

Unlock the Potential of Healthcare with AbridgeAbridge AI Inc. is revolutionizing the healthcare industry with cutting-edge AI technology, empowering clinicians to focus on patient care while streamlining clinical documentation processes.About the RoleWe are seeking an experienced Transformative AI Systems Architect to join our team and play a pivotal role...
Data Inference Specialist

7 days ago

San Francisco, California, United States Perplexity AI Full time

We are seeking an experienced Data Inference Specialist to join our team at Perplexity AI.OverviewAt Perplexity AI, we've achieved tremendous growth and adoption since launching the world's first fully functional conversational answer engine. Our AI-powered search assistant has amassed 10 million monthly active users, with mobile apps installed over 1...
AI Inference Deployment Specialist

3 weeks ago

San Francisco, California, United States Tbwa ChiatDay Inc Full time

We are seeking an experienced AI Inference Deployment Specialist to join our team at Skild AI. As a key member of our robotics team, you will be responsible for deploying cutting-edge AI models and optimizing their performance in real-world environments.Role OverviewIn this role, you will work closely with our cross-functional team to design and develop...
Chief AI Security Architect

20 hours ago

San Francisco, California, United States Magic AI Full time

About MagicMagic is a cutting-edge technology company committed to developing safe Artificial General Intelligence (AGI) that accelerates humanity's progress on the world's most pressing challenges. Our mission revolves around automating research and code generation to improve models and solve alignment more reliably than humans alone.We believe our approach...
Cloud Systems Architect

10 hours ago

San Francisco, California, United States Magic AI Full time

About Magic AIAt Magic AI, we're building safe Artificial General Intelligence (AGI) to accelerate humanity's progress on the world's most pressing challenges. Our approach combines frontier-scale pre-training, domain-specific reinforcement learning, ultra-long context, and inference-time compute to achieve this goal.We're seeking a skilled Distributed...
Expert Machine Learning Engineer for Real-Time Inference

5 days ago

San Francisco, California, United States Perplexity AI Full time

We are a fast-growing AI company looking for an expert machine learning engineer to join our team. Our current stack is Python, C++, TensorRT-LLM, and Kubernetes.You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference. The ideal candidate should have experience with ML systems and deep learning...
Senior AI Infrastructure Software Architect

3 days ago

San Francisco, California, United States ZipRecruiter Full time

Unlock Your Potential as a Senior AI Infrastructure Software ArchitectOverview:At ZipRecruiter, we're pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to power their most advanced AI applications. We're redefining AI cloud infrastructure with a mission to align the future of computing with the...
Scalable AI Architect

5 days ago

San Francisco, California, United States Anyscale Full time

About Anyscale:Anyscale is a pioneering technology company that empowers software developers to harness the full potential of distributed computing. By commercializing Ray, an open-source project, we're creating an ecosystem of libraries for scalable machine learning. Companies like OpenAI, Uber, Spotify, Instacart, and Cruise trust Ray as a critical...
Advanced AI Infrastructure Engineer

5 days ago

San Francisco, California, United States Together AI Full time

About the RoleWe are seeking an experienced Systems Research Engineer to join our team at Together AI. As a key member of our research-driven artificial intelligence company, you will play a crucial role in researching and building the next generation AI platform.Company OverviewTogether AI is committed to creating open and transparent AI systems that drive...
Senior Cybersecurity Architect

4 days ago

San Francisco, California, United States Magic AI Full time

Job OverviewMagic AI's mission is to build safe Artificial General Intelligence (AGI) that accelerates humanity's progress on the world's most pressing challenges. We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than humans alone.About the RoleThis Senior...
Senior Inference Systems Engineer

3 weeks ago

San Francisco, California, United States Genmo Inc. Full time

At Genmo Inc., we are a research lab dedicated to building state-of-the-art models for video generation. Our goal is to unlock the potential of Artificial General Intelligence (AGI).Job OverviewWe are seeking a senior/staff software engineer to join our inference team. This role involves designing and scaling our inference systems to support millions of...
AI Infrastructure Architect

1 day ago

San Francisco, California, United States Abridge AI Inc. Full time

Abridge AI Inc. is a pioneering force in healthcare technology, utilizing artificial intelligence to empower deeper understanding and improve clinical documentation efficiency.Role OverviewWe are seeking an exceptional ML Systems Engineer to join our team, responsible for scaling and deploying machine learning models to handle increasing traffic demands and...
AI Solutions Architect

5 days ago

San Francisco, California, United States Perplexity AI Full time

Company OverviewWe're Perplexity AI, a rapidly growing company that has experienced tremendous growth and adoption since launching the world's first fully functional conversational answer engine. Our AI-powered search assistant has amassed 10 million monthly active users, with our mobile apps installed over 1 million times across iOS and Android devices....
AI Systems Architect

5 days ago

San Jose, California, United States Capital One Full time

About Capital OneAt Capital One, we are pushing the boundaries of what is possible with AI. We believe that responsible and reliable AI systems can change banking for good. Our team of experts is dedicated to creating innovative solutions that empower our customers and businesses to achieve their goals.About the RoleWe are seeking a skilled Distinguished AI...
Chief Healthcare AI Solutions Architect

7 days ago

San Francisco, California, United States Abridge Full time

Abridge is a pioneering healthcare technology company that leverages artificial intelligence to revolutionize medical conversations and clinical documentation. Our mission-driven team is committed to empowering deeper understanding in healthcare through innovative solutions.We are seeking an experienced Chief Healthcare AI Solutions Architect to join our...
AI Infrastructure Architect

4 days ago

San Francisco, California, United States Crusoe Energy Inc Full time

Crusoe Energy Inc is on a mission to unlock value in stranded energy resources through innovative technology.We are inspired by making sure that the energy meeting the demand for data centers is sourced in an environmentally responsible fashion. Crusoe co-locates mobile data centers with stranded energy resources, like flare gas and underloaded renewables,...
Distinguished AI Solutions Architect

5 days ago

San Jose, California, United States Capital One Full time

Transformative AI ExpertWe are seeking a visionary Distinguished AI Solutions Architect to join our team at Capital One. As an expert in artificial intelligence, you will play a key role in designing and developing innovative AI solutions that drive business growth and customer satisfaction.About the RoleThis is a unique opportunity to work on cutting-edge...
AI Infrastructure Systems Architect

5 days ago

San Francisco, California, United States CV Library Full time

About KuzcoWe are building a large-scale distributed LLM inference network that combines idle GPU capacity from around the world into a single cohesive plane of compute.Our team is a small, well-funded group of staff-level engineers who work together to tackle difficult, high-impact engineering problems in downtown San Francisco.We value creativity alongside...
Cloud AI Infrastructure Architect

10 hours ago

San Francisco, California, United States Crusoe Full time

About the Role:As a Senior/Staff Software Engineer on the Managed AI team at Crusoe, you'll play a pivotal role in shaping the architecture and scalability of our next-generation AI inference platform. You will lead the design and implementation of core systems for our AI services, including resilient fault-tolerant queues, model catalogs, and scheduling...

Americas

Europe

Asia / Oceania

Africa

AI Inference Systems Architect