Current jobs related to Senior Software Engineer, Infrastructure Solutions - San Francisco, California - Anthropic Limited


  • San Francisco, California, United States Baton (A Ryder Technology Lab) Full time

    Job Title: Senior Software Engineer - InfrastructureWe are seeking a highly skilled Senior Software Engineer to join our Infrastructure team at Baton, a technology innovation lab for Ryder. As a key member of our team, you will be responsible for designing and developing our core web infrastructure, ensuring scalability, reliability, and security.Key...


  • San Francisco, California, United States Baton Full time

    Job DescriptionBaton is a technology innovation lab for Ryder, a leading logistics company. We're seeking a Senior Software Engineer to join our Infrastructure team.This role involves creating robust testing infrastructure to enhance engineering productivity and set a high standard for code quality.You'll work with our Head of Engineering to enable our...


  • San Francisco, California, United States Baton (A Ryder Technology Lab) Full time

    Job Title: Senior Software Engineer - InfrastructureBaton, a technology innovation lab for Ryder, is seeking a highly skilled Senior Software Engineer to join our Infrastructure team. As a key member of our team, you will play a crucial role in designing and implementing robust testing infrastructure that enhances engineering productivity and sets a high...


  • San Francisco, California, United States Acceler8 Talent Full time

    About the RoleWe are seeking a Senior Software Engineer (AI Infrastructure / MLOps) to join our pioneering AI startup focused on enhancing data quality for machine learning. This role offers the chance to work on large-scale web applications and tackle complex challenges in a rapidly growing field.As a Senior Software Engineer (AI Infrastructure / MLOps),...


  • San Francisco, California, United States Informal Systems Full time

    Job OverviewInformal Systems is a pioneering company in the field of blockchain technology, specializing in the security of interoperable, fault-tolerant networks. We are seeking a highly skilled Senior Software Engineer to join our team as a Blockchain Infrastructure Engineer, focusing on our core staking operations and service offerings on Ethereum,...


  • San Francisco, California, United States Acceler8 Talent Full time

    About the RoleWe are seeking a highly skilled Senior Software Engineer to join our team as an AI Infrastructure Specialist. This role offers the opportunity to work on large-scale web applications and tackle complex challenges in a rapidly growing field.As a Senior Software Engineer, you will be responsible for developing and maintaining our flagship web...


  • San Francisco, California, United States Deepscribe Full time

    About the RoleWe are seeking a Senior Software Engineer to join our ML Infrastructure team at DeepScribe. As a key member of our team, you will be responsible for building and optimizing infrastructure for audio processing, transcription, and LLM orchestration, ensuring scalability, reliability, and performance.You will collaborate with product and AI...


  • San Francisco, California, United States Rippling Full time

    Senior Staff Software Engineer - Infrastructure LeadAbout RipplingRippling is a leading provider of cloud-based human capital management (HCM) solutions, offering a comprehensive platform for businesses to manage their workforce, payroll, benefits, and other HR-related tasks. Our mission is to empower organizations to streamline their operations, improve...


  • San Francisco, California, United States HashiCorp Full time

    About UsHashiCorp is a leading provider of cloud infrastructure management solutions. Our team is dedicated to delivering innovative products that enable organizations to manage their cloud, private datacenter, and SaaS infrastructure with ease.About the RoleWe are seeking a highly skilled Senior Engineer to join our Terraform Enterprise team. As a key...


  • San Francisco, California, United States Pomelo Full time

    About the RolePomelo is a financial technology platform that combines consumer credit and global remittances. We're looking for a skilled Senior Software Engineer, Infrastructure to join our team in San Francisco. As a vital member of our Infrastructure team, you'll play a key role in building and maintaining the core systems that keep our platform reliable,...


  • San Francisco, California, United States Acceler8 Talent Full time

    About the RoleWe are seeking a highly skilled Senior Software Engineer to join our pioneering AI startup, specializing in enhancing data quality for machine learning. This role offers the opportunity to work on large-scale web applications and tackle complex challenges in a rapidly growing field.As a Senior Software Engineer, you will be responsible for...


  • San Francisco, California, United States Triunity Software Full time

    Job Title: Senior Java Software EngineerWe are seeking a highly skilled Senior Java Software Engineer to join our team at Triunity Software.Key Responsibilities:* Design, develop, and test complex software applications using Java* Collaborate with cross-functional teams to identify and prioritize project requirements* Develop and maintain high-quality,...


  • San Francisco, California, United States Crusoe Full time

    About the Role:We are seeking a Senior/Staff Software Engineer to join our team at Crusoe Energy, a company on a mission to unlock value in stranded energy resources through the power of computation.As a key member of our engineering team, you will design and develop internal admin tooling and infrastructure management systems for Crusoe Cloud, a leading...


  • San Francisco, California, United States Human Capital Solutions Full time

    About the JobAt Human Capital Solutions, we are seeking a talented Security Infrastructure Software Engineer to join our pioneering team. Our company develops leading software for data-driven decisions and operations, empowering partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.The RoleOur products...


  • San Francisco, California, United States MESH Full time

    About MeshMESH is a pioneering financial operating system that enables seamless transactions and financial solutions. Founded in 2020, our mission is to create an open, connected, and secure ecosystem for businesses and users.Job DescriptionWe're seeking a highly skilled Senior Site Reliability Engineer to join our infrastructure team. As a key member,...


  • San Francisco, California, United States Crusoe Full time

    About the RoleAs a Senior/Staff Software Engineer on the Managed AI team at Crusoe, you'll have a pivotal role in shaping the architecture and scalability of our next-generation AI inference platform.You will lead the design and implementation of core systems for our AI services, including resilient fault-tolerant queues, model catalogs, and scheduling...


  • San Francisco, California, United States Parafin Inc Full time

    About Us:At Parafin, we're dedicated to empowering small businesses to grow and thrive. Our mission is to provide innovative financial services that make a real difference in the lives of entrepreneurs and their communities.We're a team of passionate individuals who share a common goal: to build a platform that enables small businesses to access the...


  • San Francisco, California, United States University of California , San Francisco Full time

    Job SummaryThe Senior Network Infrastructure Engineer plays a pivotal role in the planning, design, coordination, and management of IT infrastructure delivery for network connectivity services expansions, repairs, and installations.They ensure that IT infrastructure meets both current and future network needs across the UCSF community, aligning with...


  • San Francisco, California, United States INSPYR Solutions Full time

    Job Title: Senior Software EngineerAbout the Role:We are seeking a highly skilled Senior Software Engineer to join our team at INSPYR Solutions. As a key member of our engineering team, you will be responsible for designing, developing, and maintaining our cutting-edge Digital Advertising Platform.Key Responsibilities:* Design and implement scalable,...

  • Software Engineer

    1 month ago


    San Francisco, California, United States Unreal Gigs Full time

    Job Title: Software Engineer, InfrastructureUnreal Gigs is seeking a highly skilled Software Engineer, Infrastructure to join our team. As a key member of our infrastructure team, you will play a critical role in designing, building, and maintaining our software infrastructure to support seamless research experiences.Responsibilities:Tool Development: Design...

Senior Software Engineer, Infrastructure Solutions

2 months ago


San Francisco, California, United States Anthropic Limited Full time

Position Overview:

Anthropic Limited is in search of skilled and seasoned Infrastructure Engineers to enhance our capabilities in developing, scaling, and maintaining innovative AI systems. By becoming part of our Infrastructure division, you will engage with pioneering AI technologies and play a vital role in advancing frontier models, aligning with Anthropic's vision to foster safe and dependable AI systems that serve humanity's interests.

Current Opportunities:
  • Data Infrastructure: The Data Infrastructure group is tasked with architecting, constructing, and sustaining the data frameworks that drive our AI research and offerings. You will collaborate with diverse teams to ascertain data needs, deliver robust and dependable data solutions, and perpetually refine our data infrastructure. Your responsibilities will include developing and enhancing data pipelines, applying data governance best practices, monitoring system performance, troubleshooting issues, and formulating technical strategies for high-scale, reliable data infrastructure and pipelines. Familiarity with technologies such as Spark, Airflow, dbt, and cloud services from GCP and AWS will be essential, along with designing processes to ensure effective team operations and ongoing improvement.

  • Research Infrastructure: The research infrastructure team focuses on creating and scaling systems that empower researchers to iterate swiftly while ensuring that critical systems/components utilized by researchers can operate at production scale as our model footprint expands.

  • Site Reliability Engineering: As a Site Reliability Engineer at Anthropic, you will devise and implement scalable solutions, partner with development teams to enhance infrastructure reliability, and establish monitoring systems, Service Level Objectives (SLOs), and Service Level Indicators (SLIs). You will apply fault-tolerant design patterns, construct automation tools, and participate in an on-call rotation. By utilizing Infrastructure as Code (IaC) principles, you will work with cross-functional teams to guarantee reliability and scalability in new features and services, while also accelerating engineering reliability through superior tooling.

  • Systems: The systems team is responsible for managing some of the largest and most sophisticated clusters in the industry, which are utilized for training, researching, and ultimately serving AI models. Your contributions will be critical in ensuring Anthropic's ability to reliably and safely train frontier models. You will oversee the construction of systems and the operation of extensive Kubernetes clusters with GPU/TPU/Tranium workloads.
  • Observability: The observability team is dedicated to designing, building, and maintaining the observability infrastructure that guarantees the reliability, performance, and efficiency of our AI systems and services. You will work with various teams to comprehend their observability requirements and deliver solutions using technologies such as Prometheus, Splunk, Cloud Logging, Grafana, and Honeycomb. Your role will involve developing a configuration-driven approach to manage dashboards and alerts, implementing structured logging and tracing, optimizing the observability stack, and creating a reliable system that requires minimal maintenance. You will promote a culture of operational excellence, proactive monitoring, and continuous improvement by providing managed, centralized, and user-friendly observability tools.

Key Responsibilities:
  • Lead the development of industry-leading AI clusters (ranging from thousands to hundreds of thousands of machines), collaborating closely with cloud service providers on cluster development and necessary features.
  • Engage with various stakeholders to thoroughly understand infrastructure, data, and computational needs, identifying potential solutions to support advanced research and product development.
  • Establish technical strategies and oversee the creation of high-scale, reliable infrastructure systems.
  • Mentor and guide top technical talent.
  • Design processes (e.g., postmortem reviews, incident response, on-call rotations) that facilitate effective team operations and prevent recurring failures.
You Might Be a Great Fit If You:
  • Possess 8+ years of relevant industry experience, with at least 3 years leading large-scale, complex projects or teams as an engineer or technical lead.
  • Exhibit a strong passion for distributed systems at scale, infrastructure reliability, scalability, security, and continuous improvement.
  • Demonstrate strong proficiency in at least one programming language (e.g., Python, Rust, Go, Java).
  • Showcase excellent problem-solving skills and the ability to work independently.
  • Have a keen interest in supporting internal partners, such as research teams, to understand their needs.
  • Possess outstanding communication skills to build consensus with stakeholders, both internally and externally.
  • Have extensive knowledge of modern cloud infrastructure, including Kubernetes, Infrastructure as Code, AWS, and GCP.
Preferred Qualifications:
  • Expertise in security and privacy best practices.
  • Experience with machine learning infrastructure, including GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL.
  • Familiarity with low-level systems, such as Linux kernel tuning and eBPF.
  • Technical acumen in quickly understanding systems design trade-offs and keeping pace with rapidly evolving software systems.

Application Deadline: None. Applications will be reviewed on a rolling basis.