Distributed Infrastructure Lead

3 days ago


San Francisco, United States ZipRecruiter Full time

Job Description

Infrastructure Lead (Agent Networks)

About this role

We are seeking an exceptional Infrastructure Lead to architect and build the foundational systems that will power the next of AI agent networks at Naptha AI. This is a rare opportunity to shape the future of AI agent infrastructure at a massively ambitious scale, backed by industry veterans and technical leaders through NVIDIA Inception, Google for Startups, and Microsoft for Startups.

We're building the foundational infrastructure for the next wave of AI companies, enabling frontier AI developers (many leaving labs like OpenAI, Anthropic, and DeepMind) to build products powered by enormous networks of highly capable next-AI agents. As our Infrastructure Lead, you'll design and implement the systems that will enable billions of AI agents to interact, coordinate, and scale efficiently across distributed environments.

Core Responsibilities

  1. Design and implement scalable infrastructure for massive agent networks
  2. Architect systems for efficient agent communication and coordination
  3. Build robust, distributed systems for agent deployment and execution
  4. Create monitoring, observability, and debugging systems for agent networks
  5. Develop performance optimization strategies for large-scale agent operations
  6. Design fault-tolerant systems for reliable agent interactions
  7. Lead technical decisions around infrastructure architecture

Technical Challenges You'll Tackle

  1. Designing distributed systems that can handle millions of concurrent agent interactions
  2. Building efficient communication protocols for agent-to-agent interactions
  3. Creating scalable orchestration systems for agent deployment
  4. Implementing robust monitoring and debugging tools for complex agent networks
  5. Optimizing resource utilization across distributed agent systems
  6. Developing infrastructure that can adapt to emerging AI capabilities

You're a good fit if you have:

  1. Deep expertise in distributed systems and scalable architecture
  2. Strong experience with high-performance computing or large-scale systems
  3. Track record of building reliable, production-grade infrastructure
  4. Experience with modern cloud platforms and containerization
  5. Strong coding abilities in systems programming
  6. Understanding of AI/ML deployment challenges
  7. Passion for solving complex infrastructure problems

Required Technical Experience:

  1. Proven experience building distributed systems at scale
  2. Expertise in performance optimization and system reliability
  3. Strong programming skills (Go, Rust, or similar systems)
  4. Experience with container orchestration (Kubernetes, etc.)
  5. Understanding of network protocols and distributed computing
  6. Experience with observability and monitoring systems

About the hiring process:

  1. Technical architecture discussion
  2. Systems design deep dive
  3. Coding and problem-solving session
  4. Team collaboration interview
  5. Infrastructure vision presentation

Compensation & Benefits:

  1. Highly competitive salary and significant equity stake
  2. Remote-first work environment
  3. Full medical, dental, and vision coverage
  4. Flexible PTO policy
  5. Learning and development budget
  6. Conference attendance support

This is a unique opportunity to shape the infrastructure that will power the next of AI systems. You'll be working at the intersection of distributed systems, AI, and platform design, creating the foundation for how future AI agents will interact and scale.

#J-18808-Ljbffr

  • San Francisco, United States Caldera Full time

    Senior Distributed Systems Engineer (Infrastructure) We're looking for an incredible senior engineer to help us build the future of blockchain scalability. This is an ideal opportunity for an engineer who is already passionate about tackling problems in blockchain scalability, or looking to break into the blockchain engineering space. If you're looking to...


  • San Francisco, United States Caldera Full time

    Senior Distributed Systems Engineer (Infrastructure)We’re looking for an incredible senior engineer to help us build the future of blockchain scalability.This is an ideal opportunity for an engineer who is already passionate about tackling problems in blockchain scalability or looking to break into the blockchain engineering space. If you’re looking to...


  • San Francisco, United States Arbitrum Full time

    Senior Distributed Systems Engineer (Infrastructure)We’re looking for an incredible senior engineer to help us build the future of blockchain scalability.This is an ideal opportunity for an engineer who is already passionate about tackling problems in blockchain scalability, or looking to break into the blockchain engineering space. If you’re looking to...


  • San Francisco, California, United States CoinTracker Full time

    At CoinTracker, we're on a mission to increase financial freedom and prosperity worldwide. Our technology stack is the backbone of this mission, and we're looking for a talented Infrastructure Engineering Manager to lead the charge.You'll be responsible for building and leading a high-performing infrastructure team that powers our technology stack. This...

  • Tech Lead Manager

    4 weeks ago


    San Francisco, United States Baseten Full time

    ABOUT BASETEN We're a growing team of builders backed by top-tier investors, including IVP, Spark Capital, Greylock, and Sarah Guo at Conviction. ML teams at enterprises and category-defining AI-native companies like Descript, Bland.ai, Patreon, Writer, and Robust Intelligence use Baseten to power their core production workloads with best-in-class...


  • San Francisco, United States ZipRecruiter Full time

    Company Overview: Welcome to the forefront of AI-driven innovation! Our company is a trailblazer in leveraging machine learning to revolutionize industries. We're committed to building robust infrastructure that powers our machine learning models at scale. Join us and lead our efforts in shaping the future of AI infrastructure engineering.Position Overview:...

  • Tech Lead Manager

    3 weeks ago


    San Francisco, United States BaseTen Labs, Inc. Full time

    ABOUT BASETENWe’re a growing team of builders backed by top-tier investors, including IVP, Spark Capital, Greylock, and Sarah Guo at Conviction. ML teams at enterprises and category-defining AI-native companies like Descript, Bland.ai, Patreon, Writer, and Robust Intelligence use Baseten to power their core production workloads with best-in-class...

  • Infrastructure Lead

    52 minutes ago


    San Francisco, United States Anrok, Inc Full time

    Anrok is pioneering the way in addressing a crucial challenge for businesses worldwide: navigating the complex realm of sales tax and VAT. As tax regulations continue to change and become more intricate, companies require a dependable and automated solution to manage risk and ensure global compliance doesn't become a drag on their revenue. Anrok's...


  • San Francisco, United States Christy Media Solutions Full time

    Ready to your next steps within a forward-thinking team with access to the latest technologies? We’re seeking a dynamic professional with a strong background in IT infrastructure engineering and a deep understanding of security, networking, and cloud technologies.This is the chance to work at the forefront of transformative research, building...


  • San Francisco, United States SiriusXM Radio, Inc. Full time

    Who We Are:SiriusXM and its brands (Pandora, SiriusXM Media, AdsWizz, Simplecast, and SiriusXM Connect) are leading a new era of audio entertainment and services by delivering the most compelling subscription and ad-supported audio entertainment experience for listeners -- in the car, at home, and anywhere on the go with connected devices. Our vision is to...


  • San Francisco, California, United States Mixpanel Full time

    About MixpanelMixpanel is a cutting-edge event analytics platform designed for builders who need actionable insights from their data at any given moment. This innovative solution empowers organizations to make informed decisions without requiring complex SQL queries.With over 8,000 customers, including prominent brands like Netflix, Pinterest, Sweetgreen,...


  • San Francisco, United States Tekfortune Inc Full time

    Role: Mainframe Infrastructure Engineer Location: Remote Job Description: We are looking for a versatile and highly skilled Mainframe Infrastructure Engineer with expertise in observability tooling across various channels to enhance availability and expedite incident resolution. The ideal candidate will design, implement, and lead infrastructure solutions,...


  • San Francisco, California, United States Discord Full time

    Discord is a leading communications platform that enables friends and communities to connect across the globe. We're committed to building innovative features that bring people together, and we're seeking a highly skilled Distributed Systems Engineer to join our team.The Real Time Infrastructure team plays a critical role in powering Discord's real-time...


  • San Francisco, United States She Recruits, LLC Full time

    Job Title: Founding Senior Engineer, InfrastructureLocation: San Francisco, CA (Onsite) Salary:$170k-220k Overview:An emerging AI company in San Francisco is seeking a talented Founding Senior Engineer (Infrastructure) to help shape and build our foundational infrastructure. This is a unique opportunity to work closely with a dedicated team to scale, refine,...


  • San Francisco, United States Salesforce Full time

    Job Category: Software Engineering About Salesforce: We’re Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driving your performance and...

  • Lead Consultant

    3 weeks ago


    San Francisco, United States Computacenter AG & Co. oHG Full time

    About the roleThe Lead Consultant will lead the Data Center High Level Designs/Layer ZERO consulting. The Lead Consultant will be charged with designing a “Logical” solution for the customer that includes power, including Redundancy & Distribution, Cooling Layouts, Fire protection, Cabling. This resource can expect to work closely with a Network...


  • San Francisco, United States OpenAI Full time

    You’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering, product, alignment teams that are core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations,...


  • San Francisco, United States salesforce.com, inc. Full time

    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.Job Category: Software EngineeringAbout Salesforce:We're Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. Leading with our core values, we help companies across every industry...


  • San Francisco, California, United States Unreal Gigs Full time

    Unreal Gigs OverviewWelcome to Unreal Gigs, a pioneering force in AI-driven innovation. We're committed to building robust infrastructure that powers our machine learning models at scale.Salary: $195,000 - $255,000 per yearPosition SummaryWe're seeking a seasoned Senior Machine Learning Infrastructure Engineer to lead the design, development, and...


  • San Francisco, United States OpenAI Full time

    About the TeamYou’ll join the team that’s behind OpenAI’s data infrastructure that powers critical engineering and product teams core to the work we do at OpenAI. The systems we support include our data warehouse, batch compute infrastructure, streaming infrastructure, data orchestration system, data lake, vector databases, critical integrations, and...