Director of Engineering, Production Infrastructure

4 days ago


Boston, Massachusetts, United States Klaviyo Full time $250,000 - $350,000 per year

Klaviyo's mission is to help businesses own their growth. To do that at scale, our engineers need rock‑solid platform primitives, clear, reliable building blocks that make it simple to run, store, observe, and ship in production. As Director, Production Infrastructure you will lead and build a product‑quality platform that accelerates every R&D team's path from idea to customer value. You and your team will define what is (and isn't) a platform primitive, set strong ownership boundaries, and deliver "golden paths" that answer questions like: "I want to run X- where do I run it?", "My product needs a single table to store a bit of data, where should I put it?", and "How do I get data from my service to the frontend?"

This is a hands‑on, execution‑first leadership role for a platform‑minded builder who measures success in developer velocity, system reliability, and business impact.

What You'll Do
  • Own the Production Infrastructure charter. Define the platform primitives Klaviyo provides (compute runtimes, data storage options, messaging/eventing, service networking, observability) and the clear "contract" for each: APIs, SLIs/SLOs, support model, and runbooks ensuring consistency with our company wide operational excellence best practices 
  • Publish golden paths and decision trees that make default choices obvious (e.g., "run X here," "store a bit here," "expose data to frontend via Y"), minimizing one‑off work and increasing self‑service.
  • Raise reliability and safety bars across production: incident prevention and response (blameless postmortems, on‑call health), change management, capacity planning, and resilient multi‑tenant patterns.
  • Accelerate developer velocity by improving time‑to‑first‑service, deployment lead time, and mean time to recovery; partner with product teams to remove infrastructure bottlenecks and reduce cognitive load.
  • Engineer for cost‑effectiveness at scale. Establish clear cost guardrails, usage quotas, and right‑sizing policies; partner with Finance and Security to balance spend, risk, and speed.
  • Lead and grow high‑performing teams of managers and senior ICs; set crisp goals, coach for impact, and cultivate an inclusive, ownership‑driven culture. 
  • Partner cross‑functionally with engineering leaders, security, and others to sequence investments, clarify ownership boundaries, and land platform changes safely.
  • Measure what matters. Define and report a concise scorecard (e.g., SLO coverage, incident frequency/severity, lead time for changes, MTTR, developer NPS for platform, infra cost‑to‑serve).
  • Transform workflows by putting AI at the center, building smarter systems and ways of working from the ground up; continuously experiment with AI tools and share learnings to keep the org ahead of the curve.
Who You Are
  • Platform‑minded, execution‑oriented leader with a track record building and operating production platforms at scale (e.g., multi‑tenant compute, storage, networking, CI/CD, observability). You prioritize measurable outcomes such as:reliability, efficiency, and developer productivity.
  • Experienced people leader: 10+ years in infrastructure/SRE/platform engineering, including 5+ years managing managers and senior ICs; you set high bars, coach well, and build inclusive teams.
  • Reliability first. Deep familiarity with SRE practices, SLO/SLI design, incident management, capacity planning, and operational readiness. (
  • Great system thinker & communicator. You reduce ambiguity, create clarity in docs and diagrams, and influence across product, data, and security to land org‑wide changes.
  • Outcome‑driven and accountable. You set crisp goals, instrument the work, and hold teams to impact not just activity. You're comfortable saying "no" and narrowing scope to ship.
  • AI‑curious and hands‑on. You've already experimented with AI in work or personal projects and are eager to learn fast, using AI responsibly to make your team's work smarter and more efficient.
  • Technical stack familiarity (mix of): public cloud (AWS/GCP), container orchestration, service meshes/ingress, data stores (SQL/NoSQL/object), eventing/streaming, IaC, and modern observability.
Nice to Haves
  • Experience productizing internal platforms (treating infra as a product with SLAs, roadmaps, and developer experience metrics).
  • Background in data or event‑driven architectures at scale; prior partnership with a centralized data platform (e.g., KDP) to define clean ownership boundaries.
  • Prior success improving cost‑to‑serve and reliability in a high‑growth SaaS environment.

We use Covey as part of our hiring and / or promotional process. For jobs or candidates in NYC, certain features may qualify it as an AEDT. As part of the evaluation process we provide Covey with job requirements and candidate submitted applications. We began using Covey Scout for Inbound on April 3, 2025.

Please see the independent bias audit report covering our use of Covey here



  • Boston, Massachusetts, United States Harnham Full time $180,000 - $300,000 per year

    Director, AI/ML, Compute, Infra & DevOps PlatformsFull timeHybrid - 2x per week onsiteGreater Boston AreaThis is a major investment in the future of data-driven discovery - built to leverage data, knowledge, and prediction to accelerate the discovery of new medicines. We're a full-stack organization spanning product and portfolio leadership, data...


  • Boston, Massachusetts, United States Global Infrastructure Full time $117,300 - $192,000

    TYLin is a globally recognized, full-service infrastructure consulting firm committed to providing innovative, cost-effective, constructible designs for the global infrastructure market. With over 3,000 employees throughout the Americas, Asia, and Europe, the firm provides support on projects of varying size and complexity. Together, we enhance conventional...


  • Boston, Massachusetts, United States PathAI Full time $150,000 - $250,000 per year

    Who We ArePathAI's mission is to improve patient outcomes with AI-powered pathology. Our AiSight platforms and AI products promise substantial improvements to the accuracy of diagnosis and the efficacy of treatment of diseases like cancer, leveraging modern approaches in machine learning. Our team, comprising diverse employees with a wide range of...


  • Boston, Massachusetts, United States Workato Full time $210,000 - $290,000

    About WorkatoWorkato transforms technology complexity into business opportunity. As the leader in enterprise orchestration, Workato helps businesses globally streamline operations by connecting data, processes, applications, and experiences. Its AI-powered platform enables teams to navigate complex workflows in real-time, driving efficiency and...


  • Boston, Massachusetts, United States AcuityMD Full time $180,000 - $210,000

    Senior Software Engineer, InfrastructureAcuityMD is a software and data platform that accelerates access to medical technologies. We help MedTech companies understand how their products are used, why customers vary, and identify opportunities for physicians to better serve their patients. Each year, the FDA approves ~6,000 new medical devices. Our solution...


  • Boston, Massachusetts, United States Apollo Solutions Full time $120,000 - $180,000 per year

    Senior Infrastructure Systems Engineer – Hybrid Cloud (Azure / On-Prem / M365)We're seeking aSenior Infrastructure Systems Engineerfor a leading investment management firm to take ownership of critical projects across ahybrid cloud environment. You'll design, deploy, and maintain secure, scalable infrastructure solutions spanningAzureandon-prem systems,...


  • Boston, Massachusetts, United States Better Life Partners Full time $144,500 - $182,750

    Who we are:At Better Life Partners, we provide what it takes to heal from addiction. Wherever. Whenever.We work alongside community-based organizations to meet our members where they are, no matter what recovery looks like to them. By combining virtual and in-person counseling, community support, and access to life-saving medication, we help people move...

  • DevOps Engineer

    4 days ago


    Boston, Massachusetts, United States Wizards of the Coast Full time $100,000 - $150,000 per year

    At Wizards of the Coast, we connect people around the world through play and imagination. From our genre-defining games like Magic: The Gathering and Dungeons & Dragons to our growing multiverse, we continue to innovate and build new ways to foster friendship and connection. That's where you come inAs a DevOps Engineer on our Digital Innovations Team, you'll...

  • Product Engineer

    5 days ago


    Boston, Massachusetts, United States Chelsea Clock Full time $60,000 - $80,000 per year

    Product Engineer Chelsea, MA |  Full-TimeDepartment: Product Development & Service OperationsReports To: Director, Product and Repair ServicesAbout ChelseaChelsea is an American manufacturer of premium clocks and gift items, built upon a proud 125+ year heritage of craftsmanship and precision. Every Chelsea timepiece is made with care—combining...


  • Boston, Massachusetts, United States Grand Circle Travel Full time $145,000 - $158,000 per year

    Productivity Solutions EngineerReporting to: VP, InfrastructureDepartment: Infrastructure ServicesLocation: Boston, MA (Hybrid, 3x/wk onsite required)Position SummaryGrand Circle Corporation is the leader in international travel, adventure and discovery for Americans aged 50+. Headquartered in Boston MA, and with more than 45 offices globally, more than two...