Senior Site Reliability Engineer

2 days ago


San Francisco, California, United States Sibitalent Corp Full time $180,000 - $250,000 per year

Job Title: Staff Site Reliability Engineer (SRE)

Location: San Francisco, CA (Hybrid, Local Only)

Duration: 6+ months Contract

12+ Years of profile

W2 OR C2C (Either will work)

Job Description:

As our Staff SRE, you'll be the primary expert responsible for our entire compute ecosystem. Your key responsibilities will include:

  • As a Staff SRE, you'll operate at the highest level of technical expertise and influence. You won't just solve problems; you'll prevent them at a fundamental level across organizational boundaries.
  • Design, implement, and lead large-scale, cross-functional projects to improve the reliability, performance, and efficiency of our core services and infrastructure (10x impact).
  • Drive the reduction of toil by developing and deploying sophisticated automation tools and frameworks, championing the "everything as code" philosophy.
  • Serve as a technical escalation point for critical incidents, perform deep-dive root cause analyses (RCAs), and implement robust corrective measures to prevent recurrence.
  • Define and implement SLOS, SLIs, and Error Budgets for critical services. Enhance system health. our monitoring, logging, and tracing systems to provide comprehensive visibility into
  • Set the technical direction and best practices for the entire SRE and engineering organization. Mentor mid-level and senior engineers on design patterns, operational rigor, and reliability principles.
  • We're looking for a leader and a deep technical expert with a proven track record of solving the hardest scaling and reliability challenges.

Required Qualifications

  • 8+ years of progressive experience in Site Reliability Engineering, Production Engineering, or a closely related role.
  • Expert-level proficiency with AWS, including networking, compute, and storage.
  • Deep expertise in Kubernetes and the cloud-native ecosystem.
  • Fluency in at least one major scripting/programming language for automation and tooling (e.g., Python, Go, or Java).
  • Solid experience with monitoring and logging solutions (Datadog)
  • Proven ability to design and implement robust, highly available distributed systems.
  • Demonstrated experience with Infrastructure as Code tools like Terraform.
  • Exceptional communication skills, capable of explaining complex technical issues to both technical and non-technical audiences.

Nice-to-Have

  • Experience implementing Service Mesh technologies (e.g., Istio, Linkerd).
  • A strong understanding of security principles and practices in a cloud environment.
  • Certifications such as CKA (Certified Kubernetes Administrator) or CKAD (Certified Kubernetes Application Developer).


  • San Francisco, California, United States Quizlet Full time $258,000 - $314,000 per year

    About Quizlet:At Quizlet, our mission is to help every learner achieve their outcomes in the most effective and delightful way. Our $1B+ learning platform serves tens of millions of students every month, including two-thirds of U.S. high schoolers and half of U.S. college students, powering over 2 billion learning interactions monthly.We blend cognitive...


  • San Francisco, California, United States Relevance AI Full time $180,000 - $300,000 per year

    Location : San Francisco, USA (Hybrid 3 days/week)About Us At Relevance AI, our mission is to empower anyone to delegate work to the AI workforce. We're building a new category of AI automation, enabling teams to create and deploy intelligent AI agents that replicate human-quality work, decision-making, and collaboration at scale.We're scaling fast backed by...


  • San Francisco, California, United States Heartflow Full time $185,750 - $250,922 per year

    Heartflow is a medical technology company advancing the diagnosis and management of coronary artery disease, the #1 cause of death worldwide, using cutting-edge technology. The flagship product—an AI-driven, non-invasive cardiac test supported by the ACC/AHA Chest Pain Guidelines called the Heartflow FFRCT Analysis—provides a color-coded, 3D model of a...


  • San Francisco, California, United States People Data Labs Full time $160,000 - $180,000 per year

    Note for all engineering roles: With the rise of fake applicants and AI-enabled candidate fraud, we have built in additional measures throughout the process to identify such candidates and remove them.About UsPeople Data Labs (PDL) is the provider of people and company data. We do the heavy lifting of data collection and standardization so our customers can...


  • San Francisco, California, United States Air Apps Full time $150,000 - $250,000 per year

    About Air AppsAt Air Apps, we believe in thinking bigger—and moving faster. We're a family-founded company on a mission to create the world's first AI-powered Personal & Entrepreneurial Resource Planner (PRP), and we need your passion and ambition to help us change how people plan, work, and live. Born in Lisbon, Portugal, in 2018—and now with offices in...


  • San Jose, California, United States PayPal Full time $111,500 - $191,950

    The CompanyPayPal has been revolutionizing commerce globally for more than 25 years. Creating innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, PayPal empowers consumers and businesses in approximately 200 markets to join and thrive in the global economy. We operate a global, two-sided network at scale...


  • San Francisco, California, United States Harrison Clarke Full time $120,000 - $180,000 per year

    Harrison Clarke are working with several high profile companies that are seeking aPrincipal Site Reliability Engineer (SRE), to lead the design, implementation, and scaling of the infrastructure and systems that support their products.The ideal candidate should have extensive experience in designing highly scalable infrastructure, building systems, and...


  • San Francisco, California, United States Crusoe Full time $172,000 - $209,000

    Crusoe's mission is to accelerate the abundance of energy and intelligence. We're crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact,...


  • San Francisco, California, United States Reducto Full time $120,000 - $180,000 per year

    About ReductoReducto helps AI teams ingest real world enterprise data with state of the art accuracy.The vast majority of enterprise data — from financial statements to health records — is locked in unstructured file formats like PDFs and spreadsheets. We train vision models to read those documents the way a human would, and make it possible to build...


  • San Diego, California, United States SPECTRAFORCE Full time $120,000 - $180,000 per year

    Role: Site Reliability Engineer (Only on W2)Location: San Diego, CA - OnsiteDuration: 12 MonthsJob Description:The Site Reliability Engineer (SRE) will work closely with cross-functional teams, including software development, platform, and operations, to support the availability and performance of our cloud-based systems. You will take ownership of the cloud...