Site Reliability Engineer

2 days ago


San Diego, California, United States A-Line Staffing Solutions Full time $144,000 per year

Site Reliability Engineer (SRE)


Location:
San Diego, CA (Hybrid)


Rate:
$70–80/hr on W-2 (No C2C)

Overview

We are seeking an experienced
Site Reliability Engineer (SRE)
to join our cross-functional team supporting cloud-based systems in a regulated healthcare environment. This role is ideal for an engineer who thrives on automation, scalability, observability, and ensuring the reliability and performance of enterprise cloud infrastructure. You'll work closely with development, platform, and operations teams to optimize our AWS and Azure environments.

Key Responsibilities

Cloud Infrastructure Management

  • Design, deploy, and maintain scalable, secure, and highly available infrastructure on
    AWS
    and
    Azure
    .
  • Develop Infrastructure-as-Code using
    Terraform
    ,
    AWS CDK
    , or
    CloudFormation
    .
  • Script and automate workflows using
    TypeScript
    ,
    PowerShell
    , or
    Go
    .
  • Ensure compliance with
    SOC II
    ,
    ePHI
    , and healthcare data security standards.

Observability & Monitoring

  • Implement and optimize
    Datadog
    for comprehensive application and infrastructure monitoring.
  • Build alerting mechanisms for key performance indicators (latency, system health, error rates).
  • Create and maintain real-time performance dashboards and incident response runbooks.

Performance Optimization & Troubleshooting

  • Identify and resolve system bottlenecks; ensure reliability and scalability of production systems.
  • Conduct root cause analysis and participate in on-call rotations.
  • Continuously improve architecture, security posture, and disaster recovery strategies.

Collaboration & DevOps Enablement

  • Partner with development teams to enhance CI/CD pipelines (Jenkins, GitHub Actions, or Azure DevOps).
  • Champion
    infrastructure as code
    and automation across the organization.
  • Collaborate with security and compliance teams to uphold all regulatory standards.

Security & Compliance

  • Maintain security posture for healthcare data systems in alignment with
    SOC II
    and
    HIPAA/ePHI
    .
  • Implement IAM best practices, encryption policies, and regular audit processes.

Qualifications

  • Bachelor's in Computer Science, Engineering, or related field (or equivalent experience).
  • 3+ years as an
    SRE
    managing cloud environments on
    AWS
    and/or
    Azure
    .
  • Hands-on experience with observability tools (
    Datadog
    ,
    Prometheus
    ,
    Grafana
    , etc.).
  • Expertise in
    Terraform
    ,
    CloudFormation
    , or
    AWS CDK
    .
  • Strong background in
    Kubernetes
    and
    Docker
    .
  • Experience with
    Ansible
    ,
    Puppet
    , or
    Chef
    for automation.
  • Proficiency with
    CI/CD
    tools (Jenkins, GitHub Actions, Azure DevOps).
  • Healthcare compliance experience (SOC II, ePHI) strongly preferred.

Nice to Have

  • Experience in regulated industries (Healthcare, Medical Devices).
  • Certifications:
    AWS Solutions Architect
    ,
    Azure Administrator
    ,
    CKA
    .
  • Exposure to
    AI/ML
    models for predictive performance and maintenance.
  • Familiarity with
    serverless
    technologies (AWS Lambda, Azure Functions).

Additional Attributes

  • Strong analytical and decision-making skills.
  • Collaborative and effective communicator with cross-functional teams.
  • Action-oriented and solutions-focused mindset.
  • Proven ability to influence without direct authority.
  • Excellent written skills for documenting processes and technical plans.


  • San Diego, California, United States SPECTRAFORCE Full time $120,000 - $180,000 per year

    Role: Site Reliability Engineer (Only on W2)Location: San Diego, CA - OnsiteDuration: 12 MonthsJob Description:The Site Reliability Engineer (SRE) will work closely with cross-functional teams, including software development, platform, and operations, to support the availability and performance of our cloud-based systems. You will take ownership of the cloud...


  • San Diego, California, United States ServiceNow Full time $126,700 - $215,400

    Company Description It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500. Our intelligent cloud-based...


  • San Diego, California, United States ServiceNow Full time $104,000 - $174,000 per year

    Company DescriptionIt all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500. Our intelligent cloud-based...


  • San Diego, California, United States Lytx, Inc. Full time $183,500 - $232,500

    Why Lytx:We are a team of Hungry, Low ego and capable engineers that design and support our IOT Infrastructure. Are you interested in "Operations as Code", "Infrastructure as Code" and infrastructure automation solutions? If so keep reading....Site Reliability Engineering team is responsible for the availability, reliability, observability and resilience of...


  • San Francisco, California, United States Air Apps Full time $150,000 - $250,000 per year

    About Air AppsAt Air Apps, we believe in thinking bigger—and moving faster. We're a family-founded company on a mission to create the world's first AI-powered Personal & Entrepreneurial Resource Planner (PRP), and we need your passion and ambition to help us change how people plan, work, and live. Born in Lisbon, Portugal, in 2018—and now with offices in...


  • San Francisco, California, United States Sibitalent Corp Full time $180,000 - $250,000 per year

    Job Title: Staff Site Reliability Engineer (SRE)Location: San Francisco, CA (Hybrid, Local Only)Duration: 6+ months Contract12+ Years of profileW2 OR C2C (Either will work)Job Description:As our Staff SRE, you'll be the primary expert responsible for our entire compute ecosystem. Your key responsibilities will include:As a Staff SRE, you'll operate at the...


  • San Francisco, California, United States Maxonic Inc. Full time $120,000 - $180,000 per year

    Maxonic maintains a close and long-term relationship with our direct client. In support of their needs, we are looking for anInfrastructure Site Reliability EngineerJob Description:Job Title: Infrastructure Site Reliability EngineerJob Type: Contract (4+ months) with strong possibility to convert to fulltimeJob Location: San Francisco, CAWork Schedule:...


  • San Francisco, California, United States Heartflow Full time $185,750 - $250,922 per year

    Heartflow is a medical technology company advancing the diagnosis and management of coronary artery disease, the #1 cause of death worldwide, using cutting-edge technology. The flagship product—an AI-driven, non-invasive cardiac test supported by the ACC/AHA Chest Pain Guidelines called the Heartflow FFRCT Analysis—provides a color-coded, 3D model of a...


  • San Francisco, California, United States Harrison Clarke Full time $120,000 - $180,000 per year

    Harrison Clarke are working with several high profile companies that are seeking aPrincipal Site Reliability Engineer (SRE), to lead the design, implementation, and scaling of the infrastructure and systems that support their products.The ideal candidate should have extensive experience in designing highly scalable infrastructure, building systems, and...


  • San Jose, California, United States TikTok Full time $359,720 per year

    ResponsibilitiesTeam Intro:TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security ("USDS") is a subsidiary of TikTok in the U.S.Site Reliability Engineering(SRE) at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and...