Site Reliability Engineer

2 days ago

San Diego, California, United States A-Line Staffing Solutions Full time $144,000 per year

Site Reliability Engineer (SRE)

Location:
San Diego, CA (Hybrid)

Rate:
$70–80/hr on W-2 (No C2C)

Overview

We are seeking an experienced
Site Reliability Engineer (SRE)
to join our cross-functional team supporting cloud-based systems in a regulated healthcare environment. This role is ideal for an engineer who thrives on automation, scalability, observability, and ensuring the reliability and performance of enterprise cloud infrastructure. You'll work closely with development, platform, and operations teams to optimize our AWS and Azure environments.

Key Responsibilities

Cloud Infrastructure Management

Design, deploy, and maintain scalable, secure, and highly available infrastructure on
AWS
and
Azure
.
Develop Infrastructure-as-Code using
Terraform
,
AWS CDK
, or
CloudFormation
.
Script and automate workflows using
TypeScript
,
PowerShell
, or
Go
.
Ensure compliance with
SOC II
,
ePHI
, and healthcare data security standards.

Observability & Monitoring

Implement and optimize
Datadog
for comprehensive application and infrastructure monitoring.
Build alerting mechanisms for key performance indicators (latency, system health, error rates).
Create and maintain real-time performance dashboards and incident response runbooks.

Performance Optimization & Troubleshooting

Identify and resolve system bottlenecks; ensure reliability and scalability of production systems.
Conduct root cause analysis and participate in on-call rotations.
Continuously improve architecture, security posture, and disaster recovery strategies.

Collaboration & DevOps Enablement

Partner with development teams to enhance CI/CD pipelines (Jenkins, GitHub Actions, or Azure DevOps).
Champion
infrastructure as code
and automation across the organization.
Collaborate with security and compliance teams to uphold all regulatory standards.

Security & Compliance

Maintain security posture for healthcare data systems in alignment with
SOC II
and
HIPAA/ePHI
.
Implement IAM best practices, encryption policies, and regular audit processes.

Qualifications

Bachelor's in Computer Science, Engineering, or related field (or equivalent experience).
3+ years as an
SRE
managing cloud environments on
AWS
and/or
Azure
.
Hands-on experience with observability tools (
Datadog
,
Prometheus
,
Grafana
, etc.).
Expertise in
Terraform
,
CloudFormation
, or
AWS CDK
.
Strong background in
Kubernetes
and
Docker
.
Experience with
Ansible
,
Puppet
, or
Chef
for automation.
Proficiency with
CI/CD
tools (Jenkins, GitHub Actions, Azure DevOps).
Healthcare compliance experience (SOC II, ePHI) strongly preferred.

Nice to Have

Experience in regulated industries (Healthcare, Medical Devices).
Certifications:
AWS Solutions Architect
,
Azure Administrator
,
CKA
.
Exposure to
AI/ML
models for predictive performance and maintenance.
Familiarity with
serverless
technologies (AWS Lambda, Azure Functions).

Additional Attributes

Strong analytical and decision-making skills.
Collaborative and effective communicator with cross-functional teams.
Action-oriented and solutions-focused mindset.
Proven ability to influence without direct authority.
Excellent written skills for documenting processes and technical plans.

Site Reliability Engineer

3 days ago

San Diego, California, United States SPECTRAFORCE Full time $120,000 - $180,000 per year

Role: Site Reliability Engineer (Only on W2)Location: San Diego, CA - OnsiteDuration: 12 MonthsJob Description:The Site Reliability Engineer (SRE) will work closely with cross-functional teams, including software development, platform, and operations, to support the availability and performance of our cloud-based systems. You will take ownership of the cloud...
Supervisor, Site Reliability Engineering

4 days ago

San Diego, California, United States ServiceNow Full time $126,700 - $215,400

Company Description It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500. Our intelligent cloud-based...
Supervisor, Site Reliability Engineering

5 days ago

San Diego, California, United States ServiceNow Full time $104,000 - $174,000 per year

Company DescriptionIt all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today — ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500. Our intelligent cloud-based...
Staff, Site Reliability Engineer

4 days ago

San Diego, California, United States Lytx, Inc. Full time $183,500 - $232,500

Why Lytx:We are a team of Hungry, Low ego and capable engineers that design and support our IOT Infrastructure. Are you interested in "Operations as Code", "Infrastructure as Code" and infrastructure automation solutions? If so keep reading....Site Reliability Engineering team is responsible for the availability, reliability, observability and resilience of...
Site Reliability Engineer

5 days ago

San Francisco, California, United States Air Apps Full time $150,000 - $250,000 per year

About Air AppsAt Air Apps, we believe in thinking bigger—and moving faster. We're a family-founded company on a mission to create the world's first AI-powered Personal & Entrepreneurial Resource Planner (PRP), and we need your passion and ambition to help us change how people plan, work, and live. Born in Lisbon, Portugal, in 2018—and now with offices in...
Senior Site Reliability Engineer

4 days ago

San Francisco, California, United States Sibitalent Corp Full time $180,000 - $250,000 per year

Job Title: Staff Site Reliability Engineer (SRE)Location: San Francisco, CA (Hybrid, Local Only)Duration: 6+ months Contract12+ Years of profileW2 OR C2C (Either will work)Job Description:As our Staff SRE, you'll be the primary expert responsible for our entire compute ecosystem. Your key responsibilities will include:As a Staff SRE, you'll operate at the...
Infrastructure Site Reliability Engineer

1 day ago

San Francisco, California, United States Maxonic Inc. Full time $120,000 - $180,000 per year

Maxonic maintains a close and long-term relationship with our direct client. In support of their needs, we are looking for anInfrastructure Site Reliability EngineerJob Description:Job Title: Infrastructure Site Reliability EngineerJob Type: Contract (4+ months) with strong possibility to convert to fulltimeJob Location: San Francisco, CAWork Schedule:...
Staff Site Reliability Engineer

5 days ago

San Francisco, California, United States Heartflow Full time $185,750 - $250,922 per year

Heartflow is a medical technology company advancing the diagnosis and management of coronary artery disease, the #1 cause of death worldwide, using cutting-edge technology. The flagship product—an AI-driven, non-invasive cardiac test supported by the ACC/AHA Chest Pain Guidelines called the Heartflow FFRCT Analysis—provides a color-coded, 3D model of a...
Principal Site Reliability Engineer

5 days ago

San Francisco, California, United States Harrison Clarke Full time $120,000 - $180,000 per year

Harrison Clarke are working with several high profile companies that are seeking aPrincipal Site Reliability Engineer (SRE), to lead the design, implementation, and scaling of the infrastructure and systems that support their products.The ideal candidate should have extensive experience in designing highly scalable infrastructure, building systems, and...
Site Reliability Engineer

1 day ago

San Jose, California, United States TikTok Full time $359,720 per year

ResponsibilitiesTeam Intro:TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security ("USDS") is a subsidiary of TikTok in the U.S.Site Reliability Engineering(SRE) at TikTok combines software and systems engineering to build and run large-scale, massively distributed, and...

Americas

Europe

Asia / Oceania

Africa

Site Reliability Engineer