Site Reliability Engineer

5 days ago


Austin, Texas, United States Cafell Technologies Full time $80,000 - $150,000 per year

Dear Applicant,

Role: Site Reliability Engineer (SRE) - Onsite

Location: Columbus OH / Austin/ Charlotte NC

(Full Time) Visa Type: USC/GC preferred. H1b/H4EAD accepted

Experience - 7 to 9 yrs

Job Description:

Position Summary:
As a Cloud Infrastructure Site Reliability Engineer (SRE) with expertise in multiple public cloud service provider platforms, you will be responsible for operating infrastructure solutions, following the principles and practices pioneered by Google's SRE model. Your work will ensure our cloud services meet uptime, reliability, and performance targets, and you will drive automation and continuous improvement across our production environments. This role will involve collaborating with Cross functional teams to enhance our cloud reliability posture and streamline processes through automation.

Key Responsibilities:

Design, build, and maintain highly available, scalable, and secure cloud infrastructure on platforms such as
AWS, GCP, or Azure
.

Develop and implement automation for provisioning, monitoring, scaling, and incident response using Infrastructure-as-Code tools (e.g.,
Terraform, CloudFormation, Ansible)
.

Monitor system reliability, capacity, and performance; proactively detect and address issues before they impact users. Good experience into SRE implementation of monitoring system-Dashboard development for application reliability
using Splunk, Dynatrace, Grafana, App Dynamics, Datadog, Big Panda.

Collaborate with software engineering and security teams to ensure new services and features are production-ready and meet reliability standards.

Build and maintain tools for deployment, monitoring, and operations; automate manual processes to reduce toil. Experience with Automation principals and tools (
Ansible etc
), should have worked with Toil Identification.

Document operational processes and system architectures to ensure knowledge sharing and repeatability.

Qualifications:

Bachelor's degree in computer science, Engineering, or a related technical field, or equivalent practical experience.

3+ years of experience in software development with proficiency in at least one programming language (
e.g., Python, Go, Java, Curl Scripting).

Experience administrating cloud platforms (
AWS, GCP, Azure
), including networking, security, containerization, storage, data management, and serverless technologies.

Solid understanding of
Unix/Linux systems, Windows Server, Oracle, MSSQL, MongoDB, networking
fundamentals, virtualized, and distributed systems, and file systems. Deep understanding of observability (monitoring, alerting, and logging) tools in cloud environments. Ability to set up and maintain monitoring dashboards, alerts, and logs.

Experience with observability tools –
AppDynamics, Geneos, Dynatrace, ECS Based Internal tooling, Grafana, Prometheus, Splunk, Thousand Eye
etc.

Familiarity with
Continuous Integration/Continuous Deployment
(CI/CD) tools for automated testing, deployments, provisioning, and observability.

Ability to manage and respond to incidents, perform root cause analysis, and implement postmortem reviews.

Understanding of setting, monitoring, and maintaining Service-Level Objectives (SLOs) and Service Level Agreements (SLAs) for system reliability.

Additional Qualifications a Plus:

5+ years of experience in SRE, DevOps, infrastructure, or cloud engineering roles, preferably supporting large-scale, distributed systems.

Excellent problem-solving, troubleshooting, and communication skills.

Experience leading technical projects or mentoring junior engineers.

Certifications: Certified Engineer, DevOps, SRE, CSREF



  • Austin, Texas, United States Fathom Management LLC Full time $130,000 - $150,000 per year

    Cloud Engineer-Site Reliability EngineerWe are working aggressively with the customers to assess and migrate IT systems into cloud-based environments (Microsoft Azure, Amazon Web Services and others) as well as procure and implement new technology to replace legacy systems. The Site Reliability Engineer is a member of the group of technologists who are...


  • Austin, Texas, United States Apple Full time $150,000 - $200,000 per year

    The people here at Apple don't just build products — we craft the kind of wonder that's revolutionized entire industries. It's the diversity of those people and their ideas that supports the innovation that runs through everything we do, from amazing technology to industry-leading environmental efforts. Join Apple, and help us leave the world better than...


  • Austin, Texas, United States Thales Full time $106,000 - $121,921 per year

    Location: Austin, United States of AmericaThales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become...


  • Austin, Texas, United States Sonar Full time

    Who is Sonar?Sonar helps prevent code quality and code security issues from reaching production, amplifies developers' productivity in concert with AI assistants, and improves the developer experience with streamlined workflows. Sonar analyzes all code, regardless of who writes it — your internal team, genAI, or third parties — resulting in more secure,...


  • Austin, Texas, United States Visa Full time $35 - $40

    Company Description Visa is a world leader in payments and technology, with over 259 billion payments transactions flowing safely between consumers, merchants, financial institutions, and government entities in more than 200 countries and territories each year. Our mission is to connect the world through the most innovative, convenient, reliable, and secure...


  • Austin, Texas, United States Bumble Full time $120,000 - $180,000 per year

    Inclusion at Bumble Inc.Bumble Inc. is an equal opportunity employer and we strongly encourage people of all ages, colour, lesbian, gay, bisexual, transgender, queer and non-binary people, veterans, parents, people with disabilities, and neurodivergent people to apply. We're happy to make any reasonable adjustments that will help you feel more confident...


  • Austin, Texas, United States Flosports Full time $120,000 - $180,000 per year

    FloSports is a world-class sports media company strategically positioned to be the essential destination for passionate sports fans, delighting them with live event coverage, breaking news, highlights, stats, rankings, and team and player profiles. We are growing Our Sports every day by continuing to invest in our ever-expanding ecosystem, which consists of...


  • Austin, Texas, United States FloSports, Inc. Full time $120,000 - $200,000 per year

    FloSports is a world-class sports media company strategically positioned to be the essential  destination for passionate sports fans, delighting them with live event coverage, breaking news,  highlights, stats, rankings, and team and player profiles. We are growing Our Sports every day  by continuing to invest in our ever-expanding ecosystem, which...


  • Austin, Texas, United States Apple Full time $100,000 - $150,000 per year

    Do you have a passion for ensuring the reliability, scalability, and performance of critical services? Are you a highly motivated and expert engineer with a strong understanding of Site Reliability Engineering (SRE) principles and a desire to automate and improve processes? Join Apple's General and Administrative (G&A) Solutions Engineering team as a Service...


  • Austin, Texas, United States Google Full time $171,000 - $254,000 per year

    Note: By applying to this position you will have an opportunity to share your preferred working location from the following: New York, NY, USA; Austin, TX, USA; Sunnyvale, CA, USA.Minimum qualifications:Bachelor's degree in Science, Technology, Engineering, Mathematics, or equivalent practical experience.13 years of experience troubleshooting and advocating...