Current jobs related to Sr. Site Reliability Engineer - Salt Lake City, UT - O.C. Tanner


  • Salt Lake City, UT, United States Filevine Full time

    Filevine is forging the future of legal work with cloud-based workflow tools. We have a reputation for intuitive, streamlined technology that helps professionals manage their organization and serve their clients better. We're also known for our team of extraordinary and passionate professionals who love working together to help organizations thrive. Our...


  • Salt Lake City, UT, United States Filevine Full time

    Filevine is forging the future of legal work with cloud-based workflow tools. We have a reputation for intuitive, streamlined technology that helps professionals manage their organization and serve their clients better. We're also known for our team of extraordinary and passionate professionals who love working together to help organizations thrive. Our...


  • Salt Lake City, UT, United States Western Governors University Full time

    If you're passionate about building a better future for individuals, communities, and our country-and you're committed to working hard to play your part in building that future-consider WGU as the next step in your career. Driven by a mission to expand access to higher education through online, competency-based degree programs, WGU is also committed to being...


  • Salt Lake City, UT, United States Lumen Inc Full time

    About Lumen Lumen connects the world. We are igniting business growth by connecting people, data and applications - quickly, securely, and effortlessly. Together, we are building a culture and company from the people up - committed to teamwork, trust and transparency. People power progress. We're looking for top-tier talent and offer the flexibility you need...


  • Salt Lake City, UT, United States WGU Full time

    Software Engineering Technical Lead If you're passionate about building a better future for individuals, communities, and our countryand you're committed to working hard to play your part in building that futureconsider WGU as the next step in your career. Driven by a mission to expand access to higher education through online, competency-based degree...


  • Salt Lake City, UT, United States WGU Full time

    Software Engineering Technical Lead If you're passionate about building a better future for individuals, communities, and our countryand you're committed to working hard to play your part in building that futureconsider WGU as the next step in your career. Driven by a mission to expand access to higher education through online, competency-based degree...


  • Salt Lake City, UT, United States Ford Motor Company Full time

    This Kubernetes Site Reliability Engineer position will design and provision infrastructure supporting cloud native applications alongside a geographically distributed team. Emphasis will be on container strategies and ecosystem (Kubernetes) supporting Ford's rapidly increasing data and compute requirements. These environments will be both multi-cloud (GCP,...


  • Salt Lake City, UT, United States Western Governors University Full time

    If you're passionate about building a better future for individuals, communities, and our country-and you're committed to working hard to play your part in building that future-consider WGU as the next step in your career. Driven by a mission to expand access to higher education through online, competency-based degree programs, WGU is also committed to being...


  • Salt Lake City, UT, United States ThinkBAC Consulting Full time

    Job Description Sr. Energy Storage Network Engineer - Renewables Location: FULL-TIME REMOTE (Anywhere in the USA) This is an opportunity to join an industry leading renewable energy venture with strong private equity backing that is focused on the development, execution, and operations of dynamic utility-scale energy storage projects. They are at the...


  • Salt Lake City, UT, United States ThinkBAC Consulting Full time

    Job Description Sr. Energy Storage Network Engineer - Renewables Location: FULL-TIME REMOTE (Anywhere in the USA) This is an opportunity to join an industry leading renewable energy venture with strong private equity backing that is focused on the development, execution, and operations of dynamic utility-scale energy storage projects. They are at the...

Sr. Site Reliability Engineer

2 weeks ago


Salt Lake City, UT, United States O.C. Tanner Full time

O.C. Tanner is the global leader in software and services that improve workplace culture through meaningful employee experiences. Our Culture Cloud is a suite of apps designed to enhance the employee experience with strategic recognition, service awards, wellbeing, leadership, and events that help people thrive at work. Our Culture by Design approach provides expert services to organizations looking to create great workplaces.

Our global team of 1,500 people hail from 58 countries and speak 62 languages. As programmers, researchers, designers, client professionals and craftspeople we create the tech, tools and awards that connect employees to purpose at thousands of companies. Join us as we help people all over the world thrive at work.

About O.C. Tanner

O.C. Tanner is the #1 provider of employee recognition solutions, helping organizations worldwide create thriving workplace cultures. Our mission is simple yet powerful: we help people thrive at work by fostering appreciation, engagement, and connection. Through our award-winning recognition platform, we empower companies to celebrate achievements, strengthen relationships, and build workplaces where employees feel valued and inspired.

About the Role

We are seeking a Senior Site Reliability Engineer to join our team and help ensure the reliability, scalability, and performance of our world class employee recognition platform. This role is ideal for someone who thrives at the intersection of software engineering and operations, with a passion for building resilient systems and improving customer experience.

Key Responsibilities

  • Reliability & Performance: Design, implement, and maintain monitoring, alerting, and tracing solutions for cloud-native applications. Drive improvements in uptime, latency, and overall system performance.
  • Observability: Develop and manage observability platforms (e.g., OpenTelemetry, Datadog, Coralogix) to provide actionable insights. Collaborate with engineering teams to define metrics, logs, and traces for new and existing services.
  • Incident Management: Lead incident response efforts, including root cause analysis and postmortems. Implement best practices for incident detection, escalation, and resolution.
  • Cloud & Infrastructure: Manage and optimize Kubernetes clusters and AWS cloud resources. Automate infrastructure provisioning and scaling using Infrastructure-as-Code tools.
  • Collaboration: Partner with software engineering teams to gather requirements for monitoring and reliability. Advocate for SRE principles and help teams adopt best practices for resilience and performance.
  • On-Call Responsibilities: Participate in a rotating on-call schedule to ensure 24/7 coverage for critical systems. Respond to alerts promptly, troubleshoot issues, and restore service during outages. Continuously improve on-call processes to reduce noise and enhance response efficiency.
Required Qualifications
  • Experience: 5+ years in Site Reliability Engineering, DevOps, or related roles.
  • Programming Skills: Proficiency in at least one modern programming language (e.g., Python, Go, Java).
  • Experience with Infrastructure-as-Code tools (Terraform, CloudFormation).
  • Observability Expertise: Hands-on experience with OpenTelemetry, Datadog, or similar platforms.
  • Cloud & Containers: Strong knowledge of AWS services and Kubernetes.
  • Incident Management: Proven track record in managing production incidents and improving MTTR.
  • Monitoring & Alerting: Deep understanding of metrics, logging, tracing, and alerting strategies for distributed systems.
  • Collaboration: Ability to work closely with software engineers to design reliable systems and improve application performance.
  • On-Call Readiness: Comfortable with participating in on-call rotations and handling high-pressure situations.
Preferred Qualifications
  • Familiarity with CI/CD pipelines and automation frameworks.
  • Knowledge of e-commerce platforms and high-traffic systems.
Why Join Us?
  • Work on a high-scale e-commerce platform impacting millions of customers.
  • Collaborate with talented engineers in a culture that values reliability, innovation, and ownership.
  • Competitive compensation, benefits, and opportunities for growth.