Principal Site Reliability Engineer

4 weeks ago


Atlanta, Georgia, United States UKG (Ultimate Kronos Group) Full time
About the Role

As a Principal Site Reliability Engineer at UKG, you will play a critical role in ensuring the reliability and efficiency of our cloud-based services. You will be responsible for designing, implementing, and maintaining scalable and highly available systems, as well as developing software solutions to enhance our service delivery processes.

You will work closely with cross-functional teams to identify and prioritize areas for improvement, and develop and implement solutions to address these areas. This will include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering, and auto remediation.

You will also be responsible for defining and implementing standards and best practices related to system architecture, service delivery, metrics, and the automation of operational tasks. Additionally, you will support SRE team members, services, product, and engineering teams by providing common tooling and frameworks to deliver increased availability and improved incident response.

Key Responsibilities:

  • Design and implement scalable and highly available systems
  • Develop software solutions to enhance service delivery processes
  • Collaborate with cross-functional teams to identify and prioritize areas for improvement
  • Develop and implement solutions to address areas for improvement
  • Define and implement standards and best practices related to system architecture, service delivery, metrics, and automation of operational tasks
  • Support SRE team members, services, product, and engineering teams

About You

We are looking for a highly skilled and experienced Principal Site Reliability Engineer who has a passion for learning and evolving with current technology trends. You should have a strong background in cloud infrastructure, service delivery, and software development, as well as excellent communication and collaboration skills.

Key Qualifications:

  • 10+ years of hands-on experience working within engineering or cloud
  • Minimum 5 years' experience with public cloud platforms (e.g. GCP, AWS, Azure)
  • Minimum 5 years' experience in configuration and maintenance of applications and/or systems infrastructure for large-scale customer-facing companies
  • Experience with distributed system design and architecture

What We Offer

At UKG, we offer a comprehensive benefits package, including a competitive salary, comprehensive health insurance, retirement savings plan, and generous paid time off. We also offer opportunities for professional growth and development, as well as a dynamic and inclusive work environment.



  • Atlanta, Georgia, United States Motion Recruitment Full time

    Job SummaryJoin our team as a GCP Senior Principal Reliability Engineer and take on the challenge of enhancing our client's supply chain organization. As a principal-level Site Reliability Engineer, you will work on our enterprise environment, guiding software design and development across multiple engineering teams. This full-time position offers a...


  • Atlanta, Georgia, United States Ditto Job Board Full time

    Job Title: Site Reliability EngineerAt Ditto, we're on a mission to unleash the full power of edge devices by removing all the plumbing required to build amazing applications. As a Site Reliability Engineer, you'll play a critical role in helping us achieve this goal.About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our Federal...


  • Atlanta, Georgia, United States JobRialto Full time

    Job SummaryThe Site Reliability Engineer is responsible for ensuring the availability, scalability, and performance of critical services and systems. This role requires expertise in OpenShift and CloudFormation, along with a deep understanding of site reliability principles, container technologies, monitoring tools, and automation.Key ResponsibilitiesEnsure...


  • Atlanta, Georgia, United States Navtech Full time

    Job Title: Site Reliability EngineerJob Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Navtech. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our production systems.Key Responsibilities:Provide L4 technical support for production 24x7Design and...


  • Atlanta, Georgia, United States Geotab Full time

    About GeotabGeotab is a global leader in IoT and connected transportation, certified as a Great Place to WorkTM. We are a company of diverse and talented individuals who work together to help businesses grow and succeed, and increase the safety and sustainability of our communities.Our team is growing, and we're looking for people who follow their passion,...


  • Atlanta, Georgia, United States Della Infotech Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Della Infotech. As a key member of our DevOps team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure using AWS...


  • Atlanta, Georgia, United States Jonas Software UK Full time

    About the Role:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Jonas Software UK. As a key member of our technical operations team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • Atlanta, Georgia, United States Kobiton Full time

    About the RoleKobiton is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and scalability of our systems and services.You will work closely with development and operations teams to build and maintain robust infrastructure, automate...


  • Atlanta, Georgia, United States Microsoft Corporation Full time

    We are seeking a highly skilled Senior Site Reliability Engineer to join our Windows Servicing and Delivery team at Microsoft Corporation.The ideal candidate will have a strong background in software engineering, network engineering, or systems administration, with a proven track record of delivering high-quality solutions that meet customer needs.As a...


  • Atlanta, Georgia, United States Cynet Systems Full time

    Job Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Cynet Systems. The ideal candidate will have a strong background in application development, architecture, and consulting, with a proven track record of performing assessments and providing roadmaps with project plans.The successful candidate will have a good...


  • Atlanta, Georgia, United States SIDEARM Sports Full time

    Job SummaryAt SIDEARM Sports, we're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you'll play a critical role in ensuring the reliability, availability, and performance of our live services, which impact millions of customers across the entertainment space.Key ResponsibilitiesCollaborate with...


  • Atlanta, Georgia, United States STORD Full time

    About the RoleStord is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will be responsible for designing and implementing scalable, efficient, and secure infrastructure and platform solutions.You will collaborate with cross-functional teams to deliver high-quality products and services to our...


  • Atlanta, Georgia, United States Microsoft Corporation Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Microsoft Corporation. As a key member of our Windows Servicing and Delivery team, you will be responsible for ensuring the reliability and performance of our product offerings, including Windows client, Windows Update, and Windows Autopatch.Key Responsibilities...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Job Title: Site Reliability Engineer - Azure Cloud ExpertAbout the Role: We are seeking a highly skilled Site Reliability Engineer to join our team in Atlanta. As a Site Reliability Engineer, you will be responsible for ensuring the scalability and reliability of our ecommerce applications on Azure cloud.Key Responsibilities:* Proactively monitor and...


  • Atlanta, Georgia, United States Now100 Full time

    Job Title: Site Reliability Engineer - Cloud Infrastructure SpecialistCompany Overview: Now100 is a leading provider of technology solutions, committed to delivering exceptional results for our clients. We match thoroughly vetted resources to contract, contract-to-hire, and permanent positions in all industries.Job Description: We are seeking a highly...


  • Atlanta, Georgia, United States Microsoft Corporation Full time

    About the RoleMicrosoft Corporation is seeking a highly skilled Senior Site Reliability Engineering Manager to lead the delivery of critical features in Office 365 government cloud offerings. As a key member of the Office 365 team, you will be responsible for combining your passion for quality, reliability, and creativity to drive evolution in the continuous...


  • Atlanta, Georgia, United States Motion Recruitment Full time

    Exciting Opportunity in Atlanta, GAMotion Recruitment is seeking a highly skilled Site Reliability Engineer (SRE) to join our team in Atlanta, GA. This is an on-site position that requires a strong background in software solutions and a passion for ensuring system reliability and performance.About the CompanyOur client specializes in providing cutting-edge...


  • Atlanta, Georgia, United States Cox Communications Full time

    About the RoleThis is an exciting opportunity to join our team as a Senior Site Reliability Engineer. As a key member of our Manheim Logistics SRE team, you will play a crucial role in designing and maintaining AWS infrastructure and deployment pipelines for our 15+ development teams.We are looking for a highly skilled and experienced engineer who can work...


  • Atlanta, Georgia, United States Pyramid Consulting Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Pyramid Consulting, Inc. This is a contract opportunity with long-term potential and is located in Atlanta, GA.Key ResponsibilitiesDesign and implement SLOs / SLIs / error budgets and manage reliability for infrastructure and applicationsProven experience with...


  • Atlanta, Georgia, United States Pyramid Consulting Full time

    Pyramid Consulting is seeking a talented Senior Site Reliability Engineer to join our team. This is a contract opportunity with long-term potential and is located in a major US city. The successful candidate will have a strong background in setting SLOs / SLIs / error budgets and managing reliability for infrastructure and applications.Key...