Senior Principal Software Engineer, Site Reliability

1 day ago


Atlanta, United States Engineer Up Full time

About Us: Engineer Up is on a mission to disrupt how good, hard-working people advance their careers in tech. We partner with Fortune 500 companies to deliver customized IT consulting services spanning from software development to digital transformation.


Position: Senior Principal Software Engineer

Employer: Engineer Up

Location: Remote

Role: Consultant


Position Overview:

Engineer Up is looking for a Senior Principal Software Engineer to lead engineering initiatives and guide technical strategies for optimizing system reliability across multiple teams. This role focuses on performance engineering and chaos engineering while ensuring consistency in software development practices. The Senior Principal will provide expert-level guidance and drive the adoption of new technologies within diverse product portfolios.

The successful candidate will lead 4-6 development teams, focusing on resource efficiency, cost optimization in the cloud, and enhancing technical processes. They will also manage relationships with external vendors, ensuring smooth integration with existing systems and maximizing operational efficiency.


Key Responsibilities:

  • Technical Leadership (50%): Lead discussions with engineering teams to align on technical strategies and share best practices. Architect reusable, scalable solutions across portfolios. Develop proof-of-concept projects to implement new technologies. Ensure efficient use of cloud resources and implement cost-saving strategies using Google Cloud and other platforms.
  • Team Development (10%): Mentor and support engineering teams, providing leadership and expert advice on complex technical issues. Conduct regular performance reviews and help foster a culture of innovation. Attract and retain top talent, ensuring the team is equipped to meet project goals.
  • Strategic Planning (15%): Collaborate with senior technical leadership to set technology roadmaps and strategies. Work across teams to remove technical roadblocks and ensure alignment on project priorities. Foster an environment that encourages learning and innovation within the organization.
  • Operational Support (25%): Serve as the escalation point for critical technical issues. Help resolve conflicts across teams and troubleshoot high-priority outages. Work closely with vendors to integrate new features and tools. Promote the successful adoption of technologies such as Prometheus, Terraform, and GitHub Actions across teams.


Skills and Technologies (Mandatory):

  • Google Cloud
  • Prometheus
  • Terraform
  • GitHub Actions
  • Java
  • Performance Engineering
  • Chaos Engineering


Reporting Structure:

  • Reports to multiple individuals
  • Manages 4-6 reports


Travel Requirements:

  • Occasional travel, typically 5-20% of the time.


Working Conditions:

  • Fully remote with a flexible work environment.


Minimum Qualifications:

  • Must be 18 years or older and legally authorized to work in the United States.
  • 7+ years of experience in software engineering and technical leadership.
  • Extensive experience with Google Cloud, Terraform, Prometheus, and GitHub Actions.
  • Proficiency in performance and chaos engineering techniques.


Preferred Qualifications:

  • 7-9 years of professional experience in software development.
  • Deep knowledge of Java and cloud-native technologies.
  • Strong ability to understand and work with complex systems.
  • Leadership experience managing engineering teams or initiatives.



  • Atlanta, United States Engineer Up Full time

    About Us: Engineer Up is on a mission to disrupt how good, hard-working people advance their careers in tech. We partner with Fortune 500 companies to deliver customized IT consulting services spanning from software development to digital transformation. Position: Senior Principal Software EngineerEmployer: Engineer UpLocation: RemoteRole: ConsultantPosition...


  • Atlanta, Georgia, United States UKG (Ultimate Kronos Group) Full time

    About the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at UKG. As a key member of our engineering organization, you will be responsible for designing, building, and maintaining our cloud-based infrastructure and services.Key ResponsibilitiesEngage in the lifecycle of services from conception to end-of-life,...


  • Atlanta, Georgia, United States Ultimate Software Full time

    About the RoleAs a Site Reliability Engineer at Ultimate Kronos Group, you will play a critical role in ensuring the reliability and efficiency of our cloud-based services. You will be responsible for developing software solutions to enhance, harden, and support our service delivery processes. This includes automated testing, performance analysis,...


  • Atlanta, Georgia, United States Cox Communications Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Manheim Logistics SRE team. As a key member of our team, you will be responsible for designing and maintaining AWS infrastructure and deployment pipelines for our 15+ development teams.Key ResponsibilitiesDesign and implement scalable and reliable infrastructure...


  • Atlanta, Georgia, United States Ultimate Software Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Ultimate Software. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and efficiency of our cloud-based services.Key ResponsibilitiesDesign and implement scalable and reliable cloud infrastructure solutionsDevelop and maintain...


  • Atlanta, Georgia, United States Ultimate Software Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Ultimate Software. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and efficiency of our cloud-based services.Key ResponsibilitiesDesign and implement scalable and reliable cloud infrastructure solutionsDevelop and maintain...


  • Atlanta, Georgia, United States SIDEARM Sports Full time

    About SIDEARM SportsSIDEARM Sports is a leading provider of technology solutions for collegiate athletic programs. Our team of experts is dedicated to delivering innovative and reliable solutions that meet the evolving needs of our clients.Job SummaryWe are seeking an experienced Senior Site Reliability Engineer to join our team. As a key member of our SRE...


  • Atlanta, Georgia, United States Learfield Full time

    About LearfieldLearfield is a leading media and technology services company in intercollegiate athletics, unlocking the value of college sports for brands and fans through an omnichannel platform with innovative content and commerce solutions for fan engagement.Job SummaryWe are seeking an experienced Senior Site Reliability Engineer to join our team. As a...


  • Atlanta, Georgia, United States UKG (Ultimate Kronos Group) Full time

    About the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at UKG. As a critical member of our engineering team, you will be responsible for designing, implementing, and maintaining our cloud-based infrastructure to ensure high availability, scalability, and performance.Key ResponsibilitiesDesign and implement scalable...


  • Atlanta, United States Cox Communications Full time

    This role is for an opening for a Senior Site Reliability Engineer (SRE) on the Manheim Logistics SRE team. The SRE team is tasked with designing and maintaining AWS infrastructure and deployment pipelines for Manheim Logistics 15 development teams. Reliability Engineer, Liability, Reliability, Engineer, Reliability, Monitoring, Technology


  • Atlanta, Georgia, United States Cox Communications Full time

    About the RoleCox Automotive is seeking a highly skilled Senior Site Reliability Engineer to join our Manheim Logistics SRE team. As a key member of our team, you will be responsible for designing and maintaining AWS infrastructure and deployment pipelines for our 15+ development teams.Key ResponsibilitiesDesign and implement scalable and reliable cloud...


  • Atlanta, Georgia, United States Cox Communications Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Cox Automotive. As a key member of our Manheim Logistics SRE team, you will be responsible for designing and maintaining AWS infrastructure and deployment pipelines for our 15+ development teams.Key ResponsibilitiesDesign and implement scalable and reliable...


  • Atlanta, Georgia, United States Cox Communications Full time

    About the RoleCox Automotive is seeking a highly skilled Senior Site Reliability Engineer to join our Manheim Logistics SRE team. As a key member of our team, you will be responsible for designing and maintaining AWS infrastructure and deployment pipelines for our 15+ development teams.Key ResponsibilitiesDesign and implement scalable and reliable AWS...


  • Atlanta, Georgia, United States OneTrust Full time

    Job Title: Senior Principal Software EngineerOneTrust is seeking a highly skilled Senior Principal Software Engineer to join our team. As a key member of our engineering team, you will play a critical role in shaping the technical direction of our products and platforms.Key Responsibilities:Design and implement scalable, high-performance software...


  • Atlanta, Georgia, United States Cox Enterprises Full time

    About the RoleCox Automotive is seeking a highly skilled Senior Site Reliability Engineer to join our Manheim Logistics SRE team. As a key member of our team, you will be responsible for designing and maintaining AWS infrastructure and deployment pipelines for our 15+ development teams.Key ResponsibilitiesDesign and implement scalable and reliable AWS...


  • Atlanta, Georgia, United States STORD Full time

    About StordStord is a leading commerce enablement provider of fulfillment services and technology that powers seamless checkout and delivery experiences for high-volume mid-market and enterprise brands across all channels. With a strong presence in the market, Stord manages over $5 billion of commerce annually through its fulfillment, warehousing,...


  • Atlanta, Georgia, United States OneTrust Full time

    Job OverviewOneTrust is seeking a highly skilled Senior Principal Software Engineer to join our team. As a key member of our GenAI development team, you will be responsible for designing, developing, and implementing cutting-edge GenAI applications using Python and related technologies.Key ResponsibilitiesLead the design and development of complex GenAI...


  • Atlanta, Georgia, United States Diversity Resource Staffing Inc Full time

    Senior Site Reliability EngineerThis is an exciting opportunity for a skilled Senior Site Reliability Engineer to join our Consumer SRE Team at IMT division, providing secure, resilient, scalable, and maintainable services for mortgage borrowers and lenders. Our client, a division of a leading financial services company, operates numerous financial and...


  • atlanta, United States Advansys Full time

    Job Title: Site Reliability Engineer Location: Alpharetta, GA (Locals Candidates only) Duration: Long term We seek a highly skilled Site Reliability Engineer and dynamic – Consultant In this role you will Maintain and improve the reliability, performance, and availability of software systems. Act as a bridge between traditional IT operations and...


  • Atlanta, United States Advansys Full time

    Job Title: Site Reliability Engineer Location: Alpharetta, GA (Locals Candidates only) Duration: Long term We seek a highly skilled Site Reliability Engineer and dynamic – Consultant In this role you will Maintain and improve the reliability, performance, and availability of software systems. Act as a bridge between traditional IT operations and...