Site Reliability Engineering Manager

5 days ago


Charlotte, North Carolina, United States TBG | The Bachrach Group Full time

A leading global investment management firm is seeking an exceptional SRE Manager to lead and transform their Site Reliability Engineering function. This is a newly created leadership role offering a unique opportunity to build and scale a world-class global SRE team while driving the organization's evolution toward modern, cloud-based infrastructure solutions.

The Role

As SRE Manager, you will lead the strategic expansion of site reliability practices across a global organization, transforming operational workflows into proactive, automation-driven processes. You'll build and mentor a high-performing team responsible for ensuring the reliability, scalability, and performance of both cloud and on-premise infrastructure and services.

Key Responsibilities

Team Leadership & Development

  • Lead the growth and development of a global SRE team from a small, high-performing unit to a comprehensive function
  • Oversee recruitment, onboarding, and professional development initiatives
  • Design and deliver tailored training programs on SRE principles, cloud operations, and automation tools
  • Foster a culture of excellence, collaboration, and continuous improvement

Strategic Transformation

  • Evaluate current operational workflows, RACIs, and skill assessments across the global team
  • Execute a comprehensive roadmap to transition reactive operations into proactive, SRE-aligned processes
  • Identify and eliminate toil through automation and process optimization
  • Define and implement sustainable automation frameworks to reduce operational risk

Technical Excellence

  • Collaborate with architects, platform engineering, ServiceNow developers, and application teams to design and implement comprehensive observability frameworks
  • Define, monitor, and regularly review SLIs, SLOs, SLAs, and error budgets
  • Enhance proactive incident detection capabilities and reduce MTTR
  • Oversee incident response processes and champion blameless post-mortem culture across teams

Stakeholder Management

  • Build strong partnerships with internal and external stakeholders
  • Prepare and present operational performance reports to leadership
  • Drive alignment between SRE and application development teams

Required Qualifications

Leadership Experience

  • Proven track record building and leading operational and engineering teams
  • Demonstrated ability to foster collaboration between SRE and development teams
  • Experience driving operational excellence and reducing downtime while accelerating delivery cycles

SRE & Incident Management Expertise

  • Strong experience defining and monitoring SRE principles (SLIs, SLOs, SLAs, error budgets)
  • Skilled in incident response, post-incident analysis, and facilitating blameless post-mortems
  • Track record of implementing proactive measures to prevent recurring incidents

Technical Proficiency

  • Deep expertise in
    Azure technologies
    (experience with other cloud providers highly beneficial)
  • Proven experience with
    Infrastructure as Code
    tools, particularly Terraform
  • Hands-on experience with
    monitoring and observability tools
    such as Logic Monitor, Azure Monitor, Prometheus, Grafana, Dynatrace, and Splunk
  • Strong scripting or programming skills (Python, PowerShell)
  • Experience with
    ServiceNow
    and
    Azure DevOps
  • Understanding of container orchestration platforms
  • Solid grasp of Agile, ITIL, and ITSM frameworks

Preferred Qualifications

  • Experience managing other managers
  • SharePoint administration experience
  • Demonstrated success spearheading automation initiatives that significantly reduced infrastructure provisioning time

Soft Skills

  • Excellent communication and presentation abilities
  • Strong stakeholder management capabilities
  • Strategic thinking with hands-on execution ability

Why This Role?

This is a rare opportunity to shape the SRE function of a global organization from the ground up. You'll have the autonomy to build processes, develop talent, and drive meaningful transformation that directly impacts business outcomes. If you're passionate about reliability engineering, team development, and driving operational excellence at scale, this role offers the platform to make a significant impact.



  • Charlotte, North Carolina, United States Simple Software Solutions Group, Inc Full time

    Onsite - Charlotte, NC12 monthsJob Title: Site Reliability Engineer (SRE) Practice LeadJob Summary The SRE Practice Lead is a senior technical leader responsible for building and guiding the SRE function to ensure the highest levels of reliability, availability, and scalability of critical utility infrastructure, control systems, and operational technology....


  • Charlotte, North Carolina, United States Tekgence Inc Full time $102,000 - $230,000 per year

    Job Title: Site Reliability Engineer (SRE) Practice LeadDuration: 12+ Months contractLocation: Charlotte, NC - OnsiteJob Summary:The SRE Practice Lead is a senior technical leader responsible for building and guiding the SRE function to ensure the highest levels of reliability, availability, and scalability of critical utility infrastructure, control...


  • Charlotte, North Carolina, United States Sibitalent Corp Full time

    Job Duties:Technically proficient Mid-Level Site Reliability Engineer, focusing on the stability, performance, and scalability of Apigee API platform environments. This role bridges the gap between development and operations, applying software engineering principles to operational challenges.Required Qualifications & ExperienceExperience: 7+ years of...


  • Charlotte, North Carolina, United States Ryan Consulting Group, LLC Full time $120,000 - $160,000 per year

    **No Corp to Corp or candidates requiring sponsorship now or in the future will be considered. All candidates must be able to work as a W2 employee for any employer in the US to be considered.**Job Title:Site Reliability Engineer / Sr. DevOpsType:Full-Time, On-SiteLocation:Charlotte, NC (South End)Compensation:$120–$160KOverview:We're seeking a seasoned...


  • Charlotte, North Carolina, United States NIMBUSAITECH LLC Full time $65,000 - $130,000 per year

    Job Description: Senior Site Reliability Engineer (SRE) – Full Stack ObservabilityLocation : Charlotte , North Carolina -HybridWe are seeking a highly skilled Senior Site Reliability Engineer (SRE) with extensive experience in full-stack observability for data applications across SaaS, hybrid cloud, and on-prem environments. The ideal candidate will be...


  • Charlotte, North Carolina, United States Bank of America Full time $120,000 - $200,000 per year

    Job Description:At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.Being a Great Place to Work is core to how we drive Responsible Growth. This includes our...


  • Charlotte, North Carolina, United States Mindlance Full time $200,000 - $250,000 per year

    Please find details for this position below:Client:Banking/Financial IndustryTitle:Senior Site Reliability Engineer / Senior DevOps Engineer, Senior Cloud Engineer, and Senior Platform EngineerLocation:Charlotte, NC - Hybrid RolesDuration:12+ Month (s) Extend or Convert based on performancesRequired Qualifications:Required Qualifications:8+ years of Software...


  • Charlotte, North Carolina, United States Red Ventures Full time

    This role is not open to visa sponsorship or transfer of visa sponsorship including those on H1-B, F-1, OPT, STEM-OPT, or TN visa, nor is it available to work corp-to-corp.This is a hybrid opportunity. Candidates are asked to report to our Fort Mill, SC office, 3x per week, Tuesday through Thursday and work remotely on Monday and Friday. The Growth and...


  • Charlotte, North Carolina, United States VySystems Full time

    Job Summary:We are looking for an experiencedSite Reliability Engineer (SRE)specializing inProduction Supportto ensure high availability, reliability, and performance of mission-critical applications. The ideal candidate will have hands-on experience withSplunkandDynatracefor monitoring, alerting, and root cause analysis, along with strong troubleshooting...


  • Charlotte, North Carolina, United States Career Mentors Full time $124,000 - $1,240,000 per year

    Job description Site Reliability EngineerPay Rate: upto $75 pr hr on W2W2 Candidates : visa openLocation: Charlotte, NC / Chandler, AZ / Jersey City, NJ - Near by candidatesPreviously functioned in an SRE role within a large production environment, with a focus on automation testing experience.Hands-on production experience with vSphere, Aria Suite, and...