Site Reliability Engineering Manager
5 days ago
A leading global investment management firm is seeking an exceptional SRE Manager to lead and transform their Site Reliability Engineering function. This is a newly created leadership role offering a unique opportunity to build and scale a world-class global SRE team while driving the organization's evolution toward modern, cloud-based infrastructure solutions.
The Role
As SRE Manager, you will lead the strategic expansion of site reliability practices across a global organization, transforming operational workflows into proactive, automation-driven processes. You'll build and mentor a high-performing team responsible for ensuring the reliability, scalability, and performance of both cloud and on-premise infrastructure and services.
Key Responsibilities
Team Leadership & Development
- Lead the growth and development of a global SRE team from a small, high-performing unit to a comprehensive function
- Oversee recruitment, onboarding, and professional development initiatives
- Design and deliver tailored training programs on SRE principles, cloud operations, and automation tools
- Foster a culture of excellence, collaboration, and continuous improvement
Strategic Transformation
- Evaluate current operational workflows, RACIs, and skill assessments across the global team
- Execute a comprehensive roadmap to transition reactive operations into proactive, SRE-aligned processes
- Identify and eliminate toil through automation and process optimization
- Define and implement sustainable automation frameworks to reduce operational risk
Technical Excellence
- Collaborate with architects, platform engineering, ServiceNow developers, and application teams to design and implement comprehensive observability frameworks
- Define, monitor, and regularly review SLIs, SLOs, SLAs, and error budgets
- Enhance proactive incident detection capabilities and reduce MTTR
- Oversee incident response processes and champion blameless post-mortem culture across teams
Stakeholder Management
- Build strong partnerships with internal and external stakeholders
- Prepare and present operational performance reports to leadership
- Drive alignment between SRE and application development teams
Required Qualifications
Leadership Experience
- Proven track record building and leading operational and engineering teams
- Demonstrated ability to foster collaboration between SRE and development teams
- Experience driving operational excellence and reducing downtime while accelerating delivery cycles
SRE & Incident Management Expertise
- Strong experience defining and monitoring SRE principles (SLIs, SLOs, SLAs, error budgets)
- Skilled in incident response, post-incident analysis, and facilitating blameless post-mortems
- Track record of implementing proactive measures to prevent recurring incidents
Technical Proficiency
- Deep expertise in
Azure technologies
(experience with other cloud providers highly beneficial) - Proven experience with
Infrastructure as Code
tools, particularly Terraform - Hands-on experience with
monitoring and observability tools
such as Logic Monitor, Azure Monitor, Prometheus, Grafana, Dynatrace, and Splunk - Strong scripting or programming skills (Python, PowerShell)
- Experience with
ServiceNow
and
Azure DevOps - Understanding of container orchestration platforms
- Solid grasp of Agile, ITIL, and ITSM frameworks
Preferred Qualifications
- Experience managing other managers
- SharePoint administration experience
- Demonstrated success spearheading automation initiatives that significantly reduced infrastructure provisioning time
Soft Skills
- Excellent communication and presentation abilities
- Strong stakeholder management capabilities
- Strategic thinking with hands-on execution ability
Why This Role?
This is a rare opportunity to shape the SRE function of a global organization from the ground up. You'll have the autonomy to build processes, develop talent, and drive meaningful transformation that directly impacts business outcomes. If you're passionate about reliability engineering, team development, and driving operational excellence at scale, this role offers the platform to make a significant impact.
-
Site Reliability Engineer
12 hours ago
Charlotte, North Carolina, United States Simple Software Solutions Group, Inc Full timeOnsite - Charlotte, NC12 monthsJob Title: Site Reliability Engineer (SRE) Practice LeadJob Summary The SRE Practice Lead is a senior technical leader responsible for building and guiding the SRE function to ensure the highest levels of reliability, availability, and scalability of critical utility infrastructure, control systems, and operational technology....
-
Site Reliability Engineer
1 week ago
Charlotte, North Carolina, United States Tekgence Inc Full time $102,000 - $230,000 per yearJob Title: Site Reliability Engineer (SRE) Practice LeadDuration: 12+ Months contractLocation: Charlotte, NC - OnsiteJob Summary:The SRE Practice Lead is a senior technical leader responsible for building and guiding the SRE function to ensure the highest levels of reliability, availability, and scalability of critical utility infrastructure, control...
-
Site Reliability Engineer
1 week ago
Charlotte, North Carolina, United States Sibitalent Corp Full timeJob Duties:Technically proficient Mid-Level Site Reliability Engineer, focusing on the stability, performance, and scalability of Apigee API platform environments. This role bridges the gap between development and operations, applying software engineering principles to operational challenges.Required Qualifications & ExperienceExperience: 7+ years of...
-
Site Reliability Engineer
1 week ago
Charlotte, North Carolina, United States Ryan Consulting Group, LLC Full time $120,000 - $160,000 per year**No Corp to Corp or candidates requiring sponsorship now or in the future will be considered. All candidates must be able to work as a W2 employee for any employer in the US to be considered.**Job Title:Site Reliability Engineer / Sr. DevOpsType:Full-Time, On-SiteLocation:Charlotte, NC (South End)Compensation:$120–$160KOverview:We're seeking a seasoned...
-
Site Reliability Engineer
2 weeks ago
Charlotte, North Carolina, United States NIMBUSAITECH LLC Full time $65,000 - $130,000 per yearJob Description: Senior Site Reliability Engineer (SRE) – Full Stack ObservabilityLocation : Charlotte , North Carolina -HybridWe are seeking a highly skilled Senior Site Reliability Engineer (SRE) with extensive experience in full-stack observability for data applications across SaaS, hybrid cloud, and on-prem environments. The ideal candidate will be...
-
Site Reliability Engineering Lead
2 weeks ago
Charlotte, North Carolina, United States Bank of America Full time $120,000 - $200,000 per yearJob Description:At Bank of America, we are guided by a common purpose to help make financial lives better through the power of every connection. We do this by driving Responsible Growth and delivering for our clients, teammates, communities and shareholders every day.Being a Great Place to Work is core to how we drive Responsible Growth. This includes our...
-
Senior Site Reliability Engineer
2 weeks ago
Charlotte, North Carolina, United States Mindlance Full time $200,000 - $250,000 per yearPlease find details for this position below:Client:Banking/Financial IndustryTitle:Senior Site Reliability Engineer / Senior DevOps Engineer, Senior Cloud Engineer, and Senior Platform EngineerLocation:Charlotte, NC - Hybrid RolesDuration:12+ Month (s) Extend or Convert based on performancesRequired Qualifications:Required Qualifications:8+ years of Software...
-
Site Reliability Engineer
4 days ago
Charlotte, North Carolina, United States Red Ventures Full timeThis role is not open to visa sponsorship or transfer of visa sponsorship including those on H1-B, F-1, OPT, STEM-OPT, or TN visa, nor is it available to work corp-to-corp.This is a hybrid opportunity. Candidates are asked to report to our Fort Mill, SC office, 3x per week, Tuesday through Thursday and work remotely on Monday and Friday. The Growth and...
-
Site Reliability Engineer
6 days ago
Charlotte, North Carolina, United States VySystems Full timeJob Summary:We are looking for an experiencedSite Reliability Engineer (SRE)specializing inProduction Supportto ensure high availability, reliability, and performance of mission-critical applications. The ideal candidate will have hands-on experience withSplunkandDynatracefor monitoring, alerting, and root cause analysis, along with strong troubleshooting...
-
Site Reliability Engineer
1 week ago
Charlotte, North Carolina, United States Career Mentors Full time $124,000 - $1,240,000 per yearJob description Site Reliability EngineerPay Rate: upto $75 pr hr on W2W2 Candidates : visa openLocation: Charlotte, NC / Chandler, AZ / Jersey City, NJ - Near by candidatesPreviously functioned in an SRE role within a large production environment, with a focus on automation testing experience.Hands-on production experience with vSphere, Aria Suite, and...