Principal Site Reliability Engineer
4 weeks ago
At Brightspeed, we are reimagining how people live, work, play, and connect by providing fast, reliable internet connections and an exceptional customer experience in twenty states throughout the Midwest and South.
We are seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our engineering team, you will play a critical role in designing and building a world-class fiber network and creating a customer experience second to none.
Key Responsibilities:- Implement and maintain monitoring systems to track the performance and availability of business-critical systems and infrastructure using metrics to identify trends and potential issues.
- Work closely with development teams, operations, and other stakeholders to ensure that new services and features are reliable and scalable.
- Develop scripts and tools to automate repetitive tasks, such as deployment, scaling, and monitoring.
- Respond to system outages and performance issues, performing root cause analysis to prevent recurrence.
- Define and measure Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure services meet required performance and availability targets.
- Conduct postmortems after incidents to identify what went wrong and what can be improved.
- Lead the team of site reliability engineers onshore/offshore, mentor them for support activities required for system reliability.
- Master's degree in computer science, telecommunications, or similar areas, with a minimum of 10 years software engineering experience, including a minimum of 5 years as a site reliability engineer.
- Proven track record of managing mission-critical customer-facing applications for reliability.
- 5+ years of experience supporting operations and maintenance for cloud-native applications in production that are fault-tolerant, self-healing, scalable, and high available.
- Excellent troubleshooting and problem-solving skills, with a keen attention to detail to identify and resolve complex production issues.
- Deep understanding of cloud computing platforms (GCP) and containerization technologies (e.g., Docker, Kubernetes).
- Solid experience with core Kubernetes concepts such as Pods, Workloads, Services, Ingress/Egress, Deployments, ConfigMaps, HPA, Liveliness Probe, and Secrets.
- Strong knowledge of infrastructure as code tools (e.g., Terraform, Ansible, ArgoCD) and CI/CD pipelines.
- Strong experience working with integration of code quality tool (SonarQube or Checkmarx) with CI/CD pipeline.
- Strong experience with monitoring, logging, and observability tools like, Splunk, GCP log, Dynatrace, etc.
- Ability to work independently and as part of a collaborative team, effectively communicating technical concepts to both technical and non-technical stakeholders.
- Must have proven written and verbal communication skills, including presentations using tools like PowerPoint.
- Must have ability to communicate and abstract the messaging to multiple target audiences including Sr business & IT leadership, technology, and business teams.
- Competitive compensation and comprehensive benefits.
- Opportunity to work with a cutting-edge technology stack.
- Collaborative and dynamic work environment.
- Professional development and growth opportunities.
We aspire to contemporary ways of working. Recognized as a Top Workplace by the Charlotte Observer, Brightspeed HQ is located on the 7th floor of the new Vantage South End - East Tower in Charlotte, NC. We prioritize hiring talent in the Charlotte area, whenever possible, to make it a truly vibrant destination for our hybrid workforce.
We also invest in technology that enables our entire team to stay connected. Why? Because Brightspeed recognizes the value of finding the best talent for the job, wherever they may be.
-
Principal Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Brightspeed Full timeJob Title: Principal Site Reliability EngineerWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Brightspeed. As a key member of our engineering team, you will play a critical role in ensuring the reliability and scalability of our cloud-native applications.Key Responsibilities:Design and implement monitoring systems to...
-
Principal Site Reliability Engineer
3 weeks ago
Charlotte, North Carolina, United States BrightSpeed Full timeWe are seeking a Principal Site Reliability Engineer to join our team at Brightspeed. As a key member of our team, you will be responsible for implementing and maintaining monitoring systems to track the performance and availability of business-critical systems and infrastructure.Key responsibilities include:Implementing and maintaining monitoring systems to...
-
Site Reliability Principal Engineer
4 weeks ago
Charlotte, North Carolina, United States City National Bank Full timeJob SummaryCity National Bank is seeking a highly skilled Site Reliability Principal Engineer to join our team. As a Site Reliability Principal Engineer, you will be responsible for designing, building, and managing large-scale, fault-tolerant systems. Your role will be to ensure the reliability, scalability, and maximum uptime of CNB systems in the Data...
-
Site Reliability Engineer
3 weeks ago
Charlotte, North Carolina, United States Matlen Silver Full timeJob Title: Site Reliability EngineerMatlen Silver is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, performance, and reliability of our Fulfillment Technology solutions.Key Responsibilities:Partner with application engineering, observability, and other...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Digital Technology Solutions Full timeJob Title: Site Reliability EngineerAbout the Role:At Digital Technology Solutions, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the availability, scalability, and performance of our production environment.Key Responsibilities:1. Monitoring and Incident...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Matlen Silver Full timeJob Title: Site Reliability Engineer (SRE)Duration: 6+ monthsLocation: Charlotte, NCRequired Pay Scale: $67-$70/hour W2** No C2CJob Description/Requirements:True SRE with 6+ years of experienceAWS/Cloud expertiseTriage, incident response, root cause analysis, application improvement, reliabilityLamda, ECS, APIs, Dynatrace/Datadog knowledge, gitlab,...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Capgemini Full timeJob Title: Site Reliability EngineerLocation: Atlanta, USJob Type: PermanentJob Description:We are seeking a seasoned Site Reliability Engineer to join our team at Capgemini. As a Site Reliability Engineer, you will be responsible for advancing and enhancing reliability practices, with a strong focus on testing, monitoring, and maintaining system...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Capgemini Full time**Job Title:** Site Reliability Engineer**Job Summary:** We are seeking a seasoned Site Reliability Engineer to join our Trade Distribution System (TDS) software development team. The ideal candidate will have a strong background in SRE, experience with cloud infrastructure, and a passion for ensuring system performance and reliability.**Key...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Regions Bank Full timeJob Description:At Regions Bank, we are seeking a skilled Site Reliability Engineer to join our team. This role is responsible for ensuring the dependability of our firm's most critical system applications.This position will be called upon to solve major issues, understand and remediate the points of system failure, as well as work with internal teams to...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Capgemini Full timeJob Title: Site Reliability EngineerWe are seeking a seasoned Site Reliability Engineer to join our Trade Distribution System (TDS) software development team. The ideal candidate will have a strong focus on testing, monitoring, and maintaining system performance.Key Responsibilities:Develop and maintain operational tools for deployment, monitoring, and...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States City National Bank Full timeAbout the RoleCity National Bank is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, scalability, and maximum uptime of our systems in the data center or cloud platform.Key ResponsibilitiesImplement solutions to improve stability, security,...
-
Principal Site Civil Engineer
4 weeks ago
Charlotte, North Carolina, United States HDR Full timeAbout HDRAt HDR, we are a multidisciplinary team of engineers, architects, and scientists who specialize in engineering, architecture, environmental, and construction services. Our mission is to create an unshakable foundation for progress by providing innovative solutions to complex problems.Job SummaryWe are seeking a Principal Site Civil Engineer to join...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States V2soft Full timeAbout V2SoftV2Soft is a global company with a strong presence in multiple regions, including North America, Europe, and Asia. Our headquarters is located in Bloomfield Hills, Michigan, and we have offices in Mexico, Italy, India, China, and Germany. Our mission is to provide high-performance technology solutions that address real business challenges. We...
-
Staff Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Varo Bank Full timeJob Title: Staff Site Reliability EngineerVaro Bank's SRE team is a seasoned group of experts, designing, building, and running large-scale, distributed, fault-tolerant systems that power most of our operations.We live and breathe AWS and Kubernetes, with an open-source first and result-oriented mindset.We are an automation and observability-focused team,...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Saxon Global Full timeRequisituri:Experiență în migrația AWS Dynatrace Splunk AWS Experiență cu platforme de scară mare Java De preferat:Agile Jira AWS certificat *ideali să fi fost parte a unei migrații .Net Experiență cu tabelele de sprint și Kanban Descrierea rolului:În calitate de Site Reliability Engineer, vei contribui la dezvoltarea și implementarea...
-
Staff Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Varo Bank Full timeJob SummaryVaro Bank is seeking a highly skilled Staff Site Reliability Engineer to join our team. As a key member of our SRE team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesLead and mentor a team of SREs, driving best practices and fostering a culture of reliability and...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Capgemini Full timeJob Title:Site Reliability EngineerJob Summary:Capgemini is seeking a seasoned Site Reliability Engineer to join our Trade Distribution System software development team. The ideal candidate will have a strong background in SRE, with experience in testing, monitoring, and maintaining system performance.Key Responsibilities: Develop and maintain operational...
-
Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Tandym Group Full timeJob Title: Site Reliability EngineerJob Summary: Tandym Group is seeking a skilled Site Reliability Engineer to support a financial client based in Charlotte. The ideal candidate will have experience with DevOps tools and technologies, such as GitLab and Terraform, and a strong understanding of infrastructure as code.Responsibilities:Monitor the production...
-
Principal SRE
4 weeks ago
Charlotte, North Carolina, United States Apex Systems Full timePrincipal Site Reliability EngineerWe are seeking a highly skilled Principal Site Reliability Engineer to join our dynamic SRE team. As a subject matter expert, you will be responsible for analyzing complex data and distributed systems, anticipating problems, and finding ways to mitigate risks to the environment.Key Responsibilities:Lead the design, build,...
-
Senior Site Reliability Engineer
4 weeks ago
Charlotte, North Carolina, United States Wells Fargo Full timeAbout this role:We are seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our SRE team, you will be responsible for ensuring the availability, scalability, and performance of our applications and services.Key Responsibilities:Design, implement, and maintain scalable and highly available systemsCollaborate with...