Principal Site Reliability Engineer

4 days ago


Charlotte, North Carolina, United States BrightSpeed Full time
Job Description

We are seeking a highly skilled Principal Site Reliability Engineer to join our team at Brightspeed. As a key member of our engineering team, you will play a critical role in ensuring the reliability and scalability of our cloud-based infrastructure.

Key Responsibilities:
  • Implement and maintain monitoring systems to track the performance and availability of business-critical systems and infrastructure.
  • Respond to system outages and performance issues, performing root cause analysis to prevent recurrence.
  • Develop scripts and tools to automate repetitive tasks, such as deployment, scaling, and monitoring.
  • Work closely with development teams, operations, and other stakeholders to ensure that new services and features are reliable and scalable.
  • Define and measure Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure services meet required performance and availability targets.
  • Conduct postmortems after incidents to identify what went wrong and what can be improved.
  • Lead the team of site reliability engineers onshore/offshore, mentor them for support activities required for system reliability.
Requirements:
  • Master's degree in computer science, telecommunications, or similar areas, with a minimum of 10 years software engineering experience, including a minimum of 5 years as a site reliability engineer.
  • Proven track record of managing mission-critical customer-facing applications for reliability.
  • 5+ years of experience supporting operations and maintenance for cloud-native applications in production that are fault-tolerant, self-healing, scalable, and high available.
  • Excellent troubleshooting and problem-solving skills, with a keen attention to detail to identify and resolve complex production issues.
  • Deep understanding of cloud computing platforms (GCP) and containerization technologies (e.g., Docker, Kubernetes).
  • Solid experience with core Kubernetes concepts such as Pods, Workloads, Services, Ingress/Egress, Deployments, ConfigMaps, HPA, Liveliness Probe, and Secrets.
  • Strong knowledge of infrastructure as code tools (e.g., Terraform, Ansible, ArgoCD) and CI/CD pipelines.
  • Strong experience working with integration of code quality tool (SonarQube or Checkmarx) with CI/CD pipeline.
  • Strong experience with monitoring, logging, and observability tools like, Splunk, GCP log, Dynatrace etc.
  • Ability to work independently and as part of a collaborative team, effectively communicating technical concepts to both technical and non-technical stakeholders.
  • Must have proven written and verbal communication skills, including presentations using tools like PowerPoint.
  • Must have ability to communicate and abstract the messaging to multiple target audiences including Sr business & IT leadership, technology, and business teams.
What We Offer:
  • Competitive compensation and comprehensive benefits.
  • Opportunity to work with a cutting-edge technology stack.
  • Collaborative and dynamic work environment.
  • Professional development and growth opportunities.

We are an equal opportunity employer and welcome applications from diverse candidates. Please note that we do not discriminate based on race, color, religion, sex, national origin, age, disability, or any other protected characteristic.



  • Charlotte, North Carolina, United States Brightspeed Full time

    Principal Site Reliability EngineerWe are currently seeking a skilled Principal Site Reliability Engineer to join our dynamic team at Brightspeed. As a Principal Site Reliability Engineer, you will play a key role in implementing and maintaining monitoring systems for business-critical systems and infrastructure. Your expertise will be crucial in ensuring...


  • Charlotte, North Carolina, United States Brightspeed Full time

    Job SummaryWe are seeking a highly skilled Principal Site Reliability Engineer to lead our team in ensuring the reliability and scalability of our business-critical systems and infrastructure.Key ResponsibilitiesDesign and implement monitoring systems to track performance and availability of critical systems and infrastructure.Develop and maintain scripts...


  • Charlotte, North Carolina, United States BrightSpeed Full time

    Job DescriptionWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Brightspeed. As a key member of our engineering team, you will play a critical role in ensuring the reliability and scalability of our cloud-based infrastructure.Key Responsibilities:Implement and maintain monitoring systems to track the performance and...


  • Charlotte, North Carolina, United States Brightspeed Full time

    Job DescriptionWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Brightspeed. In this role, you will be responsible for implementing and maintaining monitoring systems to track the performance and availability of business-critical systems and infrastructure.Key Responsibilities:Implement and maintain monitoring systems...


  • Charlotte, North Carolina, United States BrightSpeed Full time

    Transforming the Future of Internet ConnectivityAt Brightspeed, we're redefining the way people live, work, and connect by delivering fast, reliable internet connections and exceptional customer experiences across the Midwest and South. Our vision is to accelerate the upgrade of copper to fiber optic technologies, bringing faster and more reliable internet...


  • Charlotte, North Carolina, United States Brightspeed Full time

    Job DescriptionAre you passionate about ensuring the reliability and scalability of business-critical systems and infrastructure? Do you have a proven track record of managing mission-critical customer-facing applications? If so, we have an exciting opportunity for you.Brightspeed is a leading company reimagining how people live, work, play, and connect by...


  • Charlotte, North Carolina, United States BrightSpeed Full time

    Job SummaryBrightSpeed is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our engineering organization, you will play a critical role in designing and building a world-class fiber network and ensuring the reliability and scalability of our systems.Key ResponsibilitiesImplement and maintain monitoring systems...


  • Charlotte, North Carolina, United States Motion Recruitment Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Motion Recruitment. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our systems and applications.Key Responsibilities:Design, implement, and maintain scalable and highly...


  • Charlotte, North Carolina, United States City National Bank Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at City National Bank. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and maximum uptime of our systems in the Data Center or Cloud Platform.Key ResponsibilitiesImplement solutions that improve stability, security,...


  • Charlotte, North Carolina, United States Brightspeed Full time

    Lead Site Reliability EngineerBrightspeed is on the lookout for an experienced Lead Site Reliability Engineer to enhance our innovative team. In this pivotal role, you will be responsible for establishing and sustaining monitoring frameworks for essential systems and infrastructure. Your proficiency will be vital in guaranteeing the efficiency and uptime of...


  • Charlotte, North Carolina, United States Synechron Full time

    Job Title: Site Reliability EngineerLocation: Charlotte, NC & Iselin, NJFull Time/ PermanentAbout the RoleWe are seeking a seasoned Site Reliability Engineer to join our Trade Distribution System (TDS) software development team. As a Site Reliability Engineer, you will be responsible for advancing and enhancing reliability practices, with a strong focus on...


  • Charlotte, North Carolina, United States HDR Full time

    About HDRAt HDR, we're passionate about creating a better world through engineering, architecture, environmental, and construction services. Our multidisciplinary teams bring together experts in various fields to deliver innovative solutions that make a lasting impact.Job SummaryWe're seeking a Principal Site Civil Engineer to join our South Atlantic...


  • Charlotte, North Carolina, United States SERC Reliability Corporation Full time

    Job OverviewAbout SERC:The electric grid plays a crucial role in our daily lives, underpinning the health, safety, and prosperity of our communities. SERC's mission is to enhance the reliability and security of the electric grid, ensuring its resilience for both present and future needs.To fulfill this mission, we cultivate a diverse team of specialists...


  • Charlotte, North Carolina, United States SERC Reliability Corporation Full time

    Job OverviewCOMPANY MISSION:The electric grid is essential for our daily existence, underpinning the health, safety, and prosperity of our communities. It serves as the foundation for our economy and the advancements in society and technology. The mission of SERC is to mitigate risks to the reliability and security of the electric grid, ensuring its...


  • Charlotte, North Carolina, United States SERC Reliability Corporation Full time

    Job OverviewABOUT SERC:The electric grid plays a crucial role in our daily lives, serving as the backbone for community health, safety, and economic growth. SERC's primary goal is to mitigate risks associated with the reliability and security of the electric grid, ensuring its stability for both present and future generations.To fulfill this mission, we have...


  • Charlotte, North Carolina, United States Wells Fargo Full time

    About this RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Wells Fargo. As a key member of our Application Support and SRE team, you will play a critical role in introducing and advancing SRE discipline across multiple applications and vertical lines of business.Key ResponsibilitiesInstantiate Site Reliability...


  • Charlotte, North Carolina, United States HDR Full time

    About HDRAt HDR, we specialize in engineering, architecture, environmental, and construction services. Our multidisciplinary teams include scientists, economists, builders, analysts, and artists, making us a unique and dynamic organization.Job SummaryWe are seeking a Principal Site Civil Engineer to join our South Atlantic Building Engineering Services (BES)...


  • Charlotte, North Carolina, United States Jobot Full time

    Job OverviewRemote Senior Azure Site Reliability Engineer needed at a dynamic technology consulting firmAbout Us:We are in search of a Senior Cloud SRE Engineer with a strong foundation in Azure Cloud to become a vital part of our fully remote workforce. In this position, you will play a key role in executing the comprehensive implementation of Site...


  • Charlotte, North Carolina, United States VDart Inc Full time

    Job OverviewPosition Title: Site Reliability EngineerCompany: VDart IncEmployment Type: ContractContract Duration: 6+ monthsPosition Summary:We are seeking a highly proficient and experienced Site Reliability Engineer (SRE) to join our team, focusing on delivering exceptional service for our clients in the insurance sector. The ideal candidate will have a...


  • Charlotte, North Carolina, United States Electric Power Research Institute (EPRI) Full time

    Fuel Reliability Principal Team LeadEPRI is seeking a highly skilled Fuel Reliability Principal Team Lead to join our team. As a key member of our organization, you will be responsible for leading a team of experts in fuel reliability and performance.Key Responsibilities:Facilitate industry meetings and present at utility and member meetings, requiring...