Principal Site Reliability Engineer

2 days ago


Charlotte, North Carolina, United States BrightSpeed Full time
Transforming the Future of Internet Connectivity

At Brightspeed, we're redefining the way people live, work, and connect by delivering fast, reliable internet connections and exceptional customer experiences across the Midwest and South. Our vision is to accelerate the upgrade of copper to fiber optic technologies, bringing faster and more reliable internet service to underserved rural markets, while delivering best-in-class customer experiences.

We're seeking a Principal Site Reliability Engineer to join our growing team. In this role, you'll implement and maintain monitoring systems to track the performance and availability of business-critical systems and infrastructure, using metrics to identify trends and potential issues. You'll work closely with development teams, operations, and other stakeholders to ensure that new services and features are reliable and scalable.

Key Responsibilities:

  • Design and implement monitoring systems to track system performance and availability
  • Respond to system outages and performance issues, performing root cause analysis to prevent recurrence
  • Develop scripts and tools to automate repetitive tasks, such as deployment, scaling, and monitoring
  • Collaborate with development teams, operations, and stakeholders to ensure new services and features are reliable and scalable
  • Work on reducing latency and improving data transmission speed across the network
  • Define and measure Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure services meet performance and availability targets
  • Conduct postmortems after incidents to identify areas for improvement
  • Lead a team of site reliability engineers, mentoring them on support activities required for system reliability
Requirements:

  • Master's degree in computer science, telecommunications, or similar areas, with a minimum of 10 years software engineering experience, including 5 years as a site reliability engineer
  • Proven track record of managing mission-critical customer-facing applications for reliability
  • 5+ years of experience supporting operations and maintenance for cloud-native applications in production, fault-tolerant, self-healing, scalable, and high-availability
  • Excellent troubleshooting and problem-solving skills, with a keen attention to detail to identify and resolve complex production issues
  • Deep understanding of cloud computing platforms (GCP) and containerization technologies (e.g., Docker, Kubernetes)
  • Solid experience with core Kubernetes concepts, such as Pods, Workloads, Services, Ingress/Egress, Deployments, ConfigMaps, HPA, Liveliness Probe, and Secrets
  • Strong knowledge of infrastructure as code tools (e.g., Terraform, Ansible, ArgoCD) and CI/CD pipelines
  • Strong experience working with integration of code quality tools (SonarQube or Checkmarx) with CI/CD pipelines
  • Strong experience with monitoring, logging, and observability tools like Splunk, GCP log, Dynatrace, etc.
  • Ability to work independently and as part of a collaborative team, effectively communicating technical concepts to both technical and non-technical stakeholders
  • Proven written and verbal communication skills, including presentations using tools like PowerPoint
Why Join Us?

We prioritize hiring talent in the Charlotte area, whenever possible, to make it a vibrant destination for our hybrid workforce. At Brightspeed, we have roles that are designated as remote, hybrid, office, or field-based, depending on the position, business needs, and individual circumstances. We invest in technology that enables our entire team to stay connected. We offer competitive compensation and comprehensive benefits, reflecting our underlying belief in promoting overall wellness through physical, emotional, and financial health. Brightspeed is an Equal Opportunity Employer/Veterans/Disabled.

  • Charlotte, North Carolina, United States Brightspeed Full time

    Principal Site Reliability EngineerWe are currently seeking a skilled Principal Site Reliability Engineer to join our dynamic team at Brightspeed. As a Principal Site Reliability Engineer, you will play a key role in implementing and maintaining monitoring systems for business-critical systems and infrastructure. Your expertise will be crucial in ensuring...


  • Charlotte, North Carolina, United States Brightspeed Full time

    Job DescriptionWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Brightspeed. In this role, you will be responsible for implementing and maintaining monitoring systems to track the performance and availability of business-critical systems and infrastructure.Key Responsibilities:Implement and maintain monitoring systems...


  • Charlotte, North Carolina, United States Brightspeed Full time

    Job DescriptionAre you passionate about ensuring the reliability and scalability of business-critical systems and infrastructure? Do you have a proven track record of managing mission-critical customer-facing applications? If so, we have an exciting opportunity for you.Brightspeed is a leading company reimagining how people live, work, play, and connect by...


  • Charlotte, North Carolina, United States BrightSpeed Full time

    Job SummaryBrightSpeed is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our engineering organization, you will play a critical role in designing and building a world-class fiber network and ensuring the reliability and scalability of our systems.Key ResponsibilitiesImplement and maintain monitoring systems...


  • Charlotte, North Carolina, United States Brightspeed Full time

    Lead Site Reliability EngineerBrightspeed is on the lookout for an experienced Lead Site Reliability Engineer to enhance our innovative team. In this pivotal role, you will be responsible for establishing and sustaining monitoring frameworks for essential systems and infrastructure. Your proficiency will be vital in guaranteeing the efficiency and uptime of...


  • Charlotte, North Carolina, United States SERC Reliability Corporation Full time

    Job OverviewAbout SERC:The electric grid plays a crucial role in our daily lives, underpinning the health, safety, and prosperity of our communities. SERC's mission is to enhance the reliability and security of the electric grid, ensuring its resilience for both present and future needs.To fulfill this mission, we cultivate a diverse team of specialists...


  • Charlotte, North Carolina, United States SERC Reliability Corporation Full time

    Job OverviewCOMPANY MISSION:The electric grid is essential for our daily existence, underpinning the health, safety, and prosperity of our communities. It serves as the foundation for our economy and the advancements in society and technology. The mission of SERC is to mitigate risks to the reliability and security of the electric grid, ensuring its...


  • Charlotte, North Carolina, United States SERC Reliability Corporation Full time

    Job OverviewABOUT SERC:The electric grid plays a crucial role in our daily lives, serving as the backbone for community health, safety, and economic growth. SERC's primary goal is to mitigate risks associated with the reliability and security of the electric grid, ensuring its stability for both present and future generations.To fulfill this mission, we have...


  • Charlotte, North Carolina, United States Jobot Full time

    Job OverviewRemote Senior Azure Site Reliability Engineer needed at a dynamic technology consulting firmAbout Us:We are in search of a Senior Cloud SRE Engineer with a strong foundation in Azure Cloud to become a vital part of our fully remote workforce. In this position, you will play a key role in executing the comprehensive implementation of Site...


  • Charlotte, North Carolina, United States VDart Inc Full time

    Job OverviewPosition Title: Site Reliability EngineerCompany: VDart IncEmployment Type: ContractContract Duration: 6+ monthsPosition Summary:We are seeking a highly proficient and experienced Site Reliability Engineer (SRE) to join our team, focusing on delivering exceptional service for our clients in the insurance sector. The ideal candidate will have a...


  • Charlotte, North Carolina, United States Cynet Systems Full time

    Job DescriptionCynet Systems is seeking a highly skilled Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability and performance of our applications and infrastructure.Key ResponsibilitiesConfigure and Maintain Observability Capabilities: Partner with SRE, AIOps, and other teams...


  • Charlotte, North Carolina, United States Apollo Technical Full time

    Apollo Technical is seeking a dedicated Structural Engineer to join our team.This is a full-time, direct hire opportunity.Position Overview:The Principal Structural Engineer will take charge of the design process for structural frameworks supporting both new constructions and upgrades. Collaborating within a multidisciplinary design team, the candidate will...


  • Charlotte, North Carolina, United States Microsoft Full time

    About the RoleWe are seeking a highly skilled Principal Software Engineering Manager to lead our Azure Storage team in delivering a world-class cloud storage service. As a key member of our engineering organization, you will be responsible for driving the reliability and performance enhancements of our service, working on designing, implementing, and...


  • Charlotte, North Carolina, United States Microsoft Full time

    OverviewMicrosoft is seeking a highly skilled Principal Software Engineering Manager to lead our Azure Storage team. As a key member of our engineering organization, you will be responsible for driving the development and delivery of our cloud storage services.ResponsibilitiesManage a team of engineers to design, develop, and deploy cloud storage...


  • Charlotte, North Carolina, United States Principal Financial Group, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Information Security Engineer Intern to join our team at Principal Financial Group, Inc. As an Information Security Engineer Intern, you will play a critical role in supporting the development and implementation of our cybersecurity strategies and initiatives.Key ResponsibilitiesAssist in...


  • Charlotte, North Carolina, United States Principal Financial Group Full time

    What You'll DoWe are seeking a Lead Software Architect to become an integral part of our WSRS technology team. In this capacity, you will play a pivotal role in strategically influencing our technological landscape, from cutting-edge customer interactions to resilient core systems. Your leadership acumen and proficiency in developing enterprise-level systems...


  • Charlotte, North Carolina, United States VDart Inc Full time

    Job OverviewPosition Title: Site Reliability EngineerCompany: VDart IncJob Type: ContractDuration: 6+ monthsRole Summary:We are seeking a highly qualified and experienced Site Reliability Engineer (SRE) to join our team at VDart Inc. The ideal candidate will have a robust background in full-stack technologies and a comprehensive understanding of...


  • Charlotte, North Carolina, United States Principal Financial Group, Inc. Full time

    About the RoleWe are seeking a highly motivated and detail-oriented Information Security Engineer Intern to join our team. As an intern, you will have the opportunity to work on real-world projects, collaborate with experienced professionals, and develop your skills in information security.ResponsibilitiesAssist in monitoring and analyzing potential threats...


  • Charlotte, North Carolina, United States Microsoft Full time

    About the RoleWe are seeking a highly skilled Principal Software Engineering Manager to lead our Cloud Storage team at Microsoft. As a key member of our Azure Storage team, you will be responsible for driving the development and delivery of our Cloud Storage service.Key ResponsibilitiesManage a team of software engineers to design, develop, and deploy...


  • Charlotte, North Carolina, United States Electric Power Research Institute (EPRI) Full time

    Job Title: Fuel Reliability Principal Team Lead Location: Charlotte, NC, Palo Alto, CA Position Overview: The role requires a highly skilled individual with robust technical and project management expertise in commercial light water reactor (LWR) fuel. The candidate should possess knowledge of advanced fuel concepts, including coated cladding, doped fuel,...