Principal Site Reliability Engineer
3 weeks ago
Company Description At Brightspeed, we are reimagining how people live, work, play and connect by providing fast, reliable internet connections and an awesome customer experience in twenty states throughout the Midwest and South. Backed by funds managed by Apollo Global Management, our vision is to accelerate the upgrade of copper to fiber optic technologies, bringing faster and more reliable internet service to many rural markets traditionally underserved by broadband providers, while delivering best-in-class customer experience. Be a part of the team that will make this vision a reality....designing and building a world class fiber network and creating a customer experience second to none. Check us out on the web Job Description We are currently looking for a Principal Site Reliability Engineer to join our growing team. In this role, you will implement and maintain monitoring systems to track the performance and availability of business-critical systems and infrastructure using metrics to identify trends and potential issues. You will also work closely with development teams, operations, and other stakeholders to ensure that new services and features are reliable and scalable. As a Principal Site Reliability Engineer, your duties and responsibilities will include: Implement and maintain monitoring systems to track the performance and availability of business-critical systems and infrastructure. Use metrics to identify trends and potential issues. Respond to system outages and performance issues, performing root cause analysis to prevent recurrence. Develop scripts and tools to automate repetitive tasks, such as deployment, scaling, and monitoring. Work closely with development teams, operations, and other stakeholders to ensure that new services and features are reliable and scalable. Work on reducing latency and improving the speed of data transmission across the network. Define and measure Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to ensure services meet required performance and availability targets. Conduct postmortems after incidents to identify what went wrong and what can be improved. Work with Lead Application owners and internal Change Management to review code changes and support deployments. Lead the team of site reliability engineers onshore/offshore, mentoring them for support activities required for system reliability. Communicate and abstract the messaging to multiple target audiences including Sr business & IT leadership, technology, and business teams. Qualifications WHAT IT TAKES TO CATCH OUR EYE: Master's degree in computer science, telecommunications, or similar areas, with a minimum of 10 years software engineering experience, including a minimum of 5 years as a site reliability engineer. Proven track record of managing mission-critical customer-facing applications for reliability. 5+ years of experience supporting operations and maintenance for cloud-native applications in production that are fault-tolerant, self-healing, scalable, and highly available. Excellent troubleshooting and problem-solving skills, with a keen attention to detail to identify and resolve complex production issues. Deep understanding of cloud computing platforms (GCP) and containerization technologies (e.g., Docker, Kubernetes). Solid experience with core Kubernetes concepts such as Pods, Workloads, Services, Ingress/Egress, Deployments, ConfigMaps, HPA, Liveliness Probe, and Secrets. Strong knowledge of infrastructure as code tools (e.g., Terraform, Ansible, ArgoCD) and CI/CD pipelines. Strong experience working with integration of code quality tools (SonarQube or Checkmarx) with CI/CD pipeline. Strong experience with monitoring, logging, and observability tools like Splunk, GCP log, Dynatrace, etc. Ability to work independently and as part of a collaborative team, effectively communicating technical concepts to both technical and non-technical stakeholders. Proven written and verbal communication skills, including presentations using tools like PowerPoint. BONUS POINTS FOR: Certifications such as Google Professional Cloud DevOps Engineer or AWS Certified DevOps Engineer. Additional Information WHY JOIN US? We aspire to contemporary ways of working. Recognized as a Top Workplace by the Charlotte Observer, Brightspeed HQ is located on the 7th floor of the new Vantage South End - East Tower in Charlotte, NC. We prioritize hiring talent in the Charlotte area, whenever possible, to make it a truly vibrant destination for our hybrid workforce. At Brightspeed, we have roles that are designated as remote, hybrid, office, or field-based, depending on the position, business needs, and individual circumstances. We also invest in technology that enables our entire team to stay connected. We offer competitive compensation and comprehensive benefits. Our benefits and paid time off programs reflect our underlying belief in promoting overall wellness through physical, emotional, and financial health. Brightspeed offers a comprehensive benefit program, including competitive medical, dental, vision, and life insurance; an employee assistance program; a 401K plan with company match and a host of voluntary benefits. Diversity, equity, and inclusion are at the center of our grounding belief in Being Real. When we bring our authentic selves to work, everyone is better as a result. A diverse team helps us be fierce advocates for more accessible, inclusive, and high-quality internet, because we believe doing so promotes equity in the communities we serve. Brightspeed is an Equal Opportunity Employer/Veterans/Disabled. For all applicants, please take a moment to review our Privacy Notices: Brightspeed's Privacy Notice for California Residents. Brightspeed's Privacy Notice. #J-18808-Ljbffr
-
Principal Site Reliability Engineer
1 week ago
Charlotte, North Carolina, United States Brightspeed Full timePrincipal Site Reliability EngineerWe are currently seeking a skilled Principal Site Reliability Engineer to join our dynamic team at Brightspeed. As a Principal Site Reliability Engineer, you will play a key role in implementing and maintaining monitoring systems for business-critical systems and infrastructure. Your expertise will be crucial in ensuring...
-
Principal Site Reliability Engineer
2 months ago
Charlotte, United States Brightspeed Full timeJob Description We are currently looking for a Principal Site Reliability Engineer to join our growing team. In this role, you will implement and maintain monitoring systems to track the performance and availability of business-critical systems and infrastructure using metrics to identify trends and potential issues. You will also work closely with...
-
Principal Site Reliability Engineer
2 months ago
Charlotte, United States Brightspeed Full timeJob DescriptionJob DescriptionCompany DescriptionAt Brightspeed, we are reimagining how people live, work, play and connect by providing fast, reliable internet connections and an awesome customer experience in twenty states throughout the Midwest and South.Backed by funds managed by Apollo Global Management, our vision is to accelerate the upgrade of...
-
Principal Site Reliability Engineer
1 week ago
Charlotte, North Carolina, United States Brightspeed Full timeJob DescriptionAre you passionate about ensuring the reliability and scalability of business-critical systems and infrastructure? Do you have a proven track record of managing mission-critical customer-facing applications? If so, we have an exciting opportunity for you.Brightspeed is a leading company reimagining how people live, work, play, and connect by...
-
Site Reliability Engineer
1 month ago
Charlotte, United States VDart Inc Full timeJob DescriptionJob DescriptionJob Tittle: Site Reliability EngineerLocation: Charlotte downtown, NC (Hybrid)Job Type: ContractDuration: 6+monthJob Description:Job Description: Highly skilled and experienced SWAT Site Reliability Engineer (SRE) for US based Insurance Customer. The candidate should possess a strong foundation in full-stack technologies and an...
-
Lead Site Reliability Engineer
2 weeks ago
Charlotte, North Carolina, United States Brightspeed Full timeLead Site Reliability EngineerBrightspeed is on the lookout for an experienced Lead Site Reliability Engineer to enhance our innovative team. In this pivotal role, you will be responsible for establishing and sustaining monitoring frameworks for essential systems and infrastructure. Your proficiency will be vital in guaranteeing the efficiency and uptime of...
-
Senior Engineer for Reliability Assessment
2 weeks ago
Charlotte, North Carolina, United States SERC Reliability Corporation Full timeJob OverviewAbout SERC:The electric grid plays a crucial role in our daily lives, underpinning the health, safety, and prosperity of our communities. SERC's mission is to enhance the reliability and security of the electric grid, ensuring its resilience for both present and future needs.To fulfill this mission, we cultivate a diverse team of specialists...
-
Software Engineer III- Site Reliability
2 weeks ago
Charlotte, United States Truist Financial Corporation Full timeTruist is seeking a Software Engineer III to focus on Site Reliability work for Credit Card reporting directly to our Credit Card Production Support Group Head. This individual will Work as part of a team of Monitoring Engineers to enhance the obser Software Engineer, Information Technology, Liability, Reliability, Reliability, Engineer, Banking
-
Senior Reliability Assessment Engineer
1 week ago
Charlotte, North Carolina, United States SERC Reliability Corporation Full timeJob OverviewCOMPANY MISSION:The electric grid is essential for our daily existence, underpinning the health, safety, and prosperity of our communities. It serves as the foundation for our economy and the advancements in society and technology. The mission of SERC is to mitigate risks to the reliability and security of the electric grid, ensuring its...
-
Senior Engineer for Reliability Assessment
2 weeks ago
Charlotte, North Carolina, United States SERC Reliability Corporation Full timeJob OverviewABOUT SERC:The electric grid plays a crucial role in our daily lives, serving as the backbone for community health, safety, and economic growth. SERC's primary goal is to mitigate risks associated with the reliability and security of the electric grid, ensuring its stability for both present and future generations.To fulfill this mission, we have...
-
Senior Site Reliability Engineer
3 weeks ago
Charlotte, United States CEDENT Full timeSenior Site Reliability Engineer (DevOps) (Charlotte, NC) Role: Senior Site Reliability Engineer (DevOps) Location: Charlotte, NC Terms of Hire: Full time. Salary: Open Position Purpose: Position Responsibilities: Leverage various technologies to improve feedback and delivery by deploying containerized systems and services in support of operational...
-
Platform/Site Reliability Engineer
3 months ago
Charlotte, United States Syntricate Technologies Full timePlatform/Site Reliability Engineer 6 Months Contract to Hire Charlotte, NCJOB DESCRIPTION We're looking for a Senior Platform Engineer to come help us automate everything, enable our developer teammates, and create and support world-class platforms. As a Senior Platform Engineer, you will be an integral member of the Platform Engineering team, helping the...
-
Senior Reliability Engineer, RAPA
3 months ago
Charlotte, United States SERC Reliability Corporation Full timeJob DescriptionJob DescriptionSERC OVERVIEW:The electric grid is vital to our everyday lives. It is fundamental for the health, safety, and well-being of our communities, and provides the platform for our economy and our societal and technological advances. SERC's mission is to reduce risks to the reliability and security of the electric grid (also known...
-
Site Reliability Engineering
3 weeks ago
Charlotte, United States The Hartford Full timeYou are a driven and motivated problem solver ready to pursue meaningful work. You strive to make an impact every day & not only at work, but in your personal life and community too. If that sounds like you, then you've landed in the right place. "IT Engineer - focused on Site Reliability Engineering" The Hartford's Corporate IT team is seeking a highly...
-
Senior Site Reliability Engineer
4 days ago
Charlotte, United States Albano Systems, Inc. Full timeW2 ONLY - NO 3RD PARTIES - NO CORP TO CORPWe are seeking an experienced and highly motivated Performance & Observability Engineer who will be responsible for assessing, strategizing and building solutions for complex application and infrastructure observability needs to ensure production stability and visibility. Position will be responsible for driving...
-
Senior Azure Site Reliability Engineer
2 weeks ago
Charlotte, North Carolina, United States Jobot Full timeJob OverviewRemote Senior Azure Site Reliability Engineer needed at a dynamic technology consulting firmAbout Us:We are in search of a Senior Cloud SRE Engineer with a strong foundation in Azure Cloud to become a vital part of our fully remote workforce. In this position, you will play a key role in executing the comprehensive implementation of Site...
-
Reliability Operations Engineer
2 weeks ago
Charlotte, North Carolina, United States VDart Inc Full timeJob OverviewPosition Title: Site Reliability EngineerCompany: VDart IncEmployment Type: ContractContract Duration: 6+ monthsPosition Summary:We are seeking a highly proficient and experienced Site Reliability Engineer (SRE) to join our team, focusing on delivering exceptional service for our clients in the insurance sector. The ideal candidate will have a...
-
Charlotte, United States Smart IT Frame LLC Full timeExperience Level: 13 yearsNote: Only locals in NCRequired Qualifications:5+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education3+ years working with Google Cloud Platform (GCP) and a proven track record of building complex infrastructure...
-
Principal Civil Engineer
2 weeks ago
Charlotte, North Carolina, United States Apollo Technical Full timeApollo Technical is seeking a dedicated Structural Engineer to join our team.This is a full-time, direct hire opportunity.Position Overview:The Principal Structural Engineer will take charge of the design process for structural frameworks supporting both new constructions and upgrades. Collaborating within a multidisciplinary design team, the candidate will...
-
Staff Reliability Engineer
1 week ago
Charlotte, United States Cynet Systems Full timeJob Description: Responsibilities: Configure, maintain our applications and infrastructure's observability capabilities in partnership with SRE, AIOps and Assess code, conf, or infra changes readiness for production. Monitor and develop SLOs and SLIs through customer user journey; Advise on SLA; Establish error budgets. Strategize, analyze and tune...