Principal Site Reliability Engineer

3 weeks ago


Austin, United States SecurityScorecard Full time

Join to apply for the Principal Site Reliability Engineer role at SecurityScorecard1 week ago Be among the first 25 applicantsJoin to apply for the Principal Site Reliability Engineer role at SecurityScorecardThis range is provided by SecurityScorecard. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Base pay range$220,000.00/yr - $290,000.00/yrAbout SecurityScorecard:SecurityScorecard is the global leader in cybersecurity ratings, with over 12 million companies continuously rated, operating in 64 countries. Founded in 2013 by security and risk experts Dr. Alex Yampolskiy and Sam Kassoumeh and funded by world-class investors, SecurityScorecard’s patented rating technology is used by over 25,000 organizations for self-monitoring, third-party risk management, board reporting, and cyber insurance underwriting; making all organizations more resilient by allowing them to easily find and fix cybersecurity risks across their digital footprint.Headquartered in New York City, our culture has been recognized by Inc Magazine as a "Best Workplace,” by Crain’s NY as a "Best Places to Work in NYC," and as one of the 10 hottest SaaS startups in New York for two years in a row. Most recently, SecurityScorecard was named to Fast Company’s annual list of the World’s Most Innovative Companies for 2023 and to the Achievers 50 Most Engaged Workplaces in 2023 award recognizing “forward-thinking employers for their unwavering commitment to employee engagement.” SecurityScorecard is proud to be funded by world-class investors including Silver Lake Waterman, Moody’s, Sequoia Capital, GV and Riverwood Capital.Role OverviewAs a Principal Site Reliability Engineer, you will play a strategic and technical leadership role in shaping the reliability, scalability, and velocity of our engineering platform. Your primary focus will be advancing our Kubernetes-based infrastructure and CI/CD systems to support high-scale, high-availability services. You will partner with engineering leaders across the organization to define and drive platform-wide initiatives that enable fast, safe, and repeatable deployments, and foster a culture of reliability and operational excellence.Key ResponsibilitiesLead the design and evolution of Kubernetes-based infrastructure to support multi-tenant, high-scale applications with strong isolation, resilience, and security.Architect and optimize CI/CD pipelines to support fast and reliable build, test, and deploy cycles across a polyglot environment.Establish and evangelize best practices for GitOps, canary deployments, rollback strategies, and progressive delivery.Define and implement scalable Infrastructure as Code (IaC) patterns using tools such as Terraform, Helm, and Crossplane.Drive the adoption of automated testing throughout the delivery lifecycle—unit, integration, load, and chaos testing—to ensure high confidence in production changes.Guide teams in designing for observability, SLOs, and alerting, ensuring actionable signals and minimizing alert fatigue.Partner with security, compliance, and development teams to ensure infrastructure and delivery systems meet modern security and governance standards.Lead incident response retrospectives and foster a blameless culture of continuous improvement.Mentor and influence senior engineers across multiple teams, helping to up-level platform reliability capabilities organization-wide.Qualifications8+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure roles, with 2+ years in a technical leadership or principal capacity.Deep expertise with Kubernetes internals (controllers, networking, autoscaling, operators, etc.) and production-grade clusters on cloud providers (EKS, GKE, or AKS).Proven experience designing and scaling CI/CD systems using tools such as GitHub Actions, Argo CD, Tekton, Spinnaker, or similar.Strong proficiency in Terraform and modern IaC practices.Advanced knowledge of automated testing strategies, including performance, load, and failure testing.Proficient in one or more programming/scripting languages (Python, Go, Bash, etc.).Deep experience with monitoring and observability stacks such as Prometheus, Grafana, OpenTelemetry, and Datadog.Strong communicator with the ability to align technical initiatives to business objectives and influence across engineering teams.Nice-to-HaveExposure to chaos engineering and building resilient distributed systems.Familiarity with compliance frameworks (SOC 2, HIPAA, etc.) as they relate to infrastructure and deployment.Contributions to open-source Kubernetes tooling or SRE frameworks.Familiarity with JVM- or Node-based application stacks.Specific to each country, we offer a competitive salary, stock options, Health benefits, and unlimited PTO, parental leave, tuition reimbursements, and much moreThe estimated total compensation range for this position is $220,000 - $290,000 (base plus bonus). Actual compensation for the position is based on a variety of factors, including, but not limited to affordability, skills, qualifications and experience, and may vary from the range. In addition to base salary, employees may also be eligible for annual performance-based incentive compensation awards and equity, among other company benefits.SecurityScorecard is committed to Equal Employment Opportunity and embraces diversity. We believe that our team is strengthened through hiring and retaining employees with diverse backgrounds, skill sets, ideas, and perspectives. We make hiring decisions based on merit and do not discriminate based on race, color, religion, national origin, sex or gender (including pregnancy) gender identity or expression (including transgender status), sexual orientation, age, marital, veteran, disability status or any other protected category in accordance with applicable law.We also consider qualified applicants regardless of criminal histories, in accordance with applicable law. We are committed to providing reasonable accommodations for qualified individuals with disabilities in our job application procedures. If you need assistance or accommodation due to a disability, please contact talentacquisitionoperations@securityscorecard.io.Any information you submit to SecurityScorecard as part of your application will be processed in accordance with the Company’s privacy policy and applicable law.SecurityScorecard does not accept unsolicited resumes from employment agencies. Please note that we do not provide immigration sponsorship for this position.Seniority levelSeniority levelMid-Senior levelEmployment typeEmployment typeFull-timeJob functionJob functionEngineering and Information TechnologyIndustriesData Security Software Products, Computer and Network Security, and Software DevelopmentReferrals increase your chances of interviewing at SecurityScorecard by 2xInferred from the description for this jobMedical insuranceVision insurance401(k)Paid maternity leavePaid paternity leaveChild care supportTuition assistanceDisability insuranceGet notified when a new job is posted.Sign in to set job alerts for “Site Reliability Engineer” roles.Product Engineer, Cloud Compute and StorageAustin, TX $147,000.00-$218,000.00 1 week agoCustomer Engineer, Startups, Google CloudAustin, TX $125,000.00-$183,000.00 2 weeks agoCustomer Engineer, Startups, Google CloudCustomer Engineer, Startups, Google Cloud, Customer EngineeringCustomer Engineer, Startups, Google CloudAustin, TX $102,000.00-$146,000.00 1 week agoAustin, TX $125,000.00-$183,000.00 2 days agoAustin, TX $125,000.00-$183,000.00 2 weeks agoAustin, TX $147,000.00-$218,000.00 1 week agoAustin, TX $170,000.00-$190,000.00 3 weeks agoAustin, TX $168,000.00-$322,000.00 2 weeks agoAustin, TX $198,000.00-$250,000.00 3 days agoAustin, TX $114,765.00-$130,000.00 1 week agoAustin, TX $102,000.00-$146,000.00 1 week agoSenior Site Reliability Engineer, ML PlatformsAustin, TX $224,000.00-$425,500.00 1 week agoSr. Site Reliability Engineer, Energy SoftwareSoftware Engineer, PhD, Early Career, Campus, Systems and Infrastructure, 2025 StartSenior Site Reliability Engineer, HPC and LSFAustin, Texas Metropolitan Area $164,000.00-$204,000.00 3 days agoAustin, TX $100,000.00-$150,000.00 4 days agoWe’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr



  • Austin, United States Cafell Technologies Full time

    Senior Manager - Recruitment / Client Relations Role: Site Reliability Engineer (SRE) – Onsite Experience: 7 to 9 years. Job Description As a Cloud Infrastructure Site Reliability Engineer (SRE) with expertise in multiple public cloud service provider platforms, you will be responsible for operating infrastructure solutions, following the principles and...


  • Austin, Texas, United States Cafell Technologies Full time $80,000 - $150,000 per year

    Dear Applicant,Role: Site Reliability Engineer (SRE) - OnsiteLocation: Columbus OH / Austin/ Charlotte NC(Full Time) Visa Type: USC/GC preferred. H1b/H4EAD acceptedExperience - 7 to 9 yrsJob Description:Position Summary:As a Cloud Infrastructure Site Reliability Engineer (SRE) with expertise in multiple public cloud service provider platforms, you will be...


  • Austin, United States Dell GmbH Full time

    Principal Mechanical Reliability Engineer Mechanical Engineering leads and delivers the development of innovative and compliant mechanical design solutions, as well as cross-functional interfaces for desktop, portable and server computer systems and peripherals. Our team conducts the analysis, feasibility studies and testing of mechanical products,...


  • Austin, United States Interactive Resources - iR Full time

    Get AI-powered advice on this job and more exclusive features.Our client is seeking a highly motivated and skilled Site Reliability Engineer (SRE) to join their Advisor Platform Engineering team. This critical position focuses on maintaining the availability, performance, and scalability of a mission-critical Azure-hosted platform serving thousands of...


  • Austin, TX, United States Dell Technologies Full time

    Principal Electrical Reliability Engineer Our Electrical Engineering team puts the spark into the full hardware development lifecycle, from concept to production. It takes experts in system architecture definition, design, analysis, prototyping, sourcing & the debugging and validation of layouts or routes to deliver state-of-the-art products for a changing...


  • Austin, TX, United States Dell Technologies Full time

    Principal Electrical Reliability Engineer Our Electrical Engineering team puts the spark into the full hardware development lifecycle, from concept to production. It takes experts in system architecture definition, design, analysis, prototyping, sourcing & the debugging and validation of layouts or routes to deliver state-of-the-art products for a changing...


  • Austin, TX, United States Dell Technologies Full time

    Principal Electrical Reliability Engineer Our Electrical Engineering team puts the spark into the full hardware development lifecycle, from concept to production. It takes experts in system architecture definition, design, analysis, prototyping, sourcing & the debugging and validation of layouts or routes to deliver state-of-the-art products for a changing...


  • Austin, United States Dell Full time

    Principal Mechanical Reliability EngineerMechanical Engineering leads and delivers the development of innovative and compliant mechanical design solutions, as well as cross-functional interfaces for desktop, portable and server computer systems and peripherals. Our team conducts the analysis, feasibility studies and testing of mechanical products,...


  • Austin, United States Dell Full time

    Principal Electrical Reliability EngineerOur Electrical Engineering team puts the spark into the full hardware development lifecycle, from concept to production. It takes experts in system architecture definition, design, analysis, prototyping, sourcing & the debugging and validation of layouts or routes to deliver state-of-the-art products for a changing...


  • Austin, United States Paradromics, Inc. Full time

    Site Reliability Engineer About Paradromics Brain-related illness is one of the last great frontiers in medicine, not because the brain is unknowable, but because it has been inaccessible. Paradromics is building a brain-computer interface (BCI) platform that records brain activity at the highest possible resolution: the individual neuron. AI algorithms then...