Staff Site Reliability Engineer, Infrastructure

3 weeks ago


Columbus, United States Sunrun Full time

Everything we do at Sunrun is driven by a determination to transform the way we power our lives. We know that starts at the individual employee level. We strive to foster an environment you can thrive in through our commitment to diversity, inclusion and belonging.

A renewable energy revolution is beginning to blossom into the world's largest industrial transformation since the personal computer. The aging and vulnerable electrical grid is rapidly being supplemented and replaced by rooftop solar and rechargeable batteries. That evolution is about to explode as drivers flock to EVs and fuel those cars with electricity from home, leaving gas stations a relic to the past and spiking demand for sustainable, reliable, affordable electricity. Sunrun is in the driver's seat to lead this energy revolution as America's leading residential solar and renewable energy provider, and is leading the change through modernizing and re-imagining the future of home electrification. We are looking for skilled individuals to help us drive this transformation with an entrepreneurial and customer first spirit helping cement Sunrun as the leader of the revolution.

Objectives:

We are actively seeking an accomplished Staff Site Reliability Engineer(SRE) with exceptional leadership skills and a wealth of experience in infrastructure management to join our innovative team. As a pivotal member of our SRE team at an advanced level, you will play a critical role in shaping the reliability, security, and performance of our software solutions.

Responsibilities:

Infrastructure Leadership
  • Provide strategic leadership in designing, implementing, and managing the overall infrastructure strategy for our organization.
Cloud Technologies
  • Leverage cloud platforms (eg, AWS, Azure) to design, deploy, and manage scalable infrastructure solutions.
Define and Elevate Monitoring Standards
  • Spearhead the definition of advanced monitoring requirements and elevate SLAs.

  • Collaborate with the engineering team and TPM to implement and enhance monitoring practices.

Exceptional Communication Skills
  • Expertly convey intricate technical information to diverse stakeholders with clarity and precision.
Leadership in SRE Principles and System Design
  • Provide leadership in integrating advanced SRE principles into applications and services.

  • Lead the implementation of sophisticated system design measures for heightened security, performance, and resiliency.

Strategic Notification Strategies and Incident Response
  • Develop strategic notification strategies for production outages.

  • Leverage SLOs and SLIs to measure and optimize availability, latency, and response time.

  • Lead and strategize emergency response efforts, conduct retrospectives with RCA, and manage on-call workloads effectively.

Holistic Production Environment Oversight
  • Oversee the holistic health of the production environment, emphasizing availability, and proactive monitoring.

  • Drive advanced practices in application performance, capacity testing, and auto-scaling.

Innovative Support and Release Strategies
  • Spearhead innovative support and release strategies in collaboration with cross-functional teams.

  • Lead initiatives to elevate services through advanced testing and release procedures.

Exemplary Documentation and Automation Practices
  • Champion exemplary documentation practices for actions, findings, and automation procedures.

  • Identify and lead initiatives for advanced automation solutions.

Strategic Influence on Product Roadmap
  • Collaborate closely with engineering and product counterparts to strategically influence improved resiliency and reliability.

  • Identify and lead major projects for substantial enhancements in reliability, cost savings, and revenue.

Strategic Efficiency and Capacity Planning
  • Drive strategic efforts in efficiency and capacity planning.

  • Establish and communicate clear requirements while optimizing system resource usage.

Required Skills and Qualifications:

  • Minimum of 8 years of extensive SRE experience or equivalent.
Leadership Excellence and Communication Skills
  • Demonstrated excellence in leadership with a strong commitment to mentoring others.

  • Exceptional written and in-person communication skills.

Technical Mastery
  • Mastery in software systems engineering fundamentals.

  • Profound expertise in databases, version control, deployment logs, cloud technologies, scaling, architectural patterns, and APIs.

Unwavering Professionalism and Motivation
  • Unwavering commitment to the highest ethical and professional standards.

  • A demonstrated passion for continual learning and professional growth.

Advanced Technical Proficiency
  • Extensive experience with AWS and related technologies (eg, Lambda, Kinesis, API Gateway, SQS).

  • Advanced knowledge of open-source solutions such as Grafana and Terraform.

  • Advanced proficiency in deployment strategies, CI/CD, and programming skills in Python, Java, Ruby, or JavaScript.

If you possess these advanced qualifications and are ready to take on a leadership role at the staff level within our esteemed SRE team, we encourage you to apply and contribute to the ongoing success of our cutting-edge technology solutions.

Please note that the compensation information that follows is a good faith estimate for this position only and is provided pursuant to acts, such as The Equal Pay Transparency Act. It assumes that the successful candidate will be located in markets within the United States that warrant the compensation listed. Candidates in locations outside this local area may have a different starting salary range for this opportunity which may be higher or lower. Please speak with your recruiter to learn more.

The starting salary/wage for this opportunity is: $151,09 to $194,266.

Other rewards may include annual bonus eligibility, which is based on company and individual performance, short and long term incentives, and program-specific awards. Sunrun provides a variety of benefits to employees, including health insurance coverage, an employee wellness program, life and disability insurance, a retirement savings plan, paid holidays and paid time off (PTO). A candidate's salary history will not be used in compensation decisions.

Recruiter:

Tyrone Taylor ( (see below) )

This description sets forth the general nature and level of the qualifications and duties required of employees in this job classification, as well as some of the essential functions of this role. It is not designed to be a comprehensive inventory of all essential duties and qualifications. If you have a disability or special need that may require reasonable accommodation in order to participate in the hiring process or to perform this role if you are offered employment, please let us know by contacting us at (see below) .



  • Columbus, Ohio, United States Sunrun Full time

    Everything we do at Sunrun is driven by a determination to transform the way we power our lives. We know that starts at the individual employee level. We strive to foster an environment you can thrive in through our commitment to diversity, inclusion and belonging. A renewable energy revolution is beginning to blossom into the world's largest industrial...


  • Columbus, United States Vision It US Full time

    Job Description Job Description We are looking for an adventurous Senior Site Reliability Engineer who loves AWS technologies. You will be a member of an engineering team where collaboration and innovation are a key focus. As part of this team you will design, build, deploy, and monitor software and infrastructure that delivers new features to the market. Be...


  • Columbus, United States Vision It US Full time

    Job DescriptionJob DescriptionWe are looking for an adventurous Senior Site Reliability Engineer who loves AWS technologies. You will be a member of an engineering team where collaboration and innovation are a key focus. As part of this team you will design, build, deploy, and monitor software and infrastructure that delivers new features to the market. Be...


  • Columbus, United States V-Soft Consulting Group Full time

    Job Title: Site Reliability Engineer Location: Columbus OH/Hybrid 3 days onsite Duration: 3+ month CTH Contract W2 Role Required Skills: SRE background for 4+ years, AWS and EC2 and Lambda DynamoDB, python or java, they are moving towards containers (Kubernetes/docker) Job Description Maintain the production environment by monitoring availability and taking...


  • Columbus, United States V-Soft Consulting Group Full time

    Job Title: Site Reliability Engineer Location: Columbus OH/Hybrid 3 days onsite Duration: 3+ month CTH Contract W2 Role Required Skills: SRE background for 4+ years, AWS and EC2 and Lambda DynamoDB, python or java, they are moving towards containers (Kubernetes/docker) Job Description Maintain the production environment by monitoring availability and taking...


  • Columbus, United States Huntington Bancshares, Inc. Full time

    Description Summary: The Site Reliability Engineer provides technical and consultative support on the most complex technical matters. Responsibilities: Extensive expertise within production environments (AWS/On Premise), covering security, deployment, automation, and serverless technologies. Apply deep knowledge of SRE principles to ensure the scalability...


  • Columbus, United States V-Soft Consulting Group, Inc. Full time

    Job Title: Site Reliability EngineerLocation: Columbus OH/Hybrid 3 days onsiteDuration: 3+ month CTHContract W2 RoleRequired Skills: SRE background for 4+ years, AWS and EC2 and Lambda DynamoDB, python or java, they are moving towards containers (Kubernetes/docker)Job DescriptionMaintain the production environment by monitoring availability and taking a...


  • Columbus, United States JobRialto Full time

    Description: Maintain the production environment by monitoring availability and taking a holistic view of system health Ensure highly resilient, low latency, business continuity designs in multi regions application deployments Build software and systems to manage platform infrastructure and applications Improve reliability, quality, and time-to-market of our...


  • Columbus, United States JobRialto Full time

    Description: Maintain the production environment by monitoring availability and taking a holistic view of system health Ensure highly resilient, low latency, business continuity designs in multi regions application deployments Build software and systems to manage platform infrastructure and applications Improve reliability, quality, and time-to-market of our...


  • Columbus, United States Saxon Global Full time

    Duties and Responsibilities: Maintain the production environment by monitoring availability and taking a holistic view of system health Ensure highly resilient, low latency, business continuity designs in multi regions application deployments Build software and systems to manage platform infrastructure and applications Improve reliability, quality, and...


  • Columbus, United States Huntington Bancshares, Inc. Full time

    The Site Reliability Engineer provides technical and consultative support on the most complex technical matters. Responsibilities:Extensive expertise within production environments (AWS/ On Premise), covering security, deployment, automation, and ser Reliability Engineer, Liability, Reliability, Reliability, Engineer, Technical Support, Technology, Banking


  • Columbus, United States Global Payments Full time

    Site Reliability Engineer I page is loaded Site Reliability Engineer I Apply locations Columbus, Georgia, USA time type Full time posted on Posted Today job requisition id R0050363 Every day, Global Payments makes it possible for millions of people to move money between buyers and sellers using our payments solutions for credit, debit, prepaid and merchant...


  • Columbus, United States Central Point Partners Full time

    Description: Contract to hire 1-2 teams interview or in person if possible Great communication & written skills JOB DESCRIPTION Summary: The Programmer/Analyst-Senior modifies existing software/application programs, which are typically more complex in nature, or writes new programs to support user and management needs. Duties and...


  • Columbus, United States Remotely Full time

    This is a remote position. Site Reliability Engineer - US Residents Only, 1 year experience, remote) Team Remotely Inc. is a staffing and recruitment agency that offers a comprehensive solution for talent acquisition, including sourcing, vetting, pay rolling, and managing talent. Whether you need contract staffing, direct hire, direct sourcing, talent pools,...

  • Reliability Engineer

    2 weeks ago


    Columbus, United States AkzoNobel N.V. Full time

    Leads reliability initiatives in line with site and business needs, tracking all stages to deliver on production, quality, health/safety/environmental, and cost improvement targets. Interfaces with other departments to establish a reliability-centere Reliability Engineer, Liability, Reliability, Continuous Improvement, Equipment Maintenance, Reliability,...


  • Columbus, United States Akkodis Full time

    Akkodis is seeking an Infrastructure Engineer in Columbus, OH for a Contract position.Position Title: Infrastructure EngineerLocation: Columbus, OHDuration: 8 Month Contract Pay Range: $53-$60/hr Must-have skills:NetApp-NAS storageIsilon-EMCExtensive experience in Architecture designing, Experience in large enterprise infrastructure design Nice-to-have...


  • Columbus, United States Infinity Consulting Solutions Full time

    We have partnered with our client in search of an Application Support Engineer.Application Support Roles & Responsibilities: Application monitoring infrastructure using Splunk or Dynatrace, servers, databases, distributed batch jobs and supporting sustained resiliency, disaster recovery and high availability events Triage Distributed and Mainframe...


  • Columbus, United States Compunnel Full time

    Description: We're currently seeking a highly skilled Senior RHEL Linux Engineer to join our dynamic team. This role is instrumental in managing and enhancing our Linux infrastructure, ensuring operational efficiency, security, and system reliability. Key Responsibilities Administer and support daily operations of RHEL Linux environments. Deploy patches...


  • Columbus, United States Collabera Full time

    DayToDay Responsbilities: File Storage Engineering product experience Datacenter stack experience (Storage, Compute, Networking) Linux/Unix and Windows Operating Systems, including NAS protocols CIFS/SMB and NFS Proven experience in automation of manual tasks via code (eg Python) or scripts (eg bash, PowerShell) Experience with programming languages...


  • Columbus, United States AkzoNobel Full time

    Select how often (in days) to receive an alert: Reliability Engineer Date: May 15, 2024 Location: Columbus, OH, US Company: AkzoNobel We’ve been pioneering a world of possibilities to bring surfaces to life for well over 200 years. As experts in making coatings, there’s a good chance you’re only ever a few meters away from one of our products. Our...