Site Reliability Engineer

2 weeks ago


Newport Beach, United States TEKsystems Full time

Description:
As a Lead SRE you will be providing technical leadership, direction and accountability for platform engineering, system design and end-to-end implementation to meet and exceed the product or platform non-functional requirements including quality, security, reliability, availability and performance. The main responsibilities include, but are not limited to, optimizing design and engineering for new system and enhancements, including processes and day to day activities, to reliably support product rollout and operation in production. As a lead SRE, the role will include both oversight for production operations of our portfolio of systems, as well as development/engineering of solutions to optimize system reliability and automation.

How you'll help move us forward:

  1. Lead the design, build and implement orchestration and tooling solutions to ensure that repetitive administration tasks are performed at a high level of efficiency and free of defect.
  2. Establish best practices for structuring, automating, building, deploying and monitoring complex distributed software products and environments.
  3. Ensure the reliability and traceability of software releases and deployments of software and infrastructure changes.
  4. Create and maintain platform architecture and design specifications to aid development, testing and maintenance of software environments.
  5. Design and implement monitoring and recovery tools to provide for site high availability (HA) and disaster recovery (DR).
  6. Design and develop highly available infrastructure and platform components to meet the needs of our growing and evolving product lines.
  7. Design and implement security engineering best practices in all our deployed platform and environments.
  8. Triage alerts & diagnose/resolve critical issues, manage the implementation of changes.
  9. Manage the coordination, documentation, and tracking of critical incidents and corresponding root cause analysis, ensuring rapid and complete issue resolution and appropriate closed loop to customers and other key stakeholders.
  10. Collaborate with Delivery Engineers and DevExp Engineers to enhance and implement continuous integration/continuous deployment orchestration system to reduce friction for software delivery to production.
  11. Lead, grow, mentor other SRE team members.
  12. Evangelize the DevSecOps culture and SRE mindset, and mentor others about reliability and best practices.
  13. Identify and work with other engineering disciplines to implement opportunities for:
    1. Automation
    2. Signal to noise reduction
    3. Prevention of recurring issues, and other actions to reduce time to mitigate service-impacting events and increase the productivity of cloud operations and development resources.
  14. Maintain a strong understanding of IaaS, PaaS, and SaaS offerings with building and maintaining a state-of-the-art, cloud-based environment for large-scale data processing.
  15. Design and implement processes, technology and automation for performance testing.
  16. Ensure that implementation and solution are fully documented, and solution deployed with fully operationalized processes to support the solution lifecycle.

Skills:
Microservices, Release Software, Docker, Cloud

Additional Skills & Qualifications:
The experience you bring:
  1. 10-15 years of experience in infrastructure, system engineering, software engineering.
  2. Advanced knowledge in software engineering in test, testing automation frameworks and tools for application and/or any-as-code (infrastructure, configuration, development tools such as documentation or diagram as code).
  3. Advanced knowledge in at least 3 of the following key areas: Cloud native and IaaS Architecture (performance testing, monitoring, operations), Design (compliance, security), Cloud Engineering (planning, provision), Containers orchestration solutions.
  4. Strong understanding of business technology drivers and their impact on architecture design, performance and monitoring.
  5. Advanced level of knowledge on Observability engineering with hands on experience implementing and integrating at least 2-3 monitoring and observability platform such as AppDynamics, Dynatrace, Splunk, Grafana Cloud or cloud-based observability services in AWS or Azure.
  6. A systematic problem-solving approach, coupled with strong communications skills and a sense of ownership and drive.
  7. Hands-on experience in designing, analyzing, scaling, and troubleshooting medium to large scale distributed systems.
  8. Practice and well-versed with SRE methodologies and passionate about solving operation problems through automation and software engineering.
  9. Ability to communicate effectively vertically and horizontally within the organization about technical strategy in clear, concise, understandable terms appropriate to the audience technical understanding and expertise.
  10. Demonstrated ability to conceptualize, launch and deliver multiple engineering projects on time and within budget.
  11. Demonstrated ability to understand and troubleshoot complex problems under pressure.

Experience Level:
Intermediate Level

Benefits:
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following:
  • Medical, dental & vision
  • Critical Illness, Accident, and Hospital
  • 401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available
  • Life Insurance (Voluntary Life & AD&D for the employee and dependents)
  • Short and long-term disability
  • Health Spending Account (HSA)
  • Transportation benefits
  • Employee Assistance Program
  • Time Off/Leave (PTO, Vacation or Sick Leave)

About TEKsystems:
We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.

The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law. #J-18808-Ljbffr

  • Newport Beach, California, United States Pacific Life Full time

    Job Description: At Pacific Life, we're committed to delivering exceptional service to our policyholders. To achieve this, we're seeking a talented Lead Site Reliability Engineer to join our Engineering Excellence team. As a key member of our team, you'll provide technical leadership and direction for platform engineering, system design, and end-to-end...


  • Newport Beach, United States LEDGENT Technology & Engineering - Roth Staffing Companies, L.P. Full time

    We are seeking a Lead Site Reliability Engineer for a 6+ month project in Newport Beach, CA. This position is onsite. LEAD SRE As a Lead SRE, you will provide technical leadership, direction, and accountability for platform engineering, system design, and end-to-end implementation to meet and exceed the product or platform non-functional requirements...


  • Newport Beach, United States LEDGENT Technology & Engineering - Roth Staffing Companies, L.P. Full time

    We are seeking a Lead Site Reliability Engineer for a 6+ month project in Newport Beach, CA. This position is onsite.LEAD SREAs a Lead SRE, you will provide technical leadership, direction, and accountability for platform engineering, system design, and end-to-end implementation to meet and exceed the product or platform non-functional requirements including...


  • Newport Beach, United States Pacific Life Full time

    Job Description:Pacific Life is seeking a talented Lead Site Reliability Engineer (SRE) to join our Engineering Excellence team in a hybrid work environment. As a key member of our team, you will be responsible for providing technical leadership, direction, and accountability for platform engineering, system design, and end-to-end implementation to meet and...


  • Virginia Beach, Virginia, United States ECS Limited Full time

    Job Title: Site Reliability Engineering ManagerECS Limited is seeking a highly skilled Site Reliability Engineering Manager to join our team. As a Site Reliability Engineering Manager, you will be responsible for defining, implementing, and growing our SRE practice to ensure the reliability, availability, and performance of our critical production...


  • Virginia Beach, Virginia, United States ECS Limited Full time

    About the RoleECS Limited is seeking a talented Site Reliability Engineering Manager to play a key role in defining, implementing, and growing our SRE practice to ensure the reliability, availability, and performance of our critical production environments.The successful candidate will have demonstrated hands-on experience designing, implementing, and...


  • Fort Walton Beach, Florida, United States Booz Allen Hamilton Full time

    Job Overview:We are seeking a highly skilled Reliability Engineer, Lead to join our team at Booz Allen Hamilton. As a key member of our engineering team, you will be responsible for ensuring the reliability and availability of our clients' systems.Key Responsibilities:Conduct reliability analyses, including root cause and corrective action, Weibull, life...


  • Long Beach, California, United States Safran Full time

    About the Role:We are seeking a highly skilled Reliability and Maintainability Engineer to join our team at Safran. As a key member of our Engineering Department, you will play a pivotal role in reviewing and analyzing system and equipment design for compliance to reliability and maintainability requirements.Key Responsibilities:Review and analyze system and...


  • Palm Beach Gardens, United States NextEra Energy Full time

    Job Overview NextEra Energy Resources is seeking a highly motivated and energetic engineer to join our PGD Wind Reliability team located at our PGA Office in Palm Beach Gardens, Florida. As a reliability engineer, you will join a team of engineers and technical specialists and collaborate with cross-functional teams to participate in projects and...


  • Palm Beach Gardens, United States NextEra Energy Full time

    Requisition ID:  82704    NextEra Energy Resources is the world's largest generator of renewable energy from the wind and sun, and a world leader in battery storage. We provide energy-related products and services that grow our economy, protect the environment, support our communities and help customers meet their energy needs. We are leading...


  • Craig Beach, Ohio, United States Grundy County Engineer Full time

    Job OpportunityWe are seeking a highly skilled and reliable individual to fill the role of Utility Person/Truck Driver for the Grundy County Engineer's Office. This position requires a strong mechanical aptitude and the ability to operate a Motor Grader out of Conrad or a Truck Driver out of Grundy Center, depending on the season.Responsibilities include:•...

  • Reliability Engineer

    4 weeks ago


    Palm Beach Gardens, Florida, United States NextEra Energy , Inc. Full time

    Job Overview:NextEra Energy Resources is seeking a highly motivated and energetic engineer to join our PGD Wind Reliability team. As a reliability engineer, you will collaborate with cross-functional teams to optimize wind turbine performance, increase reliability, and reduce costs. The ideal candidate will be able to operate independently, creatively solve...


  • Long Beach, California, United States Relativity Space Full time

    About the Role:As a Mission Reliability Engineer at Relativity Space, you will be responsible for ensuring the reliability of our rockets through rigorous engineering analysis and decision-making.Working across all engineering disciplines, you will identify technical deficiencies and implement solutions to mitigate excess risk incurred during design, build,...


  • Long Beach, CA, United States Relativity Space Full time

    About the Team:At Relativity Space, our Integrated Performance teams are dedicated to ensuring that our products work seamlessly across all systems and disciplines. From trajectory design to aerodynamics, reliability analysis, and beyond, our teams work together to size the Terran R rocket, design its missions for customer success, reliability, and...

  • Reliability Engineer

    1 month ago


    Round Lake Beach, Illinois, United States Sterling Engineering Full time

    Job Title:Reliability Engineer - Process OptimizationJob Summary:The Reliability Engineer is responsible for ensuring the reliability and continuous improvement of manufacturing processes within a medical device manufacturing environment.This role involves analyzing current processes, implementing changes to optimize efficiency, and maintaining compliance...


  • Long Beach, CA, USA, United States Relativity Space Full time

    About the RoleAs a Mission Reliability Engineer at Relativity Space, you will be responsible for ensuring the reliability of our launch vehicles. This involves working across all engineering disciplines to identify and mitigate technical risks, drive architectural decision-making, and implement solutions to improve vehicle reliability.Key responsibilities...

  • Civil Engineering

    2 weeks ago


    Newport Beach, United States Velocity Search Group Full time

    Project Engineer- Architecture Firm General Summary Our client, a leading engineering and design firm, is seeking a highly skilled Project Engineer to join their team. The Project Engineer will report to a Project Manager or Senior Project Manager and will be responsible for overseeing day-to-day project activities , ensuring quality control and project...

  • Site Civil Engineer

    1 month ago


    virginia beach, United States Insight Global Full time

    Insight Global is seeking a motivated engineer to join an infrastructure team as a production-focused contributor on project-based work. The ideal candidate is eager to grow both personally and professionally, taking ownership of portions of projects, demonstrating resourcefulness, and actively incorporating feedback. Day-to-day work will be 95% design...

  • Site Civil Engineer

    3 weeks ago


    Virginia Beach, United States Insight Global Full time

    Insight Global is seeking a motivated engineer to join an infrastructure team as a production-focused contributor on project-based work. The ideal candidate is eager to grow both personally and professionally, taking ownership of portions of projects, demonstrating resourcefulness, and actively incorporating feedback. Day-to-day work will be 95% design...

  • Site Civil Engineer

    3 weeks ago


    Virginia Beach, United States Insight Global Full time

    Insight Global is seeking a motivated engineer to join an infrastructure team as a production-focused contributor on project-based work. The ideal candidate is eager to grow both personally and professionally, taking ownership of portions of projects, demonstrating resourcefulness, and actively incorporating feedback. Day-to-day work will be 95% design...