Senior Site Reliability Engineer

3 weeks ago


Los Angeles, United States Avesta Computer Services Full time

Title - Senior Site Reliability Engineer

Location - Los Angeles, California, United States OR Tempe, Arizona, United States - Hybrid

Type - Full-time Permanent


JOB DESCRIPTION

Our clients stands as a beacon of innovation, crafting world-class, large scale digital products that redefine the entertainment experience. We're on the lookout for visionary individuals to join our pioneering team, tasked with shaping the future of streaming products. Now is your chance to be part of creating and delivering extraordinary digital experiences spanning Sports and Entertainment. As a key member of our team, you'll drive innovation and significantly contribute to our mission of pioneering the next generation of streaming products. Your opportunity to create unparalleled fan experiences for these iconic sports events is here. Our current advanced digital solutions, accessed by millions across web, mobile, and living room devices, signify just the start of our ambitious journey.


ABOUT THE ROLE

Our client is hiring a Principal SRE to build and operate infrastructure and platforms to support APIs around our live direct to consumer APIs for major live events such as the Super Bowl, World Cup, and World Series. The principal engineer will be the technical lead for solving thundering herd problems including partnering with the application team to load test, scale up and scale back down again and help design the platform and infrastructure to meet their needs.

A collaborative, peacemaker mindset is a must while fostering a culture of learning and continuous improvement for the entire team. The principal engineer will additionally work with the Director, Platform Engineering to visualize workflows, and refine processes and policies to keep the team throughput high.


A SNAPSHOT OF YOUR RESPONSIBILITIES

  • Serve as technical lead for the implementation and operation of cloud-based infrastructure and platform including EKS and other AWS services supporting direct to consumer APIs and solving associated thundering herd problems including load testing, scaling up and scaling back down again
  • Work closely with Video & Player Engineering and 3rd party teams to help design and implement scalability, cost visibility and observability in the platform
  • Help to mentor and train less senior members of the team
  • Assist with product/technology selection including evaluating maturity, support and design and implementation of POCs
  • Work with the Director, Site Reliability Engineering to foster a culture of learning and continuous improvement, help to conceptualize and visualize workflows and processes
  • Perform post-incident analysis to identify root causes and potential workarounds/solutions
  • Be fluid and open to change and evolving processes and tools
  • Other duties as assigned


WHAT YOU WILL NEED

  • Expert with EKS, Kubernetes and AWS including IAM, autoscaling, networking and load balancing/request routing
  • Proven experience with solving scalability problems both up and down including thundering herd scenarios
  • Expert with troubleshooting and root cause analysis
  • Expert with at least 2 programming languages
  • Strong analytical skills
  • Strong communication skills, both verbal and written
  • Proven experience with building deployment pipelines and enabling self-service
  • Strong teamwork and willingness to collaborate with others
  • Proven experience with training and mentoring engineers



  • Los Angeles, United States Dice Full time

    Dice is the leading career destination for tech experts at every stage of their careers. Our client, Motion Recruitment Partners, LLC, is seeking the following. Apply via Dice today! Job Description A Fortune 500 consulting company is looking for SREs with Subject Matter Expertise with Dynatrace. You'll design, install, and configure Dynatrace onto...


  • Los Angeles, United States Motion Recruitment Partners LLC Full time

    Senior Site Reliability Engineer / Dynatrace Los Angeles, California **Open to Remote** Contract $60/hr - $85/hr Job Description A Fortune 500 consulting company is looking for SREs with Subject Matter Expertise with Dynatrace. You'll design, install, and configure Dynatrace onto Kubernetes clusters for various enterprise clients. This role is remote with...


  • Los Angeles, United States X (formerly Twitter) Full time

    Are you prepared to join the X team and help build the ultimate real-time information-sharing app, revolutionizing how people connect? At X, we're on a mission to become a trusted global digital public square, committed to minimal censorship within legal boundaries. Our goal is to empower every user to freely create and share ideas, fostering open public...


  • Los Angeles, United States X (formerly Twitter) Full time

    Are you prepared to join the X team and help build the ultimate real-time information-sharing app, revolutionizing how people connect? At X, we're on a mission to become a trusted global digital public square, committed to minimal censorship within legal boundaries. Our goal is to empower every user to freely create and share ideas, fostering open public...


  • Los Angeles, United States Sustainable Talent Full time

    Join the Sustainable Talent team, supporting NVIDIA as a Senior Site Reliability Engineer supporting the Infrastructure, Planning, and Process organization. This is a W-2 full-time contract based in Santa Clara, CA, with Hybrid work options. We offer competitive pay $75 - $90/hr based on factors like experience, education, location, etc. and provide full...


  • Los Angeles, United States Motion Recruitment Partners LLC Full time

    Senior Site Reliability Engineer / Dynatrace Los Angeles, California Open to Remote Contract $60/hr - $85/hr Job Description A Fortune 500 consulting company is looking for SREs with Subject Matter Expertise with Dynatrace. You'll design, install, and configure Dynatrace onto Kubernetes clusters for various enterprise clients. This role is remote with travel...


  • Los Angeles, United States Adastra replica Full time

    Job Description Job Description Our client is looking for an experienced Site Reliability Engineer to design, operate, maintain, and scale mission-critical infrastructure and products. Products include (but are not limited to) automated Hardware-In-The-Loop (HITL) data analysis systems, vehicle configuration sign-off tools, continuous integration systems for...


  • Los Angeles, United States Adastra replica Full time

    Job DescriptionJob DescriptionOur client is looking for an experienced Site Reliability Engineer to design, operate, maintain, and scale mission-critical infrastructure and products. Products include (but are not limited to) automated Hardware-In-The-Loop (HITL) data analysis systems, vehicle configuration sign-off tools, continuous integration systems for...


  • Los Angeles, United States Adastra replica Full time

    Job DescriptionJob DescriptionOur client is looking for an experienced Site Reliability Engineer to design, operate, maintain, and scale mission-critical infrastructure and products. Products include (but are not limited to) automated Hardware-In-The-Loop (HITL) data analysis systems, vehicle configuration sign-off tools, continuous integration systems for...


  • Los Angeles, United States eTek IT Services, Inc. Full time

    Job DescriptionJob DescriptionOverviewThe Site Reliability Engineer will play a crucial role in ensuring the reliability, scalability, and performance of our infrastructure and applications, ultimately contributing to the seamless operations of our systems. This role is vital in maintaining a high level of uptime and system efficiency, enhancing the overall...


  • Los Angeles, United States Drax Group Full time

    It's the people who make Drax a great place to work If you’re interested in joining us on our journey to create a zero carbon, lower cost energy future, take a look at our current vacancies. Nothing suitable yet? You can create a job alert and be the first to know as soon as something suitable is posted. Reliability Engineer Full-time,...

  • Reliability Engineer

    2 months ago


    Los Angeles, United States Kindeva Drug Delivery Company Full time

    The Reliability Engineer will lead the sites Asset Reliability agenda, effectively promoting analytical problem-solving techniques and structured reliability improvement processes. We have an immediate opening for a Reliability Engineers at Kindeva’s Northridge, CA manufacturing facility. The Reliability Engineer will lead the sites Asset Reliability...


  • Los Angeles, United States Saxon Global Full time

    Looking for a highly motivated Site Reliability Engineer, who is capable of build and run large-scale, massively distributed, fault-tolerant systems. Individual to work with teams across the organization and ensures core services reliability and keep an eye on capacity and performance. This is for a migration from AWS into GCP. Knowledge and experience with...


  • Los Angeles, United States Avesta Computer Services Full time

    Job Title: Senior Site Reliability Engineer (Devops) - (Live Streaming, Video, Media processing)Location: Tempe, Arizona / Los Angeles, California, United StatesType: FulltimeJob Description:Our clients stands as a beacon of innovation, crafting world-class, large scale digital products that redefine the entertainment experience. We're on the lookout for...


  • Los Angeles, United States forhyre.com Full time

    Job Description Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas of development and are interested in continuing to improve our platform through the ever-changing technology landscape. To be successful in this role: You'll have the opportunity to design and implement major infrastructure components,...


  • Los Angeles, United States forhyre.com Full time

    Job Description Job Description Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas of development and are interested in continuing to improve our platform through the ever-changing technology landscape. To be successful in this role You'll have the opportunity to design and implement major infrastructure...

  • Uncapped Games

    1 month ago


    Los Angeles, United States LightSpeed Studios Full time

    Uncapped Games - Site Reliability Engineer page is loaded Uncapped Games - Site Reliability Engineer Apply remote type Hybrid locations US-Los Angeles US-California-Remote time type Full time posted on Posted 30+ Days Ago job requisition id R100321 Work Mode: Onsite Responsibilities: Description: Seeking the opportunity to build a game from scratch and...

  • Uncapped Games

    7 days ago


    Los Angeles, United States LightSpeed Studios Full time

    Uncapped Games - Site Reliability Engineer page is loaded Uncapped Games - Site Reliability Engineer Apply remote type Hybrid locations US-Los Angeles US-California-Remote time type Full time posted on Posted 30+ Days Ago job requisition id R100321 Work Mode: Onsite Responsibilities: Description: Is your CV ready If so, and you are confident this is the role...


  • Los Angeles, United States Dice Full time

    Dice is the leading career destination for tech experts at every stage of their careers. Our client, Beacon Hill Staffing Group, is seeking the following. Apply via Dice today! Please make sure you read the following details carefully before making any applications. NOTE: This is a Hybrid position in Los Angeles Linux Site Reliability Engineer (SRE) If...


  • Los Angeles, California, United States Motion Recruitment Full time

    Job DescriptionA Fortune 500 consulting company is looking for SREs with Subject Matter Expertise with Dynatrace. You'll design, install, and configure Dynatrace onto Kubernetes clusters for various enterprise clients. This role is remote with travel to one of their offices once a month. Locations include the Bay Area and Dallas. Required Skills &...