Senior Site Reliability Engineer

2 weeks ago


Los Angeles, United States Avesta Computer Services Full time

Job Title: Senior Site Reliability Engineer (Devops) - (Live Streaming, Video, Media processing)

Location: Tempe, Arizona / Los Angeles, California, United States

Type: Fulltime


Job Description:

Our clients stands as a beacon of innovation, crafting world-class, large scale digital products that redefine the entertainment experience. We're on the lookout for visionary individuals to join our pioneering team, tasked with shaping the future of streaming products. Now is your chance to be part of creating and delivering extraordinary digital experiences spanning Sports and Entertainment. As a key member of our team, you'll drive innovation and significantly contribute to our mission of pioneering the next generation of streaming products. Your opportunity to create unparalleled fan experiences for these iconic sports events is here. Our current advanced digital solutions, accessed by millions across web, mobile, and living room devices, signify just the start of our ambitious journey.


About The Role:

Our client is hiring a Principal SRE to build and operate infrastructure and platforms to support APIs around our live direct to consumer APIs for major live events such as the Super Bowl, World Cup, and World Series. The principal engineer will be the technical lead for solving thundering herd problems including partnering with the application team to load test, scale up and scale back down again and help design the platform and infrastructure to meet their needs.

A collaborative, peacemaker mindset is a must while fostering a culture of learning and continuous improvement for the entire team. The principal engineer will additionally work with the Director, Platform Engineering to visualize workflows, and refine processes and policies to keep the team throughput high.


A Snapshot of Your Responsibilities:

  • Serve as technical lead for the implementation and operation of cloud-based infrastructure and platform including EKS and other AWS services supporting direct to consumer APIs and solving associated thundering herd problems including load testing, scaling up and scaling back down again.
  • Work closely with Video & Player Engineering and 3rd party teams to help design and implement scalability, cost visibility and observability in the platform.
  • Help to mentor and train less senior members of the team
  • Assist with product/technology selection including evaluating maturity, support and design and implementation of POCs.
  • Work with the Director, Site Reliability Engineering to foster a culture of learning and continuous improvement, help to conceptualize and visualize workflows and processes.
  • Perform post-incident analysis to identify root causes and potential workarounds/solutions.
  • Be fluid and open to change and evolving processes and tools.
  • Other duties as assigned.


What You Will Need:

  • Expert with EKS, Kubernetes and AWS including IAM, auto scaling, networking and load balancing/request routing.
  • Proven experience with solving scalability problems both up and down including thundering herd scenarios.
  • Expert with troubleshooting and root cause analysis
  • Expert with at least 2 programming languages
  • Strong analytical skills
  • Strong communication skills, both verbal and written
  • Proven experience with building deployment pipelines and enabling self-service.
  • Strong teamwork and willingness to collaborate with others.
  • Proven experience with training and mentoring engineers


Nice To Have, But Not a Deal breaker:

  • BS or equivalent
  • AWS Solutions Architect Professional certification


  • Los Angeles, United States NVIDIA Full time

    We are now looking for a Sr. Site Reliability Engineer (SRE)! NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s motivated by outstanding technology and amazing people. Today, we’re tapping into the unlimited potential of AI to define the next era of...


  • Los Angeles, United States developrec Full time

    SRE Lead/Manager | San Diego, CA | Full-time Role Overview: As the Engineering Manager for Site Reliability, you'll lead the charge in transitioning to cloud-based solutions while ensuring the stability of our existing systems for our rapidly growing user base, currently standing at around one million. You'll spearhead our cloud infrastructure strategy...


  • Los Angeles, United States Management Recruiters of Raleigh Full time

    Our client, a Global Petrochemical & Plastics Company, has an excellent opportunity in its world-class ethane cracker and polymers facility for a Senior Instrumentation Reliability Engineer. This brand new, state of the art facility is among the largest of its kind in the US and is one of the most extensively instrumented facilities in the world. The site...


  • Los Angeles, United States Bestomer Full time

    Who We Are At Bestomer, we’re solving a human data problem by better understanding the relationships between people, products, and businesses. We’re building technology that focuses on intuitively understanding data from real humans and their consumption behaviors in a probabilistic, combinatorial, and dynamic way so they are not just another data point...


  • Los Angeles, United States Adastra replica Full time

    Job Description Job Description Our client is looking for an experienced Site Reliability Engineer to design, operate, maintain, and scale mission-critical infrastructure and products. Products include (but are not limited to) automated Hardware-In-The-Loop (HITL) data analysis systems, vehicle configuration sign-off tools, continuous integration systems for...


  • Los Angeles, United States Adastra replica Full time

    Job DescriptionJob DescriptionOur client is looking for an experienced Site Reliability Engineer to design, operate, maintain, and scale mission-critical infrastructure and products. Products include (but are not limited to) automated Hardware-In-The-Loop (HITL) data analysis systems, vehicle configuration sign-off tools, continuous integration systems for...


  • Los Angeles, United States Canonical - Jobs Full time

    Job Description Job Description This role is an opportunity for a hands-on, but literally hands-off, senior technologist with a passion for Linux to build a career with Canonical and drive the success with those leveraging Ubuntu and open source products. If you have experience of IT operations automation, Infrastructure as Code and a passion for technology,...


  • Los Angeles, United States Avesta Computer Services Full time

    Title - Director of Site Reliability Engineering Location - Los Angeles, California, United States OR Tempe, Arizona, United States - Hybrid Type - Full-time Permanent JOB DESCRIPTION We're on the lookout for visionary individuals to join our pioneering team, tasked with shaping the future of streaming products. Now is your chance to be part of creating and...


  • Los Angeles, United States Avesta Computer Services Full time

    Title - Director of Site Reliability Engineering Location - Los Angeles, California, United States OR Tempe, Arizona, United States - Hybrid Type - Full-time Permanent JOB DESCRIPTION We're on the lookout for visionary individuals to join our pioneering team, tasked with shaping the future of streaming products. Now is your chance to be part of creating and...


  • Los Angeles, United States Avesta Computer Services Full time

    Title - Director of Site Reliability EngineeringLocation - Los Angeles, California, United States OR Tempe, Arizona, United States - HybridType - Full-time PermanentJOB DESCRIPTIONWe're on the lookout for visionary individuals to join our pioneering team, tasked with shaping the future of streaming products. Now is your chance to be part of creating and...


  • Los Angeles, United States Avesta Computer Services Full time

    Title - Director of Site Reliability EngineeringLocation - Los Angeles, California, United States OR Tempe, Arizona, United States - HybridType - Full-time PermanentJOB DESCRIPTIONWe're on the lookout for visionary individuals to join our pioneering team, tasked with shaping the future of streaming products. Now is your chance to be part of creating and...


  • Los Angeles, United States Avesta Computer Services Full time

    Title - Senior Site Reliability Engineer - (Live Streaming, Video, Media processing)Location - Los Angeles, California, United States OR Tempe, Arizona, United States - HybridType - Full-time PermanentJOB DESCRIPTIONOur clients stands as a beacon of innovation, crafting world-class, large scale digital products that redefine the entertainment experience....


  • Los Angeles, United States Avesta Computer Services Full time

    Title - Senior Site Reliability Engineer - (Live Streaming, Video, Media processing)Location - Los Angeles, California, United States OR Tempe, Arizona, United States - HybridType - Full-time PermanentJOB DESCRIPTIONOur clients stands as a beacon of innovation, crafting world-class, large scale digital products that redefine the entertainment experience....


  • Los Angeles, United States Kindeva Drug Delivery Company Full time

    The Reliability Engineer will lead the sites Asset Reliability agenda, effectively promoting analytical problem-solving techniques and structured reliability improvement processes. We have an immediate opening for a Reliability Engineers at Kindeva’s Northridge, CA manufacturing facility. The Reliability Engineer will lead the sites Asset Reliability...


  • Los Angeles, United States Saxon Global Full time

    Looking for a highly motivated Site Reliability Engineer, who is capable of build and run large-scale, massively distributed, fault-tolerant systems. Individual to work with teams across the organization and ensures core services reliability and keep an eye on capacity and performance. This is for a migration from AWS into GCP. Knowledge and experience with...


  • Los Angeles, CA, United States Avesta Computer Services Full time

    Title - Director of Site Reliability Engineering Location - Los Angeles, California, United States OR Tempe, Arizona, United States - Hybrid Type - Full-time Permanent JOB DESCRIPTION We're on the lookout for visionary individuals to join our pioneering team, tasked with shaping the future of streaming products. Now is your chance to be part of creating...


  • Los Angeles, United States forhyre.com Full time

    Job Description Job Description Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas of development and are interested in continuing to improve our platform through the ever-changing technology landscape. To be successful in this role You'll have the opportunity to design and implement major infrastructure...

  • Uncapped Games

    6 days ago


    Los Angeles, United States LightSpeed Studios Full time

    Uncapped Games - Site Reliability Engineer page is loaded Uncapped Games - Site Reliability Engineer Apply remote type Hybrid locations US-Los Angeles US-California-Remote time type Full time posted on Posted 30+ Days Ago job requisition id R100321 Work Mode: Onsite Responsibilities: Description: Seeking the opportunity to build a game from scratch and...


  • Los Angeles, United States Dice Full time

    Dice is the leading career destination for tech experts at every stage of their careers. Our client, Beacon Hill Staffing Group, is seeking the following. Apply via Dice today! NOTE: This is a Hybrid position in Los Angeles Linux Site Reliability Engineer (SRE) If you're passionate about Linux, cloud infrastructure, and contributing to open-source projects,...


  • Los Angeles, United States ATR International Full time

    We are seeking a Reliability Development Engineer for a very important client. Job Overview - Principal Duties and Responsibilities Successful candidate will be tasked for Product, Package reliability test tracking; reliability database, data analysis and summarization on a regular basis for commercial, industrial and/or automotive application Product...