No more applications are being accepted for this job

Site Reliability Engineer

4 weeks ago

Austin, United States JobRialto Full time

Description:

The Client Site Reliability team is responsible for the operations and infrastructure of all consumer-facing production systems and developer-facing systems at Client Games, including NBA Client game services, customer-facing account services, and websites. This team handles systems and services spanning multiple datacenters both terrestrial and cloud-based.

What We Need:

We are looking for an expert engineer who is passionate about building multi-datacenter infrastructure and services. Robust systems and problem-solving skills are required as we develop solutions for game studios and support data centers around the world alongside a group of outstanding engineers. In this role, you will collaborate with network engineers, systems architects, and development staff to support our gamers and the needs of the business.

What you will do

What We Do

Build and operate highly resilient systems in a multi-datacenter and cloud global environment serving game and consumer services

Develop tools for the management and automation of the systems and service infrastructure

Define and implement standards that will impact systems, services, and multiple software environments

Diagnose and resolve technical issues from both internal and external customers and drive improvements to prevent them from recurring

Participate in Site Reliability Engineering's on-call rotation

Who We Believe Will Be an Outstanding Fit

You are eager to work in a fast-paced environment with other highly skilled engineers who are passionate about service availability and health

If the idea of building data center infrastructure services from greenfield to implementation moves you

Required Qualifications

6+ years of demonstrated influence across one or more teams for large scale projects that drive impact and improvement across the organization

6+ years of experience in an SRE role for online services in a multi-region, multi-cloud environment with specific experience in reliability and resiliency

6+ years of developing tools for automation of processes or augmenting off the shelf tool functionality

6+ years of AWS and/or GCP cloud experience running highly elastic mission critical workloads

6+ years of coding experience in at least one or more of Python, Ruby, Java, or Go and a good understanding of code management

6+ years of experience using Infrastructure as Code tools like Terraform, Pulumi, or others

Extensive knowledge of software build, test, and deploy processes using Git, Jenkins, Puppet, Ansible, Docker/containers, and Kubernetes

Experience with system analysis and troubleshooting

Serve as a mentor to junior engineers and provide technical leadership to the organization.

Bonus Points

Prior hands-on experience running large scale multiplayer video games at scale

Experience designing and crafting software for systems and network automation

Debugging, code optimization, and routine task automation skills

Demonstrated ability to decompose sophisticated problems. Ability to engage in lateral investigations.

Must Haves:

3 to 5 years exp. Kubernetes, Data Dog, cloud services, large scale systems, AWS&GCP, minor Azure

GKE, home strung clusters on prem, and AKS (Very Small), EKS

Consistent upgrades across all the clusters and clouds

Education: Bachelors Degree

Additional client information:

Site Reliability Engineer

1 week ago

Austin, United States Virtu Financial Full time

Virtu is a leading financial firm that leverages cutting edge technology to deliver liquidity to the global markets and innovative, transparent trading solutions to our clients. As a market maker, Virtu provides deep liquidity that helps to create more efficient markets around the world. Our market structure expertise, broad diversification, and execution...
Senior Site Reliability Engineer- Remote

4 weeks ago

Austin, United States ClickHouse Full time

We are committed to providing our customers with reliable and secure services so we are building out our newly formed Site Reliability Engineering team. As one of the first joiners to our Reliability Engineering Team at ClickHouse, you will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance...
Site Reliability Engineer

4 days ago

Austin, United States Zenoss Careers Full time

Description We are hiring a Site Reliability Engineer to support, configure, and build our SaaS offerings. You will be troubleshooting and administering multiple environments including performance and quality of disaster recovery. You will support all aspects of the technical infrastructure by troubleshooting system configuration, installation, and other...
Site Reliability Engineer II

4 weeks ago

Austin, Texas, United States Procore Technologies Full time

Job Description What if you could use your technology skills to develop a product that impacts the way communities’ hospitals, homes, sports stadiums, and schools across the world are built? Construction impacts the lives of nearly everyone in the world, and yet it’s also one of the world’s least digitized industries. That’s why we’re looking for...
Site Reliability Engineer

2 days ago

Austin, United States Frontline Education Full time

Posting Details Job Details Description Location Requirements: This role is Hybrid to one of our offices: Austin, Naperville or Wayne. Overview : We are looking for an outgoing and dynamic Site Reliability Engineer to manage the successful operation and support of Frontline application environments. This position is responsible...
Senior Site Reliability Engineer

4 weeks ago

Austin, Texas, United States Visa Full time

Job Description As a part of the Product Reliability Engineering (PRE) Organization of VISA , you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. In this role, your time will be split between operations/on-call duties and developing systems and software that...
Site Reliability Developer

1 month ago

Austin, United States Oracle Full time

Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...
SITE OPERATIONS ENGINEER

1 day ago

Austin, United States Adva IT Services, Inc.. Full time

Job SummaryWe are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple s internal services as well as services that Apple users directly use. As an Operations Engineer, you will play a crucial role in helping ensure our systems...
Site Operations Engineer

2 days ago

Austin, United States VeeAR Projects Inc. Full time

Position: Site Operations EngineerLocation: Austin, TX (Hybrid)Duration: 12+ Months Contract with possible extensionJob Description:We are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple’s internal services as well as...
Site Operations Engineer

1 day ago

Austin, United States VeeAR Projects Inc. Full time

Position: Site Operations EngineerLocation: Austin, TX (Hybrid)Duration: 12+ Months Contract with possible extensionJob Description:We are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple’s internal services as well as...
Site Operations Engineer

1 day ago

Austin, United States Veear Full time

Position: Site Operations EngineerLocation: Austin, TX (Hybrid)Duration: 12+ Months Contract with possible extension Job Description:We are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple s internal services as well as...
Site Operations Engineer

41 minutes ago

Austin, United States Beth Page tech Full time

Job DescriptionJob DescriptionRole: Site Operations EngineerLocation Austin, TX / Santa Clara, CAJob SummaryWe are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple's internal services as well as services that Apple users...
Site Operations Engineer

13 hours ago

Austin, United States Zyxware Technologies Full time

Title: Site Operations Engineer (Only W2)Location AST or SCV (AST = Austin, TX and SCV = Santa Clara Valley)Duration: 6 MonthsJob SummaryWe are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both internal services as well as services...
Site Reliability Developer Join OCI-Ns2

16 hours ago

Austin, United States Oracle Full time

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the critically important stack, with focus on...
Senior Database Reliability Engineer

1 month ago

Austin, Texas, United States NinjaOne Full time

Senior Database Reliability Engineer (DBRE) About the Role At NinjaOne we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior Database Reliability Engineer (DBRE) to join our SRE team in the Platform Engineering organization and help us scale our products to millions of...
Senior Site Reliability Engineer I, Core SRE

2 days ago

Austin, United States Sumo Logic Full time

Location: Ideally Austin, TX. We will, however, also look at 100% remote talent based elsewhere in the USA and Canada. Summary of role Own availability, the most important product feature, by continually striving for sustained operational excellence of Sumo's planet-scale observability and security products. Work with your global SRE team to optimize...
Senior Database Reliability Engineer

1 month ago

Austin, United States NinjaOne Full time

Senior Database Reliability Engineer (DBRE) About the Role At NinjaOne we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior Database Reliability Engineer (DBRE) to join our SRE team in the Platform Engineering organization and help us scale our products to millions of...
Observability Response and Reliability Engineer

3 weeks ago

Austin, United States Netspend Full time

About the Company: Ouro is dedicated to delivering financial empowerment to millions of Americans, leveraging a proprietary payments technology platform that fuels its fintech product innovations. From prepaid, credit and debit account solutions, to digital account and money movement services, Ouro has a broad suite of products and technologies that deliver...
Site Reliability Engineer with Java Development, AWS

1 week ago

Austin, United States Pinnacle Group Full time

SRE with Java Development, AWS Day 1 Onsite Austin, TX Hybrid - 3 Days / Week Duration: Long term contract Job details - We are actively looking for SRE engineer with strong Java development + AWS background exp. Minimum 8 - 12 years exp needed. Pay Range: $65/hr - $70/hr The specific compensation for this position will be determined by a number of factors,...
Site Reliability Engineer with Java Development, AWS

1 week ago

Austin, United States Pinnacle Group, Inc. Full time

SRE with Java Development, AWS Day 1 Onsite Austin, TXHybrid - 3 Days / WeekDuration: Long term contractJob details -We are actively looking for SRE engineer with strong Java development + AWS background exp.Minimum 8 - 12 years exp needed.Pay Range: $65/hr - $70/hrThe specific compensation for this position will be determined by a number of factors,...

Americas

Europe

Asia / Oceania

Africa

Site Reliability Engineer