Site Reliability Engineer
4 weeks ago
The Client Site Reliability team is responsible for the operations and infrastructure of all consumer-facing production systems and developer-facing systems at Client Games, including NBA Client game services, customer-facing account services, and websites. This team handles systems and services spanning multiple datacenters both terrestrial and cloud-based.
What We Need:
We are looking for an expert engineer who is passionate about building multi-datacenter infrastructure and services. Robust systems and problem-solving skills are required as we develop solutions for game studios and support data centers around the world alongside a group of outstanding engineers. In this role, you will collaborate with network engineers, systems architects, and development staff to support our gamers and the needs of the business.
What you will do
What We Do
Build and operate highly resilient systems in a multi-datacenter and cloud global environment serving game and consumer services
Develop tools for the management and automation of the systems and service infrastructure
Define and implement standards that will impact systems, services, and multiple software environments
Diagnose and resolve technical issues from both internal and external customers and drive improvements to prevent them from recurring
Participate in Site Reliability Engineering's on-call rotation
Who We Believe Will Be an Outstanding Fit
You are eager to work in a fast-paced environment with other highly skilled engineers who are passionate about service availability and health
If the idea of building data center infrastructure services from greenfield to implementation moves you
Required Qualifications
6+ years of demonstrated influence across one or more teams for large scale projects that drive impact and improvement across the organization
6+ years of experience in an SRE role for online services in a multi-region, multi-cloud environment with specific experience in reliability and resiliency
6+ years of developing tools for automation of processes or augmenting off the shelf tool functionality
6+ years of AWS and/or GCP cloud experience running highly elastic mission critical workloads
6+ years of coding experience in at least one or more of Python, Ruby, Java, or Go and a good understanding of code management
6+ years of experience using Infrastructure as Code tools like Terraform, Pulumi, or others
Extensive knowledge of software build, test, and deploy processes using Git, Jenkins, Puppet, Ansible, Docker/containers, and Kubernetes
Experience with system analysis and troubleshooting
Serve as a mentor to junior engineers and provide technical leadership to the organization.
Bonus Points
Prior hands-on experience running large scale multiplayer video games at scale
Experience designing and crafting software for systems and network automation
Debugging, code optimization, and routine task automation skills
Demonstrated ability to decompose sophisticated problems. Ability to engage in lateral investigations.
Must Haves:
3 to 5 years exp. Kubernetes, Data Dog, cloud services, large scale systems, AWS&GCP, minor Azure
GKE, home strung clusters on prem, and AKS (Very Small), EKS
Consistent upgrades across all the clusters and clouds
Education: Bachelors Degree
Additional client information:
-
Site Reliability Engineer
1 week ago
Austin, United States Virtu Financial Full timeVirtu is a leading financial firm that leverages cutting edge technology to deliver liquidity to the global markets and innovative, transparent trading solutions to our clients. As a market maker, Virtu provides deep liquidity that helps to create more efficient markets around the world. Our market structure expertise, broad diversification, and execution...
-
Senior Site Reliability Engineer- Remote
4 weeks ago
Austin, United States ClickHouse Full timeWe are committed to providing our customers with reliable and secure services so we are building out our newly formed Site Reliability Engineering team. As one of the first joiners to our Reliability Engineering Team at ClickHouse, you will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance...
-
Site Reliability Engineer
4 days ago
Austin, United States Zenoss Careers Full timeDescription We are hiring a Site Reliability Engineer to support, configure, and build our SaaS offerings. You will be troubleshooting and administering multiple environments including performance and quality of disaster recovery. You will support all aspects of the technical infrastructure by troubleshooting system configuration, installation, and other...
-
Site Reliability Engineer II
4 weeks ago
Austin, Texas, United States Procore Technologies Full timeJob Description What if you could use your technology skills to develop a product that impacts the way communities’ hospitals, homes, sports stadiums, and schools across the world are built? Construction impacts the lives of nearly everyone in the world, and yet it’s also one of the world’s least digitized industries. That’s why we’re looking for...
-
Site Reliability Engineer
2 days ago
Austin, United States Frontline Education Full timePosting Details Job Details Description Location Requirements: This role is Hybrid to one of our offices: Austin, Naperville or Wayne. Overview : We are looking for an outgoing and dynamic Site Reliability Engineer to manage the successful operation and support of Frontline application environments. This position is responsible...
-
Senior Site Reliability Engineer
4 weeks ago
Austin, Texas, United States Visa Full timeJob Description As a part of the Product Reliability Engineering (PRE) Organization of VISA , you will be responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. In this role, your time will be split between operations/on-call duties and developing systems and software that...
-
Site Reliability Developer
1 month ago
Austin, United States Oracle Full timeSolve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate...
-
SITE OPERATIONS ENGINEER
1 day ago
Austin, United States Adva IT Services, Inc.. Full timeJob SummaryWe are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple s internal services as well as services that Apple users directly use. As an Operations Engineer, you will play a crucial role in helping ensure our systems...
-
Site Operations Engineer
2 days ago
Austin, United States VeeAR Projects Inc. Full timePosition: Site Operations EngineerLocation: Austin, TX (Hybrid)Duration: 12+ Months Contract with possible extensionJob Description:We are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple’s internal services as well as...
-
Site Operations Engineer
1 day ago
Austin, United States VeeAR Projects Inc. Full timePosition: Site Operations EngineerLocation: Austin, TX (Hybrid)Duration: 12+ Months Contract with possible extensionJob Description:We are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple’s internal services as well as...
-
Site Operations Engineer
1 day ago
Austin, United States Veear Full timePosition: Site Operations EngineerLocation: Austin, TX (Hybrid)Duration: 12+ Months Contract with possible extension Job Description:We are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple s internal services as well as...
-
Site Operations Engineer
41 minutes ago
Austin, United States Beth Page tech Full timeJob DescriptionJob DescriptionRole: Site Operations EngineerLocation Austin, TX / Santa Clara, CAJob SummaryWe are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both Apple's internal services as well as services that Apple users...
-
Site Operations Engineer
13 hours ago
Austin, United States Zyxware Technologies Full timeTitle: Site Operations Engineer (Only W2)Location AST or SCV (AST = Austin, TX and SCV = Santa Clara Valley)Duration: 6 MonthsJob SummaryWe are looking for an operations engineer to join the Crypto Services SRE team. The Crypto Services SRE team is responsible for systems and services that support a vast number of both internal services as well as services...
-
Site Reliability Developer Join OCI-Ns2
16 hours ago
Austin, United States Oracle Full timeWork with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the critically important stack, with focus on...
-
Senior Database Reliability Engineer
1 month ago
Austin, Texas, United States NinjaOne Full timeSenior Database Reliability Engineer (DBRE) About the Role At NinjaOne we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior Database Reliability Engineer (DBRE) to join our SRE team in the Platform Engineering organization and help us scale our products to millions of...
-
Senior Site Reliability Engineer I, Core SRE
2 days ago
Austin, United States Sumo Logic Full timeLocation: Ideally Austin, TX. We will, however, also look at 100% remote talent based elsewhere in the USA and Canada. Summary of role Own availability, the most important product feature, by continually striving for sustained operational excellence of Sumo's planet-scale observability and security products. Work with your global SRE team to optimize...
-
Senior Database Reliability Engineer
1 month ago
Austin, United States NinjaOne Full timeSenior Database Reliability Engineer (DBRE) About the Role At NinjaOne we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior Database Reliability Engineer (DBRE) to join our SRE team in the Platform Engineering organization and help us scale our products to millions of...
-
Observability Response and Reliability Engineer
3 weeks ago
Austin, United States Netspend Full timeAbout the Company: Ouro is dedicated to delivering financial empowerment to millions of Americans, leveraging a proprietary payments technology platform that fuels its fintech product innovations. From prepaid, credit and debit account solutions, to digital account and money movement services, Ouro has a broad suite of products and technologies that deliver...
-
Austin, United States Pinnacle Group Full timeSRE with Java Development, AWS Day 1 Onsite Austin, TX Hybrid - 3 Days / Week Duration: Long term contract Job details - We are actively looking for SRE engineer with strong Java development + AWS background exp. Minimum 8 - 12 years exp needed. Pay Range: $65/hr - $70/hr The specific compensation for this position will be determined by a number of factors,...
-
Austin, United States Pinnacle Group, Inc. Full timeSRE with Java Development, AWS Day 1 Onsite Austin, TXHybrid - 3 Days / WeekDuration: Long term contractJob details -We are actively looking for SRE engineer with strong Java development + AWS background exp.Minimum 8 - 12 years exp needed.Pay Range: $65/hr - $70/hrThe specific compensation for this position will be determined by a number of factors,...