Site Reliability Engineering Leader

2 weeks ago

Seattle, United States Apple Full time

People at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here Join Apple, and help us leave the world better than we found it. The Apple Services Engineering (ASE) team builds and provides systems and infrastructure that fuel Apple’s services (such as iCloud, iTunes, Siri, and Maps). We are the foundation on which Apple’s software developers build the products that our customers love. We are looking for a passionate and talented Site Reliability Engineering Leader to continue our focus in providing our customers the highest quality Apple Services experience. Our services have to scale globally, stay highly available, and "just work.” If you love designing, engineering and running systems and infrastructure that will help millions of customers, then this is the place for you The services that this team manages are foundational security services. From host access and disk encryption to identity and authentication, these services are both critical for operational, as well as securing Apple's fleet and protecting our most critical data. You would lead a new SRE team dedicated to these security services, while partnering closely with the Security Development team, to bring up and mature new services as part of our infrastructure investments. Description The SRE organization requires a strong SRE leader for its rapidly growing Security SRE team. In this role, you will oversee critical security infrastructure services, and focus on improving the reliability and manageability of these services. You’ll establish a global SRE team to support these services. You’ll be a senior engineering leader within the Infrastructure organization. You will lead the SRE teams responsible for reliability and performance of critical security infrastructure services, and improve their reliability, observability and manageability. These services are an important set of services responsible for all of Apple infrastructure; therefore, evolved approaches to changes, reliability, and resiliency are required for this role. The SRE teams are responsible for reliability and performance of on-prem and cloud-based services. You’ll collaborate with multi-functional teams to design, implement, and maintain security measures, incident response protocols, and automation tools to strengthen our organization's overall security posture. Minimum Qualifications 8 years plus engineering management experience. 5 years experience managing SRE teams, managing mission critical production services, with progressively larger charters. Demonstrated success leading SRE teams, and managing infrastructure development engineers. Understanding of SRE principles, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts. Proficient in at least one of Python, Golang, Java, or Rust. Experience working in a standard SDLC. Understanding of key Infrastructure Security concepts and principles. Key Qualifications Preferred Qualifications Proven experience with large scale, highly available, distributed, and fault tolerant systems. Excellent understanding of operating systems concepts including multi-threading, memory management, networking and storage, performance and scale. Experience with Kubernetes, Docker, and containerization (CNCF Kubernetes Administrator or equivalent). Deep knowledge of Linux security primitives, systems, packaging, container security and SELinux. Understanding of MacOS security primitives. BS/MS in Computer Science or Equivalent (5+ years of software development or production operations experience in a large-scale environment). Prior experience in security related fields (or equivalent experience) Certs like OSCP, OSCE, OSEE, etc. helpful but not vital. Education & Experience Additional Requirements Pay & Benefits At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $239,700 and $360,200, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant. #J-18808-Ljbffr

Site Reliability Engineer

3 weeks ago

Seattle, United States Prodigy Resources Full time

About Us: Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...
Site Reliability Engineer

3 weeks ago

Seattle, United States Prodigy Resources Full time

About Us: Prodigy is seeking an SRE to join our clients organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, theyre seeking an experienced Site Reliability Engineer to join their team. This role is critical...
Site Reliability Engineer

3 weeks ago

Seattle, United States Prodigy Resources Full time

About Us:Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...
Site Reliability Engineer

3 weeks ago

Seattle, United States Prodigy Resources Full time

About Us:Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...
Staff Site Reliability Engineer

1 week ago

Seattle, United States Moloco Full time

About the Role Moloco is a machine learning company that operates at massive scale (we ingest 10 petabytes of training data per day), and our models are blazingly fast (return predictions in 10 milliseconds or less); and a profitable unicorn (we are valued at $2 billion and have been profitable for the last 13+ quarters). We are looking for an exceptional...
Site Reliability Engineering Leader

3 days ago

Seattle, Washington, United States Apple Full time

About the RoleWe are seeking a highly experienced Senior SRE Leader to lead our Security SRE team, responsible for the reliability and performance of critical security infrastructure services. As a key member of our Infrastructure organization, you will oversee the development and maintenance of these services, ensuring they meet the highest standards of...
Principal Site Reliability Engineer

3 weeks ago

Seattle, United States Oracle Full time

OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...
Sr Site Reliability Engineer

4 days ago

Seattle, United States eTek IT Services, Inc. Full time

Job DescriptionJob DescriptionOverview:The Senior Site Reliability Engineer plays a critical role in ensuring the reliability, scalability, and performance of our systems and services. They are responsible for designing and implementing tools and automated solutions to improve system reliability, monitoring, and incident response.Key Responsibilities:Develop...
Site Reliability Engineer

3 months ago

Seattle, United States Capgemini Full time

**Site Reliability Engineer** **FTE with benefits** Our team is looking to add experienced Site Reliability / DevOps Engineer to our team. + Experiencedwith **Python and Shell Scripting.** + **Shouldhave extensive experience with Azure or AWS (Azure preferred)** + **Experiencewith Monitoring and Observability - Datadog** + **Experiencewith Infrastructure as...
Principal Site Reliability Engineer

1 month ago

Seattle, United States Oracle Full time

OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...
Lead Site Reliability Engineer

2 weeks ago

Seattle, Washington, United States Oracle Full time

Overview: The OCI Incident Response team serves as the primary defense mechanism for ensuring the uninterrupted operation of Oracle's cloud services. Our mission is to reduce the frequency and impact of customer-affecting incidents by implementing effective large-scale incident management strategies. We leverage our operational expertise, adherence to...
Site Reliability Engineer

3 days ago

Seattle, Washington, United States Apple Full time

About the RoleWe are seeking a highly skilled and motivated Security Site Reliability Engineer to join our dynamic and growing team. As a Security SRE, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.Key ResponsibilitiesCollaborate with cross-functional teams to design, implement, and...
Site Reliability Engineer

5 days ago

Seattle, Washington, United States Tik Tok Full time

About the RoleTikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud infrastructure.Key ResponsibilitiesDevelop and maintain automation procedures to maximize system efficiency...
Senior Site Reliability Engineer

3 weeks ago

Seattle, United States Apple Full time

To view your favorites, sign in with your Apple ID. Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Join Apple’s Cloud Service Infrastructure team as a site reliability...
Cloud Infrastructure Engineer

6 days ago

Seattle, Washington, United States Moloco Full time

About the RoleMoloco is a leading machine learning company that operates at massive scale, processing 10 petabytes of training data daily, and boasts blazingly fast models that return predictions in 10 milliseconds or less. As a profitable unicorn valued at $2 billion, we have been profitable for over 13 quarters.We are seeking an exceptional Staff Software...
Senior Software Engineer

6 days ago

Seattle, Washington, United States Moloco Full time

About the RoleMoloco is a leading machine learning company that operates at massive scale, processing 10 petabytes of training data daily, and boasts blazingly fast models that return predictions in 10 milliseconds or less. As a profitable unicorn valued at $2 billion, we have been profitable for over 13 quarters.We are seeking an exceptional Staff Software...
Lead Site Reliability Engineer

2 weeks ago

Seattle, Washington, United States Apple Full time

Overview:Position Number: The Apple Services Engineering team exemplifies Apple's dedication to merging creativity with technology. We invite you to join the Apple Services Engineering Cloud Service Infrastructure team as a Site Reliability Engineer, where you will play a pivotal role in supporting and expanding cloud services for millions of Apple users....
Site Reliability Engineer

5 days ago

Seattle, Washington, United States Apple Full time

About the RoleWe are seeking a highly skilled and motivated Security Site Reliability Engineer to join our dynamic and growing team at Apple. As a Security SRE, you will play a critical role in ensuring the security, reliability, and scalability of our systems and infrastructure.Key ResponsibilitiesDesign, implement, and maintain security measures, incident...
Senior Site Reliability Engineer

1 month ago

Seattle, United States West500 Partners Full time

Our client is a fast-growing downtown Seattle startup developing AI automation for professional services, including legal technology and medical records. They have a great product market fit and rapidly increasing revenues and are currently in need of a local Software Engineering Lead with CI/CD expertise, an AWS background, and a keen interest in innovative...
Senior Site Reliability Engineer

1 month ago

Seattle, United States West500 Partners Full time

Our client is a fast-growing downtown Seattle startup developing AI automation for professional services, including legal technology and medical records. They have a great product market fit and rapidly increasing revenues and are currently in need of a local Software Engineering Lead with CI/CD expertise, an AWS background, and a keen interest in innovative...

Americas

Europe

Asia / Oceania

Africa

Site Reliability Engineering Leader