Cloud Reliability Engineer

2 weeks ago


Cupertino, California, United States Apple Inc. Full time

About Apple Services Engineering

At Apple, we don’t just create products; we design experiences that our customers cherish and rely on. The Apple Services Engineering (ASE) team is responsible for developing and maintaining the systems that facilitate these everyday experiences. If you have engaged with Apple products, you have likely interacted with our team. The iCloud Services SRE teams are dedicated to the systems and services that directly enhance our customers' experiences. We are in search of enthusiastic and skilled Site Reliability Engineers to uphold our commitment to delivering the highest quality Apple Services experience.

Your Role

Are you passionate about engineering and managing systems and infrastructure that will impress millions of users? Imagine the impact you could have here. At Apple, innovative ideas rapidly evolve into remarkable products, services, and customer experiences. When you infuse passion and commitment into your work, the possibilities are endless. Our ideal candidates possess proven software development capabilities, robust expertise in distributed systems, a solid understanding of SRE principles, and the knowledge required to operate services at Apple’s scale. Our team plays a vital role in the daily operations of services relied upon throughout Apple.

Key Responsibilities

  1. Drive a data-informed roadmap and quarterly planning for a selection of core services from a reliability standpoint.
  2. Oversee the entire software lifecycle of core services from a reliability perspective, which includes infrastructure setup, capacity planning, deployment, monitoring, architecture, and software implementation, in close collaboration with the development team.

We seek a creative, adaptable, and passionate individual who enjoys tackling engineering challenges and collaborating across a dynamic organization. If this resonates with you, we encourage you to explore this opportunity.

Minimum Qualifications

  • 5+ years of experience in software development or production operations within a large-scale environment.
  • Meticulous attention to detail, a strong aversion to ambiguity, and a solid grasp of deadlines.
  • Proficiency in SRE principles, including monitoring, alerting, error budgets, fault analysis, and automation.
  • Experience in coding with an object-oriented programming language such as Java, Golang, or Python.
  • Exceptional troubleshooting and problem-solving abilities.
  • Strong written and verbal communication skills.
  • Experience in enhancing the entire lifecycle of global services from inception through deployment, operations, and refinement.
  • Familiarity with Linux/Unix, Networking, Systems Management, and Systems Security.
  • Experience managing a diverse array of systems.
  • Willingness to participate in on-call service support.

Preferred Qualifications

  • In-depth understanding and practical experience with Kubernetes and container orchestration implementations.
  • Experience leading teams or projects from scoping to delivery.
  • Strong leadership and partnership skills.
  • Experience managing customer-facing internet-scale systems.
  • Experience in managing Distributed Systems or Large Scale Systems Operations.

Education

  • Bachelor's degree in Computer Science or equivalent experience (5 years).


  • Cupertino, California, United States Apple Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineering Manager to join our Cloud Services team at Apple. As a Site Reliability Engineering Manager, you will be responsible for leading a team of SREs responsible for the reliability and performance of our on-prem and cloud-based services.Key ResponsibilitiesLead SRE teams to ensure the...


  • Cupertino, California, United States Apple Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineering Manager to join our Apple Services Engineering team. As a Site Reliability Engineering Manager, you will be responsible for leading SRE teams responsible for the reliability and performance of on-prem and cloud-based services.Key ResponsibilitiesLead SRE teams responsible for...


  • Cupertino, California, United States Apple Full time

    Job SummaryApple is seeking a highly skilled Senior Cloud Reliability Engineer to join our Apple Services Engineering team. As a key member of our team, you will be responsible for designing, implementing, and maintaining the reliability and scalability of our cloud infrastructure.Key ResponsibilitiesDevelop and maintain SRE tools and applications to ensure...


  • Cupertino, California, United States Apple Inc. Full time

    Cloud Storage Reliability EngineerAt Apple, our Cloud infrastructure is extensive, and the Storage Site Reliability Engineering (SRE) teams are dedicated to developing and managing the next generation of distributed storage systems that underpin Apple's essential services. Operating at a global scale, across multiple data centers, and catering to users with...


  • Cupertino, California, United States Apple Inc. Full time

    Site Reliability Engineer, ASE Block StorageApple's cloud infrastructure is extensive, and the storage Site Reliability Engineering (SRE) teams are dedicated to constructing and managing the next generation of distributed storage systems that underpin Apple's essential services. Operating at our scale, across multiple geographically dispersed data centers,...


  • Cupertino, California, United States Apple Inc. Full time

    Site Reliability Engineer, ASE Block StorageApple's cloud infrastructure is extensive, and the storage Site Reliability Engineering (SRE) teams are tasked with developing and managing the next generation of distributed storage systems that underpin Apple's most vital services. Operating at our scale, across multiple geographically dispersed data centers, and...


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking an experienced Senior Engineer to join our Apple Maps Infrastructure team. As a key member of our team, you will play a critical role in building and maintaining our cloud infrastructure services.Key ResponsibilitiesDesign and implement scalable and reliable cloud infrastructure solutionsCollaborate with engineering, security,...


  • Cupertino, California, United States Apple Full time

    About the RoleAt Apple, we're looking for a skilled Site Reliability Engineer to join our Cloud Services team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud-based services.Key ResponsibilitiesLead data-driven roadmap planning for a subset of core services from a reliability...


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Cloud Services team at Apple. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesLead data-driven roadmap and quarterly planning for a subset of core services...


  • Cupertino, California, United States Bayside Solutions Full time

    Kubernetes Operations SpecialistW2 ContractSalary Range: $124,800 - $145,600 per yearLocation: Cupertino, CA - Hybrid RolePosition Overview:The role of a Kubernetes Operations Specialist is crucial for maintaining the efficiency of cloud infrastructures, with a strong emphasis on reliability and scalability. The successful candidate will be proactive,...


  • Cupertino, California, United States Bayside Solutions Full time

    Kubernetes Operations SpecialistW2 ContractSalary Range: $124,800 - $145,600 per yearLocation: Hybrid RolePosition Overview:The role of a Kubernetes Operations Specialist is crucial for ensuring the reliable performance of cloud-based systems, with a strong emphasis on both stability and scalability. The successful candidate will be proactive, meticulous,...


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineering Manager to join our Cloud Infrastructure team at Apple. As a key member of our team, you will be responsible for building and running the services that ingest and process content destined for all Apple Service offerings around the world.Key ResponsibilitiesDesign, build, deploy,...


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Apple Services Engineering team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesLead data-driven roadmap and quarterly planning for a subset of core...


  • Cupertino, California, United States Apple Inc. Full time

    About Apple Services EngineeringAt Apple, we don’t just create products; we design experiences that our customers cherish and rely on. The Apple Services Engineering (ASE) team is dedicated to building and maintaining the systems that facilitate these daily interactions. If you have engaged with Apple products, chances are you have encountered our work....


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking a highly skilled and experienced Site Reliability Engineer to join our Data Platform team at Apple. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud infrastructure and data platforms.Key ResponsibilitiesDesign, implement, and maintain scalable and...


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking a highly skilled Senior Engineer to join our Apple Maps Infrastructure team. As a key member of our team, you will play a critical role in building and maintaining our cloud infrastructure, ensuring the secure and efficient operation of our services.Key ResponsibilitiesDesign and implement scalable and reliable cloud...


  • Cupertino, California, United States Apple Full time

    Cloud Solutions Software EngineerJoin a team at the forefront of technology innovation, where your contributions will help shape the future of digital services. At Apple, we believe in creating exceptional experiences that transform industries and inspire our customers.About the RoleThe Apple Service Engineering (ASE) team is dedicated to developing and...


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Apple Services Engineering team. As a key member of our Solr SRE team, you will be responsible for developing processes, tools, and automation for managing distributed systems in production environments.Key ResponsibilitiesDesign and implement scalable search infrastructure...


  • Cupertino, California, United States Apple Full time

    About the RoleWe are seeking a highly skilled Senior Cloud Infrastructure Engineer to join our Apple Maps Infrastructure team. As a key member of our team, you will be responsible for designing, building, and maintaining scalable and efficient cloud infrastructure services.Key ResponsibilitiesDesign and implement cloud infrastructure solutions to support...


  • Cupertino, California, United States Apple Full time

    Job SummaryApple is seeking a highly skilled Senior Cloud Reliability Engineer to join our Apple Services Engineering (ASE) team. As a key member of this team, you will play a critical role in ensuring the reliability and scalability of our cloud-based services.Key ResponsibilitiesDesign and implement scalable and highly available cloud infrastructure to...