Senior Site Reliability Engineer

3 weeks ago


Seattle, United States Apple Full time

To view your favorites, sign in with your Apple ID. Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Join Apple’s Cloud Service Infrastructure team as a site reliability engineer to help support and scale cloud services for thousands of development and operations engineers. This is a hands-on role to establish SRE practices for a private cloud service to accelerate our ability to reliably and consistently deliver thousands of applications. Description As a Site Reliability Engineer, you will be responsible for providing the platform for mission critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and services to flourish.The successful candidate will be highly self-motivated with a passion for excellence, quality and detail. The SRE will not only support operations but also work closely with the developers and architects within the team to aid in the design and assist with the implementation to improve stability, security and scalability.AS AN SRE AT APPLE, YOU WILL:- Operate, monitor, and triage all aspects of our production and non-production environments.- Pioneer and implement the next-generation telemetry system.- Prepare alert handling procedures, runbooks, and collaborate with the off-shore SRE teams.- Automate deployment and orchestration of services into the cloud environment as well as other routine processes.- Actively participate in capacity planning, scale testing, and disaster recovery exercises.- Interact with and support partner teams, including engineering, QA, and program management.- Cultivate and maintain relationships with internal and external third-party vendors. Minimum Qualifications Bachelor's Degree in Computer Science, an engineering-related field, or equivalent related experience. Advanced Degree preferred. 5+ years in a Site Reliability Engineering, DevOps, or Infrastructure focused role Must be an expert and have in-depth professional experience working with Kubernetes Proficient in GoLang Key Qualifications Preferred Qualifications Ability to implement and coordinate telemetry using monitoring and observability tools such as Splunk, Grafana, and Prometheus Experience operating large scale multi tenant Infrastructure as a Managed service Knowledge of the Linux operating system and its variations Experience with GitOps, CI/CD tools, and deployment strategies like Spinnaker, Argo Able to troubleshoot issues across the entire infrastructure stack Outstanding organizational and communications skills Education & Experience Additional Requirements At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $166,600 and $296,300, and your base pay will depend on your skills, qualifications, experience, and location. Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits. Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program. Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant. To view your favorites, sign in with your Apple ID. #J-18808-Ljbffr



  • Seattle, Washington, United States Apple Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our Apple Services Engineering team in Seattle, Washington. As a key member of our dynamic team, you will play a critical role in ensuring the availability, latency, and overall health of our object store orchestration service.Key...


  • Seattle, United States eTek IT Services, Inc. Full time

    Job DescriptionJob DescriptionOverview:The Senior Site Reliability Engineer plays a critical role in ensuring the reliability, scalability, and performance of our systems and services. They are responsible for designing and implementing tools and automated solutions to improve system reliability, monitoring, and incident response.Key Responsibilities:Develop...


  • Seattle, United States Prodigy Resources Full time

    About Us: Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us: Prodigy is seeking an SRE to join our clients organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, theyre seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us:Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us:Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Apple Full time

    Senior Site Reliability Engineer, Object Storage Seattle, Washington, United States Software and Services The Apple Services Engineering (ASE) team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. These are the people who power the App Store, Apple TV, Apple Music, Apple Podcasts, and Apple Books. They...


  • Seattle, Washington, United States Apple Full time

    Job SummaryThe Apple Services Engineering team is a dynamic and innovative group of engineers who power the App Store, Apple TV, Apple Music, Apple Podcasts, and Apple Books. As a Senior Site Reliability Engineer, Object Storage, you will play a critical role in building secure, end-to-end solutions that meet Apple's high expectations.Key...


  • Seattle, United States West500 Partners Full time

    Our client is a fast-growing downtown Seattle startup developing AI automation for professional services, including legal technology and medical records. They have a great product market fit and rapidly increasing revenues and are currently in need of a local Software Engineering Lead with CI/CD expertise, an AWS background, and a keen interest in innovative...


  • Seattle, United States West500 Partners Full time

    Our client is a fast-growing downtown Seattle startup developing AI automation for professional services, including legal technology and medical records. They have a great product market fit and rapidly increasing revenues and are currently in need of a local Software Engineering Lead with CI/CD expertise, an AWS background, and a keen interest in innovative...


  • Seattle, Washington, United States eTek IT Services, Inc. Full time

    Job DescriptionOverview:The Senior Site Reliability Engineer plays a critical role in ensuring the reliability, scalability, and performance of our systems and services. They are responsible for designing and implementing tools and automated solutions to improve system reliability, monitoring, and incident response.Key Responsibilities:Design and Implement...


  • Seattle, United States Oracle Full time

    OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...


  • Seattle, Washington, United States F5 Networks Full time

    About F5 NetworksAt F5 Networks, we are dedicated to shaping a superior digital landscape. Our teams empower organizations worldwide to create, secure, and operate applications that enhance our interactions with the ever-evolving digital environment.We are deeply committed to cybersecurity, safeguarding consumers from fraud, and enabling businesses to...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled Senior Cloud Reliability Engineer to join our Cloud Service Infrastructure team at Apple. As a key member of our team, you will be responsible for ensuring the reliability, scalability, and security of our cloud services.Key ResponsibilitiesDesign and implement scalable and reliable cloud infrastructure...


  • Seattle, Washington, United States Apple Full time

    About the RoleWe are seeking a highly skilled Senior Cloud Reliability Engineer to join our Cloud Service Infrastructure team at Apple. As a key member of our team, you will play a critical role in ensuring the reliability, scalability, and security of our cloud services.Key ResponsibilitiesDesign and implement scalable and reliable cloud infrastructure to...


  • Seattle, United States Capgemini Full time

    **Site Reliability Engineer** **FTE with benefits** Our team is looking to add experienced Site Reliability / DevOps Engineer to our team. + Experiencedwith **Python and Shell Scripting.** + **Shouldhave extensive experience with Azure or AWS (Azure preferred)** + **Experiencewith Monitoring and Observability - Datadog** + **Experiencewith Infrastructure as...


  • Seattle, United States eTek IT Full time

    Overview The Senior Site Reliability Engineer plays a critical role in ensuring the reliability, scalability, and performance of our systems and services. They are responsible for designing and implementing tools and automated solutions to improve system reliability, monitoring, and incident response. Key Responsibilities Develop and maintain...


  • Seattle, Washington, United States Oracle Full time

    Overview: The OCI Incident Response team serves as the primary defense mechanism for ensuring the uninterrupted operation of Oracle's cloud services. Our mission is to reduce the frequency and impact of customer-affecting incidents by implementing effective large-scale incident management strategies. We leverage our operational expertise, adherence to...


  • Seattle, United States Oracle Full time

    OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...


  • Seattle, Washington, United States Moloco Full time

    About the RoleMoloco is a leading machine learning company that operates at massive scale, processing 10 petabytes of training data daily, and boasts blazingly fast models that return predictions in 10 milliseconds or less. As a profitable unicorn valued at $2 billion, we have been profitable for over 13 quarters.We are seeking an exceptional Staff Software...