Senior Site Reliability Engineer, Kubernetes ASE

3 days ago


Austin, Texas, United States Apple Full time
Job Summary

We are seeking a highly skilled Senior Site Reliability Engineer to join our Apple Service Engineering team. As a key member of our team, you will play a pivotal role in supporting and scaling cloud services for thousands of development and operations engineers.

Key Responsibilities
  • Establish and maintain SRE practices for our private/public cloud service, ensuring uncompromising scalability, high availability, and seamless performance.
  • Design, build, and implement innovative software solutions to address existing challenges and proactively anticipate future needs.
  • Collaborate closely with developers and architects to design and implement solutions for improved stability, security, and scalability.
  • Operate, monitor, and prioritize tasks across all production and non-production environments, demonstrating strong operational focus.
  • Automate service deployment and orchestration in the cloud environment, as well as other routine processes, to streamline operations.
Requirements
  • Deep understanding of Kubernetes architecture, components, and best practices.
  • Proficiency in managing Kubernetes clusters, deploying applications, and automating workflows using tools like Helm and Kustomize.
  • Experience with major public cloud providers and their cloud-native services.
  • Familiarity with infrastructure as code (IaC) tools like Terraform or Ansible.
  • Adherence to SRE principles, including monitoring, alerting, error budgets, fault analysis, and automation.
  • Expertise in implementing and coordinating telemetry using tools like Splunk, Grafana, and Prometheus.
  • Proficiency in GoLang for developing automation scripts, tools, and custom applications.
  • Excellent interpersonal and communication skills.
Preferred Qualifications
  • Production & Non-Production Environments: Operate, monitor, and prioritize tasks across all production and non-production environments, demonstrating strong operational focus.
  • Innovative Problem Solver: Design, build, and implement innovative software solutions to address existing challenges and proactively anticipate future needs.
  • Documentation & Collaboration: Create clear alert handling procedures and runbooks, ensuring knowledge transfer and collaboration within and between SRE teams.
  • Automation Champion: Automate service deployment and orchestration in the cloud environment, as well as other routine processes, to streamline operations.
  • Resilience & Growth: Actively participate in capability planning, scale testing, and disaster recovery exercises, ensuring our systems remain resilient.
  • Team Player: Foster strong relationships and provide support to partner teams like engineering, QA, and program management.
Education & Experience

BS or MS in Computer Science or equivalent proven experience.

Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics.



  • Austin, Texas, United States Apple Full time

    Job Title: Senior Site Reliability Engineer, Kubernetes ASEWe are seeking a highly skilled Senior Site Reliability Engineer to join our Apple Service Engineering team. As a key member of our team, you will play a pivotal role in supporting and scaling cloud services for thousands of development and operations engineers.Key Responsibilities:Establish and...


  • Austin, Texas, United States Apple Full time

    Job DescriptionAt Apple, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud services. You will work closely with our development and operations teams to design, implement, and maintain our cloud...


  • Austin, Texas, United States Publishing Full time

    Job DescriptionAt Publishing, we're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining scalable and reliable cloud infrastructure to support our growing business.ResponsibilitiesDesign and implement scalable cloud...


  • Austin, Texas, United States Publishing Full time

    Job DescriptionWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Publishing. As a key member of our IT department, you will be responsible for designing, implementing, and maintaining our cloud infrastructure to ensure high availability, scalability, and performance.ResponsibilitiesDesign and implement scalable cloud...


  • Austin, Texas, United States Weedmaps Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Weedmaps. As a key member of our infrastructure team, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesLeverage your engineering expertise to build, monitor, and improve our...


  • Austin, Texas, United States Publishing Inc Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Publishing.com. As a key member of our IT team, you will be responsible for designing, implementing, and maintaining our cloud infrastructure and operational workflows.ResponsibilitiesDesign and implement scalable solutions to address our growing infrastructure...


  • Austin, Texas, United States Expedia Group Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled and experienced Senior Software Development Engineer (SRE) to join our team at Expedia Group. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our services and systems. You will work closely with development and operations teams to design,...


  • Austin, Texas, United States Expedia Group Full time

    Senior Software Development Engineer - Site ReliabilityWe are seeking a highly skilled and experienced Senior Software Development Engineer (SRE) to join our team at Expedia Group. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our services and systems. You will work closely with development and...


  • Austin, Texas, United States Weedmaps Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Weedmaps. As a key member of our engineering team, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesLeverage your engineering expertise to build, monitor, and improve our...


  • Austin, Texas, United States Publishing Inc Full time

    About the RoleAt Publishing Inc, we're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, implementing, and maintaining our cloud infrastructure to ensure high availability, scalability, and performance.ResponsibilitiesDesign and implement scalable...


  • Austin, Texas, United States Expedia Group Full time

    Senior Software Development Engineer - Site ReliabilityWe are seeking a highly skilled and experienced Senior Software Development Engineer (SRE) to join our team. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our services and systems. You will work closely with development and operations teams to...


  • Austin, Texas, United States Expedia Group Full time

    Job SummaryWe are seeking a highly skilled and experienced Senior Software Development Engineer (SRE) to join our team at Expedia Group. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our services and systems. You will work closely with development and operations teams to design, build, and maintain...


  • Austin, Texas, United States Publishing Inc Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Publishing Inc. As a key member of our IT team, you will be responsible for designing, implementing, and maintaining the reliability and performance of our cloud infrastructure and applications.ResponsibilitiesDesign and implement scalable solutions to address...


  • Austin, Texas, United States JobRialto Full time

    About the RoleWe are seeking a highly motivated and experienced Systems and Platform Operations Expert to join our Site Reliability Engineering & Production Services team. As a member of this team, you will work closely with other technology professionals to support Asset Management Technology - Cloud Platform solutions.Key ResponsibilitiesProvide level 2...


  • Austin, Texas, United States Oracle Full time

    Job DescriptionOracle is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based services.Key ResponsibilitiesDesign, develop, and deploy automation tools to improve the efficiency and reliability of our cloud...


  • Austin, Texas, United States Cisco Full time

    About the RoleCisco is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining the reliability and scalability of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement automated solutions to improve the reliability and...


  • Austin, Texas, United States Apple Full time

    Site Reliability Engineering ManagerAt Apple, we're committed to delivering exceptional customer experiences through innovative products and services. As a Site Reliability Engineering Manager, you'll play a critical role in ensuring the reliability and scalability of our cloud services.Key ResponsibilitiesLead a team of SRE engineers in establishing and...


  • Austin, Texas, United States Weedmaps Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Weedmaps. As a key member of our engineering team, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key ResponsibilitiesCollaborate with Cross-Functional Teams: Work closely with our...


  • Austin, Texas, United States Cisco Full time

    About the RoleCisco is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud-based infrastructure.Key ResponsibilitiesDesign and implement automated solutions to improve infrastructure stability and scalabilityCollaborate with...


  • Austin, Texas, United States NeerInfo Solutions Full time

    Job Title: Site Reliability Engineer LeadWe are seeking a highly skilled Site Reliability Engineer Lead to join our team at NeerInfo Solutions. As a key member of our engineering team, you will be responsible for ensuring the stability, resilience, and scalability of our services.Key Responsibilities:Manage a team of SREs to proactively ensure the stability,...