Principal Site Reliability Engineer

2 days ago


Santa Clara, California, United States Palo Alto Networks Full time
Job Title: Principal Site Reliability Engineer

Palo Alto Networks is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for designing, building, and operating reliable, secure cloud infrastructure.

About the Role

We are looking for a seasoned engineer with expertise in cloud native applications, infrastructure automation, and site reliability engineering. You will work closely with our development teams to ensure that applications are production-ready, scalable, and reliable. Your expertise in configuration management, infrastructure automation, and public or private cloud will be essential in driving our cloud infrastructure strategy.

Responsibilities
  • Design and build reliable, secure cloud infrastructure using Terraform, Kubernetes, and other cloud native tools.
  • Develop expertise in new technologies and contribute to the success of our SRE and DevOps teams.
  • Work with developers, researchers, data scientists, and security experts to design and implement scalable, reliable cloud native applications.
  • Develop tools and automation frameworks to streamline infrastructure deployment and monitoring.
  • Orchestrate end-to-end monitoring and alerting to ensure seamless application performance.
  • Participate in the on-call rotation and lead root cause analysis of critical business and production issues.
Requirements
  • 7+ years of experience as an engineer in infrastructure, operations, DevOps, or system engineering.
  • 7+ years of experience building high availability, scalable cloud native applications on AWS or GCP.
  • BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required.
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm.
  • Expertise in infrastructure automation tasks using Python and shell scripting.
  • Experience in site reliability engineering, production engineering, or DevOps.
  • Expertise in public or private cloud.
  • Solid experience in Kubernetes and containers.
  • Linux administration, internals, and network troubleshooting.
  • Proficiency with programming languages like Python, Java, Golang, and shell scripting to automate tasks.
  • Experience with CI/CD pipelines, GitLab, and ArgoCD preferred.
  • Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions.
  • Excellent written and verbal communication, able to collaborate and rally support.
  • Self-disciplined, self-managed, self-motivated, and strong sense of ownership, urgency, and drive.
  • Passion for infrastructure and monitoring as code.
  • Ready to understand and dissect new technology stacks quickly.
What We Offer

Palo Alto Networks is committed to providing a dynamic and inclusive work environment that fosters innovation, creativity, and collaboration. We offer a competitive compensation package, including a base salary, restricted stock units, and a bonus. Our benefits package includes a comprehensive health insurance plan, 401(k) matching, and a generous paid time off policy.

We are an equal opportunity employer and welcome applications from diverse candidates. If you require assistance or accommodation due to a disability or special need, please contact us at [insert contact email].



  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RolePalo Alto Networks is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our infrastructure platform, you will be responsible for designing, building, and operating reliable and secure cloud infrastructure.Key ResponsibilitiesContribute to the success of SRE and DevOps teams by developing expertise...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Palo Alto Networks. As a key member of our Global Customer Operation Team, you will be responsible for designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Principal Site Reliability Engineer to join our team at Palo Alto Networks. As a key member of our Global Customer Operation Team, you will be responsible for designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RolePalo Alto Networks is seeking a highly skilled Principal Site Reliability Engineer to join our Global Customer Operation Team. As a key member of our team, you will be responsible for designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Palo Alto Networks. As a Site Reliability Engineer, you will play a critical role in designing, building, and maintaining scalable and reliable infrastructure for our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and implement scalable and reliable...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Palo Alto Networks. As a Site Reliability Engineer, you will play a critical role in designing, building, and maintaining scalable and reliable infrastructure for our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and implement scalable and reliable...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Palo Alto Networks. As a Site Reliability Engineer, you will play a critical role in designing, building, and maintaining scalable and reliable infrastructure for our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and implement scalable and reliable...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Palo Alto Networks. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and reliability of our applications and infrastructure.Key ResponsibilitiesDesign, implement, and maintain scalable and reliable infrastructureBuild...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job DescriptionPalo Alto Networks is seeking a highly skilled Site Reliability Engineer to join our Global Customer Operation Team. As a Site Reliability Engineer, you will play a critical role in designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job Title: Principal Kafka Site Reliability Engineer DevOpsWe are revolutionizing the cybersecurity landscape with our cloud-delivered security services, and our cloud infrastructure is rapidly expanding with a global presence.We're seeking exceptional SREs and software engineers interested in production engineering to help us scale the largest enterprise...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Palo Alto Networks. As a key member of our Global Customer Operation Team, you will be responsible for designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and implement...


  • Santa Clara, California, United States Diverse Lynx Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based applications and infrastructure.Key ResponsibilitiesDesign, implement, and maintain cloud infrastructure on...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the RolePalo Alto Networks is seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our Global Customer Operation Team, you will be responsible for designing, building, maintaining, and scaling production services and server farms within our FedRAMP SASE product portfolio.Key ResponsibilitiesDesign and...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job DescriptionAt Palo Alto Networks, we're seeking a highly skilled Site Reliability Engineer to join our SASE team. As a key member of our team, you will be responsible for designing, building, and maintaining scalable and reliable infrastructure to support our FedRAMP SASE product portfolio.Your ImpactDesign and implement Terraform code and terragrunt to...


  • Santa Clara, California, United States Diverse Lynx Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key ResponsibilitiesDesign, implement, and maintain scalable and highly available cloud...


  • Santa Clara, California, United States Syntricate Technologies Full time

    Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team at Syntricate Technologies. As a key member of our infrastructure team, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key ResponsibilitiesDesign, implement, and maintain cloud infrastructure on AWS,...


  • Santa Clara, California, United States Centrify Corporation Full time

    Cloud Site Reliability EngineerAt Centrify Corporation, we're seeking a skilled Cloud Site Reliability Engineer to join our Cloud DevOps team. As a key member of our operations team, you'll play a critical role in ensuring the uptime and delivery of our cloud-based services.Key Responsibilities:Manage our cloud application using DevOps and Agile practices to...


  • Santa Clara, California, United States Veear Full time

    Job Description:We are seeking a highly skilled Site Reliability Engineer to join our team at Veear. As a key member of our infrastructure team, you will play a critical role in ensuring the security, compliance, and reliability of our systems.Key Responsibilities:Partner with development teams to ensure that applications have scalability and reliability...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Join Our Mission to End Breaches and Protect Digital LifePalo Alto Networks is the fastest-growing security company in history, and we're looking for a motivated, intelligent, and creative individual to join our team as a Site Reliability Engineer DevOps.About the RoleWe offer the chance to be part of an important mission: ending breaches and protecting our...


  • Santa Clara, California, United States OMNIVISION Full time

    Job Overview We are seeking a Staff Reliability Engineer to join our team at OMNIVISION. The ideal candidate will possess a strong educational background and relevant experience in the field of reliability engineering. Qualifications: A Bachelor’s degree in Physics, Electrical Engineering, Materials Science, or a related engineering field, with...