Sr Site Reliability Engineer

3 weeks ago

Santa Clara, United States Palo Alto Networks Full time

Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we’re looking for innovators who are as committed to shaping the future of cybersecurity as we are. Our Approach to Work We lead with flexibility and choice in all of our people programs. We have disrupted the traditional view that all employees have the same needs and wants. We offer personalization and offer our employees the opportunity to choose what works best for them as often as possible - from your wellbeing support to your growth and development, and beyond At Palo Alto Networks, we believe in the power of collaboration and value in-person interactions. This is why our employees generally work from the office three days per week, leaving two days for choice and flexibility to work where you feel most effective. This setup fosters casual conversations, problem-solving, and trusted relationships. While details may evolve, our goal is to create an environment where innovation thrives, with office-based teams coming together three days a week to collaborate and thrive, together Job Description Your Career Palo Alto Networks runs a large hybrid infrastructure and is one of the largest GCP customers. As a Site Reliability Engineer, you will be part of a team supporting the services running on this infrastructure. This includes automation, architecture, performance, metrics, troubleshooting, security, and reliability. Our stack includes Kubernetes, Docker, GCP, AWS, Terraform, Vault, Gitlab CI, Datadog, Elasticsearch, MySQL, Python, and Go. We don’t expect you to know all these, but we do expect you to learn the ones needed for this role. Your Impact Contribute to the success of SRE and DevOps Develop expertise in new technologies Work with developers, researchers, data scientists, and security experts Design, build and operate reliable, secure Cloud infrastructure Ensure that applications are production-ready, scalable, and reliable Develop tools and automation frameworks Automate robust deployment of robust services Orchestrate end-to-end monitoring and alerting Participate with SRE and Dev teams in the on-call rotation Lead root cause analysis of critical business and production issues Mentor and champion SRE culture Participate in design reviews Qualifications Your Experience BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required Strong Linux administration, internals, and network troubleshooting Expertise in configuration management with a framework such as Terraform, Helm or Ansible Experience in Production Engineering, DevOps, or Site Reliability Expertise in public or private cloud, especially GCP or AWS Experience in container or Kubernetes setup and configuration Capable to automate tasks with programming like Python, shell scripting or Golang Familiarity with CI/CD pipelines, GitLab CI preferred Ability to diagnose and troubleshoot complex distributed systems handling high volume transactions Excellent written and verbal communication, able to collaborate and rally support Self-disciplined, self-managed, self-motivated and strong sense of ownership, urgency, and drive Passion for infrastructure and monitoring as code Ready to understand and dissect new technology stacks quickly Participate in the on-call rotation Additional Information The Team Our engineering team is at the core of our products – connected directly to the mission of preventing cyberattacks. We are constantly innovating – challenging the way we, and the industry, think about cybersecurity. Our engineers don’t shy away from building products to solve problems no one has pursued before. We define the industry, instead of waiting for directions. We need individuals who feel comfortable in ambiguity, excited by the prospect of a challenge, and empowered by the unknown risks facing our everyday lives that are only enabled by a secure digital environment. Our Commitment We’re trailblazers that dream big, take risks, and challenge cybersecurity’s status quo. It’s simple: we can’t accomplish our mission without diverse teams innovating, together. We are committed to providing reasonable accommodations for all qualified individuals with a disability. If you require assistance or accommodation due to a disability or special need, please contact us at accommodations@paloaltonetworks.com. Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics. All your information will be kept confidential according to EEO guidelines. The compensation offered for this position will depend on qualifications, experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non-sales roles) or base salary + commission target (for sales/commissioned roles) is expected to be between $124,600/yr to $201,650/yr. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here. #J-18808-Ljbffr

Site Reliability Engineer

3 hours ago

Santa Clara, California, United States Diverse Lynx Full time

About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based applications and infrastructure.Key ResponsibilitiesDesign, implement, and maintain cloud infrastructure on...
Sr. Site Reliability Engineer

3 months ago

Santa Clara, United States TCWGlobal Full time

Sr. SRE EngineerW2 Contract to Possible HireHybrid, Santa Clara, CA$75-90/hr + PTO, Paid Holidays, Benefits We are looking for a seasoned SRE to join our multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops and maintains...
Sr. Site Reliability Engineer

3 months ago

Santa Clara, United States TCWGlobal Full time

Sr. SRE EngineerW2 Contract to Possible HireHybrid, Santa Clara, CA$75-90/hr + PTO, Paid Holidays, Benefits We are looking for a seasoned SRE to join our multifaceted and fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced crew that develops and maintains...
Site Reliability Engineer

3 weeks ago

Santa Clara, United States Veear Full time

Position: Site Reliability Engineer Location: Remote role Duration: 12+ Months Contract with possible extension Job Description: We seek development-heavy Site Reliability Engineers to design, build, maintain, and scale production services and server farms within our FedRAMP SASE product portfolio. We want passionate engineers who bring new ideas to all...
Sr Principal Site Reliability Engineer

3 weeks ago

Santa Clara, United States Palo Alto Networks Full time

Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we’re looking...
Site Reliability Engineer

3 weeks ago

Santa Clara, United States VeeAR Projects Inc. Full time

Position: Site Reliability EngineerLocation: Remote roleDuration: 12+ Months Contract with possible extensionJob Description: We seek development-heavy Site Reliability Engineers to design, build, maintain, and scale production services and server farms within our FedRAMP SASE product portfolio. We want passionate engineers who bring new ideas to all facets...
Site Reliability Engineer

3 weeks ago

Santa Clara, United States VeeAR Projects Inc. Full time

Position: Site Reliability EngineerLocation: Remote roleDuration: 12+ Months Contract with possible extensionJob Description: We seek development-heavy Site Reliability Engineers to design, build, maintain, and scale production services and server farms within our FedRAMP SASE product portfolio. We want passionate engineers who bring new ideas to all facets...
Site Reliability Engineer

2 months ago

Santa Clara, United States NVIDIA Full time

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and outstanding people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers,...
Cloud Site Reliability Engineer

2 months ago

Santa Clara, United States Centrify Corporation Full time

Our software runs on public clouds with 99.9% or better uptime and is mission critical for our customers. Our cloud operations team is where the rubber meets the road and needs innovative Site Reliability Engineers. Join a professional team of smart and hard-working professionals building enterprise-class cloud-based services in the rapidly growing market of...
Site Reliability Engineering Manager

2 weeks ago

Santa Clara, California, United States Promote Project Full time

About Promote Project: Promote Project is a leader in innovative technology solutions, dedicated to pushing the boundaries of what is possible in the realm of artificial intelligence and cloud computing. Our commitment to excellence is reflected in our talented workforce and our pursuit of groundbreaking advancements.Position Overview: We are seeking a...
Site Reliability Engineering Manager

2 weeks ago

Santa Clara, California, United States Promote Project Full time

About the Company: Promote Project is at the forefront of innovation, leveraging cutting-edge technology to redefine the landscape of AI and computing. Our mission is to harness the power of advanced computing to create transformative solutions that impact various industries.Position Overview: We are seeking a Manager of Site Reliability Engineering to...
Site Reliability Engineer

2 days ago

Santa Clara, California, United States Veear Full time

About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Veear. As a key member of our infrastructure team, you will play a critical role in ensuring the reliability, scalability, and security of our cloud-based systems.Key ResponsibilitiesCollaboration and PartnershipPartner with cross-functional teams to ensure security...
Lead Site Reliability Engineer

1 week ago

Santa Clara, California, United States Palo Alto Networks Full time

Job OverviewCompany OverviewTo comply with U.S. federal government requirements, U.S. citizenship is required for this position.Our MissionAt Palo Alto Networks, our mission is clear:To be the cybersecurity partner of choice, safeguarding our digital existence.We envision a world where each day is safer and more secure than the last. Our foundation is built...
Senior Site Reliability Engineer

2 weeks ago

Santa Clara, California, United States ServiceNow Full time

Company OverviewAt ServiceNow, we harness technology to create a better world for everyone, driven by our talented workforce. We prioritize speed and innovation to meet the demands of our customers and communities.Joining ServiceNow means becoming part of a dynamic team of innovators who possess a relentless curiosity and a commitment to creativity.We...
Senior Site Reliability Engineer

2 weeks ago

Santa Clara, California, United States ServiceNow Full time

Company OverviewAt ServiceNow, we harness technology to enhance global operations, and our dedicated workforce makes it all possible. We operate swiftly because the world demands it, innovating uniquely for our clients and communities.By becoming part of ServiceNow, you join a dynamic team of innovators who possess a relentless curiosity and a passion for...
Site Reliability Engineering

5 days ago

Santa Clara, United States Diverse Lynx Full time

Skills: Site Reliability Engineering (SRE), GIT(Bitbucket), Jenkins, AWS CodeBuild, AWS CodeDeploy Job Description: AWS application and CI/CD pipelines, Microsoft Server admin and workload support (Data center and AWS) •Initial responsibility is application platform promotion to controlled environments for test, staging, and production AWS accounts. o...
Principal Site Reliability Engineer

2 weeks ago

Santa Clara, United States Palo Alto Networks Full time

Principal Site Reliability Engineer (SASE) Full-time Job Country: United States of America To comply with U.S. federal government requirements, U.S. citizenship is required for this position. Our Mission At Palo Alto Networks, everything starts and ends with our mission: being the cybersecurity partner of choice, protecting our digital way of life. Our...
Principal Site Reliability Engineer

6 hours ago

Santa Clara, United States Palo Alto Networks Full time

To comply with U.S. federal government requirements, U.S. citizenship is required for this position Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company...
Senior Site Reliability Engineer

4 days ago

Santa Clara, United States Geospatial And Cloud Analytics Inc Full time

Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demand knowledge across different systems, networking, coding, database,...
Site Reliability Engineering Manager

2 weeks ago

Santa Clara, California, United States Promote Project Full time

About the Company: Promote Project is at the forefront of innovation, focusing on redefining technology and enhancing the capabilities of AI. We are dedicated to creating groundbreaking solutions that push the boundaries of what is possible in computing.Position Overview: We are seeking a Manager for Site Reliability Engineering to spearhead our cloud...

Americas

Europe

Asia / Oceania

Africa

Sr Site Reliability Engineer