Site Reliability Engineer

3 weeks ago


Santa Clara, United States Veear Full time
Job Description:
  • Partner with teams to ensure security and compliance requirements are met.
  • Work with development teams to ensure that applications have scalability and reliability built-in from day one - Agile is second nature to you and you're excited to work in scrum teams and represent the SRE perspective
  • Design and enhance software architecture to improve scalability, service reliability, cost, and performance - You've helped create services that are critical to their customers' success
  • Deploy automation for provisioning and operating infrastructure at large scale - You are experienced in Infrastructure as Code concepts and have put them into production
  • Partner with teams to improve CI/CD processes and technology - Helping teams in delivering value early is what you strive for
  • Drive the adoption of observability practices and a data-driven mindset - You love metrics, graphs, and gaining a deep understanding of why things happen in a system, helping others gain visibility into the things they build
  • Participate in the occasional on-call rotation supporting the infrastructure owned by the SRE team - Finding ways to reduce the time to resolution and improve the reliability of services is key to running a trusted platform
Skills
  • 5+ years of total experience with Unix/Linux experience (shell/tools/kernel/networking/storage)
  • 2+ years of working with microservice architectures running on Kubernetes and containers
  • CICD pipelines using GitLab and ArgoCD
  • Terraform
  • Ansible
  • Artifactory or equivalent experience
  • Vault or equivalent experience
  • Demonstrated experience in building tools and automation
  • Experience with public cloud (GCP highly preferred) at medium to large scale
  • Vulnerability Management for containers and VMs
  • Go or Python
  • "Gitlab " with some GCP


  • Santa Clara, California, United States Diverse Lynx Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based applications and infrastructure.Key ResponsibilitiesDesign, implement, and maintain cloud infrastructure on...


  • Santa Clara, United States VeeAR Projects Inc. Full time

    Position: Site Reliability EngineerLocation: Remote roleDuration: 12+ Months Contract with possible extensionJob Description: We seek development-heavy Site Reliability Engineers to design, build, maintain, and scale production services and server farms within our FedRAMP SASE product portfolio. We want passionate engineers who bring new ideas to all facets...


  • Santa Clara, United States VeeAR Projects Inc. Full time

    Position: Site Reliability EngineerLocation: Remote roleDuration: 12+ Months Contract with possible extensionJob Description: We seek development-heavy Site Reliability Engineers to design, build, maintain, and scale production services and server farms within our FedRAMP SASE product portfolio. We want passionate engineers who bring new ideas to all facets...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and outstanding people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers,...


  • Santa Clara, United States Centrify Corporation Full time

    Our software runs on public clouds with 99.9% or better uptime and is mission critical for our customers. Our cloud operations team is where the rubber meets the road and needs innovative Site Reliability Engineers. Join a professional team of smart and hard-working professionals building enterprise-class cloud-based services in the rapidly growing market of...


  • Santa Clara, California, United States Promote Project Full time

    About Promote Project: Promote Project is a leader in innovative technology solutions, dedicated to pushing the boundaries of what is possible in the realm of artificial intelligence and cloud computing. Our commitment to excellence is reflected in our talented workforce and our pursuit of groundbreaking advancements.Position Overview: We are seeking a...


  • Santa Clara, California, United States Promote Project Full time

    About the Company: Promote Project is at the forefront of innovation, leveraging cutting-edge technology to redefine the landscape of AI and computing. Our mission is to harness the power of advanced computing to create transformative solutions that impact various industries.Position Overview: We are seeking a Manager of Site Reliability Engineering to...


  • Santa Clara, California, United States Veear Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Veear. As a key member of our infrastructure team, you will play a critical role in ensuring the reliability, scalability, and security of our cloud-based systems.Key ResponsibilitiesCollaboration and PartnershipPartner with cross-functional teams to ensure security...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job OverviewCompany OverviewTo comply with U.S. federal government requirements, U.S. citizenship is required for this position.Our MissionAt Palo Alto Networks, our mission is clear:To be the cybersecurity partner of choice, safeguarding our digital existence.We envision a world where each day is safer and more secure than the last. Our foundation is built...


  • Santa Clara, California, United States ServiceNow Full time

    Company OverviewAt ServiceNow, we harness technology to create a better world for everyone, driven by our talented workforce. We prioritize speed and innovation to meet the demands of our customers and communities.Joining ServiceNow means becoming part of a dynamic team of innovators who possess a relentless curiosity and a commitment to creativity.We...


  • Santa Clara, California, United States ServiceNow Full time

    Company OverviewAt ServiceNow, we harness technology to enhance global operations, and our dedicated workforce makes it all possible. We operate swiftly because the world demands it, innovating uniquely for our clients and communities.By becoming part of ServiceNow, you join a dynamic team of innovators who possess a relentless curiosity and a passion for...


  • Santa Clara, United States Diverse Lynx Full time

    Skills: Site Reliability Engineering (SRE), GIT(Bitbucket), Jenkins, AWS CodeBuild, AWS CodeDeploy Job Description: AWS application and CI/CD pipelines, Microsoft Server admin and workload support (Data center and AWS) •Initial responsibility is application platform promotion to controlled environments for test, staging, and production AWS accounts. o...


  • Santa Clara, United States Palo Alto Networks Full time

    Principal Site Reliability Engineer (SASE) Full-time Job Country: United States of America To comply with U.S. federal government requirements, U.S. citizenship is required for this position. Our Mission At Palo Alto Networks, everything starts and ends with our mission: being the cybersecurity partner of choice, protecting our digital way of life. Our...


  • Santa Clara, United States Palo Alto Networks Full time

    To comply with U.S. federal government requirements, U.S. citizenship is required for this position Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company...


  • Santa Clara, United States Geospatial And Cloud Analytics Inc Full time

    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demand knowledge across different systems, networking, coding, database,...


  • Santa Clara, California, United States Promote Project Full time

    About the Company: Promote Project is at the forefront of innovation, focusing on redefining technology and enhancing the capabilities of AI. We are dedicated to creating groundbreaking solutions that push the boundaries of what is possible in computing.Position Overview: We are seeking a Manager for Site Reliability Engineering to spearhead our cloud...


  • Santa Clara, United States Palo Alto Networks Full time

    Job Description Your Career The Global Customer Operation Team is responsible for building products that protect data, workloads, and infrastructure for some of the largest enterprise customers in the world. We help our customers in their journey to the public cloud by ensuring they have the best in class protection. The public cloud market has been...


  • Santa Clara, California, United States Centrify Corporation Full time

    **About Centrify Corporation**Centrify Corporation is a leading provider of cloud-based identity and access management solutions. Our software runs on public clouds with 99.9% or better uptime and is mission critical for our customers.**Job Summary**We are seeking a highly skilled Cloud Site Reliability Engineer to join our Cloud DevOps team. As a Cloud Site...


  • Santa Clara, United States Nvidia Full time

    Senior Site Reliability Engineer - StoragelocationsUS, CA, Santa Claratime typeFull timejob requisition idJR1979072NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and...


  • Santa Clara, United States Palo Alto Networks Full time

    The Global Customer Operation Team is responsible for building products that protect data, workloads, and infrastructure for some of the largest enterprise customers in the world. We help our customers in their journey to the public cloud by ensuring Reliability Engineer, Liability, Engineer, Reliability, Software Architect, Principal, Technology