Site Reliability Engineer

3 weeks ago


Santa Clara, United States VeeAR Projects Inc. Full time

Position: Site Reliability Engineer

Location: Remote role

Duration: 12+ Months Contract with possible extension



Job Description:

We seek development-heavy Site Reliability Engineers to design, build, maintain, and scale production services and server farms within our FedRAMP SASE product portfolio. We want passionate engineers who bring new ideas to all facets of DevOps. We are looking for leaders who take ownership of their focus areas and are driven to solve problems at every level. Collaboration and partnership are at the foundation of our culture, and we need engineers who can communicate at a high level and work as a team toward achieving a common goal.


Your Impact

  • Build Terraform code and Terragrunt.
  • Build automation work using Python or go code.
  • Build BGP and networking monitoring/ remediation tools.
  • Engage with customers on escalations to provide remediation.
  • Software Architecture and Scalability.
  • Design and enhance software architecture to improve scalability in networking like BGP, OSPF, service reliability, capacity, and performance.
  • Collaborate with development teams to ensure applications align with infrastructure requirements, focusing on scalability and reliability.
  • Automation and Infrastructure Provisioning
  • Write automation code for provisioning and operating infrastructure at a massive scale.
  • Work with Dev/QA teams to build pipelines and automation for delivering and deploying applications to production.
  • On-call Support and Incident Resolution
  • Participate in occasional on-call rotations to support the infrastructure.
  • Investigate incidents, formulate hypotheses, and identify root causes to solve issues promptly.
  • Write postmortem reviews and provide remediation recommendations.
  • Cross-Functional Collaboration
  • Provide technical assistance to Systems Administrators (SA), Systems Engineers (SE), Customer Support (CS), and Professional Services (PS) teams regarding the product.
  • Identify missing product features and communicate them to the Product Management (PM) teams.
  • Customer Interaction and Collaboration
  • Work with external parties and clients, participating in Proof of Concept (POC) and Proof of Value (POV) activities with SEs and SASE architects for customers
  • Conduct customer training sessions and technical webinars.
  • Identify gaps and collaborate with PMs to make features accessible to customers.
  • Continuous Improvement
  • Collaborate with PMs to characterize new features and establish a vision for the product's evolution.
  • Actively seek ways to enhance the infrastructure, streamline processes, and improve overall system efficiency.


Your Experience

  • A bachelor's or higher degree in computer science, engineering, or a related field or equivalent military experience is required.
  • CCIE in switching, and routing.
  • Strong knowledge of IPv6, and Nat64, IPv6 subnetting.
  • Proven experience in designing, implementing, and maintaining scalable and reliable infrastructure.
  • Strong proficiency in automation scripting and infrastructure as code (IaC).
  • Excellent problem-solving skills and the ability to troubleshoot complex issues.
  • Effective communication skills, both written and verbal.
  • Experience working in collaborative, cross-functional environments.
  • Demonstrated ability to lead and mentor teams.
  • Python/Go programming.


  • Santa Clara, United States Veear Full time

    Position: Site Reliability Engineer Location: Remote role Duration: 12+ Months Contract with possible extension Job Description: We seek development-heavy Site Reliability Engineers to design, build, maintain, and scale production services and server farms within our FedRAMP SASE product portfolio. We want passionate engineers who bring new ideas to all...


  • Santa Clara, United States VeeAR Projects Inc. Full time

    Position: Site Reliability EngineerLocation: Remote roleDuration: 12+ Months Contract with possible extensionJob Description: We seek development-heavy Site Reliability Engineers to design, build, maintain, and scale production services and server farms within our FedRAMP SASE product portfolio. We want passionate engineers who bring new ideas to all facets...


  • Santa Clara, United States NVIDIA Full time

    NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and outstanding people. Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers,...


  • Santa Clara, United States Centrify Corporation Full time

    Our software runs on public clouds with 99.9% or better uptime and is mission critical for our customers. Our cloud operations team is where the rubber meets the road and needs innovative Site Reliability Engineers. Join a professional team of smart and hard-working professionals building enterprise-class cloud-based services in the rapidly growing market of...


  • Santa Clara, California, United States Promote Project Full time

    About Promote Project: Promote Project is a leader in innovative technology solutions, dedicated to pushing the boundaries of what is possible in the realm of artificial intelligence and cloud computing. Our commitment to excellence is reflected in our talented workforce and our pursuit of groundbreaking advancements.Position Overview: We are seeking a...


  • Santa Clara, California, United States Promote Project Full time

    About the Company: Promote Project is at the forefront of innovation, leveraging cutting-edge technology to redefine the landscape of AI and computing. Our mission is to harness the power of advanced computing to create transformative solutions that impact various industries.Position Overview: We are seeking a Manager of Site Reliability Engineering to...


  • Santa Clara, California, United States Palo Alto Networks Full time

    Job OverviewCompany OverviewTo comply with U.S. federal government requirements, U.S. citizenship is required for this position.Our MissionAt Palo Alto Networks, our mission is clear:To be the cybersecurity partner of choice, safeguarding our digital existence.We envision a world where each day is safer and more secure than the last. Our foundation is built...


  • Santa Clara, California, United States ServiceNow Full time

    Company OverviewAt ServiceNow, we harness technology to create a better world for everyone, driven by our talented workforce. We prioritize speed and innovation to meet the demands of our customers and communities.Joining ServiceNow means becoming part of a dynamic team of innovators who possess a relentless curiosity and a commitment to creativity.We...


  • Santa Clara, California, United States ServiceNow Full time

    Company OverviewAt ServiceNow, we harness technology to enhance global operations, and our dedicated workforce makes it all possible. We operate swiftly because the world demands it, innovating uniquely for our clients and communities.By becoming part of ServiceNow, you join a dynamic team of innovators who possess a relentless curiosity and a passion for...


  • Santa Clara, United States Diverse Lynx Full time

    Skills: Site Reliability Engineering (SRE), GIT(Bitbucket), Jenkins, AWS CodeBuild, AWS CodeDeploy Job Description: AWS application and CI/CD pipelines, Microsoft Server admin and workload support (Data center and AWS) •Initial responsibility is application platform promotion to controlled environments for test, staging, and production AWS accounts. o...


  • Santa Clara, United States Palo Alto Networks Full time

    Principal Site Reliability Engineer (SASE) Full-time Job Country: United States of America To comply with U.S. federal government requirements, U.S. citizenship is required for this position. Our Mission At Palo Alto Networks, everything starts and ends with our mission: being the cybersecurity partner of choice, protecting our digital way of life. Our...


  • Santa Clara, United States Geospatial And Cloud Analytics Inc Full time

    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline which demand knowledge across different systems, networking, coding, database,...


  • Santa Clara, California, United States Promote Project Full time

    About the Company: Promote Project is at the forefront of innovation, focusing on redefining technology and enhancing the capabilities of AI. We are dedicated to creating groundbreaking solutions that push the boundaries of what is possible in computing.Position Overview: We are seeking a Manager for Site Reliability Engineering to spearhead our cloud...


  • Santa Clara, United States Palo Alto Networks Full time

    Job Description Your Career The Global Customer Operation Team is responsible for building products that protect data, workloads, and infrastructure for some of the largest enterprise customers in the world. We help our customers in their journey to the public cloud by ensuring they have the best in class protection. The public cloud market has been...


  • Santa Clara, California, United States Centrify Corporation Full time

    **About Centrify Corporation**Centrify Corporation is a leading provider of cloud-based identity and access management solutions. Our software runs on public clouds with 99.9% or better uptime and is mission critical for our customers.**Job Summary**We are seeking a highly skilled Cloud Site Reliability Engineer to join our Cloud DevOps team. As a Cloud Site...


  • Santa Clara, United States Nvidia Full time

    Senior Site Reliability Engineer - StoragelocationsUS, CA, Santa Claratime typeFull timejob requisition idJR1979072NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and...


  • Santa Clara, United States Palo Alto Networks Full time

    Company Description Our Mission At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done,...


  • Santa Clara, California, United States Nvidia Full time

    NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables unique creativity and discovery, and powers what were...

  • Reliability Engineer

    4 weeks ago


    Santa Clara, United States Siri InfoSolutions Inc Full time

    Job DescriptionJob DescriptionReliability EngineerSanta Clara, California, United States (On-site)Job description:Work in the Board Level Reliability lab environment and setup functional test hardware and software for various products including large server systems and perform various functional tests for GPU/Tegra products.Generate script for automated test...

  • Reliability Engineer

    4 weeks ago


    Santa Clara, United States Siri InfoSolutions Inc Full time

    Job DescriptionJob DescriptionReliability EngineerSanta Clara, California, United States (On-site)Job description:Work in the Board Level Reliability lab environment and setup functional test hardware and software for various products including large server systems and perform various functional tests for GPU/Tegra products.Generate script for automated test...