Senior Site Reliability Engineer

2 weeks ago


San Jose, California, United States VDart Inc Full time
Job Overview

Position: Lead Site Reliability Engineer

Location: San Jose, CA (Hybrid Work Model)

Contract Duration: 6+ months

Experience Required: 14+ Years

Role Summary:

We are in search of a highly experienced and proactive Site Reliability Engineer Consultant. In this pivotal role, you will be responsible for:

Key Responsibilities:

  • Enhancing the reliability, performance, and uptime of software systems.
  • Acting as a liaison between traditional IT operations and software development, applying a software engineering mindset to system administration.
  • Developing and maintaining automation scripts (shell/ansible/python) for infrastructure deployment, validation, and monitoring to streamline operational tasks.
  • Scheduling monitoring scripts using cron and airflow.
  • Utilizing monitoring tools such as Dynatrace, Apica, and Grafana for system oversight.
  • Managing databases effectively.
  • Constructing CI/CD pipelines.
  • Handling incidents and managing problems efficiently.

Essential Skills:

  • Proficiency in Ansible and Python.
  • Experience with monitoring tools including Dynatrace, Apica, and Grafana.

Educational Background:

A Bachelor's degree in Computer Science or a related discipline is required.

Professional Experience:

  • Over 14 years of experience in IT Infrastructure.
  • Extensive knowledge of Linux distributions such as RHEL/CentOS, including shells, filesystems, and utilities.
  • Strong programming skills in Python and Ansible.
  • Familiarity with distributed computing and experience with container orchestration frameworks, including on-premises and Rancher Kubernetes, along with a solid understanding of Kubernetes objects.
  • Experience with storage solutions, preferably ONTAP, including volume management, aggregates, backups, and disaster recovery planning.
  • Proficient in scheduling monitoring scripts using cron and airflow.
  • Database expertise, including SQL and NoSQL databases.
  • Experience in building CI/CD pipelines is preferred.
  • Knowledge of cloud platforms, specifically AWS, is essential.

Key Competencies:

SRE, AWS, Python, Monitoring Tools (Dynatrace, Apica, Grafana)



  • San Jose, California, United States VDart Inc Full time

    Job OverviewPosition: Lead Site Reliability EngineerLocation: San Jose, CA (Hybrid Work Model)Contract Duration: 6+ monthsExperience Required: 14+ YearsRole Summary:We are in search of a highly experienced and proactive Site Reliability Engineer Consultant. In this capacity, you will be responsible for:Key Responsibilities:Enhancing the reliability,...


  • San Jose, California, United States Adobe Full time

    Site Reliability Engineer page is loadedAdobe's Reliability Engineering team is looking for a Site Reliability Engineer (SRE) to help build and operate services like Adobe Sign. Adobe Sign is the fastest, and easiest way to get contracts signed and filed.You have a track record as a site reliability engineer in large-scale SaaS businesses, and a strong...


  • San Jose, California, United States Zscaler Full time

    About ZscalerAt Zscaler, our Engineering team has developed the largest cloud security platform globally, and we continue to innovate. With over 100 patents and ambitious plans for service enhancement and global expansion, our team has established us as a leader in cloud security, serving more than 15 million users across 185 countries. We invite you to...


  • San Jose, California, United States Zscaler Full time

    About UsZscaler has developed the world's largest cloud security platform, continually innovating and expanding our services. With a robust portfolio of over 100 patents and ambitious plans for global growth, our team has established itself as a leader in cloud security, serving more than 15 million users across 185 countries. We are looking for talented...


  • San Jose, California, United States Zscaler Full time

    About ZscalerAt Zscaler, our Engineering team has developed the largest cloud security platform globally, and we continue to innovate. With over 100 patents and ambitious plans for service enhancement and global expansion, our team has established us as the leader in cloud security, serving more than 15 million users across 185 countries. We invite you to...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS provider and a prominent leader in the Salesforce DevSecOps platform tailored for regulated sectors such as finance, insurance, and healthcare. Our solutions empower developers to streamline their daily operations, enhancing productivity and accelerating release cycles while adhering to...


  • San Diego, California, United States Dexcom Full time

    About Dexcom:Founded in 1999, Dexcom, Inc. (NASDAQ: DXCM) is a pioneer in the development and marketing of Continuous Glucose Monitoring (CGM) systems designed for use by individuals with diabetes and healthcare professionals. As a leader in the transformation of diabetes management, Dexcom is committed to providing innovative CGM technology that empowers...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS company recognized as the premier provider of Salesforce DevSecOps solutions tailored for regulated sectors such as finance, insurance, and healthcare. Our offerings empower developers to streamline their daily operations, enhancing productivity and accelerating release cycles while adhering...


  • San Francisco, California, United States RevenueCat Full time

    About RevenueCatWe are a leading provider of mobile subscription infrastructure, handling over $3 billion in in-app purchases annually across thousands of apps. Our mission is to build a standard for mobile subscription infrastructure, and we're looking for a Senior Site Reliability Engineer to help us achieve this goal.About the RoleWe're seeking a highly...


  • San Francisco, California, United States Outdefine Full time

    About the JobWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Outdefine. As a key member of our Infrastructure team, you will be responsible for ensuring the reliability and scalability of our blockchain-based systems.Key ResponsibilitiesRun internal Chainlink and Blockchain nodes to ensure seamless connectivity and data...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that serves thousands of enterprise customers worldwide, including 40% of Fortune 500 companies. Founded in 2007, our mission is to make the cloud a safe and secure place for businesses to operate. As the operator of the world's largest security cloud, we accelerate digital transformation for...


  • San Francisco, California, United States Centene Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Centene. As a key member of our technology organization, you will play a critical role in ensuring the reliability, performance, and security of our platform infrastructure.Key ResponsibilitiesLead Projects and Initiatives: Help lead projects focused on...


  • San Francisco, California, United States Operant AI Full time

    Job OverviewSenior Site Reliability EngineerAs the inaugural SRE within our organization, we are looking for an individual to establish Operant's SRE strategy and operations aimed at ensuring the resilience and security of our platforms and services. If you are enthusiastic about the prospect of being an early engineer at a startup ready to revolutionize...


  • San Jose, California, United States Western Digital Full time

    Job OverviewCompany Overview:At Western Digital, we are committed to enhancing your digital life. Our Advanced Reliability Engineering (ARE) team is dedicated to pioneering and implementing cutting-edge reliability assurance techniques that span the entire product lifecycle for our Hard Drive offerings. We work closely with various departments, including...


  • San Jose, California, United States Antora Energy Full time

    About Antora EnergyAt Antora, we are dedicated to combating climate change by addressing the 30% of global emissions attributed to industrial processes. Our innovative approach focuses on providing zero-emissions industrial energy that is more affordable than fossil fuels. Our thermal batteries harness renewable energy, storing it as heat for extended...


  • San Jose, California, United States Antora Energy Full time

    About Antora EnergyAt Antora, we are dedicated to combating climate change by addressing the 30% of global emissions generated by industrial activities. Our innovative approach involves harnessing zero-emissions industrial energy that is more affordable than fossil fuels. Our thermal batteries capture renewable energy as heat, maintaining it for extended...


  • San Jose, California, United States Antora Energy Full time

    Company OverviewAt Antora Energy, we are dedicated to combating climate change by addressing the significant industrial emissions that contribute to global warming. Our innovative approach focuses on providing zero-emissions industrial energy solutions that are more cost-effective than traditional fossil fuels.About Our TechnologyAntora's cutting-edge...


  • San Francisco, California, United States Circle Full time

    About CircleCircle is a leading financial technology company that is revolutionizing the way value is transferred globally. Our innovative infrastructure enables businesses, institutions, and developers to harness the power of blockchain technology and capitalize on the emerging internet of money.Job SummaryWe are seeking a highly skilled Senior Site...


  • San Jose, California, United States TikTok Full time

    About TikTokTikTok stands as the premier platform for short-form mobile video, dedicated to fostering creativity and delivering joy to users worldwide.Our MissionAt TikTok, we believe in the power of creation. Our platform is designed to empower imaginative minds, and this philosophy extends to our teams who make TikTok a reality. Together, we strive to...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security platform provider, offering a comprehensive suite of solutions to protect businesses from cyber threats. Our team of experts has built a robust platform that enables organizations to harness the power of the cloud while ensuring the security and integrity of their data.Job SummaryWe are seeking an experienced...