Current jobs related to Lead Site Reliability Engineer - San Jose, California - VDart


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability EngineerAbout the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud services. You will work closely with our development team to design, deploy, and optimize our cloud services,...


  • San Jose, California, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our data pipelines.Key Responsibilities:* Debugging data pipelines* Monitoring alerts and troubleshooting...


  • San Jose, California, United States Tik Tok Full time

    About Team Site Reliability Engineering at TikTokTikTok's mission is to inspire creativity and bring joy. Our platform is built to help imaginations thrive, and our Site Reliability Engineering team plays a crucial role in making this happen.ResponsibilitiesDesign and implement software platforms and monitor frameworks for efficient, automated, and...


  • San Jose, California, United States HireIO Inc Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at HireIO Inc. As a Site Reliability Engineer, you will be responsible for designing and developing solutions to automate the technical operations of large-scale systems, working closely with teams to improve stability from a Software Development Lifecycle...


  • San Francisco, California, United States Outdefine Full time

    About the JobWe are seeking a highly skilled Site Reliability Engineer to join our team at Outdefine. As a key member of our engineering team, you will be responsible for ensuring the reliability, scalability, and performance of our ecommerce platform.Key ResponsibilitiesDesign and implement scalable and highly available cloud infrastructure using Kubernetes...


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability Engineering Manager, AI PlatformAbout the Role:We are seeking an experienced Site Reliability Engineering Manager to lead our AI Inference Platform team at Adobe. As a key member of our Engineering organization, you will be responsible for developing and implementing strategies to ensure the reliability, scalability, and security...


  • San Francisco, California, United States Roman Health Pharmacy LLC Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Xero. As a key member of our Reliability Enablement team, you will play a critical role in ensuring the reliability and performance of our systems.Key ResponsibilitiesInvestigate operational surprises and support teams in post-incident activitiesConduct in-depth...


  • San Jose, California, United States Tik Tok Full time

    Job SummaryTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. As a Site Reliability Engineer on our Compute Platform team, you will play a critical role in ensuring the reliability of all Big Data services and products across the company.Key Responsibilities Responsible for the reliability of...


  • San Francisco, California, United States YO HR CONSULTANCY Full time

    Job Title: Site Reliability EngineerJob Description:At YO HR CONSULTANCY, we are seeking a highly skilled Site Reliability Engineer to join our team.Key Responsibilities:* Extensive experience working with Linux flavors like RHEL/CentOS OS, shells, filesystems, and utilities* Knowledge of distributed computing and experience working with container...


  • San Francisco, California, United States GRNET Full time

    GRNET is seeking a highly skilled Site Reliability Engineer to join its team. As an SRE, you will be responsible for designing and implementing fault-tolerant, scalable, and distributed services. You will work closely with the team to bring your technical opinion and vision to the table, handle problems that require under-the-hood investigation, and lead...


  • San Jose, California, United States Tata Consultancy Services Full time

    Key Responsibilities:As a Site Reliability Engineer SRE at Tata Consultancy Services, you will be responsible for ensuring the reliability and scalability of our cloud infrastructure. This includes administration experience in Tableau, debugging data pipelines, and monitoring alerts using Grafana, Azure Metrics, Kusto Functions, and LogicApps. You will also...


  • San Jose, California, United States Tik Tok Full time

    Job SummaryTikTok is seeking a highly skilled Site Reliability Engineer to join our Compute Platform team. As a key member of our team, you will be responsible for ensuring the reliability and performance of our Big Data services and products. Responsibilities:Design and implement proactive measures to prevent service disruptions and ensure high availability...


  • San Jose, California, United States Altius Technologies, Inc. Full time

    Job SummaryAt Altius Technologies, Inc., we are seeking a highly skilled Site Reliability Performance Engineer to join our team. As a key member of our infrastructure team, you will be responsible for creating and supporting automation scripts for infrastructure deployments, validations, and monitoring. Your expertise in Ansible, Python, and monitoring tools...


  • San Jose, California, United States Adobe Full time

    Job Title: Site Reliability Engineer, AI Platform TrainingJob Summary: We are seeking a highly skilled Site Reliability Engineer to join our team at Adobe. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and security of our AI Platform.About the Role:* Identify and implement methodologies and solutions to...


  • San Leandro, California, United States Omni Inclusive Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Omni Inclusive. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our Digital Sales & Marketing platforms.Key Responsibilities:Collaborate with Engineering teams to maintain the...


  • San Leandro, California, United States Omni Inclusive Full time

    About the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Omni Inclusive. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, performance, and availability of our Digital Sales & Marketing platforms.Key Responsibilities:Design, implement, and maintain scalable and efficient systems to...


  • San Jose, California, United States Adobe Full time

    Transforming Digital Experiences with AdobeWe're a company that's passionate about empowering people to create beautiful and powerful digital experiences. Our mission is to give everyone the tools they need to design and deliver exceptional experiences across every screen.The OpportunityWe're seeking an exceptional Site Reliability Engineering Manager to...


  • San Francisco, California, United States Withorb Full time

    About UsOrb is a cutting-edge technology company on a mission to revolutionize the way businesses approach revenue growth. Our team is passionate about building a robust infrastructure that enables our customers to unlock their full potential.Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our...


  • San Jose, California, United States Zscaler Full time

    About ZscalerZscaler is a leading cloud security company that serves thousands of enterprise customers worldwide, including 40% of Fortune 500 companies. Our mission is to make the cloud a safe place to do business and a more enjoyable experience for enterprise users. As the operator of the world's largest security cloud, Zscaler accelerates digital...


  • San Francisco, California, United States Orb Full time

    About the RoleOrb is seeking a skilled Site Reliability Engineer to join our team. As a key member of our engineering organization, you will play a critical role in maintaining and scaling our robust infrastructure, ensuring stability, scalability, and performance.You will be responsible for tackling complex engineering challenges, from scaling our data...

Lead Site Reliability Engineer

1 month ago


San Jose, California, United States VDart Full time
Job Title: Lead Site Reliability Engineer Location: San Jose, CA (2 Days Hybrid) Duration: 6+ months Job Description: Experience Desired: 14+ Years. Responsibilities: We are seeking a highly skilled and dynamic Site Reliability Engineer to join our team. In this role, you will be responsible for maintaining and improving the reliability, performance, and availability of software systems. Key Responsibilities: * Creating and supporting automation scripts for infrastructure deployments, validations, and monitoring to improve operational tasks * Scheduling monitoring scripts using cron and airflow * Monitoring using tools including Dynatrace, Apica, Grafana * Database handling * Build CICD pipelines * Incident handling and problem management Mandatory Skills: * Experience in Ansible/Python * Monitoring Tools - Dynatrace/Apica/Grafana Required Education: Bachelor's degree in computer science or a related field. Required Experience: * 14 plus years of IT Infrastructure experience * Extensive experience working with Linux flavors like RHEL/CentOS OS, shells, filesystems, and utilities * Experience in programming languages like Python, Ansible * Knowledge of distributed computing and experience working with container orchestration frameworks including on-prem and Rancher Kubernetes * Experience working with Storage, ONTAP is preferable: volume, aggregates, backups, DR planning * Experience scheduling monitoring scripts using cron and airflow * Experience with monitoring tools including Dynatrace, Apica, Grafana etc * Database knowledge including SQL and NoSQL DBs * Experience building CICD pipelines (preferred) * Cloud platform knowledge (specifically AWS) is required Key Skills: SRE, AWS, Python, Monitoring Tools - Dynatrace/Apica/Grafana