See more Collapse

Sr. Manager of SRE Operations

3 months ago


San Jose, United States Zededa Full time

ZEDEDA is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control and security for the distributed edge with the freedom of deploying and managing any app on any hardware at scale and connecting to any cloud or on-premises system. With ZEDEDA customers can seamlessly manage and deploy any compute node to instantly unlock the value of IoT data, make real-time decisions, maximize operational efficiency and drive new business outcomes. We are looking for an experienced Senior Site Reliability Engineer (SRE) who is seeking new challenges and wants to make their mark by contributing to the design and upkeep of an exciting start-up.

Reporting to the VP of Engineering, the Sr. Manager of SRE Operations is responsible for ensuring the availability of our SaaS platform and exceeding the uptime and performance requirements of our Fortune 500 customers. Together with the SRE Operations team you will implement processes and procedures that will ensure meeting the quality and predictability of disaster recovery, performance monitoring and alerting as well as reporting. ZEDEDA is ISO27001 and SOC2 certified which means that incidents need to be handled according to those standards. Being the lead of the team you will play a key role in ensuring the team performs beyond expectations and assists in growing the team. On-call responsibility is part of the role as well as implementing a strategy that supports 24 x 7 x 365 availability of the SRE Operations team, additionally you will be the initial escalation point for incidents and are responsible for ensuring they get resolved by including other teams if needed.

You will work with the SRE Technical Lead and team, as well as other groups in engineering to suggest and implement improvements for operating the platform. Regular reporting on the performance of the platform to upper management is expected. This is a hands-on role and you will perform your duties as part of the SRE Operations Team. You will interface with the Customer Experience Organization and when required meet with our customers. You are an energetic self-starter fully committed to our customers' success by putting yourself in our customer's shoes and constantly striving to make sure they can use our product at all times, by,

* Creating ecstatic customers
* Ensuring frictionless deployments
* Escalation management
* On-call duties
* Radiate energy and enthusiasm
* Be a (technical) leader to the team

Qualifications

* MS Computer Science, Information Technology or similar experience
* 10+ years experience in SRE, with 5+ years experience in a SRE Operating Lead role
* Leadership qualities and aspirations
* Project and escalation management skills
* Proven technical writing skills
* Excellent communication and written skills (English)

Requirements

* An infrastructure with global presence in USA, EMEA, China and GovCloud
* A large, complex, infrastructure with 20+ SaaS instances, 500+ VMs, 100+ databases, 10+ logging services
* Meeting SLOs and creating robust and insightful metrics for large infrastructures and multiple SaaS instances
* Capacity planning of a complex solution with 50k+ connected devices
* Continuously driving cost down to maintain a competitive advantage
* Managing a successful 24x7x365 on-call team and being point of escalation Implementing a structured incident management approach from the start of incident, resolution to root cause analyses.
* Industry standards compliance, ISO-27001, SOC-2
* Strong leadership skills with ability to coach and hIre A-players, and foster a culture of continuous improvement and automation.
* Putting security at the center of everything you do.
* Hands-on knowledge of: AWS, Azure or GCP
* Terraform, Ansible
* Python, Shell script(managed) Kubernetes, ArgoCD
* GitOps, Jenkins, Github Actions
* Datadog, Grafana Stack and Open Telemetry
* PostgreSQL, Redis, Hashicorp Vault, InfluxDB and Open Search
* Lacework, Blameless, Vanta

Pay & Benefits

Zededa's main compensation philosophy is to provide you with the opportunity to progress as you grow and develop with the company. The base pay range, dependent on your skills, qualifications, experience and location for this role is between $175,000 and $200,000, and will also include commission, equity and benefits components to round out your total compensation.


We have other current jobs related to this field that you can find below


  • San Jose, United States ZEDEDA Full time

    ZEDEDA is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control and security for the distributed edge with the freedom of deploying and managing any app on any hardware at scale and connecting to any cloud or on-premises system. With ZEDEDA customers can seamlessly manage and deploy any compute node to instantly...


  • San Jose, United States ZEDEDA Full time

    ZEDEDA is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control and security for the distributed edge with the freedom of deploying and managing any app on any hardware at scale and connecting to any cloud or on-premises system. With ZEDEDA customers can seamlessly manage and deploy any compute node to instantly...


  • San Jose, California, United States ZEDEDA Full time

    ZEDEDA is a simple and scalable cloud-based IoT edge orchestration solution that delivers visibility, control and security for the distributed edge with the freedom of deploying and managing any app on any hardware at scale and connecting to any cloud or on-premises system. With ZEDEDA customers can seamlessly manage and deploy any compute node to instantly...

  • SRE Tech Lead

    2 months ago


    San Francisco, United States talentslab.io Full time

    Our client is seeking a talented SRE Tech Lead to offer leadership and supervision over the development and automation of the diverse services within the Graph ecosystem. This role involves being an active member of the SRE team, contributing hands-on expertise. You will be instrumental in enhancing availability and reliability across multiple engineering...

  • SRE Tech Lead

    2 months ago


    San Francisco, United States talentslab.io Full time

    About the job SRE Tech Lead Our client is seeking a talented SRE Tech Lead to offer leadership and supervision over the development and automation of the diverse services within the Graph ecosystem. This role involves being an active member of the SRE team, contributing hands-on expertise. You will be instrumental in enhancing availability and reliability...

  • SRE Tech Lead

    2 months ago


    San Francisco, United States talentslab.io Full time

    Job DescriptionJob DescriptionOur client is seeking a talented SRE Tech Lead to offer leadership and supervision over the development and automation of the diverse services within the Graph ecosystem. This role involves being an active member of the SRE team, contributing hands-on expertise. You will be instrumental in enhancing availability and reliability...


  • San Jose, United States Visionary Innovative Technology Solutions LLC Full time

    Position: SRE Product Reliability EngineerLocation: San Jose, CA – Hybrid (3 days in office, 2 days remote) Duration: 6+ MonthJob Description:Must have EKS (Elastic Kubernetes Services) and AWSKey Responsibilities:Design and Implementation: Develop and implement observability solutions for Kubernetes based applications using Fluent bit, Cloud Watch, Stack...


  • San Jose, United States Visionary Innovative Technology Solutions LLC Full time

    Position: SRE Product Reliability EngineerLocation: San Jose, CA – Hybrid (3 days in office, 2 days remote) Duration: 6+ MonthJob Description:Must have EKS (Elastic Kubernetes Services) and AWSKey Responsibilities:Design and Implementation: Develop and implement observability solutions for Kubernetes based applications using Fluent bit, Cloud Watch, Stack...


  • San Jose, United States Centraprise Full time

    Position: SRE with Kubernetes certified and LinuxLocation: San Jose, CA (Onsite, Hybrid Mode)Job Type: FulltimeUSC OnlyMust Have Technical/Functional Skills:Docker, Kubernetes, Ansible, Python, Shell scripting, etc.Roles & Responsibilities:Extensive experience working with linux flavors like rhel/centos os, shells, file systems and utilitiesKnowledge of...


  • San Jose, United States Centraprise Full time

    Position: SRE with Kubernetes certified and LinuxLocation: San Jose, CA (Onsite, Hybrid Mode)Job Type: FulltimeUSC OnlyMust Have Technical/Functional Skills:Docker, Kubernetes, Ansible, Python, Shell scripting, etc.Roles & Responsibilities:Extensive experience working with linux flavors like rhel/centos os, shells, file systems and utilitiesKnowledge of...

  • Staff SRE Engineer

    6 days ago


    San Francisco, United States Operant AI Full time

    Job DescriptionJob DescriptionStaff SRE EngineerAs our first SRE hire, we are seeking someone to build out Operant's SRE roadmap and functions that help keep our platforms and services resilient and secure. If you are excited by the idea of being an early engineer at a startup that is poised to lead the next wave of application security products, we...

  • Software Engineer

    2 months ago


    San Francisco, United States Plaid Full time

    [Full Time] Software Engineer - Infrastructure (SRE) at Plaid (United States) | BEAMSTART Jobs Software Engineer - Infrastructure (SRE) Plaid United States Date Posted 06 Jul, 2022 Work Location San Francisco, United States Salary Offered Not Specified Job Type Full Time Experience Required 3+ years Remote Work No Stock Options Yes Vacancies 1...

  • Sr. Inventory Manager

    2 months ago


    San Jose, United States Antora Energy Full time

    Job DescriptionJob DescriptionAt Antora, we're on a mission to stop climate change. And we can't do that unless we tackle the 30% of global emissions that come from industry.Antora is unlocking zero-emissions industrial energy, cheaper than fossil fuels. Antora's thermal batteries store energy from renewables as heat for days on end, delivering...

  • Sr. Manager, NetSuite

    4 weeks ago


    San Jose, United States Nutanix Full time

    Job DescriptionHungry, Humble, Honest, with Heart.The OpportunityAre you a highly skilled and dedicated NetSuite professional with a passion for leading technical teams? If so, join Nutanix as a Sr. Manager, NetSuite where you will have the opportunity to optimize user adoption, configure solutions, spearhead strategic initiatives, and contribute to...

  • Sr. Inventory Manager

    17 hours ago


    San Jose, United States Antora Energy Full time

    Job DescriptionJob DescriptionAt Antora, we're on a mission to stop climate change. And we can't do that unless we tackle the 30% of global emissions that come from industry.Antora is unlocking zero-emissions industrial energy, cheaper than fossil fuels. Antora's thermal batteries store energy from renewables as heat for days on end, delivering...

  • Sr. Manager, NetSuite

    1 month ago


    San Jose, United States Nutanix Full time

    Hungry, Humble, Honest, with Heart. The Opportunity Are you a highly skilled and dedicated NetSuite professional with a passion for leading technical teams? If so, join Nutanix as a Sr. Manager, NetSuite where you will have the opportunity to optimize user adoption, configure solutions, spearhead strategic initiatives, and contribute to operational...

  • Sr. Manager

    4 weeks ago


    San Leandro, California, United States Cognizant Full time

    Location: San Leandro, CA Role: Software Engineering Cognizant Technology Solutions is looking for Software Engineering to join in our team of IT professionals in a permanent role. If you meet our background requirements and skills and are looking for an opportunity with these skills and expertise, here is the ideal opportunity for you This role is with the...

  • Senior Staff Engineer

    3 weeks ago


    San Jose, California, United States GEICO Full time

    ​​Distinguished Engineer - Network and Server Hardware SRE​​ Position Summary ​​GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and applications. You will help drive our insurance business transformation as we transition from a traditional IT model to a tech...

  • Sr. Network Engineer

    2 weeks ago


    San Antonio, United States Élan Partners Full time

    TITLE: Sr. Network Engineer (Greater San Antonio, TX area) Direct Hire Opportunity No Sponsorship Onsite Our client is seeking a Sr. Network Engineer with advanced firewall knowledge to join their growing, fast-paced company. Our client is located near the San Antonio area and meets onsite 5 days a week. QUALIFICATIONS 5+ years experience in managing...


  • San Jose, United States AECOM Full time

    Job Description AECOM is seeking a Project Management Sr Manager to be based in San Jose, CA. • Plans, directs, and supervises all operations included in moderately sized projects with moderate risk, complexity, and financial impact. • Manages and leads all technical, financial, and client satisfaction areas using consistent processes and tools. • May...