Manager, Site Reliability Engineering

1 week ago


Oakland, United States Fivetran Full time

Overview From Fivetrans founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonical and ready to query, with no engineering or maintenance required. Were proud that more organizations continue to leverage our technology every day to become truly data-driven. About the Role Fivetran is building data pipelines to power the modern data stack for thousands of companies. As a Manager of Site Reliability Engineering, you will take on the responsibility for the Serbia-based group of SRE Engineers. Together with other SRE managers and engineers in Ireland, India, and the US, you will take ownership of the reliability of Fivetran's service, including building and monitoring repeatable infrastructure, reliability, and robustness of the continuously deployed release pipeline, as well as timely and effective incident response and resolution. You will co-own the responsibility for the scalability and reliability of Fivetran's connector infrastructure on AWS, GCP, and Azure. You will bring together and grow a Serbia-based team that reliably delivers excellent results while maintaining a culture of strong collaboration, engagement, and continuous improvement. This is a full-time position based out of our Novi Sad office. Our hybrid work model offers a blend of remote flexibility and in-person collaboration, including two days in the office each week to connect and build as a team. Technologies Youll Use Cloud Providers : AWS, Azure, Google Cloud Platform (GCP) Kubernetes : EKS, AKS, GKE (managed services) CI/CD : Buildkite, ArgoCD Databases : PostgreSQL, Cloud Datastore Programming Languages : Go, Java Scripting : Python, Shell Infrastructure as Code (IaC) : Terraform, Pulumi API Frameworks : FastAPI (RESTful APIs) Cloud Networking : PrivateLinks (AWS, Azure), Private Service Connect (GCP), site-to-site VPNs across major cloud providers Monitoring & Observability : Grafana What Youll Do Leadership and Talent Management Build, hire, and plan the growth of the Serbia-based SRE organization Help engineers advance in their careers; Actively guide and coach them Set clear expectations and create a positive work environment based on accountability Establish strong global and cross-team relationships with product, field, software teams, and the other SRE teams around the world SRE Subject Matter Expertise Drive initiatives that improve service reliability, scalability, and performance through automation, observability, and proactive problem-solving Advocate for simple, elegant, and easily scalable system design Support new services before they go live through activities such as system design consulting/review, capacity planning, and launch reviews Ability to be hands-on and willing to act as player-coach in SRE areas such as IaC, Observability & Alerting, and Release Management Demonstrate strong accountability for infrastructure cost management Optimize our continuous integration and deployment process, striving for safe, frequent, and automated releases Oversee incident management practices, ensuring timely response, effective/blameless postmortems, and systemic improvements Stay current with emerging technologies, tools, and industry best practices relevant to reliability engineering Skills Were Looking For Experience in managing or leading a Site Reliability Engineering (SRE), DevOps, or Infrastructure Engineering team operating in a public cloud at scale Demonstrate significant working knowledge of Continuous Integration and Deployment processes and tooling Proven experience in cloud-based infrastructure design and IaC Strong understanding and experience in security control design, implementation, and operations Solid technical working experience on AWS, GCP, or Azure, distributed systems, networking, and container orchestration(Kubernetes) Deep understanding of reliability concepts, including monitoring/observability, capacity planning, and disaster recovery Experience leading incident response, root cause analysis, and reliability-focused postmortems. Familiarity with cost optimization strategies in large-scale cloud environments Excellent leadership, communication, and stakeholder management skills Ability to iterate in the context of an evolving service environment Experience in managing changes and getting buy-in from the organization A passion for SRE/DevOps and running highly resilient/automated systems (Optional) Bonus Skills Knowledge of compliance and security practices in production environments (SOC2, ISO27001, etc.). Experience with multi-cloud support Hands-on coding/scripting experience in languages such as Python, Go, or Java. #LI-HYRBID #LI-IM1 Perks and Benefits 100% employer-paid medical insurance Generous paid time-off policy (PTO), plus paid sick time, inclusive parental leave policy, holidays, and volunteer days off RSU stock grants Professional development and training opportunities Company virtual happy hours, free food, and fun team-building activities Monthly cell phone stipend Access to an innovative mental health support platform that offers personalized care and resources in areas such as: therapy, coaching, and self-guided mindfulness exercises for all covered employees and their covered dependents. Equal Opportunity Employer, including disability/protected veterans PI #J-18808-Ljbffr



  • Oakland, United States Fivetran Full time

    Overview Staff Site Reliability Engineer From Fivetrans founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonical and ready to query, with no engineering or maintenance required. Were proud that more organizations continue to...


  • Oakland, United States Fivetran Full time

    Overview From Fivetrans founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonical and ready to query, with no engineering or maintenance required. Were proud that more organizations continue to leverage our technology every day to...


  • Oakland, United States Blue Shield of California Full time

    Your Role We are seeking an Experienced Site Reliability Engineer (SRE) to lead reliability, scalability, and performance initiatives across our production systems. In this role, you will blend software engineering, automation, and systems operations to ensure that our platforms are resilient, efficient, and continuously improving.You will be part of a...


  • Oakland, United States Theory Ventures Full time

    From Fivetran’s founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonical and ready to query, with no engineering or maintenance required. We’re proud that more organizations continue to leverage our technology every day to...


  • Oakland, United States Fivetran Full time

    From Fivetran’s founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonical and ready to query, with no engineering or maintenance required. We’re proud that more organizations continue to leverage our technology every day to...


  • Oakland, United States Fivetran Full time

    About the Role Fivetran is looking for a high-performance, experienced engineer to be a part of a team of Site Reliability Engineers. You will be working closely with engineering teams, product managers, as well as support and sales engineers to build the future of the Fivetran Data Platform Reliability. As a member of the Site Reliability Engineering team,...


  • Oakland, California, United States, AMER Fivetran Full time

    About the RoleFivetran is looking for a high-performance engineer to be a part of a team of Site Reliability Engineers. You will be working closely with engineering teams, product managers, as well as support and sales engineers to build the future of the Fivetran Data Platform Reliability. As a member of the Site Reliability Engineering team, you will...


  • Oakland, United States Fivetran Full time

    A leading data integration company is seeking a Site Reliability Engineer to ensure the performance and reliability of their infrastructure. You will collaborate with various teams to enhance the Fivetran Data Platform and drive reliability into the product roadmap. The role is full-time and offers a hybrid work model, blending remote flexibility with...


  • Oakland, United States Theory Ventures Full time

    A technology company in Oakland is seeking a Site Reliability Engineer to ensure the ongoing reliability of its production infrastructure. The ideal candidate will have over 5 years of experience with SaaS products, a strong knowledge of managed Kubernetes, and proficiency in Python and cloud platforms. This is a full-time hybrid position, allowing...

  • Senior Project

    1 week ago


    Oakland, United States Terraphase Engineering Inc. Full time

    Apply for the Senior Project - Associate Engineer (Stormwater/Civil Engineering Focus ) role at Terraphase Engineering Inc. Based in Oakland or Pleasanton, CA, this position offers the opportunity to work on a broad range of civil and environmental projects, delivering engineering deliverables and managing project execution. Key Responsibilities Primary or...