Senior Site Reliability Engineer

2 days ago


Palo Alto, California, United States Rubrik Full time
About the Role

Rubrik is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key member of our engineering team, you will be responsible for ensuring the high availability and durability of our databases, establishing best practices for internal teams to write performant SQL queries, and performing periodic database upgrades with minimal downtime for our customers.

Key Responsibilities
  • Design, implement, and maintain relational database systems for performance and reliability
  • Manage and run backend systems like Kubernetes, MySQL, and everything in between
  • Drive reliability, availability, and efficiency improvements to Rubrik's Polaris Cloud Platform
  • Participate in on-call rotations across continents, using a follow-the-sun model
  • Write and review code, plan and execute upgrades, develop documentation and capacity plans, and debug production issues
  • Work cross-functionally with various engineering teams
  • Build monitoring tools and automation to increase efficiency of all teams
Requirements
  • 8+ years of experience designing and managing relational databases with a focus on performance, scalability, reliability, high-availability, and disaster recovery
  • Experience in database design and architecture supporting large enterprise customers with high SLO and SLA requirements
  • Experience operating database layer of a large scale SaaS product
  • Experience in one or more of the following: Golang, Python, Java, Scala, C++
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
  • Ability to debug and optimize code and automate routine tasks
  • Strong operational experience with Unix/Linux operating systems and networking
  • Experience with Google Cloud Platform or other public cloud technologies
  • Minimum 3-5 years of experience as a Development, DevOps, or Site Reliability Engineer
  • Willingness to provide 24/7 coverage
  • Strong documentation skills
  • Experience working with multiple departments and divisions within an organization
  • Strong understanding of databases is a definite plus
  • Experience leading support personnel
  • Experience with FedRAMP certification is strongly desired
Additional Requirements
  • Due to the criteria and security levels for Rubrik's FedRAMP program, this position will require the following:
  • U.S. citizenship at the time of hire
  • Residence within the contiguous United States (i.e., the lower 48 states and the District of Columbia)
  • Willingness to undergo a Single Source Background Investigation if required
Security and Privacy Responsibilities

This position carries special Security and Privacy Responsibilities for protecting the U.S. Federal Government's interests:

  • Know, acknowledge, and follow system-specific security policies and procedures
  • Protect data and individual privacy per requirements and regulations
  • Perform ongoing activities in compliance with service and contractual obligations
  • Participate in role-based training, completing assignments on a timely basis
  • Report security issues promptly, and aid investigation when needed
  • Support controlled changes and vulnerability remediation activities
  • Work collaboratively with Information Security in designing, implementing, assessing, or enhancing system-specific security and privacy controls
Position Risk Designation

This position carries duties and responsibilities involving the U.S. Federal Government's interests. The selected incumbent may be subject to one or both of the additional background checks with periodic re-screening as noted below:

Position Risk Designation: Non-Sensitive, Low Risk, Tier 1

Incumbents without access to U.S. Government data may be required to complete Standard Form 85 and undergo a Tier 1 Investigation (T1) for non-sensitive positions of Low Risk. (Baseline screening; formerly National Agency Check and Inquiries (NACI)).

Position Risk Designation: Non-Sensitive, Moderate Risk, Tier 2 (Public Trust)

Incumbents with access to U.S. Government data may be required to complete Standard Form 85P and undergo Tier 2 (T2) Investigation for non-sensitive positions designated Moderate Risk.

Position Risk Designation: Moderate Risk Law Enforcement (CJIS)

When hired for a position where access to Moderate Risk criminal justice information is required, the employee must complete a fingerprint-based national criminal history background check within 30 days after the employee's start date.



  • Palo Alto, California, United States Rubrik Full time

    About the RoleRubrik is seeking a Senior Site Reliability Engineer to join our team. As a Senior Site Reliability Engineer, you will be responsible for ensuring the high availability and durability of our databases, establishing best practices for internal teams to write performant SQL queries, and performing periodic database upgrades minimizing downtime...


  • Palo Alto, California, United States Rubrik Full time

    About the RoleRubrik is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a Senior Site Reliability Engineer, you will be responsible for ensuring the high availability and durability of our databases, establishing best practices for internal teams to write performant SQL queries, and performing periodic database upgrades with...


  • Palo Alto, California, United States X (formerly Twitter) Full time

    About XX is a global digital public square, committed to protecting freedom of speech and building the future of unlimited interactivity. Our mission is to empower every user to freely create and share ideas, fostering open public discourse without barriers.Job SummaryWe're seeking a highly motivated Senior/Staff CDN Site Reliability Engineer to join our...


  • Palo Alto, California, United States X (formerly Twitter) Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our Command Center Team at X (formerly Twitter). As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and reliability of our services, which are used by millions of users worldwide.Key ResponsibilitiesTriage and troubleshoot complex...


  • Palo Alto, California, United States Criteo Full time

    About the RoleCriteo is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key ResponsibilitiesDesign, implement, and maintain scalable and highly available systems to support our growing...


  • Palo Alto, California, United States Criteo Full time

    About the RoleCriteo is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and applications.Key ResponsibilitiesDesign, develop, and maintain scalable and reliable software systemsCollaborate with...


  • Palo Alto, California, United States Mistral AI Full time

    About Mistral AIMistral AI is a leading innovator in the field of open-source large language models. Our mission is to make AI ubiquitous and open, bridging the gap between technology and businesses of all sizes.Job SummaryWe are seeking a highly experienced Site Reliability Engineer to shape the reliability, scalability, and performance of our platform and...


  • Palo Alto, California, United States Criteo Full time

    About the RoleCriteo is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and applications.Key ResponsibilitiesDesign, develop, and maintain scalable and highly available systems and...


  • Palo Alto, California, United States General Motors Full time

    About the RoleAt General Motors, we're committed to innovation and excellence in all aspects of our business. As a Staff Site Reliability Engineer, you'll play a critical role in ensuring the reliability and scalability of our software systems. You'll work closely with cross-functional teams to design, implement, and maintain high-quality software solutions...


  • Palo Alto, California, United States General Motors Full time

    Job DescriptionAt General Motors, we are pioneering next-generation software solutions for commercial fleet owners and their drivers. As a Site Reliability Engineer, you will play a critical role in improving the reliability, scalability, and operability of our production system.Responsibilities:Lead the Site Reliability engineering effort to improve anomaly...


  • Palo Alto, California, United States General Motors Full time

    Job DescriptionAt General Motors, we're revolutionizing the automotive industry with software-defined vehicles. As a Site Reliability Engineer, you'll play a critical role in ensuring the reliability, scalability, and security of our production systems.ResponsibilitiesLead the Site Reliability engineering effort to improve anomaly detection, platform...


  • Palo Alto, California, United States Tesla Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our PLM Operations team at Tesla. As a key member of our team, you will be responsible for ensuring the reliability and performance of our 3DExperience services running on on-prem Kubernetes.Key ResponsibilitiesDefine Service Level Objectives (SLOs) around latency,...


  • Palo Alto, California, United States Rubrik Full time

    About The Role:Rubrik is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and availability of our cloud-based data security platform. You will work closely with our development team to identify and resolve issues, and collaborate with our operations team...


  • Palo Alto, California, United States X (formerly Twitter) Full time

    About XX is a global digital public square, committed to protecting freedom of speech and building the future of unlimited interactivity. Our mission is to empower every user to freely create and share ideas, fostering open public discourse without barriers.Job SummaryWe are seeking a highly motivated CDN Site Reliability Engineer to join our Edge Services...


  • Palo Alto, California, United States Tesla Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to join our PLM Operations team at Tesla. As a key member of our team, you will be responsible for ensuring the reliability and performance of our engineering design tools, which are used by thousands of Tesla engineers worldwide.ResponsibilitiesDefine Service Level Objectives (SLOs)...


  • Palo Alto, California, United States Rubrik Full time

    About The RoleRubrik is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and durability of our databases, as well as establishing best practices for internal teams to write performant SQL queries.Key ResponsibilitiesEnsure high availability and...


  • Palo Alto, California, United States Rubrik Full time

    About the RoleRubrik is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and durability of our databases, establishing best practices for internal teams to write performant SQL queries, and performing periodic database upgrades with minimal...


  • Palo Alto, California, United States Tesla Full time

    Job Title: Senior Mechanical Reliability Engineer, MegapackAt Tesla, we're looking for a highly skilled Senior Mechanical Reliability Engineer to join our team and play a key role in designing reliability into our industrial energy storage systems. As a Senior Mechanical Reliability Engineer, you will be responsible for ensuring our products meet the highest...


  • Palo Alto, California, United States Plume Design Inc Full time

    Job Title: Technical Manager, Site Reliability EngineeringWe're seeking a seasoned Technical Manager with expertise in Customer Facing environments to lead our Site Reliability Engineering Team. This team focuses on deployments, fixes, and sustainability. The ideal candidate will have strong technical knowledge in key areas while prioritizing customer...


  • Palo Alto, California, United States Rubrik Full time

    About The RoleThe Rubrik Engineering team is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and durability of our databases, as well as establishing best practices for internal teams to write performant SQL queries.Key ResponsibilitiesEnsure high...