Senior Site Reliability Engineer, PLM Operations

1 week ago


Palo Alto, California, United States Tesla Full time
Job Summary

We are seeking a highly skilled Senior Site Reliability Engineer to join our PLM Operations team at Tesla. As a key member of our team, you will be responsible for ensuring the reliability and performance of our engineering design tools, which are used by thousands of Tesla engineers worldwide.

Responsibilities
  • Define Service Level Objectives (SLOs) around latency, traffic, errors, and saturation to ensure reliability and performance.
  • Maintain Tesla-custom Helm Charts to deploy highly customized and evolving 3DExperience services running on on-prem Kubernetes.
  • Modernize our deployment infrastructure using custom GitHub Actions, ArgoCD, Atlantis, and Terraform.
  • Achieve high performance services using tools like Prometheus, Grafana, Catchpoint, Splunk, and OpsGenie.
  • Be in an on-call rotation, manage incidents as Incident Commander, and write actionable incident reports.
  • Manage tasks via Jira for observability and human capacity planning, maintaining excellent Jira hygiene.
  • Write and review design documents, including testing frameworks, deployment models, environment definitions, etc.
Requirements
  • Deep networking experience, including troubleshooting outages from L7 to L3 and contributing to infra or networking GitHub repos or publications.
  • Deep Oracle Database experience, including indexing deltas and schema migrations.
  • Docker/Kubernetes experience, including performed kubelet upgrades in-situ and used skopeo or CRI-O intentionally.
  • Diagnosing problems in legacy enterprise Java stacks.
  • Installing, managing, or using 3DExperience or similar experience with other PLM software.
  • Outstanding experience with Scientific computing or LIMS.
  • Deep understanding of hypervisor technology (VMware).
Benefits

As a full-time Tesla employee, you will be eligible for the following benefits at day 1 of hire:

  • Aetna PPO and HSA plans with $0 payroll deduction.
  • Family-building, fertility, adoption, and surrogacy benefits.
  • Dental and vision plans with options including $0 paycheck contribution.
  • Company-paid HSA contribution when enrolled in the High Deductible Aetna medical plan with HSA.
  • Healthcare and Dependent Care Flexible Spending Accounts (FSA).
  • LGBTQ+ care concierge services.
  • 401(k) with employer match, Employee Stock Purchase Plans, and other financial benefits.
  • Company-paid Basic Life, AD&D, short-term, and long-term disability insurance.
  • Employee Assistance Program.
  • Sick and Vacation time (Flex time for salary positions), and Paid Holidays.
  • Back-up childcare and parenting support resources.
Voluntary Benefits

Optional benefits include:

  • Critical illness, hospital indemnity, accident insurance, theft & legal services, and pet insurance.
  • Weight Loss and Tobacco Cessation Programs.
  • Tesla Babies program.
  • Commuter benefits.
  • Employee discounts and perks program.
Expected Compensation

$104,000 - $348,000/annual salary, depending on level + cash and stock awards + benefits. Pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience.



  • Palo Alto, California, United States Tesla Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our PLM Operations team at Tesla. As a key member of our team, you will be responsible for ensuring the reliability and performance of our 3DExperience services running on on-prem Kubernetes.Key ResponsibilitiesDefine Service Level Objectives (SLOs) around latency,...


  • Palo Alto, California, United States Tesla Full time

    About the RoleWe are seeking a highly skilled Staff Site Reliability Engineer to join our PLM Operations team at Tesla. As a key member of our team, you will be responsible for ensuring the reliability and performance of our PLM systems, which are critical to the success of our engineering design tools.Key ResponsibilitiesDefine Service Level Objectives...


  • Palo Alto, California, United States Rubrik Full time

    About the RoleRubrik is seeking a Senior Site Reliability Engineer to join our team. As a Senior Site Reliability Engineer, you will be responsible for ensuring the high availability and durability of our databases, establishing best practices for internal teams to write performant SQL queries, and performing periodic database upgrades minimizing downtime...


  • Palo Alto, California, United States Rubrik Full time

    About the RoleRubrik is seeking a highly skilled Senior Site Reliability Engineer to join our team. As a Senior Site Reliability Engineer, you will be responsible for ensuring the high availability and durability of our databases, establishing best practices for internal teams to write performant SQL queries, and performing periodic database upgrades with...


  • Palo Alto, California, United States General Motors Full time

    Job DescriptionAt General Motors, we're revolutionizing the automotive industry with software-defined vehicles. As a Site Reliability Engineer, you'll play a critical role in ensuring the reliability, scalability, and security of our production systems.ResponsibilitiesLead the Site Reliability engineering effort to improve anomaly detection, platform...


  • Palo Alto, California, United States Rubrik Full time

    About The Role:Rubrik is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and availability of our cloud-based data security platform. You will work closely with our development team to identify and resolve issues, and collaborate with our operations team...


  • Palo Alto, California, United States X (formerly Twitter) Full time

    About XX is a global digital public square, committed to protecting freedom of speech and building the future of unlimited interactivity. Our mission is to empower every user to freely create and share ideas, fostering open public discourse without barriers.Job SummaryWe are seeking a highly motivated CDN Site Reliability Engineer to join our Edge Services...


  • Palo Alto, California, United States Criteo Full time

    About the RoleCriteo is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and applications.Key ResponsibilitiesDesign, develop, and maintain scalable and reliable software systemsCollaborate with...


  • Palo Alto, California, United States Criteo Full time

    About the RoleCriteo is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our infrastructure and applications.Key ResponsibilitiesDesign, develop, and maintain scalable and highly available systems and...


  • Palo Alto, California, United States Rubrik Full time

    About The RoleRubrik is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and durability of our databases, as well as establishing best practices for internal teams to write performant SQL queries.Key ResponsibilitiesEnsure high availability and...


  • Palo Alto, California, United States General Motors Full time

    About the RoleAt General Motors, we're committed to innovation and excellence in all aspects of our business. As a Staff Site Reliability Engineer, you'll play a critical role in ensuring the reliability and scalability of our software systems. You'll work closely with cross-functional teams to design, implement, and maintain high-quality software solutions...


  • Palo Alto, California, United States General Motors Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at General Motors. As a Site Reliability Engineer, you will play a critical role in ensuring the stability, scalability, and reliability of our software-defined vehicle solutions.Key ResponsibilitiesLead the Site Reliability engineering effort to improve anomaly...


  • Palo Alto, California, United States Rubrik Full time

    About The RoleRubrik is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the smooth operation of our infrastructure services, ensuring high availability and durability of our databases, and driving reliability, availability, and efficiency improvements to our Polaris Cloud...


  • Palo Alto, California, United States Rubrik Full time

    About The RoleThe Rubrik Engineering team is seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and durability of our databases, as well as establishing best practices for internal teams to write performant SQL queries.Key ResponsibilitiesEnsure high...


  • Palo Alto, California, United States Rubrik Full time

    About The Role:As a Site Reliability Engineer at Rubrik, you will play a critical role in ensuring the smooth operation of our infrastructure services. This includes maintaining high availability and durability of our databases, establishing best practices for internal teams to write performant SQL queries, and performing periodic database upgrades with...


  • Palo Alto, California, United States EarnIn Full time

    About EarnInEarnIn is a pioneering financial technology company that specializes in building products that deliver real-time financial flexibility for individuals with unique financial needs. Our mission is to provide access to earned wages without mandatory fees, interest rates, or credit checks.We have a strong leadership team and world-class funding...


  • Palo Alto, California, United States Tesla Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our AI Infrastructure team at Tesla. As a Site Reliability Engineer, you will be responsible for maintaining and improving our platform to ensure our Full-Self-Driving (FSD), Tesla Bot & Dojo engineering teams have the necessary tools and resources to be productive.Key...


  • Palo Alto, California, United States Tesla Full time

    About the RoleWe are seeking a highly skilled Sr. Mechanical Reliability Engineer to join our team at Tesla, focusing on the Megapack industrial energy storage system. As a key member of our reliability team, you will play a critical role in designing and ensuring the reliability of our products, from concept to field operation.ResponsibilitiesFacilitate...


  • Palo Alto, California, United States Tesla Full time

    About the RoleWe are seeking a highly skilled Sr. Mechanical Reliability Engineer to join our team at Tesla, focusing on the Megapack industrial energy storage system. As a key member of our reliability team, you will play a critical role in designing and ensuring the reliability of our products, meeting the highest standards of quality and...

  • Senior Cloud Engineer

    2 weeks ago


    Palo Alto, California, United States Tencent Full time

    Job Title: Senior Site Reliability EngineerTencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life of people around the world.About TencentFounded in 1998 with its headquarters in Shenzhen China, our guiding principle is to use technology for good. We are not only a major video...