Reliability and Performance Specialist

2 days ago


Santa Clara, California, United States Selector Software, Inc. Full time

**Job Description**

We're seeking a highly skilled Reliability and Performance Specialist to join our team and help us deliver a world-class AIOps platform.

In this role, you'll be responsible for designing, implementing, and maintaining infrastructure for the platform using IaC tools.

**Key Responsibilities:**

  • Design and implement infrastructure using IaC tools like Terraform or Ansible.
  • Automate software deployments and configuration management using tools like GitOps or Kubernetes.
  • Configure and manage monitoring tools to identify and troubleshoot performance issues.
  • Develop and implement incident response procedures to ensure rapid resolution of service disruptions.

**Requirements:**

  • Bachelor's degree or higher in a relevant field.
  • Kubernetes operations understanding.
  • Experience with multi-node kubernetes deployment in EKS, GKE, AKS, and RKE2.
  • Expertise in infrastructure automation tools like Terraform, Ansible, or Chef.
  • Proficiency in scripting languages like Python, Bash, or PowerShell.
  • In-depth knowledge of Linux operating systems.
  • Excellent troubleshooting and problem-solving skills.
  • Strong communication and collaboration skills.
  • Ability to work independently and as part of a team.

The estimated salary for this role is $150,000 - $180,000 per year, based on industry standards and location.



  • Santa Clara, California, United States Intel Full time

    Join us at Intel, where we're shaping the future of technology. We're seeking a highly skilled Reliability and Quality Assurance Specialist to join our team.As a member of the Pre-Silicon Design Quality and Reliability Engineer (Pre-Si QRE) group, you'll support the development of CPU and Hard IPs on the most advanced Intel processes.Key responsibilities...


  • Santa Clara, California, United States Pure Storage, Inc. Full time

    Drive product reliability at scale as a Software Engineering Manager for Pure Storage's Fleet Reliability Engineering team. Lead a group dedicated to ensuring the highest reliability of customer FlashBlade systems.Your mission will be to manage and improve systems and processes that monitor and respond to fleet reliability issues—whether through reactive,...


  • Santa Barbara, California, United States AppFolio Full time

    We're innovators, changemakers, and collaborators at AppFolio. We deliver magical experiences for our customers by pioneering cloud and AI technology.As a key member of our team, you'll help build common infrastructure and improve the reliability, quality of services, and observability patterns. You'll collaborate with engineering teams to enhance the...


  • Santa Clara, California, United States Intel Full time

    Intel is at the forefront of innovation, driving progress in the world of technology. We are committed to enriching the lives of every person on earth.The Client Computing Group (CCG) is responsible for driving business strategy and product development for Intel's PC products and platforms. This role will be part of the Pre-Silicon Design Quality and...


  • Santa Clara, California, United States Advanced Micro Devices , Inc. Full time

    About the RoleWe are seeking a highly skilled Data Center Performance Optimization Specialist to join our team. In this role, you will be responsible for ensuring that AMD Instinct GPU-accelerated systems operate at peak performance before being deployed to solve the world's most challenging problems.Key ResponsibilitiesDefine performance suites and best...

  • AMD Software Engineer

    4 weeks ago


    Santa Clara, California, United States Advanced Micro Devices , Inc. Full time

    About the Role:We are seeking a highly skilled software engineer to join our team as a performance optimization specialist. This is an exciting opportunity to work with our cutting-edge hardware and software technology to improve the performance of key applications and benchmarks.Key Responsibilities:Collaborate with our architecture specialists to design...


  • Santa Clara, California, United States Roche Holdings Inc. Full time

    ResponsibilitiesThe Principal DevOps Engineer will lead the design automation and perform deployment of various algorithms on dev, test, and production environments.This role requires excellent team player and mentoring skills, with a track record of guiding cross-functional teams to achieve DevOps automation goals and enhance overall productivity.The...


  • Santa Clara, California, United States OmniVision Technologies Full time

    OmniVision Technologies, a leading CMOS Image Sensor Manufacturer, is seeking a highly skilled Staff Reliability Engineer to join its team in Santa Clara, CA.About the RoleWe are looking for an exceptional engineer with expertise in reliability systems to help us design and develop high-quality image sensors. As a Staff Reliability Engineer, you will play a...


  • Santa Clara, California, United States SiTime Corporation Full time

    Job OverviewSiTime Corporation is a leading precision timing company. Our semiconductor MEMS programmable solutions offer high performance, smaller size, lower power and better reliability.This role will be responsible for the growth and support of HW development that is critical to SiTime's test infrastructure.Key ResponsibilitiesManage vendors to execute...


  • Santa Clara, California, United States Macom Technology Solutions Holdings, Inc. Full time

    Job Description:MACOM is seeking a highly skilled High Performance Analog ICs Specialist to join our team. As an expert in high-speed ICs for optical media access and signal transport, you will provide expert-level engineering support to customers, sales, FAEs, and marketing.Key Responsibilities:- Specialize in MACOM's High-Performance Analog ICs and their...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About the JobPalo Alto Networks is seeking an experienced Reliability Engineer to join our team. The ideal candidate will have a strong background in reliability engineering and networking products.The successful candidate will be responsible for establishing controls and document procedures related to NPI product quality and reliability, aiding Development...


  • Santa Clara, California, United States Apple Full time

    Company OverviewWe are a leading technology company dedicated to creating innovative products that empower people worldwide. Our mission is to drive progress and improve lives through groundbreaking technologies.SalaryThe base pay for this role ranges from $175,800 to $312,200, depending on your skills, qualifications, experience, and location. You'll also...


  • Santa Monica, California, United States System One Full time

    **Job Title:** Medical Device Reliability SpecialistWe are seeking a highly skilled Medical Device Reliability Specialist to join our team at System One. This role is ideal for an experienced engineer who wants to make a real impact in the medical device industry.**Estimated Salary:** $95,000 - $115,000 per year (dependent on experience)**About the Role:In...


  • Santa Clara, California, United States Pure Storage, Inc. Full time

    About the RolePure Storage, Inc. is a leading technology company dedicated to delivering innovative solutions for data storage and management.Job DescriptionWe are seeking an experienced Observability and Site Reliability Engineer to join our team in Santa Clara, CA. As an Observability and SRE Engineer , you'll be responsible for managing and enhancing the...


  • Santa Clara, California, United States Celestial AI Full time

    About Celestial AICelestial AI is at the forefront of a technological revolution in data center infrastructure. As Generative AI continues to advance, the performance drivers are shifting from systems-on-chip (SOCs) to systems of chips. In the era of Accelerated Computing, data center bottlenecks are no longer limited to compute performance, but rather the...


  • Santa Clara, California, United States Palo Alto Networks Full time

    About Palo Alto NetworksOur mission is to protect our digital way of life. We strive to be the cybersecurity partner of choice, working tirelessly to safeguard our customers' public cloud workloads with resilient, scalable, and always-on firewall solutions.We're a team of innovators who challenge the status quo and drive meaningful change in the...


  • Santa Clara, California, United States Palo Alto Networks Full time

    We are looking for a highly skilled Cloud Service QA Specialist to join our team at Palo Alto Networks. As a Cloud Service QA Specialist, you will be responsible for ensuring the quality and reliability of our cloud-based services.The ideal candidate will have a strong background in software testing and performance engineering, with experience working with...


  • Santa Clara, California, United States PayNearMe Full time

    About UsAt PayNearMe, we're revolutionizing the way businesses accept, disburse, and manage payments. Our cutting-edge technology enables seamless payment experiences, reducing costs and increasing acceptance rates.We're a dynamic team of over 200 employees, headquartered in Silicon Valley, with a passion for innovation and excellence. We're committed to...


  • Santa Clara, California, United States City of Santa Clara, CA Full time

    Job Title: Electrical Systems SpecialistJob Summary: We are seeking a skilled Electrical Systems Specialist to join our team at the City of Santa Clara, CA. As an Electrical Systems Specialist, you will be responsible for maintaining and repairing high voltage substation equipment, power generation equipment, and industrial manufacturing facilities.About Us:...


  • Santa Clara, California, United States AdvancedPCB Full time

    Job OverviewAdvancedPCB is seeking a skilled Electrical Test Specialist to perform electrical testing on printed circuit boards to ensure functionality and compliance with quality standards.Key Responsibilities:Perform electrical testing on printed circuit boards.Ensure compliance with quality standards.Collaborate with management to ensure appropriate...