Senior Bilingual Site Reliability Engineer

4 weeks ago


San Diego, California, United States IntelliPro Group Inc. Full time
Job Description

We are seeking a highly skilled Site Reliability Engineer to join our team at IntelliPro Group Inc. As a Site Reliability Engineer, you will be responsible for maintaining and optimizing our robust database infrastructure, leveraging automation to ensure reliability, performance, and security. You will design scalable solutions that meet our expanding data and business needs.

Key Responsibilities:

  • Collaborate with cross-functional teams to ensure proper toolsets for generating, collecting, analyzing, visualizing, and alerting operational data.
  • Own and operate critical open-source services, including Elasticsearch, Kafka, RabbitMQ, and Redis.
  • Design and build tools that improve observability, system resiliency, and platform performance.
  • Proactively manage and triage site availability incidents, working to minimize mean time to recovery (MTTR) for critical customer-impacting events.
  • Partner with service owners to define and implement Service Level Metrics (SLMs) and Service Level Objectives (SLOs).
  • Document technical processes, network diagrams, and runbooks to enhance efficiency and improve the reliability of the infrastructure.
  • Participate in 24/7/365 on-call rotation to ensure continuous system availability.

Requirements:

  • Bachelor's degree in Computer Science, Information Systems, or a related field (or foreign equivalent).
  • Minimum 4-7 years of experience in mission-critical, real-time, high-traffic applications in cloud environments.
  • Proficiency in cloud systems, continuous integration, Java, SQL/NoSQL databases, and observability tools such as Grafana, Prometheus, or Zabbix.
  • Experience in scripting/programming (Python, GoLang) and container technologies like Docker, Kubernetes, or Mesos.
  • Knowledge of open-source technologies (Elasticsearch, Kafka, Redis) is essential.

Benefits:

  • Competitive bonus and RSU offerings.
  • Comprehensive healthcare (medical, dental, vision, prescription).
  • Health Savings Account with employer contributions.
  • Flexible Spending Accounts for healthcare and dependent care.
  • Company-paid life and disability insurance.
  • Voluntary benefits (Critical Illness, Accident, Hospital Indemnity).
  • Employee Assistance Program (EAP) and Business Travel Accident Insurance.
  • 401(k) plan with discretionary company match and financial advisory access.
  • Generous paid time off (vacation, holidays, sick days, and floating holidays).
  • Employee discounts and free weekly catered lunch.
  • Dog-friendly office, gym access (in select locations).
  • Free snacks, beverages, and company swag.
  • Invitations to company events and annual holiday parties.


  • San Francisco, California, United States Tampa Gardens Senior Living Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Cloud Infrastructure Team. As a key member of our team, you will be responsible for deploying, managing, optimizing, and upgrading the systems that run Sight Machine software.You will work closely with our Development Engineering team to ensure the stability,...


  • San Diego, California, United States Intellipro Group Full time

    We are seeking a highly skilled Site Reliability Engineer to join our team at IntelliPro Group. As a key member of our SRE team, you will be responsible for maintaining and optimizing our robust database infrastructure, leveraging automation to ensure reliability, performance, and security.Key Responsibilities:Collaborate with cross-functional teams to...


  • San Francisco, California, United States Astranis Full time

    Astranis MissionAstranis is revolutionizing global connectivity by developing the next generation of smaller, more cost-effective spacecraft. Our mission is to bridge the digital divide and connect the four billion people worldwide who lack internet access.Job SummaryWe are seeking a highly motivated and experienced Senior Site Reliability Engineer to join...


  • San Diego, California, United States Qualcomm Full time

    Job Title: Site Reliability EngineerAt Qualcomm, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the stability, sustainability, and security of our infrastructure and services.Key Responsibilities:Monitor system health and detect anomalies to prevent service...


  • San Diego, California, United States Insight Global Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and performance of our cloud-based systems.Key Responsibilities:Design and implement scalable and highly available cloud...


  • San Diego, California, United States Insight Global Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will play a critical role in ensuring the high availability and performance of our cloud-based systems.Key Responsibilities:Design and implement scalable and highly available cloud...


  • San Diego, California, United States Commserve Technologies Inc Full time

    Job Title: Site Reliability EngineerAt Commserve Technologies Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our enterprise-level applications.Key Responsibilities:Configure, architect, and maintain...


  • San Diego, California, United States Commserve Technologies Inc Full time

    Job Title: Site Reliability EngineerAt Commserve Technologies Inc, we are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our enterprise-level applications.Key Responsibilities:Configure, architect, and maintain...


  • San Diego, California, United States BAE Systems USA Full time

    Job DescriptionAt BAE Systems USA, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the seamless delivery of our cloud-based services.Key Responsibilities:Work collaboratively with cross-functional teams to design, implement, and maintain scalable and reliable...


  • San Diego, California, United States BAE Systems USA Full time

    Job DescriptionBAE Systems USA is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement robust automation solutions to streamline infrastructure deployment and...


  • San Diego, California, United States Becton, Dickinson & Company Full time

    About the RoleA Site Reliability Engineering Manager at Becton, Dickinson & Company is responsible for ensuring the smooth operation of complex systems and services. They oversee a team of Site Reliability Engineers to maintain infrastructure, handle incident response, and implement continuous improvement initiatives.Key ResponsibilitiesLead a team of Site...


  • San Diego, California, United States BAE Systems USA Full time

    Job DescriptionAt BAE Systems USA, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you'll play a critical role in ensuring the seamless delivery of our cloud-based services. Your expertise in cloud technologies, service lifecycle management, and infrastructure automation will be instrumental in driving our...


  • San Francisco, California, United States Outdefine Full time

    About the RoleWe are seeking a skilled Senior Site Reliability Engineer to join our team at Outdefine. As a key member of our engineering team, you will be responsible for ensuring the reliability and scalability of our blockchain-based infrastructure.Key ResponsibilitiesDesign and implement scalable and reliable infrastructure solutions for our...


  • San Francisco, California, United States Twitter Full time

    Job Summary:Twitter is seeking a Senior Site Reliability Engineer to lead a team of engineers working to keep our services reliable and scalable. The ideal candidate will have experience managing services in a distributed environment and be comfortable working with on-prem and cloud-based infrastructure.Responsibilities:Lead a team of site reliability...


  • San Francisco, California, United States WEX Full time

    Job SummaryThe WEX Site Reliability Engineering team is seeking a highly motivated and quick-learning individual to join our team as a Site Reliability Engineer Level 1. As a key member of our team, you will be responsible for ensuring the reliability, performance, and security of our systems.Key Responsibilities:Actively participate in training and...


  • San Diego, California, United States BAE Systems USA Full time

    Job DescriptionBAE Systems USA is seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our Platform Engineering group, you will play a critical role in developing and deploying cutting-edge IaaS, PaaS, and SaaS solutions using the latest technologies.Key Responsibilities:Work in a team of SREs to ensure seamless, continuous...


  • San Diego, California, United States BAE SYSTEMS Full time

    Job Title: Principal Site Reliability EngineerAt BAE Systems, we are seeking a highly skilled Principal Site Reliability Engineer to join our team. As a key member of our Platform Engineering group, you will play a critical role in developing and deploying cutting-edge technologies to support our customers' missions.Key Responsibilities:Design and implement...


  • San Jose, California, United States Hireio, Inc. Full time

    Job OverviewWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Hireio, Inc.The ideal candidate will have a strong background in software development, systems engineering, and cloud infrastructure. They will be responsible for designing, implementing, and maintaining large-scale, distributed systems that are highly available,...


  • San Jose, California, United States Triune Infomatics Inc Full time

    Role:Senior Site Reliability ManagerTriune Infomatics Inc is seeking an experienced Senior Site Reliability Manager to join our team and contribute to the design and upkeep of our cloud-based IoT edge orchestration solution.Job Summary:The Senior Site Reliability Manager will be responsible for ensuring the availability of our SaaS platform and meeting the...


  • San Diego, California, United States BD Full time

    Job Title: Site Reliability Engineering ManagerJob Summary:A Site Reliability Engineering Manager is responsible for ensuring that systems and services run smoothly, reliably, and efficiently at scale. They manage a team of SREs to maintain infrastructure, handle incident response, and improve the system's reliability and performance.Key...