We have other current jobs related to this field that you can find below


  • Seattle, Washington, United States Flexe Full time

    Flexe solves the hardest omnichannel logistics problems for the world's largest retailers and brands. Integrating technology, open logistics networks, and elastic economic models allows Flexe customers to move fast, at scale, and with precision. Founded in 2013 and headquartered in Seattle, Flexe brings deep logistics expertise and enterprise-grade...


  • Seattle, Washington, United States Apple Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our Apple Services Engineering team in Seattle, Washington. As a key member of our dynamic team, you will play a critical role in ensuring the availability, latency, and overall health of our object store orchestration service.Key...


  • Seattle, United States Prodigy Resources Full time

    About Us: Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us: Prodigy is seeking an SRE to join our clients organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, theyre seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us:Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us:Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Apple Full time

    Senior Site Reliability Engineer, Object Storage Seattle, Washington, United States Software and Services The Apple Services Engineering (ASE) team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. These are the people who power the App Store, Apple TV, Apple Music, Apple Podcasts, and Apple Books. They...


  • Seattle, United States Apple Full time

    To view your favorites, sign in with your Apple ID. Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Join Apple’s Cloud Service Infrastructure team as a site reliability...


  • Seattle, United States West500 Partners Full time

    Our client is a fast-growing downtown Seattle startup developing AI automation for professional services, including legal technology and medical records. They have a great product market fit and rapidly increasing revenues and are currently in need of a local Software Engineering Lead with CI/CD expertise, an AWS background, and a keen interest in innovative...


  • Seattle, United States West500 Partners Full time

    Our client is a fast-growing downtown Seattle startup developing AI automation for professional services, including legal technology and medical records. They have a great product market fit and rapidly increasing revenues and are currently in need of a local Software Engineering Lead with CI/CD expertise, an AWS background, and a keen interest in innovative...


  • Seattle, United States Oracle Full time

    OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...


  • Seattle, Washington, United States F5 Networks Full time

    About F5 NetworksAt F5 Networks, we are dedicated to shaping a superior digital landscape. Our teams empower organizations worldwide to create, secure, and operate applications that enhance our interactions with the ever-evolving digital environment.We are deeply committed to cybersecurity, safeguarding consumers from fraud, and enabling businesses to...


  • Seattle, United States Capgemini Full time

    **Site Reliability Engineer** **FTE with benefits** Our team is looking to add experienced Site Reliability / DevOps Engineer to our team. + Experiencedwith **Python and Shell Scripting.** + **Shouldhave extensive experience with Azure or AWS (Azure preferred)** + **Experiencewith Monitoring and Observability - Datadog** + **Experiencewith Infrastructure as...


  • Seattle, United States Oracle Full time

    OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...


  • Seattle, United States Oracle Full time

    OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...


  • Seattle, Washington, United States Oracle Full time

    Overview: The OCI Incident Response team serves as the primary defense mechanism for ensuring the uninterrupted operation of Oracle's cloud services. Our mission is to reduce the frequency and impact of customer-affecting incidents by implementing effective large-scale incident management strategies. We leverage our operational expertise, adherence to...


  • Seattle, United States Moloco Full time

    About the Role Moloco is a machine learning company that operates at massive scale (we ingest 10 petabytes of training data per day), and our models are blazingly fast (return predictions in 10 milliseconds or less); and a profitable unicorn (we are valued at $2 billion and have been profitable for the last 13+ quarters). We are looking for an exceptional...


  • Seattle, Washington, United States Circle Full time

    About the RoleWe are seeking a highly skilled Cloud Engineer to join our team at Circle, a leading financial technology company. As a Senior Site Reliability Engineer, you will play a critical role in designing, building, and maintaining our cloud infrastructure estate to meet the growing demands of our worldwide customer base.You will be responsible for...


  • Seattle, Washington, United States Apple Full time

    Overview:Position Number: The Apple Services Engineering team exemplifies Apple's dedication to merging creativity with technology. We invite you to join the Apple Services Engineering Cloud Service Infrastructure team as a Site Reliability Engineer, where you will play a pivotal role in supporting and expanding cloud services for millions of Apple users....


  • Seattle, United States Oracle Full time

    We are seeking experienced cloud technologists, interested in solving hard problems on tight schedules, to join our Major Incident Management team. OCI Incident Response is the first line of defense for maintaining the high availability of Oracles c Reliability Engineer, Architect, Liability, Engineer, Principal, Reliability, Technology

Senior Site Reliability Engineer

3 months ago


Seattle, United States SingleStore Full time

Position Overview

MemSQL is seeking a Senior Site Reliability Engineer to help drive our Kubernetes product strategy surrounding our managed service. You will be at the forefront; crafting the design, building out the collaborated vision, and sustaining your envisioned product strategy.

This role will be an integral part of building our managed service product line, and influencing future direction of the organization. As a technical leader in the space you will collaborate with the entire engineering team guiding decisions critical to both team and company success.

Role and Responsibilities

  • Help MemSQL craft its production container orchestration strategy.
  • Design, build, and run elastic Kubernetes clusters across on-prem, AWS, Azure, and Google Cloud environments.
  • Experience designing systems for peak reliability, scalability, and performance.
  • Efficiently operate within a data center environment; monitoring performance and health of hardware and software, installing new servers, and upgrading as needed
  • Participate in a SLA-driven on-call rotation, which will include after-hours, weekend, and rotating holiday participation.

Required Skills and Experience

  • Expert-level knowledge of Kubernetes and the container ecosystem.
  • Strong working knowledge of configuration management tools such as Ansible and Puppet.
  • Experience with Unix/Linux operating systems internals and administration (e.g., filesystems, inodes, system calls) and networking (e.g., TCP/IP, routing, network topologies and hardware, SDN) and a keen interest in relational databases.
  • Familiar with at least one of AWS, Azure, or Google Cloud.
  • Experience debugging, diagnosing and troubleshooting complex, production software.
  • C, Python, POSIX shell programming experience required. Experience with C++ / Go are a strong plus.
  • Familiarity with JunOS, routing protocols (BGP), IPSec and Ceph storage a plus.
  • B.S. Degree in Computer Science or related field