Site Reliability Engineer

Found in: Jooble US O C2 - 2 weeks ago


Boston MA, United States Rockset, Inc. Full time

ABOUT ROCKSET

At Rockset, we’ve built the real-time analytics database for the world's data applications. Our team and technology come from a rich heritage, rooted in the experience of building massive scale data systems at the world’s leading companies, and we created Rockset to make those kinds of powerful data platforms available to real-time application developers everywhere. We are creating a world where developers can go from complex data sets to fast, interactive applications and analysis effortlessly.

We’re a fast-growing company that values curiosity, diversity, and open-mindedness. You will solve interesting problems, surrounded by exceptional people, while making customers happy. We work hard, but also take our personal lives and experiences seriously. We are backed by Greylock Partners and Sequoia Capital, and headquartered in San Mateo, CA with offices in Boston, MA and London, UK and remote employees throughout the US.

As a site reliability engineer, you will be responsible for the automation, stability, security, configuration, monitoring, alerting, and capacity planning of Rockset's network, systems, and infrastructure. You will also build tools that help the rest of the engineering team be more productive, and including the ones that Rockset engineers use to deploy and manage their services. You will have a foundational impact on shaping the team and the systems we create. The on-call pager is shared by most of the engineering team, not just SRE.

Our infrastructure is completely hosted in Amazon Web Services. We use a variety of home grown, open source, and commercial tools, including Kubernetes, Docker, Kafka, Zookeeper, Prometheus, Grafana, Salt, Terraform, Phacility, and Buildkite. We try to deploy new code to our production environment twice a week, but as an SRE you can expect to make production changes on a daily basis.

You should expect to collaborate with all other engineering teams to develop solutions that meet reliability, security, and business requirements. Lastly, you will diagnose, triage, and build solutions for complex technical issues at scale.

The US base salary range for this full-time position is $140,000/year to $215,000/year + equity +benefits. The actual pay may vary based on factors such as location, experience, and skills. Final salary will be commensurate with the candidate’s level and location. This range represents base salary only.

You'd be a great fit if you are:
  • Passionate about distributed systems, database technologies, and highly scalable services
  • Poised under fire and willing to share an on-call rotation with the rest of the team
  • A self-starter who thrives in a fast-paced environment
  • Willing to learn new skills and technologies
  • Attentive to details and comfortable with ambiguity
It would be even more awesome if you also have:
  • Bachelor's or Master's degree in Computer Science or a related field, or relevant work experience
  • Experience as an SRE for 3+ years
  • Experience building and operating public-facing 24x7 web applications at scale
  • Experience working with cloud infrastructure and patterns (AWS preferred)
  • Strong programming skills in a scripted language (Python, Ruby, Bash)
  • Experience with Kubernetes, Mesos, Swarm, or similar container orchestration tools
  • Experience with Terraform, Salt, Chef, Packer, or similar configuration management tools
  • Experience with Grafana, Prometheus, Datadog, or similar monitoring tools
  • Experience with Azure a plus

OUR COMMITMENT TO DIVERSITY

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

#J-18808-Ljbffr

  • Boston, United States Wright-Pierce Full time

    Welcome to the Latest Job Vacancies Site 2024 and at this time we would like to inform you of the Latest Job Vacancies from the Veeva Systems with the position of Senior Site Reliability Engineer - Veeva Systems which was opened this.If this job matches your qualifications, please send your application directly through our latest Job site. Indeed, every job...

  • Reliability Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    Boston, MA, United States MASSACHUSETTS MARITIME ACADEMY Full time

    The GxP Reliability Engineer will provide reliability engineering support for all facilities, utilities systems and equipment including analytical instrumentation, R&D lab support equipment and systems. This role will facilitate the deployment of Maintenance and Reliability Best Practices for new and existing equipment, facilities, and utilities at GxP...

  • Site Reliability Engineering Director

    Found in: Jooble US O C2 - 2 days ago


    Newton, MA, United States Bright Horizons Full time

    The Director of Site Reliability Engineering (SRE) will play a pivotal role in ensuring the seamless and reliable operation of consumer and customer-facing digital infrastructure across our lines of business. This leadership position involves overseeing a team of skilled SRE professionals and collaborating closely with cross-functional teams to enhance...

  • Reliability Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    Boston, MA, United States Beacon Engineering Resources Full time

    Reliability Engineer provide Design for Reliability, Maintainability and Supportability guidance to the new product introduction teams. Tasks include: Development of Failure Modes Effects Analysis Calculating reliability predictions/ modelling; Develop probabilistic models for cost, reliability, logistics, etc Creating lifecycle support...

  • Reliability Engineer- Biotechnology

    Found in: Jooble US O C2 - 7 days ago


    Boston, MA, United States Thrive Full time

    A biotechnology client in Boston, MA is looking for a Facilities/ Reliability Engineer Summary The facilities site engineer reporting to global engineering and capital project management will be responsible for providing engineering expertise in a multi-site GMP manufacturing facility environment including in-depth knowledge of building mechanical,...

  • Cisco Cloud Reliability Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    Boston, MA, United States Ceres Group Full time

    The Site Reliability Engineer is primarily responsible for technical architecture, engineering, implementation, reliability and support of the firm's network, encompassing both on-prem and Cloud Platform Network infrastructures. This includes engineering and implementation as well as troubleshooting. It includes responsibility for ensuring delivery of...

  • Quality & Reliability Engineering

    Found in: Jooble US O C2 - 2 weeks ago


    Boston, MA, United States Whoop, Inc Full time

    At WHOOP, we're on a mission to unlock human performance. As a Reliability Quality Engineer, you will work closely with WHOOP Hardware Engineering, Product Manager and other teams and vendors as you will be responsible for ensuring products meet quality requirements and reliability standards. This individual will lead initiatives to maintain and improve...

  • Cisco Cloud Reliability Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    Boston, MA, United States Ceres Group Full time

    The Site Reliability Engineer is primarily responsible for technical architecture, engineering, implementation, reliability and support of the firm's network, encompassing both on-prem and Cloud Platform Network infrastructures. This includes engineering and implementation as well as troubleshooting. It includes responsibility for ensuring delivery of...

  • Reliability Quality Engineer

    Found in: Jooble US O C2 - 2 weeks ago


    Boston, MA, United States Whoop, Inc Full time

    At WHOOP, we're on a mission to unlock human performance. WHOOP empowers users to perform at a higher level through a deeper understanding of their bodies and daily lives. As a Reliability Quality Engineer, you will work closely with WHOOP Hardware Engineering, Product Manager and other teams and vendors as you will be responsible for ensuring products...


  • Boston, United States Alarm.com Full time

    Job DescriptionJob DescriptionSenior Software Engineer (Site Reliability Engineer)Do you love working with the latest technologies? Excited about helping maintain, improve, and scale an environment that supports millions of customers and IoT devices? Passionate about code at scale?If the above holds true for you, then we would love to talk to you! Alarm.com...

  • Site Reliability Engineer

    Found in: Appcast US C2 - 2 weeks ago


    Boston, United States Intelletec Full time

    About the PositionWe are looking for experienced engineers who understand AI systems, and are excited about becoming global leaders in a completely novel field. We need people that can work independently as part of a small team.You will be responsible for building the industry’s first end-to-end AI evaluation platform, starting with an offline evaluation...


  • Boston, United States Intelletec Full time

    About the PositionWe are looking for experienced engineers who understand AI systems, and are excited about becoming global leaders in a completely novel field. We need people that can work independently as part of a small team.You will be responsible for building the industry’s first end-to-end AI evaluation platform, starting with an offline evaluation...

  • Site Reliability Engineer

    Found in: Appcast Linkedin GBL C2 - 2 weeks ago


    Boston, United States Intelletec Full time

    About the PositionWe are looking for experienced engineers who understand AI systems, and are excited about becoming global leaders in a completely novel field. We need people that can work independently as part of a small team.You will be responsible for building the industry’s first end-to-end AI evaluation platform, starting with an offline evaluation...

  • Reliability Engineer

    Found in: Careerbuilder One Red US C2 - 1 week ago


    Allendale, MA, United States General Dynamics Mission Systems Full time

    Requires a Bachelor’s degree in Reliability, Electrical, Mechanical, Materials Engineering, or a related Science, Engineering or Mathematics field. CLEARANCE REQUIREMENTS: Department of Defense Secret security clearance is required within a reasonable period of time. Applicants selected will be subject to a U.S. Government security investigation and must...


  • Boston, United States CERES Group Full time

    The Site Reliability Engineer is primarily responsible for technical architecture, engineering, implementation, reliability and support of the firm's network, encompassing both on-prem and Cloud Platform Network infrastructures. This includes engineering and implementation as well as troubleshooting. It includes responsibility for ensuring delivery of...

  • Reliability Engineer

    Found in: Careerbuilder One Red US C2 - 1 week ago


    Allendale, MA, United States General Dynamics Mission Systems Full time

    Requires a Bachelor’s degree in Reliability, Electrical, Mechanical, Materials Engineering, or a related Science, Engineering or Mathematics field. CLEARANCE REQUIREMENTS: Department of Defense Secret security clearance is required within a reasonable period of time. Applicants selected will be subject to a U.S. Government security investigation and must...

  • Software Engineering Manager, Site Services

    Found in: Jooble US O C2 - 7 days ago


    Boston, MA, United States Knewin Full time

    At Wayfair we are well on the way to becoming the world’s number one, online destination for all things home. In the Storefront team, we are the first impression of a customer’s shopping experience; we own the design and implementation of the customer-facing website. our Storefront Engineering teams ensure that we’re building technology that solves...


  • Boston, United States Alarm.com Full time

    Do you love working with the latest technologies? Excited about helping maintain, improve, and scale an environment that supports millions of customers and IoT devices? Passionate about code at scale? If the above holds true for you, then we would love to talk to you! Alarm.com is looking for a versatile Site Reliability Engineer to work on our Platform...

  • Software Engineering Manager, Site Services

    Found in: Jooble US O C2 - 7 days ago


    Boston, MA, United States Knewin Full time

    Candidates for this position should be based in Boston, MA and will be expected to comply with their team's hybrid work schedule requirements. At Wayfair we are well on the way to becoming the world’s number one, online destination for all things home. Our core belief is that everyone should live in a home they love. We make this possible by...


  • Boston, United States WHOOP Full time

    At WHOOP, we're on a mission to unlock human performance. WHOOP empowers users to perform at a higher level through a deeper understanding of their bodies and daily lives.As a Senior Reliability Quality Engineer, you will work directly with WHOOP Hardware Engineering, Product Manager and other teams and vendors to create product reliability requirements...