Senior Site Reliability Engineer

2 weeks ago

Seattle WA United States The Calyx Institute Full time

The Calyx Institute is seeking applications for Senior Systems Developer positions. A contributor to the Calyx engineering team, Senior Systems Developers will be primarily responsible for building and organizing components of our infrastructure.

Ideal candidates should have experience with the architecture of systems used to support the development of complex software products. They should be energetic, unflappable, flexible, and thrive in a highly-technical collaborative environment. The Calyx Institute works collaboratively on projects with a diverse, global community. We work hard to choose communications tools that encourage participation, where every contributor is heard and respected. Senior Systems Developers must be team players with excellent verbal and written communication skills.

The focus of the Senior Systems Developer role will range from the maintenance and automation of CI/CD pipelines (Jenkins, Ansible), to helping bootstrap the CalyxVPN project, and CalyxCloud (kubernetes, S3 object storage). Work on these projects will include developing roadmaps and milestones in cooperation with project managers responsible for these products, identifying team staffing needs, and helping to ensure timely completion of project deliverables. Some of these projects will require coordination with outside/upstream groups and flexibility with what may initially be "moving target deliverables." More software projects will be brought into the pipeline over time, in coordination with team leads and project managers.

Calyx teams are distributed globally across multiple time zones, so the ability to organize and communicate asynchronously via chat platforms, email, Gitlab tickets, etc. is crucial.

Candidates must have strong organizational skills and the ability to communicate in writing to diverse audiences. They will be expected to identify areas for project expansion or alignment, support team leads on the execution of projects, and help to write reports on the technical status of projects. Senior Systems Developers are expected to be technical experts within their rating, and capable of training/developing skills in their subordinates. This position reports to the Engineering Director (but reporting may be delegated to a Project Manager).

This role is an opportunity to both apply your existing skills and gain new ones. The following lists indicate what we think are necessary to execute various Calyx projects, but you do not need to be an expert in all of them to apply.

Required Skills

Git: functional knowledge of forking, branching, merging, reviewing, and rebasing
Ansible: understanding and experience with the concept of infrastructure as code, building/deploying containers and services using ansible
SSH, BASH, LINUX/UNIX, L2-L4 Networking, FOSS tools/solutions, CLI expertise
Containers (Podman|RKE2): experience/interest in CNCF projects
RAS-based (reliability, availability and serviceability) approach to design, deployment, and maintenance of infrastructure
Comfortable with stand-up review of written code and justification/critique of methods
Participate/engage in continuous refactoring and addressing technical debt
Experience in/with leadership roles

Preferred Skills & Qualifications

Designing and building applications for container-native deployment strategies
CI/CD: Design and maintenance of a CI/CD pipeline (using Jenkins, Zuul, etc.)
Experience with deployments on tier-1 public clouds (i.e. AWS, Azure, etc.), demonstrating a "best practices" approach
Experience building and deploying container registries
Hands-on experience with enterprise hardware (metal servers, racking, switches, etc.)
Container orchestration: Docker, Kubernetes, Podman, metal LB, traefik, etc.
Debian: experience with Debian or another apt-based Linux distribution
RedHat: experience with RHEL or another rpm-based Linux distribution
Networking: configuring switches (Juniper|Cummulus), network optimization, segmentation, BGP routing, DDoS mitigation, and associated IaaS tools, etc.
Logging: matomo, ELK, Prometheus, etc.
Applications: Nextcloud, Zammad, Tor, NGINX, mailservers, gitea|gitlab, SnipeIT
Communications: Signal, Zulip, XMPP
Four-year degree (or equivalent) in an operations or technical field (operations management, engineering, maths, sciences, etc.)
Graduate degrees and professional certifications are a plus
4+ years professional experience in similar roles (show us a portfolio)

Details & To Apply

Compensation for the Senior Systems Developer position ranges from $120,000 - $140,000. This position is full-time, hybrid-remote in the greater Seattle, WA area. Rare short-notice travel to data centers in Western Washington state is required. Calyx offers health, vision, and dental insurance, flexible paid time off, and 401k plan with generous employer contribution.

To apply, submit your required application materials via the link to Greenhouse below. Applications without a cover letter or submitted via email will not be reviewed.

Required materials:

1-page cover letter in PDF format
1-2 page resumé in PDF format
*3 professional references with contact information

*We will not contact references without informing you first

Optional materials:

Portfolio of past experience in similar roles

Application Deadline: May 1, 2024

Start Date: June 1, 2024

Senior Site Reliability Engineer

2 days ago

Seattle, United States Saxon Global Full time

Starbucks Senior Site Reliability Engineer (Cloud) 8-month contract (Likely extension to 18 month with strong performance) Hybrid - (Must be local to the Seattle area, onsite at Starbucks headquarters 3 days a week with 2 days remote) Job Summary and Mission This position contributes to Starbucks on their Data Platform Services team. This team maintains and...
Senior Site Reliability Engineer

3 days ago

Seattle, United States Sentry Full time

Bad software is everywhere, and we’re tired of it. Sentry is on a mission to help developers write better software faster, so we can get back to enjoying technology. With more than $217 million in funding and 90,000 organizations that believe we’re on to something, we're building performance and error monitoring tools that help companies like Disney,...
Senior Site Reliability Engineer

5 days ago

Seattle, Washington, United States Flexe Full time

Flexe solves the hardest omnichannel logistics problems for the world's largest retailers and brands. Integrating technology, open logistics networks, and elastic economic models allows Flexe customers to move fast, at scale, and with precision. Founded in 2013 and headquartered in Seattle, Flexe brings deep logistics expertise and enterprise-grade...
Site Reliability Engineer

5 days ago

Seattle, United States Sogeti Full time

Site Reliability Engineer FTE with benefits Our team is looking to add experienced Site Reliability / DevOps Engineer to our team. Experienced with Python and Shell Scripting.Should have extensive experience with Azure or AWS (Azure preferred)Experience with Monitoring and Observability - DatadogExperience with Infrastructure as a Code - specifically...
Site Reliability Engineer

4 days ago

Seattle, United States Capgemini Full time

**Site Reliability Engineer** **FTE with benefits** Our team is looking to add experienced Site Reliability / DevOps Engineer to our team. + Experiencedwith **Python and Shell Scripting.** + **Shouldhave extensive experience with Azure or AWS (Azure preferred)** + **Experiencewith Monitoring and Observability - Datadog** + **Experiencewith Infrastructure as...
Site Reliability Engineer

3 days ago

Seattle, United States INSPYR Solutions Full time

Title: Site Reliability Engineer Location: Seattle, WA (Hybrid 2-3 days on-site) Duration: 1+ year contract, (Possibility of conversion) Compensation: $85-$95.40/hour Work Requirements: US Citizen, GC Holders or Authorized to Work in the U.S. Skillset / Experience: You will be taking a lead role, interacting with a squad of experienced AWS software...
Senior Site Reliability Engineer, TikTok Server Architecture

3 weeks ago

Seattle, WA, United States TikTok Full time

DescriptionTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. Why Join UsCreation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This...
Lead Site Reliability Engineer

3 days ago

Seattle, United States Capgemini Full time

**LeadSite Reliability Engineer** **Seattle,WA** **FTE/Direct hiring with benefits** **NoRemote - Onsite and Hybrid position fromWA location only** **Qualification& Skills** + 8+ years ofexperience in Site Reliability Engineering or related field + Develop,maintain and configure cloud observability systems (e.g., Datadog, Splunk,OpenTelemetry, APM, etc.). +...
Site Reliability Engineer

2 days ago

Seattle, United States INSPYR Solutions Full time

Title: Site Reliability Engineer Location: Seattle, WA (2-3 days onsite) Duration: 1+ year contract (Possibility of conversion) Compensation: $85-$95.40/hour Work Requirements: US Citizen, GC Holders or Authorized to Work in the U.S. Skillset / Experience: You will be taking a lead role, interacting with a squad of experienced AWS software...
Reliability Engineer

3 weeks ago

Seattle, United States JLL Full time

JLL is seeking aReliability Engineerto join our team! In JLL Work Dynamics our most significant assets are our "People" and our "Clients". We will act with Dignity and Respect, make Ethical Decisions, champion Corporate Responsibility and serve as a driving force for a Sustainable Asset Management. There are opportunities for growth and career development,...
Senior Site Reliability Engineer, CORE

1 month ago

Los Gatos, California, United States Netflix Full time

"At Netflix, we strive to bring joy to people across the world through amazing stories. As we grow internationally, we are continually enhancing our cloud-based infrastructure to improve our performance, scalability, and reliability.The SRE team's goal is to ensure customer joy by successfully managing risk and minimizing impact across Netflix. We do this...
Software Engineer

4 days ago

Seattle, United States Lacework Full time

At Lacework, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, big sky thinking, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of users...
Software Engineer

5 days ago

Seattle, United States Lacework Full time

At Lacework, we strive to provide a supportive, collaborative environment where people are empowered to do the best work of their careers. Our team members enjoy solving complex problems, big sky thinking, and obsess over getting the details right. We love what we do and are proud of our work to secure clouds and container environments for thousands of users...
Site Reliability Engineer

2 months ago

New York, NY, United States Fourier Ltd Full time

Joining a growing team to support, maintain and improve their automated trading systems. You'll be working in a fast paced and agile trading environment In the worlds most successful hedge fund to support maintain and improve their trading systems. This company looks for the most talented engineers on the market and rewards them accordingly. Build and...
Site Reliability Engineer

4 days ago

Seattle, United States HireIO Inc Full time

1. Engage in and improve the whole lifecycle of Ads systems - from system design consulting through to launch reviews, deployment, operation and refinement. 2. Build availability of services deployed across multiple data centers globally. 3. Deliver tools/software to improve the reliability, scalability and operability of services. 4. Measure and monitor...
Site Reliability Engineer

3 weeks ago

Seattle, United States HireIO Inc Full time

1. Engage in and improve the whole lifecycle of Ads systems - from system design consulting through to launch reviews, deployment, operation and refinement. 2. Build availability of services deployed across multiple data centers globally. 3. Deliver tools/software to improve the reliability, scalability and operability of services. 4. Measure and monitor...
Site Reliability Engineer

4 weeks ago

Seattle, United States Hireio, Inc. Full time

Job DescriptionJob Description1. Engage in and improve the whole lifecycle of Ads systems — from system design consulting through to launch reviews, deployment, operation and refinement.2. Build availability of services deployed across multiple data centers globally.3. Deliver tools/software to improve the reliability, scalability and operability of...
Site Reliability Engineer

4 weeks ago

Seattle, United States Vareto Full time

As part of a growing SRE team, you will play a critical role in laying the foundation for Vareto's platform and infrastructure. You'll drive key engineering initiatives related to our cloud infrastructure, continuous integration process, deployment automation, security, compliance, and more. This is a fully remote position. Candidates must reside in and be...
Cybersecurity Site Reliability Engineer 3

4 days ago

York, PA, United States Comcast Corporation Full time

FreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, we’re making it easier for buyers and sellers to transact across all screens, data types, and sales channels. As a global company, we have offices in nine countries and can...
Reliability Engineer

4 days ago

Seattle, United States Saint-Gobain Group Full time

Consistent with CertainTeed Gypsum Vision, Mission, Values and Objectives, the Reliability Engineer identifies and quantifies Line 1 and Line 2 root cause failure(s), and drives permanent solutions to address systemic or chronic mechanical deficiencies to world class levels of safety, environmental impact, quality, service, and efficiency standards within...

Americas

Europe

Asia / Oceania

Africa

Senior Site Reliability Engineer