We have other current jobs related to this field that you can find below


  • Seattle, Washington, United States Flexe Full time

    Flexe solves the hardest omnichannel logistics problems for the world's largest retailers and brands. Integrating technology, open logistics networks, and elastic economic models allows Flexe customers to move fast, at scale, and with precision. Founded in 2013 and headquartered in Seattle, Flexe brings deep logistics expertise and enterprise-grade...


  • Seattle, Washington, United States Apple Full time

    Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our Apple Services Engineering team in Seattle, Washington. As a key member of our dynamic team, you will play a critical role in ensuring the availability, latency, and overall health of our object store orchestration service.Key...


  • Seattle, United States Prodigy Resources Full time

    About Us: Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us: Prodigy is seeking an SRE to join our clients organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, theyre seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us:Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Prodigy Resources Full time

    About Us:Prodigy is seeking an SRE to join our client's organization which is leading the charge in fintech innovation, providing state-of-the-art solutions that drive financial success and empower our clients. As they embark on an exciting Greenfield project, they're seeking an experienced Site Reliability Engineer to join their team. This role is critical...


  • Seattle, United States Apple Full time

    Senior Site Reliability Engineer, Object Storage Seattle, Washington, United States Software and Services The Apple Services Engineering (ASE) team is one of the most exciting examples of Apple’s long-held passion for combining art and technology. These are the people who power the App Store, Apple TV, Apple Music, Apple Podcasts, and Apple Books. They...


  • Seattle, United States Apple Full time

    To view your favorites, sign in with your Apple ID. Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Join Apple’s Cloud Service Infrastructure team as a site reliability...


  • Seattle, United States West500 Partners Full time

    Our client is a fast-growing downtown Seattle startup developing AI automation for professional services, including legal technology and medical records. They have a great product market fit and rapidly increasing revenues and are currently in need of a local Software Engineering Lead with CI/CD expertise, an AWS background, and a keen interest in innovative...


  • Seattle, United States West500 Partners Full time

    Our client is a fast-growing downtown Seattle startup developing AI automation for professional services, including legal technology and medical records. They have a great product market fit and rapidly increasing revenues and are currently in need of a local Software Engineering Lead with CI/CD expertise, an AWS background, and a keen interest in innovative...


  • Seattle, United States Oracle Full time

    OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...


  • Seattle, Washington, United States F5 Networks Full time

    About F5 NetworksAt F5 Networks, we are dedicated to shaping a superior digital landscape. Our teams empower organizations worldwide to create, secure, and operate applications that enhance our interactions with the ever-evolving digital environment.We are deeply committed to cybersecurity, safeguarding consumers from fraud, and enabling businesses to...


  • Seattle, United States Capgemini Full time

    **Site Reliability Engineer** **FTE with benefits** Our team is looking to add experienced Site Reliability / DevOps Engineer to our team. + Experiencedwith **Python and Shell Scripting.** + **Shouldhave extensive experience with Azure or AWS (Azure preferred)** + **Experiencewith Monitoring and Observability - Datadog** + **Experiencewith Infrastructure as...


  • Seattle, United States Oracle Full time

    OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...


  • Seattle, United States Oracle Full time

    OCI Incident Response is the first line of defense for maintaining the high availability of Oracle’s cloud. We make customer-impacting events shorter, less frequent, and less impactful by providing large-scale incident management. We are front-and-center in driving down event duration by using our operational experience, knowledge of standard processes,...


  • Seattle, Washington, United States Oracle Full time

    Overview: The OCI Incident Response team serves as the primary defense mechanism for ensuring the uninterrupted operation of Oracle's cloud services. Our mission is to reduce the frequency and impact of customer-affecting incidents by implementing effective large-scale incident management strategies. We leverage our operational expertise, adherence to...


  • Seattle, United States Moloco Full time

    About the Role Moloco is a machine learning company that operates at massive scale (we ingest 10 petabytes of training data per day), and our models are blazingly fast (return predictions in 10 milliseconds or less); and a profitable unicorn (we are valued at $2 billion and have been profitable for the last 13+ quarters). We are looking for an exceptional...


  • Seattle, Washington, United States Circle Full time

    About the RoleWe are seeking a highly skilled Cloud Engineer to join our team at Circle, a leading financial technology company. As a Senior Site Reliability Engineer, you will play a critical role in designing, building, and maintaining our cloud infrastructure estate to meet the growing demands of our worldwide customer base.You will be responsible for...


  • Seattle, Washington, United States Apple Full time

    Overview:Position Number: The Apple Services Engineering team exemplifies Apple's dedication to merging creativity with technology. We invite you to join the Apple Services Engineering Cloud Service Infrastructure team as a Site Reliability Engineer, where you will play a pivotal role in supporting and expanding cloud services for millions of Apple users....


  • Seattle, United States Oracle Full time

    We are seeking experienced cloud technologists, interested in solving hard problems on tight schedules, to join our Major Incident Management team. OCI Incident Response is the first line of defense for maintaining the high availability of Oracles c Reliability Engineer, Architect, Liability, Engineer, Principal, Reliability, Technology

Senior Site Reliability Engineer

1 month ago


Seattle, United States Censys Full time

Censys knows the internet and cloud better than anyone else. Attack Surface Management provides customers with an attacker-centric view of all externally facing internet and cloud to extend visibility, prioritize, and remediate the most critical risk exposures that will actually lead to a breach. Our daily IPv4 scans and the world’s largest SSL/TLS Certificate database enables customers with the most accurate and continuously updated attack surfaces. Enterprise security teams leverage Censys to keep pace with the speed of the business and gain an advantage on the rapidly evolving cyber-attack threats.

Role Summary

As a Site Reliability Engineer on the Infrastructure and Ops platform team, you will help design, build, and deploy the tools used to empower our development teams and production applications. We’re looking for talented engineers to help grow our operational maturity as well as equally enjoy mastering cloud-native technologies to build and support our microservice architecture growth and reliability. 

As a Developer Efficiency and Experience focused SRE, you will be responsible for helping improve the efficiency of engineering and our development teams by supporting the SDLC and workflows of our developers, ranging from writing supporting application code, automation, and most importantly, empowering developers to have the confidence to create, deploy, and manage their services end-to-end inside the platform.

Why Censys Sticks Out
  • Humble Brilliance: We celebrate intelligence and foster a culture where tackling cybersecurity challenges is exciting and rewarding.
  • A Mosaic of Minds: Our diverse team brings together various talents and backgrounds, fostering true innovation through diverse perspectives.
  • Thinkers and Tinkerers: At Censys, we encourage questioning the status quo, experimenting, and pushing boundaries to drive breakthroughs.
  • Coffee-Fueled Collaboration: Collaboration is at the heart of our success, fueled by a shared passion for disrupting the cybersecurity scene.
  • Trailblazing Impact: Your innovations at Censys make waves in real-world cybersecurity, shaping the future and leaving a lasting legacy.
  • Life Beyond the Firewall: We prioritize work-life balance and support your well-being inside and outside the office.
  • Empowered Voices: Your voice matters at Censys; we champion an inclusive culture where diverse perspectives drive change.
What you will do
  • Build and maintain tooling to support our applications in Kubernetes and in the Google Cloud Platform.
  • Work with development teams to help them build, ship, and deploy services and applications with ease and confidence, and promote service resilience and reliability.
  • Help ensure smooth operations of our production environments, and work with developers to help debug complex issues as they arise. This includes creating and facilitating the capturing and monitoring of the 4 golden signals in our applications.
  • Help to create a self-service platform by working with the rest of the SRE and infrastructure team to accelerate and promote developer velocity, including service catalogs, repository tooling and documentation. We believe in the self-service model and treat the development team as our internal customers, including listening to feedback, seeking out improvements, and quickly iterating to continually provide value.
  • Participate in a shared on-call rotation schedule. We believe in service end-to-end ownership and as such both development teams and SRE participate in on-call.  Our SRE team is responsible for maintaining and being on call for our infrastructure environments and ensuring primary site up-time.
Required Qualifications
  • Experience deploying, managing, and debugging applications in a Kubernetes environment. We leverage Helm and Crossplane heavily to deploy our applications.
  • Experience building, securing, and managing container images.
  • Experience working with Cloud-based environments, and interacting with Cloud services such as CloudSQL databases, Pub/Sub, Memorystore, and others.
  • Familiarity with Infrastructure-as-code Tools, such as Terraform, Crossplane, or similar.
  • Experience with tools and solutions used to monitor the 4 golden signals (latency, traffic, errors, and load), including Prometheus, Grafana, and OpenTelemetry.
  • Familiarity with a monorepo, trunk-based development model with monolithic build tooling and CI/CD, with a strong desire to achieve Continuous Deployment. Familiarity with CI/CD systems such as GitHub Actions, ArgoCD, or similar.
  • Ability to communicate and support developers with empathy to support their day-to-day roles, seeking ways to automate and promote self-service as necessary to continually enable developers to move with higher velocity and confidence through the entire SDLC.
Preferred Qualifications
  • Experience building and supporting a gRPC microservice architecture. Familiarity with Kubernetes Service Mesh, such as Istio or similar, to support our microservice architecture observability, multi-cluster routing, and network efficiency is highly desirable.
  • Ability to interface with application code to help assist in introducing best practices, golden path standardization, shared libraries, etc. The majority of our applications are written in Go. Python and Scala are present to a lesser degree.
  • Familiarity with Application Security tooling, such as dependency scanning, static analysis, and other linting tooling to help shift security left in the SDLC and CI process, and bridge engineering practices with our Security Operations team.
  • Familiarity and comfort with Linux-based environments.
Qualities
  • Have a passion for clean, concise architecture and enjoy working in a GitOps based environment.
  • Comfortable with projects that have a large degree of uncertainty and risk
  • Desire to collaborate with and advise product management and leadership to balance long term maintainability of software against rapid development, as well as clearly communicate BCDR implications.
  • Understands and practices the principles of continuous delivery to ensure quick, safe, and sustainable development in the face of changing priorities and uncertainty
What will make you stand out
  • Basic understanding of infrastructure operations, including load-balancers, ingresses, routing, DNS, and VPC design. We operate several data center environments across the globe in addition to our cloud infrastructure.
  • Not being afraid to dig into code to better understand how our applications work to better facilitate testing, integration, and development environments, to help instrument metrics, or to improve service reliability.
  • Deep understanding of how to optimize and support web-based applications and help protect public-facing assets with tooling such as anti-DDoS and Web-Application Firewall technologies.

Our target salary range for this role is between $160,000 USD and $190,000 USD + bonus eligibility and equity.

Our roots are in Ann Arbor, Michigan with location hubs in Seattle, the Bay Area, Washington D.C., and Dublin, Ireland. Our innovation is fueled by the team’s global perspectives and diverse backgrounds. 

Don't meet every single requirement? Studies have shown that women and people of color are less likely to apply to jobs unless they feel they meet every qualification. At Censys we are dedicated to building a diverse, inclusive, and authentic workplace - so if you're excited about this role but your past experience doesn't align perfectly with every listed requirement in the job description, we encourage you to apply anyway. You may be exactly who we need to fill this role or others 

We value diversity and are committed to creating an inclusive environment for all employees. Censys is an equal opportunity employer.