Senior Cloud Reliability Engineer

1 month ago


San Francisco, California, United States Crusoe Full time
About Crusoe Energy Systems

Crusoe Energy Systems is a pioneering company that's revolutionizing the way we approach energy resources. Our mission is to unlock value in stranded energy resources through the power of computation.

We're driven by a vision to align the long-term interests of the climate with the future of global computing infrastructure. As data centers consume an exponentially growing power footprint to deliver technology to all connected devices, we're committed to making sure that the energy meeting that demand is sourced in an environmentally responsible fashion.

Our innovative approach involves co-locating mobile data centers with stranded energy resources, like flare gas and underloaded renewables, to deliver low-cost, carbon-negative distributed computing solutions. Crusoe Cloud is a managed cloud services platform powered by stranded energy that enables climate-friendly innovation in computationally intensive fields, including artificial intelligence, graphics rendering, and computational biology.

About the Role

We're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer at Crusoe Energy Systems, you'll play a pivotal role in ensuring the reliability and performance of our infrastructure.

Our SRE team is dedicated to detecting, analyzing, and preventing issues to maintain high Service Level Agreement through Service Level Indicators (SLIs) and Service Level Objectives (SLOs). Through automation and proactive remediation, our SREs not only resolve common errors automatically but also advise various engineering teams in building resilient code.

We prioritize anticipating and resolving issues before they impact our customers, conducting thorough post-mortems, and driving continuous improvement. Our customer-centric approach ensures that clients always have access to the virtual machines they depend on.

Responsibilities
  • Collaborate with the SRE team to design, implement, and maintain scalable and reliable infrastructure
  • Develop and maintain tools to enhance monitoring capabilities and automate routine processes
  • Work closely with software engineers to advise on best practices for resilient code and review changes before deployment
  • Participate in incident response drills, post-mortems, and root cause analysis sessions to learn from past issues and prevent future ones
  • Document work, share insights with the team, and plan for the next day's challenges
Requirements
  • 5+ years of professional SRE experience
  • 5+ years of experience contributing to architecture and design of new and current systems
  • Bachelor's Degree in Computer Science or related field, or 8+ years relevant work experience
  • Solid understanding of infrastructure design, including operational trade-offs of various designs
  • Experience writing high-quality code with at least one programming language (Python, Go, or similar)
  • Experience building with modern infrastructure tools such as Docker, Kubernetes, Ansible, Cloud Formation, Terraform
  • Experience building with modern CI/CD practices and build systems, such as GitLab CI/CD, CircleCI, GitHub Actions
  • Experience with logging, monitoring, and alerting systems and tools
  • Experience with Unix/Linux environments
  • Experience with TCP/IP and network programming
  • Experience with information security best practices
  • Excellent communication skills
  • Must be able to pass a background check
  • Embody the Company values
Benefits
  • Hybrid work schedule
  • Industry-competitive pay
  • Restricted Stock Units in a fast-growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term, and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Subscription to the Calm app
  • MetLife Legal
  • Company-paid commuter benefit; $50 per pay period

Crusoe Energy Systems is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.



  • San Francisco, California, United States Cribl, Inc Full time

    Cribl Inc is seeking a Senior Cloud Reliability Engineer to join our mission to unlock the value of all observability data.Cribl provides users a new level of observability, intelligence and control over their real-time data.You will join a team of technical engineers who are committed to shipping only high-quality software and enjoying all the goat gifs the...


  • San Francisco, California, United States Tampa Gardens Senior Living Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Cloud Infrastructure Team. As a key member of our team, you will be responsible for deploying, managing, optimizing, and upgrading the systems that run Sight Machine software.You will work closely with our Development Engineering team to ensure the stability,...

  • Senior Cloud Engineer

    1 month ago


    San Francisco, California, United States TBWA\Chiat\Day Full time

    About Scout MotorsScout Motors is a pioneering company that is revolutionizing the electric pick-up truck and rugged SUV marketplace. We're a team of innovators, entrepreneurs, and visionaries who are passionate about shaping the future of transportation.Job SummaryWe're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key...

  • Senior Cloud Engineer

    4 weeks ago


    San Francisco, California, United States Eateam Full time

    Role:As a key member of Eateam's infrastructure team, we are seeking a highly skilled Senior Cloud Engineer to lead our cloud platform engineering efforts.Responsibilities:Design and deploy virtualization architectures, including VMware, Openshift, or KubeVirt platforms.Evaluate existing application architectures and identify opportunities for...


  • San Francisco, California, United States Crusoe Full time

    About Crusoe Energy Systems:We are a company on a mission to unlock value in stranded energy resources through the power of computation.Our goal is to align the long-term interests of the climate with the future of global computing infrastructure.Data centers consume an exponentially growing power footprint to deliver technology to all connected devices, and...


  • San Francisco, California, United States Crusoe Energy Systems Full time

    About This Role:At Crusoe Energy Systems, our Site Reliability Engineering team plays a pivotal role in ensuring the reliability and performance of our infrastructure.SRE at Crusoe is dedicated to detecting, analyzing, and preventing issues to maintain high Service Level Agreement through Service Level Indicators (SLIs) and Service Level Objectives...


  • San Francisco, California, United States Amazon Full time

    Job Title: Senior Cloud Software EngineerAbout the Role:We are seeking a highly skilled Senior Cloud Software Engineer to join our team at Amazon. As a key member of our team, you will be responsible for designing, developing, and deploying cloud services that leverage AI and machine learning techniques for our Smart Eyewear product.Key Responsibilities:•...


  • San Francisco, California, United States Twitter Full time

    Job Summary:Twitter is seeking a Senior Site Reliability Engineer to lead a team of engineers working to keep our services reliable and scalable. The ideal candidate will have experience managing services in a distributed environment and be comfortable working with on-prem and cloud-based infrastructure.Responsibilities:Lead a team of site reliability...


  • San Francisco, California, United States Waabi Full time

    Senior Cloud Infrastructure EngineerWaabi is seeking a highly skilled Senior Cloud Infrastructure Engineer to join our team. As a key member of our Infrastructure team, you will be responsible for designing, implementing, and troubleshooting cloud systems to support our AI-first approach to self-driving technology.Key Responsibilities:Collaborate with the...


  • San Francisco, California, United States Eateam Full time

    Role:As a Senior Cloud Infrastructure Engineer at Eateam, you will be responsible for designing and deploying virtualization architectures, including VMware, Openshift, or KubeVirt platforms. You will also evaluate existing application architectures and identify opportunities for containerization to improve scalability, reliability, and...


  • San Francisco, California, United States Zilliz Full time

    Job Title: Cloud Platform Staff Site Reliability EngineerWe are seeking a highly skilled Cloud Platform Staff Site Reliability Engineer to join our team at Zilliz. As a key member of our SRE team, you will be responsible for ensuring the reliability, availability, and performance of our distributed database systems.Key Responsibilities:Design and build tools...

  • Senior Cloud Engineer

    4 weeks ago


    San Diego, California, United States BRAIN CORP Full time

    Job SummaryBRAIN CORP is seeking a highly skilled Senior Cloud Engineer to join our team. As a key member of our Software Engineering team, you will be responsible for designing, developing, and maintaining scalable and reliable cloud infrastructure on the Google Cloud Platform (GCP).Key ResponsibilitiesDesign and implement cloud infrastructure on GCPDevelop...


  • San Francisco, California, United States Waabi Full time

    Transforming the Future of TransportationWaabi is a pioneering AI company revolutionizing the self-driving technology landscape. As a key member of our Infrastructure team, you will play a crucial role in designing, implementing, and troubleshooting cloud systems that enable safe and efficient self-driving at scale.With a strong background in computer...

  • Senior Cloud Engineer

    4 weeks ago


    San Francisco, California, United States Orbital Sidekick Full time

    Job Title: Senior Cloud EngineerOrbital Sidekick is seeking a Senior Cloud Engineer to join our team. As a key member of our software engineering team, you will design and develop cloud-based services that process and deliver insights to customers.Key Responsibilities:Design and develop cloud-based services using AWS serverless technologiesImplement image...


  • San Francisco, California, United States University of California San Francisco Full time

    **Job Description:**As a Senior Cloud Engineer at the University of California, San Francisco (UCSF), you will play a key role in driving the configuration and operation of our cloud infrastructure services. You will work closely with developers, engineers, business partners, and University stakeholders to provide best-in-class cloud services. Your key...


  • San Francisco, California, United States Checkr Full time

    About CheckrCheckr is a leading provider of background check services, helping businesses build trust with their customers and employees. We're looking for a Senior Cloud Security Engineer to join our team and help us drive security initiatives tied to our cloud product infrastructure.Job SummaryWe're seeking a highly skilled Senior Cloud Security Engineer...


  • San Francisco, California, United States Atika Technologies Full time

    Job Summary:Atika Technologies is seeking a highly skilled Cloud Engineer and Site Reliability Specialist to support our Corporate engineering requirements. The ideal candidate will have a strong background in DevOps (80%) and SRE (20%) with expertise in AWS and Kubernetes.Key Responsibilities:⁠ ⁠Support Corporate engineering...

  • Senior Cloud Engineer

    4 weeks ago


    San Diego, California, United States Trabus Technologies Full time

    Job Summary:We are seeking a highly skilled Senior Cloud Engineer to join our team at Trabus Technologies. As a Senior Cloud Engineer, you will be responsible for designing, implementing, and managing cloud-based systems and infrastructure for our clients. You will work closely with our team to ensure that our cloud solutions meet the highest standards of...


  • San Francisco, California, United States TBWA\Chiat\Day Full time

    Job Title:Senior Site Reliability Engineer with Perplexity AIJob Summary:We are seeking a highly skilled Senior Site Reliability Engineer to join our team at Perplexity AI. As a key member of our infrastructure team, you will be responsible for designing, implementing, and scaling our cloud infrastructure to support our AI-powered search...

  • Senior Cloud Engineer

    4 weeks ago


    San Francisco, California, United States Saxon Global Full time

    Job SummaryAs a Senior Cloud Engineer at Saxon Global, you will be responsible for designing, implementing, and managing cloud-based infrastructure and data platforms. Your expertise in cloud services, automation, and data processing will enable you to create scalable and secure solutions for our clients.Key Responsibilities:* Design and implement...