Senior/Staff Site Reliability Engineer

2 days ago


San Francisco, California, United States Crusoe Full time

About Crusoe Energy Systems

Crusoe Energy Systems is a pioneering company that aims to unlock value in stranded energy resources through the power of computation. Our mission is to align the long-term interests of the climate with the future of global computing infrastructure.

Our Approach

We co-locate mobile data centers with stranded energy resources, such as flare gas and underloaded renewables, to deliver low-cost, carbon-negative distributed computing solutions. Our managed cloud services platform, Crusoe Cloud, enables climate-friendly innovation in computationally intensive fields, including artificial intelligence, graphics rendering, and computational biology.

About This Role

We are seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer at Crusoe Energy Systems, you will play a pivotal role in ensuring the reliability and performance of our infrastructure. Your primary responsibility will be to detect, analyze, and prevent issues to maintain high Service Level Agreement through Service Level Indicators (SLIs) and Service Level Objectives (SLOs).

Key Responsibilities

  • Collaborate with the SRE team to design, implement, and maintain scalable and reliable infrastructure
  • Develop and maintain monitoring tools to ensure high SLIs and SLOs
  • Participate in incident response drills, post-mortems, and root cause analysis sessions to learn from past issues and prevent future ones
  • Work closely with software engineers to advise on best practices for resilient code and review changes before deployment
  • Automate routine processes and develop tools to enhance our monitoring capabilities

Requirements

  • 5+ years of professional SRE experience
  • 5+ years of experience contributing to architecture and design of new and current systems
  • Bachelor's Degree in Computer Science or related field, or 8+ years relevant work experience
  • Solid understanding of infrastructure design, including the operational trade-offs of various designs
  • Experience writing high-quality code with at least one programming language (Python, Go, or similar)
  • Experience building with modern infrastructure tools such as Docker, Kubernetes, Ansible, Cloud Formation, Terraform
  • Experience building with modern CI/CD practices and build systems, such as GitLab CI/CD, CircleCI, GitHub Actions
  • Experience with logging, monitoring, and alerting systems and tools
  • Experience with Unix/Linux environments
  • Experience with TCP/IP and network programming
  • Experience with information security best practices
  • Excellent communication skills
  • Must be able to pass a background check
  • Embody the Company values

Benefits

  • Hybrid work schedule
  • Industry-competitive pay
  • Restricted Stock Units in a fast-growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term, and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Subscription to the Calm app
  • MetLife Legal
  • Company-paid commuter benefit; $50 per pay period

Compensation Range

Compensation will be paid in the range of $183,000 - $250,000. Restricted Stock Units are included in all offers. Compensation to be determined by the applicant's education, experience, knowledge, skills, and abilities, as well as internal equity and alignment with market data.

Crusoe Energy is an Equal Opportunity Employer

Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.



  • San Francisco, California, United States Crusoe Full time

    About This Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Crusoe Energy Systems. As a Site Reliability Engineer, you will play a pivotal role in ensuring the reliability and performance of our infrastructure.Key Responsibilities:Collaborate with the SRE team to detect, analyze, and prevent issues to maintain high Service...


  • San Francisco, California, United States Crusoe Energy Inc Full time

    About Crusoe Energy IncWe are a pioneering company that aims to unlock value in stranded energy resources through the power of computation. Our mission is to align the long-term interests of the climate with the future of global computing infrastructure.We are inspired by making sure that the energy meeting the demand for technology is sourced in an...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS provider and a prominent leader in the Salesforce DevSecOps platform tailored for regulated sectors such as finance, insurance, and healthcare. Our solutions empower developers to streamline their daily operations, enhancing productivity and accelerating release cycles while adhering to...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS company recognized as the premier provider of Salesforce DevSecOps solutions tailored for regulated sectors such as finance, insurance, and healthcare. Our offerings empower developers to streamline their daily operations, enhancing productivity and accelerating release cycles while adhering...


  • San Francisco, California, United States Crusoe Full time

    About Crusoe Energy SystemsCrusoe Energy Systems is a pioneering company that aims to unlock value in stranded energy resources through the power of computation. Our mission is to align the long-term interests of the climate with the future of global computing infrastructure.Our ApproachWe co-locate mobile data centers with stranded energy resources, such as...


  • San Francisco, California, United States RevenueCat Full time

    About RevenueCatWe are a leading provider of mobile subscription infrastructure, handling over $3 billion in in-app purchases annually across thousands of apps. Our mission is to build a standard for mobile subscription infrastructure, and we're looking for a Senior Site Reliability Engineer to help us achieve this goal.About the RoleWe're seeking a highly...


  • San Francisco, California, United States Outdefine Full time

    About the JobWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Outdefine. As a key member of our Infrastructure team, you will be responsible for ensuring the reliability and scalability of our blockchain-based systems.Key ResponsibilitiesRun internal Chainlink and Blockchain nodes to ensure seamless connectivity and data...


  • San Francisco, California, United States Centene Full time

    About the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Centene. As a key member of our technology organization, you will play a critical role in ensuring the reliability, performance, and security of our platform infrastructure.Key ResponsibilitiesLead Projects and Initiatives: Help lead projects focused on...


  • San Francisco, California, United States Operant AI Full time

    Job OverviewSenior Site Reliability EngineerAs the inaugural SRE within our organization, we are looking for an individual to establish Operant's SRE strategy and operations aimed at ensuring the resilience and security of our platforms and services. If you are enthusiastic about the prospect of being an early engineer at a startup ready to revolutionize...


  • San Francisco, California, United States Crusoe Full time

    About the RoleAs a key member of our technical team, you will play a pivotal role in ensuring the reliability and performance of our infrastructure at Crusoe Energy Systems. We are a pioneering company that's revolutionizing the way we think about energy and computing, and we're looking for a skilled Site Reliability Engineer to join our team.Key...


  • San Francisco, California, United States Circle Full time

    About CircleCircle is a leading financial technology company that is revolutionizing the way value is transferred globally. Our innovative infrastructure enables businesses, institutions, and developers to harness the power of blockchain technology and capitalize on the emerging internet of money.Job SummaryWe are seeking a highly skilled Senior Site...


  • San Francisco, California, United States Diverse Lynx Full time

    About the Role:We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a key member of our organization, you will play a critical role in ensuring the reliability and efficiency of our digital infrastructure.Key Responsibilities:Design and implement reliable digital infrastructure solutionsCollaborate with...


  • San Francisco, California, United States Autodesk, Inc. Full time

    Job SummaryWe are seeking a highly skilled Senior Site Reliability Engineer to lead our cloud infrastructure efforts and ensure the reliability and performance of our software solutions. As a key member of our team, you will be responsible for designing, implementing, and maintaining scalable and secure cloud infrastructure to support our growing user...


  • San Francisco, California, United States AutoRABIT Holding Inc. Full time

    Job OverviewAbout AutoRABIT:AutoRABIT is a rapidly expanding SaaS company recognized as the premier provider of Salesforce DevSecOps solutions tailored for regulated sectors such as finance, insurance, and healthcare. Our platform empowers developers to streamline their workflows, enhancing productivity and accelerating release cycles while adhering to...


  • San Francisco, California, United States Astranis Full time

    About the RoleAstranis is a pioneering company in the field of satellite technology, aiming to bridge the digital divide by connecting the four billion people worldwide who lack internet access. As a Senior Site Reliability Engineer for Ground Software Systems, you will play a crucial role in ensuring the reliability and availability of our mission-critical...


  • San Francisco, California, United States Cognizant Full time

    Senior Site Reliability Engineer and R2 Solutions Architect (Remote) Cognizant is seeking an experienced Senior Site Reliability Engineer and R2 Solutions Architect with expertise in Python Performance Validation and Dynatrace to oversee critical projects. Your contributions will significantly enhance the efficiency and effectiveness of our solutions,...


  • San Francisco, California, United States Chelsoft Solutions Co Full time

    Job OverviewWe are seeking a Senior Site Reliability Engineer to join our dynamic team at Chelsoft Solutions Co. This position is designed for a skilled SRE professional who thrives in a hybrid work environment.Key ResponsibilitiesImplement and maintain reliable systems and infrastructure.Collaborate with cross-functional teams to enhance system...


  • San Diego, California, United States Dexcom Full time

    About Dexcom:Founded in 1999, Dexcom, Inc. (NASDAQ: DXCM) is a pioneer in the development and marketing of Continuous Glucose Monitoring (CGM) systems designed for use by individuals with diabetes and healthcare professionals. As a leader in the transformation of diabetes management, Dexcom is committed to providing innovative CGM technology that empowers...


  • San Francisco, California, United States Cisco Full time

    Position Overview We are seeking experienced engineers to become part of our Federal region's Site Reliability Engineering (SRE) team at Cisco, a leader in Internet and cloud intelligence solutions. In this role, you will be instrumental in designing and sustaining the infrastructure and systems vital for the operations within the Federal sector. Your...


  • San Francisco, California, United States Zilliz Full time

    Key Responsibilities:Collaborate with cross-functional teams to design and implement scalable and reliable cloud-based systems.Develop and maintain monitoring tools and systems to ensure the availability and performance of Zilliz's distributed database systems.Design and implement strategies for incident management and disaster recovery to minimize downtime...