Senior Cloud Reliability Engineer
2 weeks ago
Crusoe Energy Systems is a pioneering company that's revolutionizing the way we approach energy resources. Our mission is to unlock value in stranded energy resources through the power of computation.
We're driven by a vision to align the long-term interests of the climate with the future of global computing infrastructure. As data centers consume an exponentially growing power footprint to deliver technology to all connected devices, we're committed to making sure that the energy meeting that demand is sourced in an environmentally responsible fashion.
Our innovative approach involves co-locating mobile data centers with stranded energy resources, like flare gas and underloaded renewables, to deliver low-cost, carbon-negative distributed computing solutions. Crusoe Cloud is a managed cloud services platform powered by stranded energy that enables climate-friendly innovation in computationally intensive fields, including artificial intelligence, graphics rendering, and computational biology.
About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer at Crusoe Energy Systems, you'll play a pivotal role in ensuring the reliability and performance of our infrastructure.
Our SRE team is dedicated to detecting, analyzing, and preventing issues to maintain high Service Level Agreement through Service Level Indicators (SLIs) and Service Level Objectives (SLOs). Through automation and proactive remediation, our SREs not only resolve common errors automatically but also advise various engineering teams in building resilient code.
We prioritize anticipating and resolving issues before they impact our customers, conducting thorough post-mortems, and driving continuous improvement. Our customer-centric approach ensures that clients always have access to the virtual machines they depend on.
Responsibilities- Collaborate with the SRE team to design, implement, and maintain scalable and reliable infrastructure
- Develop and maintain tools to enhance monitoring capabilities and automate routine processes
- Work closely with software engineers to advise on best practices for resilient code and review changes before deployment
- Participate in incident response drills, post-mortems, and root cause analysis sessions to learn from past issues and prevent future ones
- Document work, share insights with the team, and plan for the next day's challenges
- 5+ years of professional SRE experience
- 5+ years of experience contributing to architecture and design of new and current systems
- Bachelor's Degree in Computer Science or related field, or 8+ years relevant work experience
- Solid understanding of infrastructure design, including operational trade-offs of various designs
- Experience writing high-quality code with at least one programming language (Python, Go, or similar)
- Experience building with modern infrastructure tools such as Docker, Kubernetes, Ansible, Cloud Formation, Terraform
- Experience building with modern CI/CD practices and build systems, such as GitLab CI/CD, CircleCI, GitHub Actions
- Experience with logging, monitoring, and alerting systems and tools
- Experience with Unix/Linux environments
- Experience with TCP/IP and network programming
- Experience with information security best practices
- Excellent communication skills
- Must be able to pass a background check
- Embody the Company values
- Hybrid work schedule
- Industry-competitive pay
- Restricted Stock Units in a fast-growing, well-funded technology company
- Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
- Employer contributions to HSA accounts
- Paid Parental Leave
- Paid life insurance, short-term, and long-term disability
- Teladoc
- 401(k) with a 100% match up to 4% of salary
- Generous paid time off and holiday schedule
- Cell phone reimbursement
- Tuition reimbursement
- Subscription to the Calm app
- MetLife Legal
- Company-paid commuter benefit; $50 per pay period
Crusoe Energy Systems is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.
-
Senior Cloud Reliability Engineer
1 week ago
San Francisco, California, United States Cribl, Inc Full timeCribl Inc is seeking a Senior Cloud Reliability Engineer to join our mission to unlock the value of all observability data.Cribl provides users a new level of observability, intelligence and control over their real-time data.You will join a team of technical engineers who are committed to shipping only high-quality software and enjoying all the goat gifs the...
-
Senior Cloud Reliability Engineer
2 weeks ago
San Francisco, California, United States Atlassian Full timeOverview:We are seeking a highly skilled Senior Cloud Reliability Engineer to join our growing SRE team at Atlassian. As a key member of our team, you will be responsible for designing, implementing, and maintaining scalable and reliable cloud infrastructure that supports our suite of cloud products.The ideal candidate will have a strong background in cloud...
-
Senior Cloud Reliability Engineer
2 weeks ago
San Francisco, California, United States Crusoe Energy Inc Full timeUnlock Value in Stranded Energy ResourcesCrusoe Energy is revolutionizing the way we think about energy consumption and computing infrastructure. As a Senior Cloud Reliability Engineer, you will be part of a team that is passionate about making a positive impact on the environment while driving innovation in computationally intensive fields.About UsWe are a...
-
Senior Cloud Reliability Engineer
2 weeks ago
San Francisco, California, United States Varo Bank Full timeVaro Bank's cloud infrastructure is a complex system that requires a high level of reliability and availability. As a Senior Cloud Reliability Engineer, you will be responsible for designing and maintaining disaster recovery scenarios, ensuring that our systems are always up and running.We are looking for a skilled engineer who can write and maintain...
-
Senior Cloud Reliability Engineer
1 month ago
San Francisco, California, United States Springshot Full timeAbout the RoleWe're seeking a seasoned Senior Site Reliability Engineer to join our team at Springshot. As a key member of our crew, you'll play a vital role in maintaining the reliability and performance of our SaaS platform.With a strong portfolio of global aviation customers and a passion for innovation, we're continuously pushing the boundaries of what's...
-
Senior Cloud Reliability Engineer
2 weeks ago
San Francisco, California, United States Springshot Full timeAbout SpringshotSpringshot is a pioneering technology company that bridges the gap between human ingenuity and technological advancements. Our mission is to empower individuals and organizations to thrive in a world where technology and humanity coexist. We believe in the transformative power of innovation and strive to create solutions that make a...
-
Senior Cloud Reliability Engineer
2 weeks ago
San Francisco, California, United States Crusoe Energy Inc Full timeAbout Crusoe Energy IncCrusoe Energy Inc is a pioneering company that is revolutionizing the way we approach energy resources. Our mission is to unlock value in stranded energy resources through the power of computation.Job SummaryWe are seeking a highly skilled Senior/Staff Site Reliability Engineer to join our team. As a key member of our engineering team,...
-
Senior Site Reliability Engineer
2 weeks ago
San Francisco, California, United States Tampa Gardens Senior Living Full timeAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our Cloud Infrastructure Team. As a key member of our team, you will be responsible for deploying, managing, optimizing, and upgrading the systems that run Sight Machine software.You will work closely with our Development Engineering team to ensure the stability,...
-
Senior Cloud Reliability Engineer
2 weeks ago
San Francisco, California, United States Crusoe Full timeAbout Crusoe EnergyCrusoe Energy is a pioneering company that's revolutionizing the way we approach energy resources. Our mission is to unlock value in stranded energy resources through the power of computation.Job DescriptionWe're seeking a highly skilled Senior Cloud Reliability Engineer to join our team. As a key member of our infrastructure team, you'll...
-
Senior Cloud Reliability Engineer
2 weeks ago
San Francisco, California, United States Crusoe Full timeAbout Crusoe EnergyCrusoe Energy is a pioneering company that aims to unlock value in stranded energy resources through the power of computation. Our mission is to align the long-term interests of the climate with the future of global computing infrastructure.Job DescriptionWe are seeking a highly skilled Senior/Staff Site Reliability Engineer to join our...
-
Senior Cloud Reliability Engineer
2 weeks ago
San Francisco, California, United States Crusoe Energy Inc Full timeAbout Crusoe Energy IncCrusoe Energy Inc is a pioneering company that aims to unlock value in stranded energy resources through the power of computation.Job DescriptionWe are seeking a highly skilled Senior Cloud Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for designing, building, and maintaining...
-
Senior Cloud Engineer
2 weeks ago
San Francisco, California, United States TBWA\Chiat\Day Full timeAbout Scout MotorsScout Motors is a pioneering company that is revolutionizing the electric pick-up truck and rugged SUV marketplace. We're a team of innovators, entrepreneurs, and visionaries who are passionate about shaping the future of transportation.Job SummaryWe're seeking a highly skilled Senior Site Reliability Engineer to join our team. As a key...
-
Senior Cloud Engineer
1 week ago
San Francisco, California, United States Eateam Full timeRole:As a key member of Eateam's infrastructure team, we are seeking a highly skilled Senior Cloud Engineer to lead our cloud platform engineering efforts.Responsibilities:Design and deploy virtualization architectures, including VMware, Openshift, or KubeVirt platforms.Evaluate existing application architectures and identify opportunities for...
-
Senior Site Reliability Engineer
3 weeks ago
San Francisco, California, United States SingleStore Full timeJob Title: Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at SingleStore. As a key member of our engineering team, you will be responsible for designing, building, and running elastic Kubernetes clusters across on-prem, AWS, Azure, and Google Cloud environments.Key Responsibilities:Help drive...
-
Senior Cloud Reliability Engineer
4 weeks ago
San Francisco, California, United States Crusoe Full timeAbout Crusoe EnergyCrusoe Energy is a pioneering company that's revolutionizing the way we think about energy and computing. Our mission is to unlock value in stranded energy resources through the power of computation.Job DescriptionWe're seeking a highly skilled Senior/Staff Site Reliability Engineer to join our team. As a key member of our infrastructure...
-
Senior Site Reliability Engineer
1 month ago
San Francisco, California, United States Autodesk Full time{"Responsibilities": "As a Senior Site Reliability Engineer at Autodesk, you will be responsible for leading the development and maintenance of robust cloud infrastructure to support millions of daily users. You will automate processes to improve system reliability and introduce best practices in continuous integration and deployment. You will also lead...
-
Senior Cloud Reliability Engineer
7 days ago
San Francisco, California, United States Crusoe Full timeAbout Crusoe Energy Systems:We are a company on a mission to unlock value in stranded energy resources through the power of computation.Our goal is to align the long-term interests of the climate with the future of global computing infrastructure.Data centers consume an exponentially growing power footprint to deliver technology to all connected devices, and...
-
Senior Cloud Reliability Engineer
3 weeks ago
San Jose, California, United States Tik Tok Full timeSenior Site Reliability Engineer, Global E-CommerceTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy to our users. As a Senior Site Reliability Engineer on our Global E-Commerce team, you will play a critical role in ensuring the reliability and scalability of our platform.Responsibilities:Be...
-
Senior Cloud Reliability Engineer
3 weeks ago
San Jose, California, United States Tik Tok Full timeJob Title: Senior Site Reliability Engineer, Global E-CommerceTikTok is a leading destination for short-form mobile video, and our mission is to inspire creativity and bring joy. As a Senior Site Reliability Engineer on our Global E-Commerce team, you will play a critical role in ensuring the reliability and scalability of our e-commerce...
-
Senior Cloud Engineer
2 weeks ago
San Francisco, California, United States Ansa Full timeAbout the RoleWe are seeking a highly skilled Senior Cloud Engineer to join our team at Ansa. As a key member of our engineering team, you will be responsible for designing and implementing scalable, reliable, and secure cloud-based systems.Your primary focus will be on building and maintaining our cloud infrastructure, ensuring seamless integration with our...