Staff Site Reliability Engineer
2 weeks ago
We are seeking a highly skilled Staff Site Reliability Engineer to join our Data Engineering team. As a key member of our team, you will be responsible for maintaining and enhancing the reliability of our data infrastructure.
Your work will directly impact the availability and performance of our data services, enabling the organization to make better decisions.
You will collaborate closely with data engineers and software engineers to develop and drive 100% automation, best practices for deep monitoring and alerting.
This role will report to our Director of Data Engineering.
About You- Bachelor's degree in Computer Science, Information Technology, or a related field.
- 12+ years of experience in site reliability engineering, database operations, or a related role with a focus on data platforms, data stores, data operations.
- Extensive experience with AWS cloud platform and their data-related services.
- Proficiency in monitoring tools (e.g., Datadog, CloudWatch, DevOps Guru, DB Performance Insights).
- Proficiency in one or more programming languages (e.g. Python, Java).
- Proficiency in automation frameworks (e.g., Terraform, Cloud Formation).
- Strong understanding of various performance metrics both at a high level and at a low level like Disk/IO saturation.
- Experience in identifying and eliminating the bottlenecks in the system.
- Strong understanding of database internals like types of indexes, schemas, query plans.
- Strong understanding of database systems (e.g., SQL, NoSQL) and experience in managing large-scale data infrastructures.
- Strong understanding and hands-on implementation of CI/CD pipelines and DataOps practices.
- Experience with data governance, compliance, and lifecycle management.
- Ability to own and execute projects while effectively collaborating with the team to influence and shape the vision of the data engineering organization.
We value:
- Courage. We believe that when we overcome fear, we enable our best selves.
- Curiosity. We are curious, which is the gateway to empathy, inclusion, and understanding.
- Service. We serve our community with humility, enabling joy and belonging for others.
- Kaizen. We have a growth mindset committed to constant forward progress.
We are an equal opportunity employer and value diversity at Crunchyroll. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
-
Staff Site Reliability Engineer
1 week ago
San Francisco, California, United States Crunchyroll Full timeAbout CrunchyrollWe're a global entertainment company dedicated to delivering the art and culture of anime to a passionate community. Our mission is to help everyone belong, and we're looking for talented individuals to join our team.The RoleWe're seeking a Staff Site Reliability Engineer to maintain and enhance the reliability of our data infrastructure. As...
-
Staff Site Reliability Engineer
1 month ago
San Francisco, California, United States Gusto Full timeAbout GustoGusto is a modern, online people platform that empowers small businesses to take care of their teams. Our comprehensive suite of tools includes full-service payroll, health insurance, 401(k)s, expert HR, and team management solutions. With offices in Denver, San Francisco, and New York, we serve over 300,000 businesses nationwide.Our MissionWe...
-
Staff Site Reliability Engineer
1 month ago
San Francisco, California, United States Crunchyroll Full timeAbout CrunchyrollWe're a global entertainment company dedicated to delivering the art and culture of anime to a passionate community. Our mission is to help everyone belong, and we're committed to creating a workplace that reflects this value.The RoleWe're seeking a highly skilled Staff Site Reliability Engineer to join our Data Engineering team. As a key...
-
Staff Site Reliability Engineer
3 weeks ago
San Francisco, California, United States Gusto Full timeAbout GustoGusto is a leading provider of modern, cloud-based people management solutions for small businesses. Our platform offers a comprehensive suite of tools, including payroll, benefits, and HR management, designed to help businesses thrive.Job SummaryWe are seeking an experienced Staff Site Reliability Engineer to join our Infrastructure Engineering...
-
Cloud Platform Staff Site Reliability Engineer
2 weeks ago
San Francisco, California, United States Zilliz Full timeJob Title: Cloud Platform Staff Site Reliability EngineerWe are seeking a highly skilled Cloud Platform Staff Site Reliability Engineer to join our team at Zilliz. As a key member of our SRE team, you will be responsible for ensuring the reliability, availability, and performance of our distributed database systems.Key Responsibilities:Design and build tools...
-
Senior Staff Site Reliability Engineer
4 days ago
San Francisco, California, United States WEX Full timeAbout the RoleThe WEX Site Reliability Engineering team is seeking a technical leader to drive the design and implementation of complex systems at scale. As a Senior Staff SRE, you will work closely with engineering teams to ensure that our systems are reliable, performant, and secure.Key ResponsibilitiesProvide technical guidance and mentorship to other...
-
Senior Staff Site Reliability Engineer
4 days ago
San Francisco, California, United States WEX Full timeThe WEX Site Reliability Engineering team is seeking a Senior Staff SRE who is passionate about developing software and solutions focused on observability, incident response, reliability, and performance.The team will be part of the Benefits Reliability organization which supports our internal stakeholders and our Benefits Platform teams.As part of the...
-
Site Reliability Engineer
2 weeks ago
San Francisco, California, United States Unreal Gigs Full timeJob Title: Site Reliability EngineerAt Unreal Gigs, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the high availability, scalability, and performance of our complex distributed systems.Key Responsibilities:Design and implement monitoring, logging, and alerting...
-
Site Reliability Engineer
2 weeks ago
San Francisco, California, United States Unreal Gigs Full timeJob Title: Site Reliability EngineerAt Unreal Gigs, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the high availability, scalability, and performance of our complex distributed systems.Key Responsibilities:Design and implement monitoring, logging, and alerting...
-
Site Reliability Engineer
4 weeks ago
San Francisco, California, United States Wasmer Full timeAbout the RoleWe are seeking an exceptional Site Reliability Engineer to join our team at Wasmer. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining scalable and reliable infrastructure solutions for our Edge computing platform.Key ResponsibilitiesDesign and implement scalable and reliable infrastructure...
-
Site Reliability Engineer
4 weeks ago
San Francisco, California, United States Instabase Full timeAbout InstabaseAt Instabase, we're passionate about democratizing access to cutting-edge AI innovation to enable any organization to solve previously unsolvable unstructured data problems in their industry.With customers representing some of the largest and most complex organizations in the world, and investors like Greylock, Andreessen Horowitz, and Index...
-
Staff Site Reliability Engineer, GovCloud
3 weeks ago
San Francisco, California, United States Medallia Full timeAbout the RoleWe are seeking a highly skilled Staff Site Reliability Engineer to join our GovCloud team at Medallia. As a Staff Engineer, you will play a critical role in ensuring the reliability and availability of our applications and infrastructure for our US Government customers.Key ResponsibilitiesDesign and implement highly available and scalable...
-
Staff Site Reliability Engineer, GovCloud
2 weeks ago
San Francisco, California, United States Medallia Full timeAbout the RoleWe are seeking a highly skilled Staff Site Reliability Engineer to join our GovCloud team at Medallia. As a Staff Engineer, you will be responsible for ensuring the reliability and availability of Medallia applications for our US Government customers and infrastructure in a highly available, secure, and scalable environment.Key...
-
Site Reliability Engineer
1 month ago
San Francisco, California, United States Apollo Solutions Full timeSite Reliability EngineerApollo Solutions has partnered with a pioneering artificial intelligence business that is revolutionizing the use of AI/ML in gaming and security.The company is working closely with government contracts and gaming console companies and is seeking a Site Reliability Engineer to join their growing team.The Site Reliability Engineer...
-
Site Reliability Engineer
2 weeks ago
San Francisco, California, United States DaVita Full timeAbout the RoleThe WEX Site Reliability Engineering team is seeking a skilled Site Reliability Engineer to join our Platform Reliability organization. As a key member of our team, you will be responsible for developing software and solutions focused on observability, incident response, reliability, and performance.You will collaborate with our engineering...
-
Site Reliability Engineer
2 weeks ago
San Francisco, California, United States Roman Health Pharmacy LLC Full timeAbout the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Xero. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud-based platform.Key ResponsibilitiesInvestigate operational surprises and support teams in post-incident activitiesConduct in-depth incident...
-
Staff Site Reliability Engineer, GovCloud
4 weeks ago
San Francisco, California, United States Medallia Full timeAbout MedalliaMedallia is the pioneer and market leader in Experience Management. Our award-winning SaaS platform, Medallia Experience Cloud, leads the market in the understanding and management of experience for candidates, customers, employees, patients, citizens, and residents.The Role and TeamWe are looking for a talented Staff Site Reliability Engineer...
-
Site Reliability Engineer
2 weeks ago
San Francisco, California, United States Instabase Full timeAbout InstabaseAt Instabase, we're passionate about harnessing the power of AI innovation to democratize access to cutting-edge technology and empower organizations to solve complex unstructured data problems. With a strong presence in the market and a talented team, we're committed to delivering top-tier solutions that drive business success.Job...
-
Site Reliability Engineer
1 month ago
San Francisco, California, United States Wasmer Full timeAbout the RoleWe are seeking an exceptional Site Reliability Engineer to join our team at Wasmer. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and performance of our Edge computing platform.Key ResponsibilitiesDesign, implement, and maintain scalable and reliable infrastructure solutions for our Edge computing...
-
Site Reliability Engineer
3 weeks ago
San Francisco, California, United States SpeedCast Full timeJob Title: Site Reliability EngineerAt Speedcast, we're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based communication solutions.Key Responsibilities:Analyze and design continuous...