Current jobs related to Site Reliability Engineer - Mountain View, California - VentureDive

Site Reliability Engineer

4 weeks ago

Mountain View, California, United States Optomi Full time

Job Title: Site Reliability EngineerOptomi, in partnership with a large consulting firm, is seeking an experienced Site Reliability Engineer for their Remote team. This position requires a versatile, highly motivated individual capable of supplying frontline technical and operational support to our Site Reliability teams.As a vital part of the Reliability...
Site Reliability Engineer

2 weeks ago

Mountain View, California, United States Moveworks Full time

About MoveworksMoveworks is a leading AI startup that provides a universal AI copilot for search and automation across all business applications. Our mission is to empower employees to work faster and more efficiently by eliminating repetitive support issues and delivering instant knowledge.Job DescriptionWe are seeking a highly skilled Site Reliability...
Site Reliability Engineer

2 weeks ago

Mountain View, California, United States Moveworks Full time

About MoveworksMoveworks is a leading AI-powered automation platform that helps businesses streamline their operations and improve employee productivity. Our innovative technology enables employees to find information and get support in one place, reducing costs and increasing efficiency.Job DescriptionWe are seeking a highly skilled Site Reliability...
Site Reliability Engineer

1 week ago

Mountain View, California, United States Atlassian Full time

About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Atlassian. As a Site Reliability Engineer, you will play a critical role in ensuring the performance, reliability, and scalability of our cloud-based services.ResponsibilitiesDesign, implement, and maintain scalable and reliable cloud infrastructureCollaborate with...
Site Reliability Engineer

3 weeks ago

Mountain View, California, United States Moveworks Full time

About MoveworksMoveworks is a leading AI startup that provides a universal AI copilot for search and automation across all business applications. Our mission is to empower employees to work faster and more efficiently by eliminating repetitive support issues and delivering instant knowledge.Job DescriptionWe are seeking a highly skilled Site Reliability...
Site Reliability Engineer

3 weeks ago

Mountain View, California, United States Tik Tok Full time

About the RoleWe are seeking a skilled Site Reliability Engineer to join our Applied Machine Learning (AML) team. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining highly available, scalable, and fault-tolerant systems.ResponsibilitiesDesign and develop large-scale systems that meet the needs of our AML...
Site Reliability Engineer

4 weeks ago

Mountain View, California, United States Synopsys Full time

About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Platform Team at Synopsys. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and performance of our engineering environment. You will work closely with our development teams to design, implement, and operate scalable and efficient...
Site Reliability Engineer

1 week ago

Mountain View, California, United States Atlassian Full time

About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the performance and reliability of our services. You will work closely with our teams to identify and resolve issues, and develop solutions to improve our systems.Key Responsibilities:Investigate...
Site Reliability Engineer, Edge

3 weeks ago

Mountain View, California, United States Tik Tok Full time

Job Title: Site Reliability Engineer, EdgeAt TikTok, we're committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe, and so does our workplace.About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our Edge team. As a...
Senior Site Reliability Engineer

2 weeks ago

Mountain View, California, United States Groq Full time

Job Title: Senior Site Reliability EngineerWe are seeking a highly skilled Senior Site Reliability Engineer to join our team at Groq. As a key member of our infrastructure operations team, you will be responsible for ensuring the reliability, scalability, and performance of our tools and services.Key Responsibilities:Design and implement scalable and...
Senior Site Reliability Engineer

1 week ago

Mountain View, California, United States Groq Full time

Unlock the Power of AI with GroqWe're on a mission to democratize access to AI, and we need your expertise to make it happen. As a Senior Site Reliability Engineer at Groq, you'll play a critical role in ensuring the reliability, scalability, and performance of our tools and services.Key Responsibilities:Design and implement scalable and reliable...
Staff Site Reliability Engineer

1 month ago

Mountain View, California, United States Moveworks Full time

About MoveworksMoveworks is a leading AI startup that provides a universal AI copilot for search and automation across all business applications. Our mission is to empower employees to work faster and more efficiently by eliminating repetitive support issues and delivering instant knowledge.Job DescriptionWe are seeking a highly skilled Staff Site...
Site Reliability Engineer

2 weeks ago

Mountain View, California, United States Tik Tok Full time

About the RoleWe are seeking a skilled Site Reliability Engineer to join our AML team, where you will play a critical role in designing, building, and maintaining highly available, scalable, and fault-tolerant systems.ResponsibilitiesDesign and implement large-scale systems to ensure high availability and scalability.Monitor and analyze system performance,...
Staff Site Reliability Engineer

2 days ago

Mountain View, California, United States Moveworks Full time

About the RoleMoveworks is the universal AI copilot for search and automation across all your business applications. We give employees one place to go to find information and get support while reducing costs for your business. The Moveworks Copilot is powered by an industry-leading Reasoning Engine that uses a combination of public and proprietary language...
Site Reliability Engineer

2 weeks ago

Mountain View, California, United States Tik Tok Full time

About TikTok U.S. Data SecurityTikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to millions of users worldwide. Our mission is to empower creators and communities to express themselves authentically, while ensuring the security and integrity of our platform.Job SummaryWe are seeking a highly skilled Site...
Senior Site Reliability Engineer

7 days ago

Mountain View, California, United States Groq Full time

Job DescriptionAt Groq, we're revolutionizing the AI economy by making processing power more accessible, faster, and more affordable. As a Senior Site Reliability Engineer, you'll play a critical role in ensuring the reliability, scalability, and performance of our tools and services.Responsibilities:Design and implement scalable and reliable architectures...
Senior Site Reliability Engineer

1 week ago

Mountain View, California, United States Groq Full time

Unlock the Power of AI with GroqAt Groq, we're revolutionizing the AI economy by making processing power more accessible, faster, and more affordable. Our Language Processing Unit (LPU) outpaces the GPU in speed, power, efficiency, and cost-effectiveness, empowering a world where AI is universally accessible.Join Our MissionWe're seeking a Senior Site...
Site Reliability Engineer

3 days ago

Mountain View, California, United States Insight Global Full time

Site Reliability Engineer Opportunity in the Bay AreaWe are seeking a highly motivated Site Reliability Engineer to join our team in the Bay Area. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud infrastructure.Key Responsibilities:* Strong Linux System Admin fundamentals (bash/shell...
Site Reliability Engineer

1 month ago

Mountain View, California, United States Bayone Full time

Job DescriptionAt Bayone, we are seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the high availability and scalability of our online production environment.Minimum Qualifications:Bachelor's degree in Computer Science or a related technical field, or...
Site Reliability Engineer

2 weeks ago

Mountain View, California, United States NewsBreak Full time

{"h1": "Transform Local News with NewsBreak", "p": "At NewsBreak, we're revolutionizing the way users interact with local news and their communities. Our mission is to foster safer, more vibrant, and authentically connected lives through robust collaborations with local publishers and businesses across the nation. As a Site Reliability Engineer, you'll play...

Site Reliability Engineer

2 months ago

Mountain View, California, United States VentureDive Full time

Job Brief:
As Data Platform Site Reliability Engineering you will manage infrastructure and applications on cloud computing platforms to deliver data processing, governance, and storage. Our platform teams work with exabytes of data, terabytes of memory, and hundreds of thousands of jobs to enable predictable and performant data analytics.
As an SRE, you'll need to solve problems that arise using empirical data, teamwork, and your own unique expertise.
The Data Platform SRE will work directly with our data platform and engineering teams in an embedded SRE model, operating in unison with the developers to deliver seamless experiences for our customers. We run a mix of open source, vendor licensed, and internally developed tools which you will use and have opportunities to improve upon. The cross functional team collaborates to ensure we apply a consistent incident management process across all data platform services and provide user journey based SLOs derived from exhaustive observability metrics, high availability architecture, and automation for deployments. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face.
VentureDive Overview:
Founded in 2012 by veteran technology entrepreneurs from MIT and Stanford, VentureDive is the fastest growing technology company in the region that develops and invests in products and solutions that simplify and improve the lives of people world-wide. We aspire to create a technology organization and an entrepreneurial ecosystem in the region that is recognized as second to none in the world.
Responsibilities & Requirements:

Make an impact from design phase, through development and operation of Data Platform over Kubernetes cluster and its ecosystem on AWS.
Build core services, and tooling and create technical processes that simplify and enable engineers across multiple services.
Identifying, automating, and scaling system configurations without compromising on security and reliability.
Participate in on-call rotations and help improve incident response.

Education and Experience:
BS/MS in Computer Science or Equivalent (7+ years of software development or production operations experience in a large-scale environment)
Qualification Requirements:

Strong sense of ownership and integrity demonstrated through clear communication and collaboration
Experience in architecting, developing, operating, and troubleshooting Kubernetes clusters and/or other highly available systems at scale.
Proficiency with the architecture, deployment, performance tuning, and troubleshooting of open-source data analytics technologies, especially Apache Spark, Trino and related software in a large-scale environment.
The ability to design, author, and release code in languages like Go, Python, or Java
Acute drive to automate manual operations and to improve them through repeated iteration.
Understanding of the Linux Operating System, standard networking protocols, and components
Experience with cloud-native services on AWS/GCP
Hands-on experience managing large numbers of diverse systems with configuration management or software delivery platforms (such as Terraform, CloudFormation, ArgoCD,and Flux)
Experience with deploying, supporting, and monitoring new and existing services, platforms, and application stacks.
Excellent troubleshooting and problem-solving skills
Experience with scale testing, disaster recovery, and capacity planning.
Effective communication and collaboration skills: have the ability to drive and promote technical partnerships across teams.
Incident response and/or incident management experience

In order to thrive at VentureDive, you
intellectually smart and curious
the passion for and take pride in your work
believe in VentureDive's mission, vision, and values
a no-frills attitude
a collaborative team player
ethical and honest
Are you ready to put your ideas into products and solutions that will be used by millions?
You will find VentureDive to be a quick pace, high standards, fun and a rewarding place to work at. Not only will your work reach millions of users worldwide, you will also be rewarded with competitive salaries and benefits. If you think you have what it takes to be a VenDian, come join us ... we're having a ball
Powered by JazzHR
SRE

Americas

Europe

Asia / Oceania

Africa

Current jobs related to Site Reliability Engineer - Mountain View, California - VentureDive

Site Reliability Engineer