Sr. Site Reliability Engineer

2 weeks ago


Mountain View CA United States CENTRL Full time

We are seeking an experienced and dynamic professional to join our organization as the Director of Cloud/Infrastructure Operations. In this leadership role, you will oversee the strategic direction, planning, and execution of our cloud and infrastructure operations to ensure the high availability, scalability, and performance of our IT systems.

Job Responsibilities

  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault-finding
  • Partner with development teams to improve services through rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning.
  • Create sustainable systems and services through automation and uplift
  • Balance feature development speed and reliability with well-defined service level objectives

Minimum Qualifications

  • Bachelor’s degree in computer science or other highly technical scientific discipline
  • Experience with cloud service providers (AWS, GCP, Azure)
  • Experience with distributed storage technologies like NFS, HDFS, Ceph, and S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn)
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
#J-18808-Ljbffr

  • Mountain View, United States Muon Space Full time

    About the role Muon Space is looking for a Sr. Site Reliability Engineer to join our Platform Software team. Our team provides cloud infrastructure for Muon’s Satellite Operations systems and Data Platform, as well as development and test systems for engineers across the company. As a Cloud Engineer on the team, you will bring expertise in building and...


  • Mountain View, United States Atlassian Full time

    Site Reliability Engineer Intern, 2025 Summer U.S.Site Reliability Engineering | Mountain View, United States or RemoteJoin Atlassian as an intern and spend your summer with us having an impact on how millions of users collaborate and use software. We‘re in the business of developing software to help teams everywhere get amazing ideas on the ground and...


  • Mountain View, United States Atlassian Full time

    Site Reliability Engineer Intern, 2025 Summer U.S.Site Reliability Engineering | Mountain View, United States or RemoteJoin Atlassian as an intern and spend your summer with us having an impact on how millions of users collaborate and use software. We‘re in the business of developing software to help teams everywhere get amazing ideas on the ground and...


  • Mountain View, CA, United States CENTRL Inc. Full time

    CENTRL is a super-fast-growing Silicon Valley technology company focused on third-party risk, due diligence, cyber risk, and security. It has offices in the SF Bay Area, NY, Australia, and India. CENTRL’s clients include leading companies around the world including several Fortune 500 firms. CENTRL is led by a highly experienced management team with a...


  • Hawthorne, CA, United States SPACE EXPLORATION TECHNOLOGIES CORP Full time

    SR. SITE RELIABILITY ENGINEER - TOP SECRET CLEARANCE As a Senior Site Reliability Engineer, you will architect, develop, and test key aspects of the infrastructure for an in-house solution for analysis, simulation, prototyping, and operation of software in support of all SpaceX flight systems. You will have full ownership of the automation and technical...


  • Mountain View, United States Idexcel Full time

    Job Title: Site Reliability EngineerLocation: Mountain View, CA/OnsiteDuration: 1 yearJob Overview:We are seeking a highly skilled and experienced Senior Infrastructure Engineer to join our team. The successful candidate will be responsible for designing, implementing, maintaining, and optimizing our infrastructure systems across multiple platforms,...


  • Mountain View, United States VLink Inc Full time

    Senior Site Reliability Engineer- Only local to Mountain View, CA or Bellevue, WAOnly USC/GC/EAD- W2 onlyNO C2CContractRemoteJob Description:Primary:Ability to code in Python or GoLinux Admin (System Administration & Network Configuration)Debugging & Troubleshooting (Application and Infrastructure) production performance issuesKnowledge of MQ (Message Queue...


  • Sunnyvale, CA, United States Natcast, Inc. Full time

    Natcast (short for The National Center for the Advancement of Semiconductor Technology) is a new, purpose-built, non-profit entity created to operate the National Semiconductor Technology Center (NSTC) consortium, established by the CHIPS Act of the U.S. government. Working at Natcast represents an opportunity to help extend America’s leadership in...


  • Plainsboro Township, NJ, United States Integra LifeSciences Full time

    Changing lives. Building Careers. Joining us is a chance for you to do important work that creates change and shapes the future of healthcare. Thinking differently is what we do best. To us, change equals opportunity. Every day, more than 4,000 of us are challenging what’s possible and making headway to help improve outcomes. Position: Sr. Reliability...


  • Dallas, TX, United States Sygna LLC Full time

    Job Title: Sr. Site Reliability Engineer Ready to apply Before you do, make sure to read all the details pertaining to this job in the description below. Contract Type: Contract to hire Location: Hybrid (Dallas Tx) Must Have and Metrics Technical Skills: Years of experience: 7+ Ability to collaborate with cross-functional teams, troubleshoot...


  • McLean, VA, United States Root Center For Advanced Recovery Full time

    Overview Design. Disrupt. Repeat. Be an agent of change on a team committed to achieving client-focused, mission-driven excellence. Steampunk is looking for an experienced Site Reliability Engineer with an appetite for taking on new challenges. Who We Are Steampunk is the explosive collision of human-centered design and traditional government contracting. An...


  • McLean, VA, United States GameStop Full time

    Overview Design. Disrupt. Repeat. Be an agent of change on a team committed to achieving client-focused, mission-driven excellence. Steampunk is looking for an experienced Site Reliability Engineer with an appetite for taking on new challenges. Who We Are Steampunk is the explosive collision of human-centered design and traditional government...


  • Redwood City, CA, United States C3 AI Full time

    We are looking for an Associate Site Reliability Engineer / Site Reliability Engineer to join our team at our HQ in Redwood City, CA. Responsibilities: Maximize system uptime and availability, ensuring functional and performance SLAs. Establish end-to-end monitoring and alerting on all critical aspects. Solve complex problems for critical services...


  • Sunnyvale, CA, United States Apple Inc. Full time

    To view your favorites, sign in with your Apple Account. Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. The people here at Apple don’t just create products —...


  • Sunnyvale, CA, United States Microsoft Full time

    There has never been a more exciting time to be working in healthcare at Microsoft. Our Health & Life Sciences Solutions organization is an interdisciplinary team of product managers, designers, engineers, and clinicians who are designing, developing and deploying next-generation healthcare solutions powered by the Microsoft Cloud for healthcare...


  • Mountain View, CA, United States Reliable Robotics Full time

    We're building safety-enhancing technology for aviation that will save lives. Automated aviation systems will enable a future where air transportation is safer, more convenient and fundamentally transformative to the way goods — and eventually people — move around the planet. We are a team of mission-driven engineers with experience across aerospace,...


  • Mountain View, CA, United States Intershop Communications AG Full time

    (Senior) Site Reliability Engineer (m/f/d) Jena permanent Full time Senior We are Intershop - We're built to boost your business! As an e-commerce pioneer, we have been setting standards in the development of software for digital commerce for almost 30 years. With our cloud offering and as a Microsoft partner, we transform the challenges of the...


  • Chicago, IL, United States Datamaxis Full time

    Location : Chicago, IL Position Type : Fulltime (3 days a week (Tue, Wed & Thu) onsite or more if needed) Salary : $125,000 to 140,000 (10% yearly bonus) Responsibilities: Manage and monitor systems and infrastructure hosted on-premises and Cloud. Good understanding of different layers of an application and system design - networking concepts, cloud...


  • Chicago, IL, United States WEX, Inc. Full time

    The WEX Site Reliability Engineering (SRE) team is seeking an entry-level Site Reliability Engineer Level 1 who is passionate about learning and growing in the field of software development and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits...


  • Mountain View, United States Elastic Full time

    Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the...