We have other current jobs related to this field that you can find below


  • Houston, United States Thyme Tech Full time

    Site Reliability Engineer - Remote FriendlyJob OverviewOur company is dedicated to helping businesses harness the power of cloud technology to drive innovation and enhance operational efficiency. We specialize in managed services across leading cloud platforms and are searching for a dedicated Site Reliability Engineer (SRE) with a passion for technology and...


  • Houston, United States TalentMatch LLC Full time

    Job DescriptionJob DescriptionOur partner is a next generation pressure pumping company – that is on the grow!With corporate headquarters in Houston Texas, and field-based district locations in almost every oil and gas play in the United States, they are the pioneer and leading provider of electric hydraulic fracturing services in the oil and gas industry....


  • Houston, United States Schlumberger Full time

    Full-time or part-time: Full-time Job title: Site Reliability Engineer Job Location: 1430 Enclave Parkway, Houston, TX 77077Job Description: Create ultra-scalable and highly reliable software systems through system design consulting, capacity planning, system health monitoring, and sustainable incident response. Engage in and improve the entire lifecycle...


  • Houston, United States Schlumberger Full time

    Full-time or part-time: Full-time Job title: Site Reliability Engineer Job Location: 1430 Enclave Parkway, Houston, TX 77077Job Description: Create ultra-scalable and highly reliable software systems through system design consulting, capacity planning, system health monitoring, and sustainable incident response. Engage in and improve the entire lifecycle...


  • Houston, United States SLB Full time

    Employer: Schlumberger Technology Corporation Full-time or part-time: Full-time Job title: Site Reliability Engineer Job Location: 1430 Enclave Parkway, Houston, TX 77077Job Description: Create ultra-scalable and highly reliable software systems through system design consulting, capacity planning, system health monitoring, and sustainable incident...


  • Houston, United States Schlumberger Full time

    Employer: Schlumberger Technology Corporation Full-time or part-time: Full-time Job title: Site Reliability Engineer Job Location: 1430 Enclave Parkway, Houston, TX 77077Job Description: Create ultra-scalable and highly reliable software systems through system design consulting, capacity planning, system health monitoring, and sustainable incident...


  • Houston, Texas, United States Schlumberger Full time

    Full-time or part-time: Full-timeJob title: Site Reliability EngineerJob Location: 1430 Enclave Parkway, Houston, TX 77077Job Description:Create ultra-scalable and highly reliable software systems through system design consulting, capacity planning, system health monitoring, and sustainable incident response. Engage in and improve the entire lifecycle of...


  • Houston, United States VMC Soft Technologies, Inc Full time

    W2 CONTRACT ONLY C2C CANDIDATES PLEASE DO NOT APPLYTITLE: Site Reliability EngineerRemote 3 years experience in below technologies must:New Relic Platform with APM, Synthetic, and Browser experienceNew Relic Query Language (NRQL)PythonTechnical Requirements• Very Proficient in New Relic platform (APM, Synthetic, and Browser Monitors)• Develop code or...


  • Houston, United States VMC Soft Technologies, Inc Full time

    W2 CONTRACT ONLY C2C CANDIDATES PLEASE DO NOT APPLYTITLE: Site Reliability EngineerRemote 3 years experience in below technologies must:New Relic Platform with APM, Synthetic, and Browser experienceNew Relic Query Language (NRQL)PythonTechnical Requirements• Very Proficient in New Relic platform (APM, Synthetic, and Browser Monitors)• Develop code or...


  • Houston, United States FIS Global Full time

    Position Type : Full time Type Of Hire : Experienced (relevant combo of work and education) Education Desired : Bachelor of Computer Science Travel Percentage : 0% Are you curious, motivated, and forward-thinking? At FIS, you’ll have the opportunity to work on some of the most challenging and relevant issues in financial services and technology. Our...


  • Houston, United States Imubit Full time

    TL;DR: Imubit is looking for a Site Reliability Engineer to help disrupt the refining and chemical industries with breakthrough machine learning technologies. About us: Imubit directly controls and optimizes refineries and chemical plants with AI to add millions of dollars to the plant bottom line while managing safe operating limits, energy efficiency, and...


  • Houston, Texas, United States SLB Full time

    Employer: Schlumberger Technology Corporation Full-time or part-time: Full-time Job title: Site Reliability Engineer Job Location: 1430 Enclave Parkway, Houston, TX 77077Job Description: Create ultra-scalable and highly reliable software systems through system design consulting, capacity planning, system health monitoring, and sustainable incident...


  • Houston, United States JPMorgan Chase & Co. Full time

    There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the CORPORATE SECTOR within INFRASTRUCTURE PLATFORMS , you will solve complex...


  • Houston, United States Veradigm® Full time

    Welcome to Veradigm! Our Mission is to be the most trusted provider of innovative solutions that empower all stakeholders across the healthcare continuum to deliver world-class outcomes. Our Vision is a Connected Community of Health that spans continents and borders. With the largest community of clients in healthcare, Veradigm is able to deliver an...


  • Houston, Texas, United States RPOI Full time

    Site Reliability Engineers (SREs) are responsible for keeping all user-facing services and other DSX production systems running smoothly.SREs are a blend of pragmatic operators and software craftspeople that apply sound engineering principles, operational discipline, and mature automation to our environments.As an SRE you will: Be on a PagerDuty rotation to...


  • Houston, United States ConocoPhillips Full time

    Welcome to ConocoPhillips, where innovation and excellence create a platform for opportunity and growth. Come realize your full potential here.Who We Are We are one of the world's largest independent exploration and production companies, based on proved reserves and production of liquids and natural gas. With operations and activities in 13 countries, we...


  • Houston, United States Fintex Holdings Inc Full time

    Job DescriptionJob DescriptionJob DescriptionWe’re looking for a Site Reliability Engineer with strong software development skills combined with an engineering mindset.  Your responsibility will be to ensure platform performance and scalability by monitoring and investigating activity with an eye toward building the suite of programs necessary to automate...


  • Houston, United States Charles Schwab Full time

    Your Opportunity This full-time role is part of a nine-month NERD (New Employee Recruitment and Development) program that blends on-the-job experience with an extensive training curriculum that covers tools, technologies, processes, and soft skills required to be successful in Schwab Technology Services. By pairing the curriculum, on-the-job experience, and...


  • Houston, United States Channel Personnel Services Full time

    Job DescriptionJob DescriptionThe role is part of the Reliability Group supporting plant operation and reliability improvement efforts. Working in a team environment, it carries responsibility for implementing reliability best practices, developing and optimizing preventive maintenance tasks, and supporting maintenance and turnaround activities. The position...


  • Houston, United States Channel Personnel Services Full time

    Job DescriptionJob DescriptionThe role is part of the Reliability Group supporting plant operation and reliability improvement efforts. Working in a team environment, it carries responsibility for implementing reliability best practices, developing and optimizing preventive maintenance tasks, and supporting maintenance and turnaround activities. The position...

Site Reliability Engineer

2 months ago


Houston, United States Octagos Health | The Future of Digital Health Full time

About Octagos Health:


Octagos Health is a dynamic and rapidly growing healthcare technology company dedicated to improving patient outcomes through AI-driven solutions. We are seeking a skilled Cloud Platform Site Reliability Engineer (SRE) to join our team and ensure the reliability, scalability, and performance of our cloud-based production systems.


Position Summary:


The Cloud Platform Site Reliability Engineer (SRE) will be responsible for maintaining and improving the reliability and performance of our cloud-based production systems. This role involves building and maintaining cloud infrastructure, monitoring system health, automating processes, and responding to production incidents. The ideal candidate will be an experienced engineer with a deep understanding of cloud architectures and a passion for ensuring high availability and performance in a production environment.


Key Responsibilities:


  • System Reliability: Ensure the reliability and performance of cloud-based production systems by developing and implementing effective monitoring and incident response strategies.
  • Infrastructure Management: Build, maintain, and improve cloud infrastructure to support scalable and highly available services.

Automation: Automate repetitive tasks and processes to improve efficiency and reduce the potential for human error in a production environment.

Incident Management: Respond to production incidents, troubleshoot issues, and implement solutions to prevent recurrence.

Performance Tuning: Analyze system performance in the cloud and make recommendations for improvements.

Collaboration: Work closely with development, operations, and security teams to ensure seamless integration and deployment of systems and applications.

Capacity Planning: Monitor cloud system capacity and plan for future growth to ensure scalability.

Documentation: Maintain comprehensive documentation of cloud system architecture, processes, and procedures.

Security: Implement and maintain security best practices to protect data and systems in the cloud.

Continuous Improvement: Stay updated on industry trends and emerging cloud technologies to continuously improve system reliability and performance.


Qualifications:


  • Education: Bachelor’s degree in Computer Science, Engineering, or a related field; or equivalent work experience.

Experience: Minimum of 5 years of experience in site reliability engineering or a similar role, with a focus on cloud-based platforms.


Skills:

  • Strong understanding of cloud architectures, networking, and security principles.
  • Proficiency in scripting languages (e.g., Python, Bash) and automation tools (e.g., Ansible, Terraform).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Expertise in cloud platforms (e.g., AWS, Azure, Google Cloud) and containerization technologies (e.g., Docker, Kubernetes).
  • Excellent problem-solving and troubleshooting skills.
  • Strong communication and collaboration skills.
  • Ability to work in a fast-paced, dynamic production environment.


Why Join Octagos Health?


  • Innovative Environment: Be part of a forward-thinking company that is shaping the future of healthcare.
  • Growth Opportunities: Take advantage of opportunities for professional growth and advancement.
  • Collaborative Culture: Work in a supportive and collaborative environment where your contributions are valued.
  • Competitive Compensation: Receive a competitive salary and benefits package.