SRE/ Site Reliability Engineer

4 weeks ago


Fort Smith, United States Bitquery Full time

Bitquery Overview Bitquery is an API-first product company dedicated to powering and solving blockchain data problems using ground truth and on-chain data. We extract and present valuable data via APIs, delivering solutions to multiple verticals like Decentralized Finance (DeFi), DEX Arbitrage Analytics, Crypto Surveillance & Forensics across major blockchains like Bitcoin, Ethereum, EOS, and Tezos. Our international team of software developers analyzes decentralized data from 40+ chains.

We're Hiring: SRE Engineer We are currently looking for a full-time Site Reliability Engineer (SRE) to further develop, monitor, and support infrastructure while automating various processes. This role may require shift work.

Roles & Responsibilities:

Ensure smooth operation of software, environments, and company services.

Analyze and improve performance and availability of products.

Identify bottlenecks in architecture and infrastructure.

Improve system alerting and incident management.

Enhance monitoring systems based on SLI (Prometheus, Icinga, Grafana, etc.).

Formalize SLI under main business requirements.

Establish SLO for services and infrastructure.

Minimize system recovery time (RPO and RTO).

Analyze incidents in the production environment.

Manage capacity effectively.

Requirements:

5+ years of experience in implementing, troubleshooting, and supporting infrastructure software and distributed systems.

Support experience in Golang, Python, Ruby.

Proficiency with virtualization and containerization technologies (containerd, Docker, k8s).

Experience setting up CI/CD pipelines (e.g., Jenkins).

Creating and maintaining fault-tolerant systems with comprehensive log coverage, monitoring, and alerting.

Understanding of "infrastructure as code" principle and ability to test it (Ansible, Terraform).

Knowledge of network security organization principles (IPsec, WAF, IPS).

Experience with maintaining blockchain nodes.

Availability in US timezone.

Our Tech Stack:

Infrastructure: Bare-metal / AWS

Databases: Clickhouse / MySQL

SCM: Git / GitHub

Message broker: Kafka

Repository: Nexus

CI/CD: Jenkins

Monitoring: Icinga 2, Grafana, Prometheus, Victoria metrics, ELK

Orchestration: k8s, Ansible, Terraform

Containers: LXC, Docker

Scripting: Python, Golang, Ruby, Groovy

OS: Debian/Ubuntu

Others: Docker Compose, IPSec

Benefits:

Work with a truly global team across 5 countries.

Remote work flexibility.

Choose your own work hours.

Yearly trip with the Bitquery team to a remote destination.

Fast interview process within 1-2 weeks.

Flat hierarchy empowering individuals to deliver results.

Join a great culture and build Bitquery with us.

#J-18808-Ljbffr



  • Fort Worth, United States M2S Tech Solutions Full time

    Role: Senior site reliability engineerLocation: Fort Worth, TX (Hybrid)Visa: USC Mandatory Skills:Azure DevOps , Azure Cloud Native Security , SRE , GITHUB , Infra CI CD Pipelines, DynatraceSenior site reliability engineer Requirements & Skills:Hands on experience as SREExperience with Azure cloudExperience with APM tools Dynatrace SaaS, Mezmo (LogDNA) and...


  • Fort Wayne, United States Sentara Healthcare Full time

    Role Overview Site reliability engineers (SREs) are responsible for improving system reliability and resilience to make it faster and easier to develop and deploy new software capabilities. SREs focus especially on building automation to reduce manual effort and prevent operations incidents. Key Responsibilities Work with stakeholders such as product owners...


  • Fort Worth, United States Cynet Systems Full time

    Job Description: Responsibilities: Oversee the design and maintenance of geospatial databases to ensure optimal performance and reliability. Implement and manage Azure DevOps practices for geospatial project lifecycle, enhancing collaboration and efficiency. Utilize GITHUB for version control and collaboration across geospatial development...


  • Fort Worth, United States Cynet Systems Full time

    Job Description: Responsibilities: Oversee the design and maintenance of geospatial databases to ensure optimal performance and reliability. Implement and manage Azure DevOps practices for geospatial project lifecycle, enhancing collaboration and efficiency. Utilize GITHUB for version control and collaboration across geospatial development projects. ...


  • Fort Washington, United States JR Technologies Full time

    At JR Technologies, our vision is to create the new customer-centric distribution landscape of tomorrow. Working with us offers many opportunities to experienced professionals who are interested in joining a strong team, learning and mentoring in a dynamic environment, honing professional and technical abilities, and who thrive on new challenges. We provide...


  • Fort Bragg, United States Venatore Llc Full time

    Job DescriptionJob DescriptionWhat You'll Get to Do:As a Cloud Site Reliability Engineer (SRE) you’ll help ensure the mission is never interrupted. As an SRE you will help ensure today is safe and tomorrow is smarter. Our work depends on talented people joining our team to help transition legacy technologies to cloud infrastructure in an efficient and...


  • Fort Bragg, United States Venatore Llc Full time

    Job DescriptionJob DescriptionWhat You'll Get to Do:As a Cloud Site Reliability Engineer (SRE) you’ll help ensure the mission is never interrupted. As an SRE you will help ensure today is safe and tomorrow is smarter. Our work depends on talented people joining our team to help transition legacy technologies to cloud infrastructure in an efficient and...

  • Senior SRE Engineer

    3 weeks ago


    Fort Worth, United States RingCentral Full time

    Say hello to possibilities. It's not everyday that you consider starting a new career. We're RingCentral, and we're happy that someone as talented as you is considering this role. First, a little about us, we're the $2 billion global leader in cloud-based communications and collaboration software. We are fundamentally changing the nature of human...


  • Fort Lauderdale, United States Chewy Full time

    Our Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...


  • Fort Lauderdale, United States Chewy Full time

    Our Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...


  • Fort Lauderdale, United States Chewy Full time

    Our Opportunity: We are looking for a Site Reliability Engineer II at our facility in Plantation, FL to focus on service stability and reliability by working with application owners to set SLOs, "Error Budget" and backup and DR strategies. What You'll Do: Define application monitoring and alerting strategy. Perform capacity planning and production readiness...


  • Ft. Smith, United States Mars Full time

    Job Description:Reliability/Maintenance Engineer - Ft. Smith, ARThe Reliability Engineer will have a strong passion and drive to eliminate all maintenance losses in the site by using proven technical tools to develop maintenance strategies for both new and existing equipment.  This role must be a leader in maintenance improvement tools (PM Optimization,...


  • Ft. Smith, United States Mars Full time

    Job Description:Reliability/Maintenance Engineer - Ft. Smith, ARThe Reliability Engineer will have a strong passion and drive to eliminate all maintenance losses in the site by using proven technical tools to develop maintenance strategies for both new and existing equipment.  This role must be a leader in maintenance improvement tools (PM Optimization,...

  • Senior SRE Engineer

    2 weeks ago


    Fort Worth, United States RingCentral Full time

    Say hello to possibilities. It’s not everyday that you consider starting a new career. We’re RingCentral, and we’re happy that someone as talented as you is considering this role. First, a little about us, we’re the $2 billion global leader in cloud-based communications and collaboration software. We are fundamentally changing the nature of human...

  • Senior SRE Engineer

    2 weeks ago


    Fort Worth, United States RingCentral Full time

    Say hello to possibilities. It’s not everyday that you consider starting a new career. We’re RingCentral, and we’re happy that someone as talented as you is considering this role. First, a little about us, we’re the $2 billion global leader in cloud-based communications and collaboration software. We are fundamentally changing the nature of human...


  • Fort Liberty, United States Booz Allen Hamilton Full time

    Job Description Location: Fort Bragg,NC,US Remote Work: No Job Number: R0188764 Site Reliability Engineer The Opportunity: Everyone is trying to “harness the power of the cloud,” but not everyone knows how. As a Kubernetes platform engineer, you know how to build resilient platforms that meet customer needs and take advantage of the power of...

  • Reliability Engineer

    2 months ago


    Fort Wayne, United States Lozier Full time

    Job Description - Reliability Engineer (24000264) Reliability Engineer - ( 24000264 ) COMPANY OVERVIEW Lozier Corporation is an industry leader in providing store fixtures to major retailers across the U.S. and around the world. Headquartered in Omaha, Nebraska, Lozier began manufacturing fixtures in 1956, and originated the basics of today’s shelving...


  • Fort Bragg, United States Booz Allen Hamilton Full time

    Site Reliability EngineerThe Opportunity:Everyone is trying to “harness the power of the cloud,” but not everyone knows how. As a Site Reliability Engineer, you know how to build resilient platforms that meets customer needs and takes advantage of the power of containerization both in the cloud and on premises. What if you could use your Kubernetes...

  • Reliability Engineer

    3 weeks ago


    Fort Collins, United States Cps4jobs Full time

    Seeking a Reliability Engineer for an exciting opportunity near Fort Collins, CO. This is a great time to join a leading manufacturer with excellent benefits and endless growth opportunities. Responsibilities for the Reliability Engineer include: Developing reliability management techniques Leading the risk and reliability function for critical machines at...

  • Reliability Engineer

    3 weeks ago


    Fort Worth, United States QuEST Global Full time

    62936BR Title: Reliability Engineer Job Description: Quest Global is an organization at the forefront of innovation and one of the world’s fastest growing engineering services firms with deep domain knowledge and recognized expertise in the top OEMs across seven industries. We are a twenty-five-year-old company on a journey to becoming a centenary one,...