Senior Systems Reliability Engineer

2 days ago


Chicago IL United States DRW Full time

DRW is a diversified trading firm with over 3 decades of experience bringing sophisticated technology and exceptional people together to operate in markets around the world. We value autonomy and the ability to quickly pivot to capture opportunities, so we operate using our own capital and trading at our own risk.

Headquartered in Chicago with offices throughout the U.S., Canada, Europe, and Asia, we trade a variety of asset classes including Fixed Income, ETFs, Equities, FX, Commodities and Energy across all major global markets. We have also leveraged our expertise and technology to expand into three non-traditional strategies: real estate, venture capital and cryptoassets.

We operate with respect, curiosity and open minds. The people who thrive here share our belief that it’s not just what we do that matters–it's how we do it. DRW is a place of high expectations, integrity, innovation and a willingness to challenge consensus.

We are seeking a Systems Reliability Engineer to join our Fixed Income Commodities and Currency Options (FICCO) and Cumberland team in either Chicago or London. In this role, you will be responsible for designing and supporting highly available systems within a technologically diverse stack used for global research and trading of FICCO and Cryptoassets. Leveraging tools such as AWS, Docker, Kubernetes, CI/CD, Python, Prometheus and Grafana, you will develop a repeatable and supportable tech stack to meet the demanding needs of our business.

Core Responsibilities:

  1. Collaborate with our FICCO and Cumberland technology and trading teams regarding their CI/CD processes.
  2. Collaborate with development teams to troubleshoot software build issues and optimize packaging processes.
  3. Automate deployment processes to improve efficiency and reduce manual intervention.
  4. Implement and manage infrastructure as code tools such as Terraform and Ansible.
  5. Maintain, design, and troubleshoot our observability stack.
  6. Drive initiatives to modernize environments by developing and optimizing processes using appropriate cloud and container tools, such as AWS and Kubernetes.
  7. Consistently challenge the norm and advocate for change.

Skills and Qualifications:

  1. Proven experience as a DevOps Engineer, Site Reliability Engineer, or similar software engineering role.
  2. Strong expertise with Observability tools such as Prometheus, Alert Manager, Grafana, Sentry, and OpenTelemetry.
  3. Proficiency with Python, Java, and C++ software builds and packaging.
  4. Hands-on experience with CI/CD tools like TeamCity, Concourse, Argo Workflows, and/or GitHub Actions.
  5. Solid understanding of Infrastructure as Code (IaC) tools such as Terraform, Terragrunt, and Ansible.
  6. Skills in Python for troubleshooting and maintaining environment dependencies.
  7. Proficient with Docker for image creation, networking, and execution.
  8. Experienced with Kubernetes, including deployment and management of applications.
  9. Knowledge of ArgoCD, Helm, and Kustomize for Kubernetes application management.
  10. Fundamental understanding of git and familiarity with git repository tools such as GitHub and GitLab.
  11. Linux experience with Debian and Redhat-based systems.
  12. Excellent organizational skills, with the ability to effectively plan and prioritize tasks.
  13. Strong collaborative team spirit and communication skills.

Preferred Qualifications:

  1. Bachelor’s degree in Computer Science, Engineering, or a relevant field.
  2. Experience using Conda, including environment management and conda-build for creating conda packages.
  3. Experience deploying and maintaining CI/CD pipelines in a large-scale production environment.
  4. Hands-on experience with cloud platforms and services, such as AWS, GCP, or Azure.
  5. Experience supporting the infrastructure and systems that facilitate electronic trading functions or other high-performance computing environments.
  6. Experience in consolidating diverse and redundant approaches to common problems.
#J-18808-Ljbffr

  • Chicago, IL, United States WEX Inc. Full time

    Senior Staff Site Reliability Engineer Apply to locations: Chicago, IL; Bay Area, CA; San Francisco, CA. About the Role The WEX Site Reliability Engineering (SRE) team is seeking a Senior Staff SRE who is passionate about developing software and solutions focused on observability, incident response, reliability and performance, operational excellence, and...


  • Chicago, IL, United States WEX, Inc. Full time

    The WEX Site Reliability Engineering (SRE) team is seeking an entry-level Site Reliability Engineer Level 1 who is passionate about learning and growing in the field of software development and solutions focused on observability, incident response, reliability and performance, operational excellence, and compliance. The team will be part of the Benefits...


  • Sunnyvale, CA, United States Tbwa ChiatDay Inc Full time

    Figure is an AI Robotics company developing a general purpose humanoid. Our Humanoid is designed for corporate tasks targeting labor shortages and jobs that are undesirable or unsafe. We are based in Sunnyvale, CA and require 5 days/week in-office collaboration. We are looking for a Senior Reliability Test Engineer in charge of designing and executing test...


  • Chicago, IL, United States Russell Tobin Full time

    RussellTobin is looking for a Senior Control System Engineer to join a growing team. This is a Direct hire opportunity. Apply today for consideration.Senior Control System EngineerChicago, IL 60611, United StatesDuration: PermanentSalary range: $80K-125K/yr. Job DescriptionSenior Control System Engineer/ Indianapolis, IN (Hybrid working)Company is looking...


  • Fairfax, VA, United States Apex Systems Full time

    We are seeking talented professionals to join our successful and growing team in building the next-generation Continuous Diagnostics and Mitigation (CDM) Cyber data solution. The CDM Program is the Cybersecurity and Infrastructure Security Agency’s (CISA) dynamic approach to strengthening the cybersecurity of Federal networks and systems through better...


  • Richardson, TX, United States Celestica Full time

    SummaryThe Senior Reliability Engineer, works in cross functional teams with designers, customers and manufacturing engineering and project leaders to ensure products designed can meet reliability specifications. Define the reliability testing strategy, reliability test plan and conduct tests. Complete a stress based MTBF analysis of products, thus providing...


  • Chicago, IL, United States Adyen Full time

    Senior Site Reliability Engineer, Internal Services Infrastructure Chicago This is Adyen Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition. For our teams, we create an...


  • Chicago, IL, United States CME Group Full time

    Description Position Overview: Data System Reliability Engineer (dSRE) CME Group: Where Futures Are MadeCME Group is the world's leading and most diverse derivatives marketplace. But who we are goes deeper than that, here you can impact markets worldwide, transform industries and build a career shaping tomorrow. We invest in your success and you own it,...


  • Boston, MA, United States Wasabi Technologies Inc. Full time

    At Wasabi, we’re a proven collection of pioneers, visionaries and disruptive doers. We see things differently than our competitors, and we make our mark in the industry by challenging the norm and delivering the unexpected and improbable. We’re a fast-growing company taking the Cloud Storage industry by storm and recognized as one of the best places to...


  • Chicago, IL, United States Nextpoint Full time

    Join the team designing and developing innovative software solutions to meet client needs while providing expert technical support. Who we are and what we offer at Nextpoint Nextpoint delivers transformative software and services for all law-kind. Our award-winning team is 100% focused on making it simple, fluid, and affordable for law firms of all...


  • Carrollton, GA, United States IDR, Inc. Full time

    Maintenance Systems Reliability EngineerWe are seeking a Maintenance Systems Reliability Engineer to drive and support the implementation of reliability best practices across manufacturing sites. The role involves evaluating and deploying new technologies to enhance equipment reliability and maintainability.Key Responsibilities:Implement and support...


  • Chicago, Illinois, United States The J.M. Smucker Company Full time

    We are seeking an experienced Senior Systems Engineering Lead to provide active leadership with regard to power and control systems, equipment specification and maintenance, proper operation, technician development, troubleshooting at The J.M. Smucker Company.Job DescriptionThe Senior Systems Engineering Lead will be responsible for the overall strategy and...


  • San Diego, CA, United States Booz Allen Full time

    Reliability Systems EngineerThe Opportunity:Are you looking for an opportunity to combine your technical skills with big picture thinking to make an impact in national security? You understand your customer’s environment and how to develop the right systems for their mission. Your ability to translate real-world needs into technical specifications makes...


  • San Diego, CA, United States Booz Allen Full time

    Reliability Systems EngineerAll the relevant skills, qualifications and experience that a successful applicant will need are listed in the following description.The Opportunity:Are you looking for an opportunity to combine your technical skills with big picture thinking to make an impact in national security? You understand your customer’s environment and...


  • San Diego, CA, United States Booz Allen Hamilton Full time

    Your growth matters to us - explore our career development opportunities. A PLACE WHERE YOU BELONG Bring your whole self to work in our culture of respect and inclusivity. SUPPORT YOUR WELLBEING Learn how we’ll support you as you pursue a balanced, fulfilling life. YOUR CANDIDATE JOURNEY Discover what to expect during your journey as a candidate with us....


  • Redmond, WA, United States Amazon Full time

    Senior Reliability Engineer, Project Kuiper Job ID: 2768100 | Amazon Kuiper Manufacturing Enterprises LLC Project Kuiper is an initiative to increase global broadband access through a constellation of 3,236 satellites in low Earth orbit (LEO). Its mission is to bring fast, affordable broadband to unserved and underserved communities around the world. Project...


  • Pryor, OK, United States Allied Reliability Full time

    The Rotating Equipment Engineer is responsible for engineering support and reliability improvements for rotating equipment in a plant environment.Job Duties:Develop and implement failure mode-based reliability strategies for critical rotating equipmentAnalyze equipment trends, historical data, and field data to develop and adjust reliability programs to...


  • chicago, United States Blue Signal Search Full time

    Our client is a leader in the plastics and manufacturing industry, providing solutions and services to customers nationwide. They are actively seeking a seasoned Senior Control Systems Engineer to join their dedicated team. This role requires an individual with a comprehensive understanding of control theory, system architecture, and hands-on experience with...


  • Chicago, IL, United States Grubhub Full time

    About The OpportunityWe’re all about connecting hungry diners with our network of over 300,000 restaurants nationwide. Innovative technology, user-friendly platforms and streamlined delivery capabilities set us apart and make us an industry leader in the world of online food ordering. When you join our team, you become part of a community that works...


  • Palm Bay, FL, United States L3Harris Technologies Full time

    Job Title: Senior Specialist, Reliability, Maintainability, Availability, & Testability (RMAT) EngineerJob Code: 16094Job Location: Palm Bay, FL (100% Onsite) Job Description: Reliability Engineer supporting various programs within Ground, Space and Airborne Systems. Typical tasks include performance of Reliability Analysis, Failure Mode and Effects...