Lead Site Reliability Engineer

2 weeks ago


Chicago, Illinois, United States DASH2 Full time

Overview

DASH2 is seeking skilled technical professionals at various levels who are eager to challenge themselves in delivering top-tier SaaS solutions. We provide a stimulating environment that encourages growth, adaptability, and the consistent application of your expertise. Our clients depend on us during critical moments, and our engineering team is committed to fulfilling that promise.

As a Principal Engineer, your role will involve guiding the development of SaaS software solutions for clients primarily engaged with regulatory bodies such as the SEC. Our offerings are essential as they address regulatory challenges faced by our customers. In this capacity, you will oversee projects from inception to completion, set coding standards, gain a comprehensive understanding of our operational systems, and ensure that our technology evolves without accumulating technical debt.

Key Responsibilities

  • Promote and establish a culture of Site Reliability Engineering (SRE) to uphold a robust platform infrastructure.
  • Implement comprehensive application and infrastructure monitoring and alerting systems to avert client-impacting incidents, ensuring system availability, performance, and scalability to meet SLOs and SLAs.
  • Enhance application performance at scale.
  • Automate all processes, including operational runbooks.
  • Define and support continuous integration and deployment (CI/CD) pipelines aligned with branching and quality assurance strategies.
  • Engage deeply with technology, staying abreast of the latest tools, technologies, and methodologies; assist in evaluating, prototyping, and integrating them into workflows.
  • Work independently to achieve project milestones and tasks on schedule while providing regular updates on progress.
  • Foster strong relationships with SRE and software engineering teams to uphold quality standards.
  • Commit to continuous learning and apply insights gained.
  • Advocate for best practices, eliminate bottlenecks, and enhance processes.

Required Qualifications

  • Proven experience in designing tiered web and mobile applications.
  • Expertise in developing secure code and Azure infrastructure, adhering to security standards, and engaging in financial systems security practices using tools like Cloud Defender and Checkmarx.
  • Over 8 years of experience in software development using modern programming languages such as C# .NET or Java.
  • More than 8 years of experience in creating automated deployments with tools like Azure DevOps Pipelines, Ansible, Jenkins, or other scripting languages for managing infrastructure and software deployment in a CI/CD environment.
  • 8+ years of scripting experience in PowerShell or Python/Bash for automating system operations as runbooks for both Windows and Linux environments.
  • 8+ years of experience in implementing best practices for production performance, availability, and scalability monitoring and alerting using tools such as New Relic, Dynatrace, DataDog, or AppDynamics.
  • 8+ years of experience as a global administrator of Azure, including cloud cost management.
  • 8+ years of experience supporting public-facing, revenue-generating systems.
  • Strong focus on DevOps with experience in building and deploying Infrastructure as Code using Terraform or similar technologies.
  • Experience in monitoring and preventing issues with databases and queries (SQL, Cosmos) using tools like Solarwinds Database Performance Analyzer or Idera SQL Diagnostic Manager.
  • Experience in planning, coordinating, developing, and executing all stages of test scripts.
  • Experience in securing Windows or Linux systems in a 24/7 production environment.
  • Familiarity with containerization and managing Kubernetes clusters.
  • Knowledge of common networking, firewall, and load balancing protocols.
  • Bachelor's degree in Computer Science or equivalent professional experience.


  • Chicago, Illinois, United States DASH2 Full time

    OverviewDASH2 is seeking experienced technical professionals who are eager to excel in delivering top-tier SaaS solutions. We provide a stimulating environment that encourages growth, adaptability, and the consistent application of your skills. Our clients depend on us during critical moments, and our engineering team is committed to fulfilling that...


  • Chicago, Illinois, United States Calabitek Full time

    Job DescriptionPosition: Site Reliability EngineerLocation: RemoteExperience: 10+ yearsThis position is responsible for ensuring application observability, maintenance, and support. The role involves identifying and implementing proactive preventive measures, evaluating, and recommending techniques, practices, or technologies that align with business...


  • Chicago, Illinois, United States Calabitek Full time

    Job OverviewPosition: Site Reliability EngineerLocation: Chicago, IL (Local Candidates Preferred)Experience: 10+ YearsThis position is crucial for ensuring application observability, ongoing maintenance, and robust support. The role involves identifying and implementing proactive preventive measures, as well as evaluating and recommending techniques,...


  • Chicago, Illinois, United States The Hartford Full time

    Senior Site Reliability EngineerAt The Hartford, we are committed to making a significant impact as an insurance provider that transcends traditional coverages and policies. Being part of our team means you have the opportunity to achieve your professional aspirations while assisting others in reaching theirs. Join us as we work towards shaping the...


  • Chicago, Illinois, United States National Black MBA Association Full time

    About the RoleThis is a strategic and transformation-focused role within the National Black MBA Association's Global Technology organization. As a Manager of Site Reliability Engineering, you will play a key part in ensuring the reliable and efficient operation of our security services.Key Responsibilities:Design and drive monitoring, alerting, and ticket...


  • Chicago, Illinois, United States National Black MBA Association Full time

    About the RoleThis is a strategic and transformation-focused role within the National Black MBA Association's Global Technology organization. As a Manager of Site Reliability Engineering, you will play a key part in ensuring the reliable and efficient operation of our security services.**Key Responsibilities:**Design and drive monitoring, alerting, and...


  • Chicago, Illinois, United States Oak Street Health Full time

    Transformative Role at Oak Street HealthWe are seeking a skilled Site Reliability Engineer to collaborate with our software engineering teams in implementing monitoring and alerting solutions, designing performance tests, and automating tasks to enhance efficiency.Key ResponsibilitiesDesign and implement telemetry, monitoring, and alerting systems to ensure...


  • Chicago, Illinois, United States Circle Full time

    About CircleCircle is a pioneering financial technology company at the forefront of the emerging internet of money, where value can flow freely, globally, and instantly, revolutionizing the way we think about payments, commerce, and markets. Our cutting-edge infrastructure, including the blockchain-based USDC, empowers businesses, institutions, and...


  • Chicago, Illinois, United States Gusto Full time

    About GustoGusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insurance, 401(k)s, expert HR, and team management tools. Today, Gusto offices in Denver, San Francisco, and New York serve more than 300,000 businesses nationwide. Our mission is to create a world...


  • Chicago, Illinois, United States Donato Technologies, Inc Full time

    Job OverviewPosition Title: DevOps EngineerCompany: Donato Technologies, IncWork Model: HybridOnsite Days: Tuesday - ThursdayContract Duration: 6 MonthsPosition SummaryWe are in search of a skilled DevOps Engineer to partner with our Application Development teams in delivering innovative business solutions through agile methodologies while effectively...


  • Chicago, Illinois, United States Stardom Employment Consultants Full time

    Job Description:As a Site Reliability Engineer at Stardom Employment Consultants, you will be responsible for maintaining and improving the reliability, availability, and performance of our systems. You will collaborate closely with development, operations, and security teams to build and automate scalable infrastructure, monitor system health, and address...


  • Chicago, Illinois, United States TEKsystems Full time

    Position Overview:This Site Reliability Engineering (SRE) team is responsible for facilitating in-depth advisory sessions, establishing SRE program leadership internally, and recruiting and nurturing talent for client projects.The Practice Architect will be strategic, generating innovative SRE methodologies in areas such as observability, production...


  • Chicago, Illinois, United States Itron, Inc. Full time

    Itron is revolutionizing how utilities and cities manage energy and water. We are committed to creating a more sustainable, resourceful world. Join us.Job Family SummaryPlans, designs, develops and tests software systems or applications for software enhancements and new products including cloud-based or internet-related tools. Evaluates reliability of...


  • Chicago, Illinois, United States Jobot Full time

    Remote Azure Site Reliability Engineer Opportunity with a Leading Tech Consulting FirmAbout Us:We are a dynamic consulting organization seeking a seasoned Cloud Site Reliability Engineer with a strong foundation in Azure Cloud technologies. This fully remote position is pivotal in implementing Site Reliability Engineering (SRE) methodologies across our...


  • Chicago, Illinois, United States Outdefine Full time

    About the JobOutdefine is a leading-edge talent community that connects top professionals with innovative companies and enterprises globally. As a Site Reliability Engineer, you will play a crucial role in ensuring the reliability and efficiency of our cloud infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure...


  • Chicago, Illinois, United States The Hartford Full time

    About The HartfordThe Hartford is a leading insurance company that goes beyond traditional coverages and policies. We're committed to making a difference and proud to be an organization that values innovation and excellence.Job SummaryWe're seeking a highly skilled Staff Reliability Engineer to join our Reliability Engineering Team. As a key member of our...


  • Chicago, Illinois, United States Jobot Full time

    Remote Azure Site Reliability Engineer OpportunityThis position is hosted by Jobot Consulting.About Us:We are a dynamic tech consulting firm seeking a Senior Cloud Site Reliability Engineer with a strong background in Azure Cloud. In this role, you will play a key part in implementing Site Reliability Engineering (SRE) practices across our enterprise...

  • Reliability Engineer

    4 weeks ago


    Chicago, Illinois, United States GATX Full time

    OverviewFounded in 1898 and headquartered in Chicago, IL, GATX Corporation (NYSE: GATX) is an industry leader with 125+ years of success-success that is powered by our people. We are proud of our high-performance culture, hard-working and enthusiastic management team, and beautiful office space in the Willis Tower.At GATX, we hire the best and offer our...


  • Chicago, Illinois, United States CME Group Full time

    Job OverviewA critical role in CME Group's Cloud data transformation, the Data SRE will be aligned to data product pods ensuring the firm's data infrastructure is reliable, scalable, and efficient as the GCP data footprint expands rapidly.Key ResponsibilitiesOptimize data pipelines to ensure efficient data processing and reduce latency.Ensure data integrity...


  • Chicago, Illinois, United States GATX Full time

    Position OverviewFounded in 1898, GATX Corporation is a leading organization with over 125 years of industry success, driven by our dedicated workforce. We take pride in our vibrant culture, committed management team, and modern office environment.At GATX, we prioritize hiring top talent and fostering a dynamic, collaborative workplace that empowers...