Lead Site Reliability Engineer
2 weeks ago
Overview
DASH2 is seeking skilled technical professionals at various levels who are eager to challenge themselves in delivering top-tier SaaS solutions. We provide a stimulating environment that encourages growth, adaptability, and the consistent application of your expertise. Our clients depend on us during critical moments, and our engineering team is committed to fulfilling that promise.
As a Principal Engineer, your role will involve guiding the development of SaaS software solutions for clients primarily engaged with regulatory bodies such as the SEC. Our offerings are essential as they address regulatory challenges faced by our customers. In this capacity, you will oversee projects from inception to completion, set coding standards, gain a comprehensive understanding of our operational systems, and ensure that our technology evolves without accumulating technical debt.
Key Responsibilities
- Promote and establish a culture of Site Reliability Engineering (SRE) to uphold a robust platform infrastructure.
- Implement comprehensive application and infrastructure monitoring and alerting systems to avert client-impacting incidents, ensuring system availability, performance, and scalability to meet SLOs and SLAs.
- Enhance application performance at scale.
- Automate all processes, including operational runbooks.
- Define and support continuous integration and deployment (CI/CD) pipelines aligned with branching and quality assurance strategies.
- Engage deeply with technology, staying abreast of the latest tools, technologies, and methodologies; assist in evaluating, prototyping, and integrating them into workflows.
- Work independently to achieve project milestones and tasks on schedule while providing regular updates on progress.
- Foster strong relationships with SRE and software engineering teams to uphold quality standards.
- Commit to continuous learning and apply insights gained.
- Advocate for best practices, eliminate bottlenecks, and enhance processes.
Required Qualifications
- Proven experience in designing tiered web and mobile applications.
- Expertise in developing secure code and Azure infrastructure, adhering to security standards, and engaging in financial systems security practices using tools like Cloud Defender and Checkmarx.
- Over 8 years of experience in software development using modern programming languages such as C# .NET or Java.
- More than 8 years of experience in creating automated deployments with tools like Azure DevOps Pipelines, Ansible, Jenkins, or other scripting languages for managing infrastructure and software deployment in a CI/CD environment.
- 8+ years of scripting experience in PowerShell or Python/Bash for automating system operations as runbooks for both Windows and Linux environments.
- 8+ years of experience in implementing best practices for production performance, availability, and scalability monitoring and alerting using tools such as New Relic, Dynatrace, DataDog, or AppDynamics.
- 8+ years of experience as a global administrator of Azure, including cloud cost management.
- 8+ years of experience supporting public-facing, revenue-generating systems.
- Strong focus on DevOps with experience in building and deploying Infrastructure as Code using Terraform or similar technologies.
- Experience in monitoring and preventing issues with databases and queries (SQL, Cosmos) using tools like Solarwinds Database Performance Analyzer or Idera SQL Diagnostic Manager.
- Experience in planning, coordinating, developing, and executing all stages of test scripts.
- Experience in securing Windows or Linux systems in a 24/7 production environment.
- Familiarity with containerization and managing Kubernetes clusters.
- Knowledge of common networking, firewall, and load balancing protocols.
- Bachelor's degree in Computer Science or equivalent professional experience.
-
Lead Site Reliability Engineer
7 days ago
Chicago, Illinois, United States DASH2 Full timeOverviewDASH2 is seeking experienced technical professionals who are eager to excel in delivering top-tier SaaS solutions. We provide a stimulating environment that encourages growth, adaptability, and the consistent application of your skills. Our clients depend on us during critical moments, and our engineering team is committed to fulfilling that...
-
Site Reliability Engineer
2 weeks ago
Chicago, Illinois, United States Calabitek Full timeJob DescriptionPosition: Site Reliability EngineerLocation: RemoteExperience: 10+ yearsThis position is responsible for ensuring application observability, maintenance, and support. The role involves identifying and implementing proactive preventive measures, evaluating, and recommending techniques, practices, or technologies that align with business...
-
Site Reliability Engineer
2 weeks ago
Chicago, Illinois, United States Calabitek Full timeJob OverviewPosition: Site Reliability EngineerLocation: Chicago, IL (Local Candidates Preferred)Experience: 10+ YearsThis position is crucial for ensuring application observability, ongoing maintenance, and robust support. The role involves identifying and implementing proactive preventive measures, as well as evaluating and recommending techniques,...
-
Senior Site Reliability Engineer
2 weeks ago
Chicago, Illinois, United States The Hartford Full timeSenior Site Reliability EngineerAt The Hartford, we are committed to making a significant impact as an insurance provider that transcends traditional coverages and policies. Being part of our team means you have the opportunity to achieve your professional aspirations while assisting others in reaching theirs. Join us as we work towards shaping the...
-
Site Reliability Engineering Manager
1 week ago
Chicago, Illinois, United States National Black MBA Association Full timeAbout the RoleThis is a strategic and transformation-focused role within the National Black MBA Association's Global Technology organization. As a Manager of Site Reliability Engineering, you will play a key part in ensuring the reliable and efficient operation of our security services.Key Responsibilities:Design and drive monitoring, alerting, and ticket...
-
Site Reliability Engineering Manager
1 week ago
Chicago, Illinois, United States National Black MBA Association Full timeAbout the RoleThis is a strategic and transformation-focused role within the National Black MBA Association's Global Technology organization. As a Manager of Site Reliability Engineering, you will play a key part in ensuring the reliable and efficient operation of our security services.**Key Responsibilities:**Design and drive monitoring, alerting, and...
-
Site Reliability Engineer
7 days ago
Chicago, Illinois, United States Oak Street Health Full timeTransformative Role at Oak Street HealthWe are seeking a skilled Site Reliability Engineer to collaborate with our software engineering teams in implementing monitoring and alerting solutions, designing performance tests, and automating tasks to enhance efficiency.Key ResponsibilitiesDesign and implement telemetry, monitoring, and alerting systems to ensure...
-
Staff Site Reliability Engineer
2 days ago
Chicago, Illinois, United States Circle Full timeAbout CircleCircle is a pioneering financial technology company at the forefront of the emerging internet of money, where value can flow freely, globally, and instantly, revolutionizing the way we think about payments, commerce, and markets. Our cutting-edge infrastructure, including the blockchain-based USDC, empowers businesses, institutions, and...
-
Staff Site Reliability Engineer
4 weeks ago
Chicago, Illinois, United States Gusto Full timeAbout GustoGusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insurance, 401(k)s, expert HR, and team management tools. Today, Gusto offices in Denver, San Francisco, and New York serve more than 300,000 businesses nationwide. Our mission is to create a world...
-
Site Reliability Engineer
7 days ago
Chicago, Illinois, United States Donato Technologies, Inc Full timeJob OverviewPosition Title: DevOps EngineerCompany: Donato Technologies, IncWork Model: HybridOnsite Days: Tuesday - ThursdayContract Duration: 6 MonthsPosition SummaryWe are in search of a skilled DevOps Engineer to partner with our Application Development teams in delivering innovative business solutions through agile methodologies while effectively...
-
Site Reliability Engineer
7 hours ago
Chicago, Illinois, United States Stardom Employment Consultants Full timeJob Description:As a Site Reliability Engineer at Stardom Employment Consultants, you will be responsible for maintaining and improving the reliability, availability, and performance of our systems. You will collaborate closely with development, operations, and security teams to build and automate scalable infrastructure, monitor system health, and address...
-
Chicago, Illinois, United States TEKsystems Full timePosition Overview:This Site Reliability Engineering (SRE) team is responsible for facilitating in-depth advisory sessions, establishing SRE program leadership internally, and recruiting and nurturing talent for client projects.The Practice Architect will be strategic, generating innovative SRE methodologies in areas such as observability, production...
-
Site Reliability Engineer
3 weeks ago
Chicago, Illinois, United States Itron, Inc. Full timeItron is revolutionizing how utilities and cities manage energy and water. We are committed to creating a more sustainable, resourceful world. Join us.Job Family SummaryPlans, designs, develops and tests software systems or applications for software enhancements and new products including cloud-based or internet-related tools. Evaluates reliability of...
-
Senior Azure Reliability Engineer
2 weeks ago
Chicago, Illinois, United States Jobot Full timeRemote Azure Site Reliability Engineer Opportunity with a Leading Tech Consulting FirmAbout Us:We are a dynamic consulting organization seeking a seasoned Cloud Site Reliability Engineer with a strong foundation in Azure Cloud technologies. This fully remote position is pivotal in implementing Site Reliability Engineering (SRE) methodologies across our...
-
Reliability Engineer
2 days ago
Chicago, Illinois, United States Outdefine Full timeAbout the JobOutdefine is a leading-edge talent community that connects top professionals with innovative companies and enterprises globally. As a Site Reliability Engineer, you will play a crucial role in ensuring the reliability and efficiency of our cloud infrastructure.Key Responsibilities:Design and implement scalable and reliable cloud infrastructure...
-
Reliability Engineer
4 days ago
Chicago, Illinois, United States The Hartford Full timeAbout The HartfordThe Hartford is a leading insurance company that goes beyond traditional coverages and policies. We're committed to making a difference and proud to be an organization that values innovation and excellence.Job SummaryWe're seeking a highly skilled Staff Reliability Engineer to join our Reliability Engineering Team. As a key member of our...
-
Senior Azure Reliability Engineer
2 weeks ago
Chicago, Illinois, United States Jobot Full timeRemote Azure Site Reliability Engineer OpportunityThis position is hosted by Jobot Consulting.About Us:We are a dynamic tech consulting firm seeking a Senior Cloud Site Reliability Engineer with a strong background in Azure Cloud. In this role, you will play a key part in implementing Site Reliability Engineering (SRE) practices across our enterprise...
-
Reliability Engineer
4 weeks ago
Chicago, Illinois, United States GATX Full timeOverviewFounded in 1898 and headquartered in Chicago, IL, GATX Corporation (NYSE: GATX) is an industry leader with 125+ years of success-success that is powered by our people. We are proud of our high-performance culture, hard-working and enthusiastic management team, and beautiful office space in the Willis Tower.At GATX, we hire the best and offer our...
-
Senior Data Reliability Engineer
4 days ago
Chicago, Illinois, United States CME Group Full timeJob OverviewA critical role in CME Group's Cloud data transformation, the Data SRE will be aligned to data product pods ensuring the firm's data infrastructure is reliable, scalable, and efficient as the GCP data footprint expands rapidly.Key ResponsibilitiesOptimize data pipelines to ensure efficient data processing and reduce latency.Ensure data integrity...
-
Component Reliability Engineer
2 weeks ago
Chicago, Illinois, United States GATX Full timePosition OverviewFounded in 1898, GATX Corporation is a leading organization with over 125 years of industry success, driven by our dedicated workforce. We take pride in our vibrant culture, committed management team, and modern office environment.At GATX, we prioritize hiring top talent and fostering a dynamic, collaborative workplace that empowers...