Lead SRE

3 weeks ago


Chicago, United States CareerBuilder Full time

About the job Lead SRE

This is a 1099-based role. Eligible for US-based candidates with a valid work authorization only.

Unicorn startup is seeking a Site Reliability Engineer (SRE) to join our dynamic team, focusing on the reliability and performance of our trading enclave product/service lines. The ideal candidate will bring a deep understanding of SRE principles, including Incident Management, combined with expertise in DevOps practices and software development. This role demands strong technical skills in monitoring and observability tools such as Dynatrace, Splunk, and Grafana, coupled with exceptional Root Cause Analysis and troubleshooting abilities.

Specialization in networking, including Cisco, Arista, and AVI, and proficiency with network debugging tools like Wireshark, are crucial for success in this position.

Responsibilities:

- Lead incident management process for high availability and performance of trading enclave services
- Implement SRE practices to enhance system reliability
- Utilize Dynatrace, Splunk, and Grafana for system health monitoring and optimization
- Conduct thorough root cause analysis to prevent recurrence of incidents and outages
- Collaborate with development and operations teams to streamline CI/CD pipelines
- Provide expertise in networking technologies (Cisco, Arista, AVI) and tools like Wireshark
- Mentor junior engineers and advocate for innovative solutions for system reliability and performance Qualifications:

- Bachelor's or Master's degree in Computer Science or related field
- 10+ years of SRE/DevOps experience in high-availability systems, preferably in trading or financial services
- Strong expertise in Dynatrace, Splunk, and Grafana
- Deep understanding and hands-on experience with networking principles and technologies (Cisco, Arista, AVI)
- Proficiency in network debugging and analysis tools, including Wireshark
- Certifications in networking, SRE, or DevOps practices are highly desirable
- Solid understanding of on-prem and hybrid cloud infrastructure, container orchestration, MongoDB, Kafka, and IBM mainframe DB2 (preferred)
- Certifications in relevant technologies (Dynatrace, Splunk) are a plus
- Excellent communication, leadership, problem-solving, and root-cause analysis skills
#J-18808-Ljbffr


  • Senior Manager

    1 week ago


    Chicago, United States United Airlines Full time

    Description There’s never been a more exciting time to join United Airlines. We’re on a path towards becoming the best airline in the history of aviation. Our shared purpose – Connecting People, Uniting the World – is about more than getting people from one place to another. It also means that as a global company that operates in hundreds of...


  • Chicago, United States CME Group Full time

    Senior Data Reliability Engineer (Data SRE) CME Group is the world's leading and most diverse derivatives marketplace. But who we are goes deeper than that, here you can impact markets worldwide, transform industries and build a career shaping tomorrow. We invest in your success and you own it, all the while working alongside a team of leading experts who...


  • Chicago, Illinois, United States Georgia IT Inc Full time

    Role: Machine Learning Engineer/SRELocation: Chicago, IL or RemoteDuration: 12 MonthsRate: DOEUS Citizens and Green cards & GC-EAD Only. No Third-party C2C available for this jobWe are seeking a highly skilled and motivated Machine Learning Engineer who possesses expertise in developing, deploying, and managing machine learning models. In this role, you will...


  • Chicago, United States Chicago Mercantile Exchange Inc. Full time

    Description Position Overview: Data System Reliability Engineer (dSRE) CME Group: Where Futures Are Made CME Group is the world's leading and most diverse derivatives marketplace. But who we are goes deeper than that, here you can impact markets worldwide, transform industries and build a career shaping tomorrow. We invest in your success and you own it, all...


  • Chicago, United States CME Group Full time

    Description Position Overview: Data System Reliability Engineer (dSRE) CME Group: Where Futures Are Made CME Group is the world's leading and most diverse derivatives marketplace. But who we are goes deeper than that, here you can impact markets worldwide, transform industries and build a career shaping tomorrow. We invest in your success and you own...


  • Chicago, United States CME Group Full time

    Description This role is hybrid requires to be 2 days on site in our Chicago office. This role does not allow to work outside of Illinois state. Position Overview: Data System Reliability Engineer (dSRE) CME Group: Where Futures Are Made CME Group is the world's leading and most diverse derivatives marketplace. But who we are goes deeper than...


  • Chicago, United States CME Group Full time

    Description Position Overview: Data System Reliability Engineer (dSRE) CME Group: Where Futures Are Made CME Group is the world's leading and most diverse derivatives marketplace. But who we are goes deeper than that, here you can impact markets worldwide, transform industries and build a career shaping tomorrow. We invest in your success and you own...


  • Chicago, United States JobRialto Full time

    Top 3 requirements: Ecommerce experience (think Nordstrom, Target, where you purchase a product) Java Spring boot Kubernetes Plusses: Azure Kubernetes preferred Description: Client is looking for a forward-thinking, energetic Site Reliability Engineering Manager to join our team. Client serves the ecommerce needs of leading and growing grocery retailers...


  • Chicago, IL, United States CME Group Full time

    Description Position Overview: Data System Reliability Engineer (dSRE) CME Group: Where Futures Are Made CME Group is the world's leading and most diverse derivatives marketplace. But who we are goes deeper than that, here you can impact markets worldwide, transform industries and build a career shaping tomorrow. We invest in your success and you own...


  • Chicago, IL, United States CME Group Full time

    Description This role is hybrid requires to be 2 days on site in our Chicago office. This role does not allow to work outside of Illinois state. Position Overview: Data System Reliability Engineer (dSRE) CME Group: Where Futures Are Made CME Group is the world's leading and most diverse derivatives marketplace. But who we are goes deeper than...


  • Chicago, United States R2 Global Full time

    Our client, a financial services giant, is looking for a Principal SRE professional to join the team and lead observability efforts throughout a major cloud project and beyond. This role will work 3x's a week in the Downtown Chicago area onsite. Key Responsibilities: Lead and mentor a team of site reliability engineers, fostering a culture of collaboration,...


  • Chicago, United States R2 Global Full time

    Our client, a financial services giant, is looking for a Principal SRE professional to join the team and lead observability efforts throughout a major cloud project and beyond.This role will work 3x's a week in the Downtown Chicago area onsite.Key Responsibilities:Lead and mentor a team of site reliability engineers, fostering a culture of collaboration,...


  • Chicago, United States R2 Global Full time

    Our client, a financial services giant, is looking for a Principal SRE professional to join the team and lead observability efforts throughout a major cloud project and beyond.This role will work 3x's a week in the Downtown Chicago area onsite.Key Responsibilities:Lead and mentor a team of site reliability engineers, fostering a culture of collaboration,...


  • Chicago, United States R2 Global Full time

    Our client, a financial services giant, is looking for a Principal SRE professional to join the team and lead observability efforts throughout a major cloud project and beyond.This role will work 3x's a week in the Downtown Chicago area onsite.Key Responsibilities:Lead and mentor a team of site reliability engineers, fostering a culture of collaboration,...


  • Chicago, United States Motion Recruitment Full time

    Job Description Our client is a leading global financial services firm that provides a comprehensive range of solutions for wealth management, asset servicing, asset management, and banking services. With a focus on serving corporations, institutions, affluent families, and individuals worldwide, they offer trust and investment management services, global...


  • Chicago, United States Cleo Full time

    Site Reliability Engineer At Cleo, we make doing business easy! Cleo is an established software company with a start-up feel. We have awesome products, which go hand in hand with our awesome culture! We are devoted to our people and pride ourselves on creating a fun, laid-back, but fast-paced work environment. Not only do we work hard, we play hard. We have...


  • Chicago, United States Rose International Full time

    Date Posted: 05/09/2024 Hiring Organization: Rose International Position Number: 463795 Job Title: Application-Production Support Analyst Job Location: Chicago, IL, USA, 60604 Work Model: Hybrid Employment Type: Temporary Estimated Duration (In months): 7 Min Hourly Rate ($): 50.00 Max Hourly Rate ($): 55.00 Must Have Skills/Attributes: ...


  • Chicago, United States eTek IT Services, Inc. Full time

    Job DescriptionJob DescriptionPRINCIPAL RESPONSIBILITIES: Lead production stability effort by preventing production issue and improve production stability.• Track /Manage Incidents /Change /Problems for assigned applications.• Provide regular and high-quality updates to all the stakeholders on the progress of the incidents, including SLA...

  • Database SRE

    4 weeks ago


    Chicago, United States Oxford Knight Full time

    Salary: Market-Leading Summary This is one of the world's top algorithmic trading firms, looking for Database Site Reliability Engineers to join the Trading Systems Infrastructure team. A community of self-starters from multiple tech backgrounds - math, computer science, statistics, physics, engineering - they have built one of the world's most...


  • Chicago, United States Apex Systems Full time

    Lead production stability effort by preventing production issue and improve production stability.• Track /Manage Incidents /Change /Problems for assigned applications.• Provide regular and high-quality updates to all the stakeholders on the progress of the incidents, including SLA risks/breaches.• Identify measures to improve applications stability and...