Reliability Engineering Lead
4 days ago
Trumid is a pioneering fintech that's revolutionizing fixed income trading. Our cutting-edge electronic solutions are empowering us to grow rapidly, and we're seeking exceptional talent to redefine the intersection of technology and finance.
The OpportunityWe're looking for a Lead Site Reliability Engineer (SRE) to ensure our systems' reliability, scalability, and performance as we continue to scale. This role offers a unique opportunity to shape our firm's reliability practices and infrastructure. You'll be instrumental in optimizing our existing infrastructure, implementing new technologies, and enhancing our incident response capabilities.
Key Responsibilities:- Transform the SRE function to evolve, simplify, and scale existing solutions.
- Drive improvements in system reliability, scalability, and performance through innovative solutions and industry best practices.
- Lead incident response efforts, including troubleshooting, resolution, and conducting post-mortem analysis to prevent future incidents.
- Automate repetitive tasks to reduce manual intervention and improve operational efficiency.
- Collaborate closely with software development, DevOps, and infrastructure teams to embed reliability into the development lifecycle.
SRE expert with foundation knowledge of SRE best practices. Demonstrated hands-on experience managing large-scale and highly-available cloud-based systems. Deep understanding of cloud components in at least one of the major cloud providers (e.g., AWS, GCP, Azure), including infrastructure, services, and tooling. Expertise in containerization and orchestration tools (e.g., Docker, Kubernetes) and experience with deployment strategies such as blue-green and canary deployments.
Requirements:- Strong scripting and programming skills in Python, Bash, Go, or similar languages.
- Excellent problem-solving skills, focusing on diagnosing complex issues in large-scale distributed systems.
- Strong communication and collaboration skills, capable of working effectively with cross-functional teams in a fast-paced environment.
- Bachelor's degree in computer science (or equivalent) and at least 10 years of professional experience at a fast-paced tech-oriented company.
- Highly competitive compensation: $220,000 - $300,000 per year.
- Fully paid medical, dental, and vision coverage.
- Remote work options.
- A team-oriented and collaborative company culture.
- Equal-opportunity employer.
-
Lead Site Reliability Engineer
4 weeks ago
New York, New York, United States Tenth Mountain Full timeLead Site Reliability EngineerAt Tenth Mountain, we're committed to helping veterans transition into rewarding civilian careers. As a Lead Site Reliability Engineer, you'll play a critical role in ensuring the reliability and availability of our Payments infrastructure.Key Responsibilities:Provide 24/5 round-the-clock support for the Payments team, covering...
-
Lead Reliability Engineer
3 weeks ago
New York, New York, United States Capital One Full timeJob SummaryCapital One is seeking a highly skilled Reliability Engineer to join our team. As a Reliability Engineer, you will be responsible for designing, developing, and implementing technical solutions to ensure the reliability and availability of our systems.You will work closely with cross-functional teams to identify and prioritize opportunities to...
-
Lead Reliability Engineer
3 weeks ago
New York, New York, United States Capital One Services, LLC Full timeAbout the Role:We are seeking a highly skilled Reliability Engineer to join our team at Capital One Services, LLC. As a Reliability Engineer, you will be responsible for designing, developing, testing, implementing, and supporting technical solutions in full-stack development tools and technologies.Key Responsibilities:Collaborate with Agile teams to design,...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Citadel Enterprise Americas Services LLC Full timeJob SummaryCitadel Enterprise Americas Services LLC is seeking a skilled Site Reliability Engineer to join our team. As a key member of our technical operations team, you will be responsible for ensuring the reliability and performance of our trading applications. This is a challenging and rewarding role that requires a strong understanding of software...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Insight Global Full timeJob SummaryWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will be responsible for ensuring the uptime and reliability of our production and non-production environments. You will work closely with our development teams to build and maintain the infrastructure and applications...
-
Product Reliability Engineer
3 weeks ago
New York, New York, United States Palantir Technologies Full timeA World-Changing CompanyPalantir builds the world's leading software for data-driven decisions and operations.By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more.The RoleReliability Engineers are the driving forces of...
-
Site Reliability Engineer
3 weeks ago
New York, New York, United States Tik Tok Full timeAbout Site Reliability Engineering at TikTokTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. As a Site Reliability Engineer at TikTok, you will play a critical role in ensuring the reliability and scalability of our systems.Responsibilities Develop and maintain automation procedures to...
-
Senior Software Reliability Engineer
1 month ago
New York, New York, United States Oakland Search Full timeSenior Site Reliability EngineerAbout the RoleWe are seeking a highly skilled Senior Site Reliability Engineer to join our team in New York City. As a key member of our engineering team, you will be responsible for designing, implementing, and maintaining our software systems to ensure high availability, scalability, and performance.Key...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Citadel Securities Americas Services LLC Full timeJob SummaryCitadel Securities Americas Services LLC is seeking a skilled Reliability Specialist to join our team. As a key member of our infrastructure support team, you will be responsible for ensuring the smooth operation of our trading applications. This includes collaborating with cross-functional teams to identify and resolve production issues,...
-
Senior Site Reliability Engineer
3 weeks ago
New York, New York, United States Podium Full timeAt Podium, our mission is to empower local businesses to succeed. We achieve this by providing a comprehensive platform that streamlines lead conversion, communication, and sales. Our platform, powered by AI and integrations, helps local businesses thrive in a competitive market.Our team is dedicated to fostering a culture that values exceptional talent and...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Hebbia Full timeAbout HebbiaHebbia is a cutting-edge technology company that empowers users to collaborate with AI on each step and validate responses. Our mission is to put capable AI in the hands of 1 billion people by 2030.Job DescriptionWe are seeking a highly skilled Site Reliability Engineer to contribute to building systems that optimize the uptime and reliability of...
-
Reliability Engineering Expert
3 weeks ago
New York, New York, United States Capital One Full timeAbout the Role:We are seeking a highly skilled Reliability Engineering Expert to join our team at Capital One. As a Reliability Engineering Expert, you will be responsible for designing, developing, and implementing technical solutions to improve the reliability and scalability of our systems.Key Responsibilities:Collaborate with Agile teams to design,...
-
Site Reliability Engineer
3 weeks ago
New York, New York, United States Peloton Full timeAbout the RolePeloton is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our platform. You will work closely with our engineering teams to design, implement, and operate scalable systems that meet the needs of our users.Key...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Huntress Full timeJob OverviewWe are seeking a highly skilled Site Reliability Engineer to join our team. As a key member of our infrastructure team, you will be responsible for ensuring the reliability and scalability of our distributed systems.Your primary focus will be on designing, implementing, and maintaining our cloud infrastructure, ensuring that it meets the needs of...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States City National Bank Full timeJob SummaryCity National Bank is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and maximum uptime of our systems in the Data Center or Cloud Platform.Key ResponsibilitiesImplement solutions that improve stability, security, scalability,...
-
Senior Site Reliability Engineer, Compute
4 weeks ago
New York, New York, United States Squarespace Full timeAbout the RoleWe're seeking an experienced software engineer to join our Infrastructure Engineering team as a Senior Site Reliability Engineer, Compute. As a key member of our team, you'll play a crucial role in ensuring the reliability and performance of our systems, working closely with product teams to maintain the stability of our hybrid data centers and...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Tik Tok Full timeAbout the RoleTikTok is seeking a skilled Site Reliability Engineer to join our U.S. Data Security team. As a key member of our team, you will be responsible for ensuring the reliability and scalability of our software systems.Responsibilities:Collaborate with infrastructure, product, and platform engineering teams to design and deploy scalable and secure...
-
Site Reliability Engineer
4 weeks ago
New York, New York, United States Tik Tok Full timeJob SummaryTikTok is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the scalability and reliability of our cloud infrastructure. You will work closely with our infrastructure, product, and platform engineering teams to design, deploy, and maintain scalable and secure...
-
Senior Site Reliability Engineer
3 weeks ago
New York, New York, United States Clear Corporate Services LLC Full timeAt CLEAR, we're pushing the boundaries of digital and biometric identification, making it easier for our members to navigate the world.We're seeking a Senior Site Reliability Engineer to spearhead our SRE function, driving innovation in our identity platform. This role will involve leading reliability-focused practices, collaborating with the Software...
-
Staff Site Reliability Engineer
4 weeks ago
New York, New York, United States Betterment Full timeAbout BettermentBetterment is a leading technology-driven financial services company that offers investing and retirement solutions for retail investors and investment advisors, as well as financial wellness solutions for small and medium-sized businesses. Our team is passionate about our mission: making people's lives better.About the RoleAs a Staff Site...