Current jobs related to Lead Site Reliability Engineer - New York, New York - Wells Fargo


  • New York, New York, United States Tenth Mountain Full time

    Lead Site Reliability EngineerAt Tenth Mountain, we're committed to helping veterans transition into rewarding civilian careers. As a Lead Site Reliability Engineer, you'll play a critical role in ensuring the reliability and availability of our Payments infrastructure.Key Responsibilities:Provide 24/5 round-the-clock support for the Payments team, covering...


  • New York, New York, United States Alchemy Full time

    About the RoleAlchemy is seeking a highly skilled Site Reliability Engineer to join our Infrastructure team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our globally used developer platform.Key ResponsibilitiesDesign, deploy, and continuously improve the infrastructure supporting...


  • New York, New York, United States Apollo Solutions Full time

    Site Reliability EngineerApollo Solutions is partnering with a pioneering artificial intelligence business that is revolutionizing the use of AI/ML in gaming and security.The company is working closely with government contracts and gaming console companies and is seeking a Site Reliability Engineer to join their growing team.The Site Reliability Engineer...


  • New York, New York, United States Cynet Systems Full time

    Job Title: Site Reliability EngineerJob Summary:Cynet Systems is seeking a highly skilled Site Reliability Engineer to lead the development and implementation of geospatial application performance monitoring strategies. The ideal candidate will have a strong background in Site Reliability Engineering (SRE) and proven experience in using Dynatrace for...


  • New York, New York, United States Braze Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Braze. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our internal-facing services and platforms.Key ResponsibilitiesPartner with Braze's engineering teams to architect products that effectively utilize...


  • New York, New York, United States Intuit Inc Full time

    Job Title: Site Reliability Engineering ManagerAt Intuit Inc, we're seeking an experienced Site Reliability Engineering Manager to lead our Site Reliability Engineering Team. As a key member of our Engineering organization, you will be responsible for ensuring the reliability, scalability, and performance of our application used by both internal engineers...


  • New York, New York, United States Alchemy Full time

    About the RoleAlchemy is seeking a highly skilled Site Reliability Engineer to join our Infrastructure team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our globally used developer platform.Our mission is to empower builders with the tools they need to create exceptional on-chain products....


  • New York, New York, United States Fourier Ltd Full time

    Site Reliability EngineerFourier Ltd is seeking a skilled Site Reliability Engineer to join our technical operations team. As a Site Reliability Engineer, you will play a critical role in ensuring the superior performance and availability of our production applications throughout the development cycle.Key Responsibilities:Configure and manage multiple...


  • New York, New York, United States Tik Tok Full time

    About TikTok U.S. Data SecurityTikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to millions of users worldwide.Our mission is to provide a secure and reliable platform for users to express themselves, learn, and be entertained.Role OverviewWe are seeking a skilled Site Reliability Engineer to join our U.S....


  • New York, New York, United States Intuit Inc Full time

    Job OverviewMailchimp is a leading marketing platform for small businesses, empowering millions of customers worldwide to build their brands and grow their companies with a suite of marketing automation, multichannel campaigns, CRM, and analytics tools.Job DescriptionWe are seeking an experienced Engineering Leader to lead our Site Reliability Engineering...


  • New York, New York, United States Tik Tok Full time

    About TikTok U.S. Data SecurityTikTok is a leading destination for short-form mobile video, inspiring creativity and bringing joy to millions of users worldwide.Our mission is to provide a secure and reliable platform for users to express themselves, learn, and be entertained.Site Reliability Engineering at TikTokAs a Site Reliability Engineer at TikTok, you...


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and reliable cloud...


  • New York, New York, United States CapB InfoteK Full time

    Job Title: Site Reliability EngineerAbout the Role:At CapB InfoteK, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:• Develop and build low-level component...


  • New York, New York, United States Phaxis Full time

    Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Phaxis. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining our critical infrastructure platforms.Key Responsibilities:Design and implement scalable and resilient servicesCollaborate with engineering teams to...


  • New York, New York, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design and implement automated workflows to reduce TOIL and...


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly available...


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain infrastructure automation...


  • New York, New York, United States FLOAT LLC Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Float LLC. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud infrastructure, enabling our engineering teams to focus on delivering high-quality software to our customers.Key...


  • New York, New York, United States Unreal Gigs Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Unreal Gigs. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our systems.Key Responsibilities:Design, implement, and maintain scalable infrastructure solutions to support...


  • New York, New York, United States Hudson River Trading Full time

    Job Title: Senior IT Site Reliability EngineerHudson River Trading (HRT) is a leading financial services company that utilizes a scientific approach to trading. We are seeking a highly skilled Senior IT Site Reliability Engineer to join our team.Job Summary:The Senior IT Site Reliability Engineer will be responsible for ensuring the availability and...

Lead Site Reliability Engineer

2 months ago


New York, New York, United States Wells Fargo Full time
About this position:

We are seeking a Principal Engineer specializing in Site Reliability who is passionate about tackling intricate challenges through innovative solutions that drive substantial change across a diverse landscape. You will be part of a dedicated team focused on Application Support and Site Reliability Engineering (SRE), enhancing the SRE framework across numerous applications and various business sectors within the organization. This team will spearhead technological transformation and the adoption of SRE-aligned enterprise capabilities and products, implement new tooling solutions, automate complex issues, and integrate cutting-edge technology.

Key Responsibilities:
  • Establish and cultivate Site Reliability Engineering and AIOps capabilities within the organization, fostering a culture of excellence and leading by example. Collaborate in training skilled engineers and enhancing the practice within the enterprise.
  • Promote and advance the adoption of enterprise tools and innovative solutions to enhance availability in a multi-cloud environment, focusing on observability, monitoring, logging, synthetic monitoring, and chaos engineering.
  • Advance AIOps by introducing self-healing and autonomic capabilities to address complex operational challenges, including process automation and leveraging AI/ML to enhance product availability.
  • Automate critical SRE metrics and IT Service Operations processes, including customer impact assessments, availability of essential business functions, SLO/SLI compliance, error budgeting, and minimizing recovery times.
  • Share support responsibilities for vital applications and customer journeys, leading the technical resolution of high-priority incidents in collaboration with cross-functional teams, conducting blameless post-mortems, and performing root cause analyses to foster continuous improvement.
  • Work closely with application development teams and other organizations to influence and enhance stability and SRE-aligned capabilities.
  • Advise leadership on the development or influence of applications, networks, information security, databases, operating systems, or web technologies to meet complex business and technical requirements across multiple divisions.
  • Lead the strategy and resolution of unique and complex challenges requiring comprehensive evaluation across various enterprise areas, delivering long-term, scalable solutions that necessitate creativity, innovation, and advanced analytical thinking.
Required Qualifications:
  • 10+ years of engineering experience or equivalent demonstrated through a combination of work experience, training, military experience, or education.
  • 7+ years of experience in Java, C#, Python, or other object-oriented programming languages.
  • 5+ years of experience in engineering and support roles on Linux/Unix and Windows servers.
  • 3+ years of experience with cloud technologies.
  • 3+ years of experience supporting complex enterprise-level applications and platforms in production environments.
  • 5+ years of experience designing and implementing complex observability solutions using industry-standard tools or custom-built solutions.
  • 5+ years of experience with configuration and monitoring technologies such as Ansible, Grafana, Elastic, Splunk, and Prometheus.
  • Excellent verbal, written, and interpersonal communication skills.
Desired Qualifications:
  • A Master's degree or higher in computer science or engineering.
  • Experience in the design, implementation, and governance of Artificial Intelligence, Natural Language Processing, or Machine Learning architectures.
  • Familiarity with Agile Scrum methodologies and Kanban practices.
Job Expectations:
  • Willingness to travel up to 10%.
  • In-office presence expected for three days per week.
Compensation:

The base pay range for this position is reflective of the skills, experience, and achievements of the candidate.
$144,000 - $300,000.00

Benefits:

The organization offers eligible employees a comprehensive benefits package, including but not limited to:
  • Health benefits
  • 401(k) Plan
  • Paid time off
  • Disability benefits
  • Life insurance and critical illness insurance
  • Parental leave
  • Tuition reimbursement
  • Scholarships for dependent children
Diversity and Inclusion:

The organization values diversity, equity, and inclusion in the workplace and welcomes applications from all qualified candidates, regardless of various protected characteristics.