Site Reliability Engineer

4 weeks ago


new york city, United States Aegistech Full time

We are looking to hire an employee for a Cloud SRE – Windows Hybrid role located in NYC.


The role is located within the client’s Cloud and Platform group, a global team responsible for the maintenance and support of infrastructure systems used within the client, a global international investment bank. The team plays a critical role and works closely with global counterparts in maintaining the production infrastructure. The candidate should have strong technical, functional, and analytical skills with good experience in automation and supporting critical infrastructure and troubleshooting on Windows systems. This position will contribute towards supporting and driving infrastructure implementations to completion and serving as subject matter expert to the user community, across Infrastructure, Platform, and Software as a Service (Iaas/PaaS/SaaS). The team operates in a follow the sun support model.


The function provides a variety of services to our stakeholders including hardware specification advice, Operational Readiness of new solutions and implementation of new Windows servers.



THE DAY-TO-DAY RESPONSIBILITIES:

  • Support/manage MS Windows systems and implementation of Change requests.
  • The candidate will create scripts to increase the efficiency of daily support. This includes updating runbooks and support procedures.
  • Active collaboration with the Global Operations and Engineering teams to implement key projects within the Cloud environments.
  • Provide assistance and support to transformation programs for application and services looking to move to the cloud environment.
  • Responsible for looking at ways to improve/automate SRE items - availability, latency, performance, efficiency, and capacity planning.
  • Troubleshoot system performance issues.
  • Handle trouble tickets, user requests, proactive maintenance.
  • Support weekend BCP / DR tests and weekend on call production support on a rotation basis.
  • Assist application teams in post-configuration of new servers deployed.
  • A good understanding of ITIL and Change Management policies is desired.
  • Coordination with Infrastructure teams and Business IT managers to deliver projects on schedule.
  • Work with the Engineering team on Operational Readiness and implement engineered solutions to improve efficiency and stability of the infrastructure.
  • Provide documentation for 1st line Operations team and maintain run books.
  • Investigate and determine root causes for major incidents with the help of vendors and internal infrastructure teams, providing a detailed RCA and plan for remediation.
  • Attend to escalations during Follow-The-Sun support hours.
  • Contribute towards BAU Projects.
  • Work with Incident Management team to provide RCAs for Incidents


THE SKILLS YOU NEED TO GET THE ROLE:

  • Solid experience as a Windows Systems Administrator in a large-scale, global and distributed environment
  • Cloud tools such as Ansible, GIT, Kubernetes, Terraform
  • Virtual (VMware), physical networking configuration is a plus
  • Ability to deploy and support MS Windows Clusters
  • Ability to create scripts using PowerShell, Python, VBScript, Jscript/JavaScript is a plus
  • Knowledge of Site Reliability Engineering components are a must.
  • Experience working with VMware virtualization
  • Understanding of Active Directory and how enterprise class identity and access management (IAM) is extended from on-premises environment to public cloud is a plus
  • Ability to troubleshoot issues and provide resolution
  • Written and verbal communication skills are a must
  • Work independently as well as in a team
  • Previous experience with supporting a banking infrastructure is preferred
  • Prior experience of global enterprise
  • Experience of working with offshore IT teams


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and reliable cloud...


  • New York, New York, United States CapB InfoteK Full time

    Job Title: Site Reliability EngineerAbout the Role:At CapB InfoteK, we're seeking a skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:• Develop and build low-level component...


  • New York, New York, United States Phaxis Full time

    Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Phaxis. As a Site Reliability Engineer, you will be responsible for designing, building, and maintaining our critical infrastructure platforms.Key Responsibilities:Design and implement scalable and resilient servicesCollaborate with engineering teams to...


  • New York, New York, United States Diverse Lynx Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based systems.Key Responsibilities:Design and implement automated workflows to reduce TOIL and...


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain scalable and highly available...


  • New York, New York, United States Lorven Technologies Full time

    Job Title: Site Reliability EngineerLorven Technologies is seeking a highly skilled Site Reliability Engineer to join our team. As a Site Reliability Engineer, you will be responsible for ensuring the reliability, scalability, and performance of our cloud-based infrastructure.Key Responsibilities:Design, implement, and maintain infrastructure automation...


  • New York, New York, United States FLOAT LLC Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our team at Float LLC. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our cloud infrastructure, enabling our engineering teams to focus on delivering high-quality software to our customers.Key...


  • New York, New York, United States Unreal Gigs Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Unreal Gigs. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our systems.Key Responsibilities:Design, implement, and maintain scalable infrastructure solutions to support...


  • New York, New York, United States Alchemy Full time

    About the RoleAlchemy is seeking a highly skilled Site Reliability Engineer to join our Infrastructure team. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our globally used developer platform.Key ResponsibilitiesDesign, deploy, and continuously improve the infrastructure supporting...


  • New York, New York, United States Insight Global Full time

    Job Title: Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Insight Global. As a Site Reliability Engineer, you will be responsible for ensuring the uptime and reliability of our production and non-production environments.Key Responsibilities:Monitor availability and system health to ensure optimal...


  • New York, New York, United States ADP Full time

    About ADPADP is a global leader in HR technology, offering the latest AI and machine learning-enhanced payroll, tax, HR, benefits, and more. We believe our people make all the difference in cultivating an inclusive, down-to-earth culture that welcomes ideas, encourages innovation, and values belonging.Job DescriptionWe are seeking a Site Reliability Engineer...


  • New York, New York, United States Motion Recruitment Full time

    Site Reliability EngineerWe are seeking a highly skilled Site Reliability Engineer to join our team at Motion Recruitment. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our systems, as well as collaborating with cross-functional teams to drive innovation and improvement.Key Responsibilities:Design,...


  • New York, New York, United States Cynet Systems Full time

    Job Title: Site Reliability EngineerJob Summary:Cynet Systems is seeking a highly skilled Site Reliability Engineer to lead the development and implementation of geospatial application performance monitoring strategies. The ideal candidate will have a strong background in Site Reliability Engineering (SRE) and proven experience in using Dynatrace for...


  • New York, New York, United States Phaxis Full time

    Site Reliability EngineerWe are seeking an experienced Site Reliability Engineer to join our team at Phaxis. As a Site Reliability Engineer, you will be responsible for designing and building scalable and resilient systems, collaborating with engineering teams to advocate for optimal system use, and managing our centralized development infrastructure.Key...


  • New York, New York, United States Diverse Lynx Full time

    Job Title: SRE - Site Reliability EngineerJob Summary:We are seeking a highly skilled Site Reliability Engineer to join our team at Diverse Lynx LLC. As a Site Reliability Engineer, you will be responsible for ensuring the reliability and scalability of our cloud-based systems.Key Responsibilities:Design and implement automated workflows to reduce TOIL and...


  • New York, New York, United States Braze Full time

    About the RoleWe're seeking a highly skilled Site Reliability Engineer to join our team at Braze. As a Site Reliability Engineer, you will play a critical role in ensuring the reliability and scalability of our internal-facing services and platforms.Key ResponsibilitiesPartner with Braze's engineering teams to architect products that effectively utilize...


  • New York, New York, United States Apollo Solutions Full time

    Site Reliability EngineerApollo Solutions is partnering with a pioneering artificial intelligence business that is revolutionizing the use of AI/ML in gaming and security.The company is working closely with government contracts and gaming console companies and is seeking a Site Reliability Engineer to join their growing team.The Site Reliability Engineer...


  • New York, New York, United States Unreal Gigs Full time

    Job SummaryWe are seeking a highly skilled Site Reliability Engineer to join our tech startup, Unreal Gigs, specializing in infrastructure and authorization solutions.As a Site Reliability Engineer, you will play a crucial role in ensuring the reliability, availability, and performance of our systems. Your responsibilities will include designing,...


  • New York, New York, United States Grafbase, Inc. Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Engineering team at Grafbase, Inc. As a Site Reliability Engineer, you will play a crucial role in ensuring the reliability, availability, and performance of our systems and services.You will collaborate with cross-functional teams to design, implement, and maintain...


  • New York, New York, United States Grafbase, Inc. Full time

    About the RoleWe are seeking a highly skilled Site Reliability Engineer to join our Engineering team at Grafbase, Inc.As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our systems and services.Key ResponsibilitiesCollaborate with cross-functional teams to develop and deploy software...