Incident Management Lead

2 weeks ago


Austin, Texas, United States Amazon Full time
Job Title: Incident Management Lead

Amazon Stores Central Technical Operations Services (CTOS) is the first line of defense for maintaining high availability in the Amazon Retail Website. We make customer impacting events shorter, less frequent, and less severe, by providing large scale event and incident management. The Amazon Retail Website has hundreds of millions of customers globally who can be impacted by these types of incidents; the work we do to mitigate them helps real people at a tremendous scale. Our automated tooling quickly identifies the cause of an issue and helps mitigate the impact, and much of our engineers' time is spent on projects to improve the tooling, automation, and processes to avoid future occurrences. We help direct the resolution of an issue to the relevant service teams, and dive deep into those events retrospectively to drive improvements to our process. It's an exciting time to be part of our team as we are rapidly growing and expanding our offerings globally.

Key Responsibilities:
  • Lead, mentor, and manage a highly adaptable and skilled Incident Response Team.
  • Develop and maintain Incident response plans, procedures, and runbooks to effectively respond to Corporate and Online Production incidents.
  • Coordinate and oversee Incident response activities, including Detection, Localization, Mitigation, and Recovery efforts.
  • Serve in a role of Incident Commander as a primary point of contact for Corporate and Online Production incidents. Coordinate communication and collaboration with internal and external stakeholders.
  • Lead investigations into Corporate and Online Production incidents to determine the root cause, scope, and impact of the Incident.
  • Provide leadership and guidance during outage situations, including coordinating cross-functional teams and communicating with senior leadership and stakeholders.
  • Conduct post-incident actions such as analysis, lessons learned, and reviews to identify areas for improvement to the Incident and Problem Management process.
  • Prepare and deliver Incident response reports, presentations, and other communications to senior leadership and stakeholders.
  • Incident Management: Perform the Incident Commander role for outage calls. Be the gold standard for "what excellent looks like."
  • Regularly audit documentation to ensure it's up to date and applicable to current process.
  • Lead reviews of Incident calls to ensure every Incident Manager is consistently raising the bar.
  • Work with the team to generate reporting and derive insights from past incidents to identify trends in outage behaviors.
A Day in the Life:
  • Conduct stand-ups with the Incident Management team
  • Lead the discussion around Incident Analytics and leverage the team to identify the pain points of the Incident process
  • Lead documentation reviews
  • Conduct 1 on 1 meetings with each Incident Manager that reports to you.
  • Track performance of each team member against the Role Guidelines expectation.
  • Identify and create KPI's to raise the bar of CTOS Incident Management practice.
  • Generate career plans with the team to ensure that they have guidance on where to learn and grow.
Basic Qualifications:
  • Bachelor's Degree in IT related field (e.g. Information Systems, IT Management, Engineering, Computer Science), or equivalent experience.
  • 5+ years of experience in managing high velocity teams in the Incident Management space
  • 5+ Years of experience with incident and event management tools, networking, infrastructure, and cloud systems such as PagerDuty, DataDog, Moogsoft, Solar Winds, Nagios etc.
  • Proven experience (5+ years) in Incident response, with a focus on managing and leading incident response efforts.
  • Knowledge of and/or a completed certification of incident response frameworks and best practices, such as ITIL 4 Foundations, Professional, or Strategic Leader Certifications.
Preferred Qualifications:
  • Strong leadership, communication, and interpersonal skills.
  • Strong understanding of Incident Management principles, technologies and platforms around detection, and a strong understanding of the Problem Management process.
  • Ability to work effectively in a fast-paced, dynamic environment and manage competing priorities.
  • Strong analytical and problem-solving capability, with the ability to analyze complex incidents and identify effective response strategies.

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.



  • Austin, Texas, United States Tik Tok Full time

    About the RoleWe are seeking a highly skilled Incident Manager, Regulatory Response to join our Trust & Safety team at TikTok. As a key member of our global team, you will be responsible for investigating, managing, and resolving reported incidents and escalations across various TikTok channels.ResponsibilitiesInvestigate and manage reported incidents and...


  • Austin, Texas, United States Tik Tok Full time

    About the RoleTikTok is seeking a highly skilled Incident Manager, Regulatory Response to join our Trust & Safety team. As a key member of our global team, you will be responsible for investigating, managing, and resolving reported incidents and escalations across various TikTok channels.ResponsibilitiesInvestigate and manage reported incidents and...


  • Austin, Texas, United States Caliber Sourcing Full time

    Job Title: Incident Response Analyst IAt Caliber Sourcing, we are seeking a highly skilled Incident Response Analyst I to join our team. As an Incident Response Analyst I, you will be responsible for analyzing and resolving technical issues in a fast-paced environment.Key Responsibilities:Provide initial analysis and triage for incoming customer calls,...


  • Austin, Texas, United States Oracle Full time

    Job Title: Incident Response Security AnalystOracle is seeking a highly skilled Incident Response Security Analyst to join our Cloud Security team. As a key member of our team, you will be responsible for detecting, analyzing, and responding to security incidents in our cloud environment.Key Responsibilities:Perform hands-on activities including network and...


  • Austin, Texas, United States Caliber Sourcing Full time

    Job DescriptionAs an Incident Response Analyst I, you will be responsible for analyzing and classifying customer cases, troubleshooting documentation, and escalating issues as needed. You will work closely with customers to provide a superior customer experience through tactical troubleshooting, monitoring, and proactive incident resolution.Key...


  • Austin, Texas, United States Caliber Sourcing Full time

    Job DescriptionWe are seeking a highly skilled Incident Response Specialist to join our team. As an Incident Response Specialist, you will be responsible for analyzing and resolving technical issues in a timely and efficient manner.Key Responsibilities:Provide initial analysis and triage for incoming customer calls, monitoring alerts, and customer-submitted...


  • Austin, Texas, United States META Full time

    Summary:Meta is seeking a highly motivated and detail-oriented individual to join our Law Enforcement Incident Response Team. The ideal candidate will possess a strong work ethic, excellent judgment, and exceptional partnering skills.Key Responsibilities:Bring operational excellence to a team that evaluates threat, risk, and user privacy in a time-critical...


  • Austin, Texas, United States Genius Road, LLC Full time

    Cybersecurity Incident Response Analyst Contract Duration: 6-12 months (+potential extension) Work Arrangement: Onsite in Austin, Texas Genius Road, LLC is seeking a Cybersecurity Incident Response Analyst to manage and operate an RSA Netwitness Platform. The ideal candidate will hold the Netwitness XDR Specialist Administrator Certification and will be...


  • Austin, Texas, United States 3M Full time

    Job Title: Privacy Program and Incident Response LeadJoin 3M, a global innovation company, as a Privacy Program and Incident Response Lead. This role offers a unique opportunity to leverage your expertise in managing and coordinating privacy programs and initiatives within the organization.The Impact You'll MakeAs an operational privacy program and incident...


  • Austin, Texas, United States CAPITAL METROPOLITAN Full time

    Job Title: Technical Product Management LeadCapital Metropolitan Transportation Authority is seeking a highly skilled Technical Product Management Lead to join our team. As a key member of our product management team, you will be responsible for directing business and technical stakeholders in the product strategy, roadmap, and prioritization, product...


  • Austin, Texas, United States Parking Management Services Full time

    About the RoleWe are seeking a highly skilled and experienced Valet Guest Service Manager to join our team at Parking Management Services. As a key member of our operations team, you will be responsible for overseeing the day-to-day activities of our valet service, ensuring seamless and efficient operations.Key ResponsibilitiesLead and Motivate a Team:...


  • Austin, Texas, United States ResProp Management Full time

    Job Title: Transactions Marketing & SEM LeadWe are seeking a highly skilled Transactions Marketing & SEM Lead to join our team at ResProp Management. As a key member of our marketing team, you will be responsible for creating customized targeted marketing strategies and brand development across digital, print, and local channels for properties within your...

  • Security Officer

    1 month ago


    Austin, Texas, United States BEST Crowd Management Full time

    Job Title: Full-Time Security Officer - Moody CenterJob Type: Full-timeLocation: Austin, TXPay: $16.50 to 17.00 / hrShifts: Hiring for all shifts (1st Shift, 2nd Shift, 3rd Shift)Job Summary:We are seeking a highly skilled and experienced Security Officer to join our team at BEST Crowd Management. As a Security Officer, you will play a critical role in...


  • Austin, Texas, United States Sandalwood Management Full time

    Job Summary:Sandalwood Management is seeking a highly skilled and experienced Lead Maintenance Supervisor to oversee maintenance operations at one of our apartment complexes in North Austin. This is a fast-paced position that requires excellent customer service skills and technical knowledge in residential housing repairs, including HVAC.Key...


  • Austin, Texas, United States Hampton North Full time

    About the RoleWe are seeking a highly experienced Senior Manager, Cybersecurity Operations to lead our Incident Response team at Hampton North. As a key member of our cybersecurity leadership team, you will be responsible for developing and implementing strategies to enhance our incident response capabilities and ensure the highest level of security for our...

  • Marketing Lead

    6 days ago


    Austin, Texas, United States ResProp Management Full time

    Job Title: SEM/SEO LeadResProp Management is a leading multi-family property management company that serves people, community residents, and owners. We are committed to hiring top talent from innovative companies to drive our growth and success.About the Role:We are seeking an experienced SEM/SEO Lead to join our marketing team. As a key member of our team,...


  • Austin, Texas, United States CSC Management LLC Full time

    Assistant Manager Job DescriptionCSC Management LLC is seeking an experienced Assistant Manager to join our team in Austin, Texas. As a key member of our property management team, you will be responsible for overseeing property operations, managing staff, and ensuring the smooth day-to-day functioning of our properties.Key Responsibilities:Collect and...


  • Austin, Texas, United States McKesson Full time

    About McKessonMckesson is a leading healthcare company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Our mission is to improve the lives of patients, our communities, and our people.Job SummaryWe are seeking a highly skilled Cyber Security...


  • Austin, Texas, United States Advantest America Full time

    Job Description:This is a position in Corporate IT, primarily working with the Field Service Business Leads and Management to ensure seamless IT services.Key Responsibilities:Manage IT services for the Field Service Business, including Oracle R12 and Boundary Applications.Oversee incident and escalation management, change management, project management, and...

  • Safety Manager

    1 week ago


    Austin, Texas, United States Crox Consulting Inc Full time

    Job Title: Site/Safety ManagerWe are seeking a dedicated and experienced Safety Manager to lead our safety initiatives and foster a culture of excellence within Crox Consulting Inc.Key Responsibilities:Safety Program Development: Create and enhance safety programs to ensure compliance with OSHA regulations and industry standards.Regulatory Compliance: Stay...