Operations and Incident Manager

1 month ago


Washington, United States Leidos Full time

**Description**

The OIM is responsible to meet the following performance requirements:

- Infrastructure Operations: The OIM will work to optimize and minimize the cost of infrastructure operations and identify and implement opportunities for improvement. The OIM will work with OCIO management to define infrastructure support initiatives and solutions for improving service efficiency and effectiveness. The OIM is responsible for the development and continuous maintenance of standard operating procedures for all aspects of infrastructure operations.
- Service Level Management: The OIM will research and recommend courses of action to prevent problems and to maintain agreed infrastructure service levels for the user community. The OIM will utilize metrics, key performance indicators, and service level agreements to improve IT service delivery and maintain service delivery standards.
- Incident Management and Response: The OIM is responsible for overall management of contractor incident response. This includes timely response to technical issues, resolution and remediation of outages, incidents, service requests, problems, as well as timely support for security incident investigations and provision of data and metrics to system ISSOs.
- Service Transition: The OIM is responsible for the ongoing execution of the ITIL-compliant Service Transition process. The OIM must ensure that all infrastructure changes are in accordance with the process.

This position requires clear and concise communication with internal team members and across multiple task areas and clients, as well as external organizations (e.g., sub-contractors, vendors, etc.) and customer executives. The position also works to influence project/team leaders regarding solution design, process and/or approaches to continuously improve and modernize the customer’s infrastructure.

Primary Responsibilities:

- Lead the day-to-day operations of an IT environment for a federal customer supporting approximately 1,500 end users.
- Ensure perpetual monitoring of the IT environment. Ensure that services are up and operational - meeting and exceeding required service level agreements (SLAs). Utilize metrics, key performance indicators, and service level agreements for driving the performance of IT service delivery. React with urgency to identified environmental issues.
- Appropriately communicate status of the operational environment and ongoing initiatives to multiple levels of leadership within the customer team including but not limited to technical leads, Contracting Officer’s Representative (COR), internal consumers of IT services, and executive leaders including the CIO.
- Coordinate and execute ongoing infrastructure operations, management, and maintenance.
- Manage incident responses and ensure rapid restoration of services that fail. Work cooperatively with the End User Support Services (EUSS) Incident & Problem Manager to ensure proper focus on problem resolution and effective communications to the FTC user community.
- Proactively research and recommend courses of action to prevent problems and to maintain high infrastructure service levels for the user community. Support strategic improvements to the environment including the migration of components and services to the cloud. Coordinate with OCIO management to establish priorities and plans of action.
- Track and respond to customer requests in a timely manner.
- Optimize and minimize the cost of infrastructure operations.
- Ensure that all infrastructure changes and deployment of new services comply with the customer-prescribed change management process. Ensure all changes are well documented with test plans, architecture and design documentation, roll-back plans, and any other documentation required by the change process.
- Ensure development and continuous maintenance of documentation describing the infrastructure environment including, but not limited to, architecture diagrams, designs, standard operating procedures (SOPs), and tracking lists necessary to accurately maintain the server environment.
- Direct timely support of security incident investigations and provide data and metrics to Information System Security Officers (ISSOs) as required.
- Manage a team of three direct reports and approximately 20 onsite team members and 15 remote team members.
- Interact with vendors as required to support troubleshooting of issues, coordinate changes, and develop strategic improvements.
- Ensure security of the environment through adherence to security standards and device configuration in accordance with FTC security requirements.
- Use customer ticket management solution (ServiceNow) to log and track all activities. Support development of dashboards and capabilities to constantly monitor the environment.
- Ensure accurate tracking of infrastructure assets and a perpetually accurate inventory of the environment.
- Serve as the point of escalation for on-call and after-hours support requirements.
-


  • Incident Manager

    1 week ago


    Washington, United States JCD Staffing Full time

    We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder and Communications Management, and After-Action Root...


  • Washington, United States JCD STAFFING LLC Full time

    Job Description Job Description Job Description: We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder and...


  • Washington, United States ShorePoint Inc Full time

    Job Description Job Description Salary: Who we are: ShorePoint is a fast-growing, industry recognized, and award-winning cybersecurity services firm with a focus on high-profile, high-threat, private and public-sector customers who demand experience and proven security models to protect their data. ShorePoint subscribes to a “work hard, play hard”...


  • Washington, United States Network Designs Full time

    Job Description Job Description Job Description: We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder and...


  • Washington, United States JCD STAFFING LLC Full time

    Job DescriptionJob DescriptionJob Description:We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder and...


  • Washington, United States ShorePoint Full time

    Job DescriptionJob DescriptionSalary: Who we are: ShorePoint is a fast-growing, industry recognized, and award-winning cybersecurity services firm with a focus on high-profile, high-threat, private and public-sector customers who demand experience and proven security models to protect their data. ShorePoint subscribes to a “work hard, play hard”...


  • Washington, United States Network Designs Full time

    Job Description: We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder and Communications Management, and...


  • Washington, United States Network Designs Full time

    Job Description: We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder and Communications Management, and...


  • Washington, United States Highlighttech Full time

    Overview Highlight is looking for an Incident Response Manager to play a crucial role in managing incidents, ensuring smooth operations, and continuously improve the incident management process to meet customer expectations and maintain a high level of service quality in a 24/7/365 environment. Responsibilities Provide Incident Management services to meet...


  • Washington DC, United States JCD STAFFING LLC Full time

    Job Description Job Description Job Description: We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder...


  • Washington, United States Network Designs Inc. Full time

    Job DescriptionJob DescriptionJob Description:We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder and...


  • Washington, United States Network Designs Inc. Full time

    Job DescriptionJob DescriptionJob Description:We are seeking a highly skilled and experienced Incident and Release Manager, who will be responsible for the Incident, Problem, and Release processes within the program. For Incident and Problem Management this would include Incident Triage methodologies, Impact Assessments, Troubleshooting, Stakeholder and...

  • Incident Manager

    1 month ago


    Washington, United States Trusted Advisors Full time

    Job DescriptionJob DescriptionNous recherchons pour le compte d'une filiale d'une banque, un Incident Manager avec mimimum 5 ans dans un service de production informatique.Principales Missions & Responsabilités:La gestion des incidents informatiques Suivre les tickets avec un impact client fort (Incidents majeurs ou modérés)Réaliser les...


  • Washington, United States DAn Solutions Inc Full time

    REQUIRES AN EXISTING/ACTIVE TS/SCI WITH CI POLYGRAPH - NO REMOTE WORK, MUST WORK ON SITEJob Description The Security Incident Analyst (SIA) is responsible for the mitigation of security incidents on DIA information systems. The SIA investigates incidents involving information technology (IT) assets and DIA personnel to the DIA Computer Network Defense Center...


  • Washington, United States DAN Solutions Full time

    Job DescriptionJob DescriptionREQUIRES AN EXISTING/ACTIVE TS/SCI WITH CI POLYGRAPH - NO REMOTE WORK, MUST WORK ON SITEJob DescriptionThe Security Incident Analyst (SIA) is responsible for the mitigation of security incidents on DIA information systems. The SIA investigates incidents involving information technology (IT) assets and DIA personnel to the DIA...


  • Washington, United States DAn Solutions Inc Full time

    REQUIRES AN EXISTING/ACTIVE TS/SCI WITH CI POLYGRAPH - NO REMOTE WORK, MUST WORK ON SITE Job Description The Security Incident Analyst (SIA) is responsible for the mitigation of security incidents on DIA information systems. The SIA investigates incidents involving information technology (IT) assets and DIA personnel to the DIA Computer Network Defense...


  • Washington, United States Emerging Tech, LLC Full time

    **Must be authorized to work in the US** **100% Remote** Support the Department of Veterans Affairs (VA) Electronic Health Record Modernization Integration Office (EHRM-IO) for operational activities, including, but not limited to incident management tracking, briefing development, and maintenance/verification of dashboards - Leverage technical...

  • Incident Manager

    1 month ago


    Washington, United States Trusted Advisors Full time

    Job DescriptionJob DescriptionNous recherchons pour le compte d'une filiale d'une banque internationale, basée à Casablanca, un Incident ManagerMissions:Gérer les incidents critiques de la production et en assurer la communicationSuivre la production des post-mortem et la clôture des actions identifiées dans les plans d'actionsAssurer les...


  • Washington, United States Gridiron IT Full time

    **Role Description**: - Support the development of staff schedules and staffing forecasts for approval - Ensure shift members follow the appropriate incident escalation and reporting procedures - Ingest, triage, prioritize, assign, track, document, and manage incidents and results - Provide technical support in response to computer security incidents -...


  • Washington, United States Base One Technologies Full time

    Our DC Metro based client is looking for Senior Incident Response Analyst . If you are qualified for this position, please email your updated resume in word format to Primary Responsibilities• In-depth knowledge of each phase of the Incident Response life cycle• Expertise of Operating Systems (Windows/Linux) operations and artifacts• Understanding of...