Major Incident Manager (night shift)
Posted 2025-04-06ALTA IT Services is a wholly owned subsidiary of System One, a leading provider of specialized workforce solutions and integrated services. ALTA is an established leader in IT Staffing and Services, for both government and commercial enterprises across the United States, specializing in Program & Project Management, Application Development, Cybersecurity, Data & Advanced Analytics, and Agile Transformation Services.
Major Incident Manager
100% Remote
Shift Details:
- Night Shift (6:00 PM - 6:00 AM)
Rotating Two-Week Schedules:
- Week 1 - Work: Mon-Tues, Off: Wed-Thurs, Work: Fri-Sun
- Week 2 - Off: Mon-Tues, Work: Wed-Thurs, Off: Fri-Sun
Weekends and Holidays:
- Contractors will be expected to work every other weekend and on some holidays
Job Description
Major Incident Management is responsible for driving the coordination and recovery efforts of major outages at the client. When issues impact the clientÂs services or systems, major outages may occur, which result in serious interruptions to business and member activities. The Major Incident Management team operates 24x7 to ensure that impacted services are restored as efficiently and effectively as possible. The team actively monitors systems and services, documents and timelines recovery efforts, manages and coordinates various support team activities, and notifies business units of potential impacts and on-going recovery efforts. The team is also responsible for providing continual process improvement suggestions for the major incident management service, and monitoring for weekend change activities and military pay days.
Major Responsibilities
 Monitors Service Desk ticket queues, system alerts, and escalation methods to identify possible trends or outages
 Serves as the main point of contact for all incident and service issue escalations directed to the Major Incident Management team
 Ensures that incident management processes are efficiently and effectively followed
 Determines the impact and priority of incidents based on affected customers and/or business units
 Communicates operational issues to respective IT management, support teams, and incident communication managers
 Provides outage notification and recovery effort updates to business units via the Status Page
 Engages various support teams and resources to major incident bridges
 Manages and coordinates troubleshooting and recovery efforts between support teams and vendors
 Ensures continuous collaboration with IT Operations Management and other areas or teams
 Documents initial issues, recovery activities, and resolution steps taken via MIM timelines
 Ensures prompt resolution and coordination of incident management activities during recovery efforts
 Updates and validates outage information in availability management tools for reporting and tracking purposes
 Makes recommendations, proposals, and suggestions for improvement within the service to reduce severity and frequency of incidents
 Attends Post Incident Review Meetings or reviews meeting notes once the meetings conclude to ensure compliance with service improvement initiatives
 Attends and participates in TCABs (technical change advisory board meetings) to review, discuss, and approve or reject concerning upcoming changes or releases to the environment
 Coordinates, communicates, and manages Sunday Maintenance Windows for weekend scheduled activities
 Works with Problem Management and Change Management to resolve incidents
 Coordinates, communicates, and manages Military Pay Bridge activities
 Prepares operational status reports to IT Operations Management
 Updates and publishes Morning Reports
Required Qualifications
 BachelorÂs Degree in a related field, or the equivalent combination of education, training, and/or experience
 Extensive IT experience that demonstrates knowledge of hardware and infrastructure protocols used to provide services to customers
 Extensive IT experience in at least one of the following areas: mainframe, networking, middleware Websphere, Azure
 Prior experience leading incident bridge calls from initial triage to guiding recovery efforts, maintaining a timeline and ensuring that service is restored as quickly as possible
 Experience in leading or supervising an IT team
 Demonstrated ability to lead others in a challenging and fast-paced large enterprise environment
 Strong research, analytical, and problem solving skills
 Strong planning, organizational, and multi-tasking skills
 Demonstrated ability in exercising initiative to produce desired results and achieve objectives
 Ability to effectively interface with various levels of employees, management, and vendors
 Excellent interpersonal, verbal, and written communication skills
 Practical Incident management work experience
Desired Qualifications
 ITIL v3 or v4 Foundations Certificate
 CCNA / Networking Training and Certificates
 Middleware Training and Certificates
 Azure Training and Certificates
System One, and its divisions and subsidiaries including Joulé, ALTA IT Services, CM Access, and MOUNTAIN, LTD., are leaders in delivering workforce solutions and integrated services across North America. We help clients get work done more efficiently and economically, without compromising quality. System One not only serves as a valued partner for our clients, but we offer eligible full-time employees health and welfare benefits coverage options including medical, dental, vision, spending accounts, life insurance, voluntary plans, as well as participation in a 401(k) plan.
System One is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, age, national origin, disability, family care or medical leave status, genetic information, veteran status, marital status, or any other characteristic protected by applicable federal, state, or local law.
Apply Job!