09/26/2025 | News release | Distributed by Public on 09/26/2025 14:43
Nuclear power plant refueling outages are among the most complex phases in a plant's operational cycle.1 During these outages, tens of thousands of activities, including maintenance and surveillance, are conducted simultaneously within a short timeframe. Typically lasting three to four weeks, these operations involve large crews of contractors with diverse skill sets performing tasks ranging from testing and surveillance to maintenance. Outages may extend longer if major backfitting or modernization projects are planned. Consequently, plant outages are expensive, incurring significant operational costs, such as contractor labor and equipment, as well as the loss of generation while the plant is off line. This can easily cost a plant operator more than $1 million a day. Therefore, there is a constant need to mitigate the economic impact on plants by reducing the frequency, duration, and risks associated with these outages.2,3
The Outage Optimization project, part of the Department of Energy's Light Water Reactor Sustainability Program, is developing openly available tools and methods to enhance outage scheduling for nuclear power plants.4,5 These tools and methods assist outage planners in several critical areas. First, they assess schedule resiliency by evaluating how the schedule can accommodate delays and adapt to unexpected activities, such as equipment degradation discovered during planned outage activities. Second, they identify critical points in the outage schedule where potential unplanned issues may arise. Third, these tools and methods help planners allocate available resources to address delays and unexpected events as the outage progresses, thereby reducing the risk of delays and ensuring the outage schedule stays on track. By improving outage predictability and resilience, these tools contribute to national energy security, workforce modernization, and the digital transformation of nuclear operations.
The Outage Optimization project has tackled several use cases, each focusing on different aspects of outage planning and execution. Table 1 summarizes these use cases, highlighting the required data, developed methods, and outcomes to support outage-related decisions during both planning and execution. These use cases are discussed in the following sections. It is important to note that this project does not develop plant outage scheduling methods but rather complements and analyzes critical points in existing schedules created with tools like Primavera P6.
Table 1. Targeted outage related use cases.
ID | Use case | Required data | Methods | Decision support |
1 |
Activity duration variance | Activities performed during past outages (description and actual duration values) | Natural language processing (NLP) methods | Inform outage planners on time/resources required to perform outage activities based on past experience. |
2 |
Analysis of unexpected events | Planned activities, actual performed activities, issue reports, and work orders | NLP methods | Inform outage planners on the possible occurrence of unexpected events based on past operational experience. |
3 |
Schedule resilience | Planned outage schedule, planned activities (description and actual duration values), and duration variance | Schedule resilience methods | Inform outage planners on the risk of outage delays associated with delays and unexpected events (see use cases 1 and 2). |
4 |
Multiresource outage schedule planning | Outage schedule and resource availability | Critical path and combinatorial optimization | Assist outage managers in allocating jobs based on available resources during outage execution. |
5 |
Post-outage analysis | Planned and actual activities (description and actual duration values), planned and actual outage schedules | Post-outage analysis | Inform outage managers to measure performance and compare it to past outages. |
The process of scheduling and planning a plant outage is complex and typically begins one to two years before the actual outage.1-3 Despite the availability of computational tools for managing and scheduling the numerous activities to be performed, it requires significant manual labor, such as selecting outage activities, identifying dependencies between activities, and creating the outage schedule. Another crucial element in effective outage planning is the actual data required to build outage schedule. Table 2 provides a subset of necessary data elements for each activity.
Table 2. Required data for each planned outage activity.
Activity data element | Description |
Duration | The estimated time required for each activity (typically in hours and minutes). |
Resources | Types of resources needed to complete the activity (e.g., electrical, mechanical, scaffolding). |
Dependencies | Tasks that must be completed before starting the specified activity. Dependencies can vary widely and may be driven by system logic, technical specifications, plant risks, or plant conditions. |
Using the data in Table 2, the project's schedule optimization tools can plan daily outage activities and personnel requirements. Tools like Primavera P6 are designed to create efficient outage schedules, aiming to complete required tasks in less time while ensuring all activities are finished.
Fig. 1. Simplified schedule consisting of 8 activities, used throughout this article as an example for the developed use cases listed in Table 1. Activities that are part of the critical path are indicted in red.
The outage scheduling, planning, and optimization process, also known as critical path method (CPM),6 is well established and widely used in the nuclear industry. A schedule optimization algorithm carefully balances the planning methods for activity duration and dependencies, along with available resources. It produces the outage schedule, as illustrated in Fig. 1.
With the data listed in Table 2 considered, the CPM determines the following:
The critical path, which is the longest sequence of activities that determine the full duration of the outage.
Specific start and finish times for each activity. For noncritical path activities, the total float indicates how much an activity can be delayed without affecting the critical path. For critical path activities, drag measures how much an activity can be shortened before it is no longer on the critical path.
Applying CPM in real contexts can present challenges:
Activity durations are often treated as fixed, but they vary based on past experiences. Uncertainty can be expressed by providing the average duration with its variance or the observed minimum and maximum values.
Duration uncertainties arise from factors like crew size and skills, operational conditions (e.g., weather), and time of day.
Unexpected events may delay activities, affecting completion times.
New activities can arise during an outage and need to be added to the schedule, along with their dependencies.
Besides optimizing for time, other resources (e.g., crews, equipment, space) should be considered to identify critical path activities and measure proximity to the critical path.
These challenges have been addressed by the use cases listed in Table 1, and the next sections describe how the developed methods are tackling these challenges.
The first use case or scenario focuses on computational methods designed to inform outage planners about the time required to perform outage activities based on past outage operational experience. The developed method uses text semantic similarity7,8 to evaluate activity completion times. The goal is to find similar activities from past outages that match the queried task. By gathering and analyzing the historical completion times from a selected subset of past activities, the temporal distribution of the queried activity can be estimated.
An example of textual similarity is shown in Fig. 2, which compares two activities with similar semantic meanings, highlighting the importance of data cleaning and curation. This example suggests that a simple word-to-word similarity comparison between those two activities would show them as very dissimilar. However, if the historical activity were to be cleaned (e.g., through spell-checking and abbreviation identification and expansion), it would be transformed into "[ACC01-B] PRESSURE TRANSMITTER CALIBRATION." Consequently, the two activities would be very similar. These are the elements required for the semantic similarity analysis:
A set of past outage activities, which may be divided across several datasets, one for each outage. Outages of various plant units, different plants, or different utilities can be combined to enhance the analysis results.
A computational method designed to compute the semantic similarity between two activities (i.e., the queried and historic activities) would generate a point value that measures how similar the two activities are. It is important to note that the computational time for such a method must be very short, as the similarity search for a queried activity in a database of tens of thousands of past activities must be performed within minutes.
Figure 2. Example of semantic similarity between a queried and a historical outage activity.
Word, sentence, and document similarity analyses are a significant aspect of recent NLP method development and are essential in text analytics, including text summarization and representation, text categorization, and knowledge discovery. This information provides sentence syntactic structure data (i.e., negation, conjecture, and syntactic dependency) to guide the similarity measuring process.
Once the historical plant outage data has been cleaned-including data sanitization, spell-checking, and abbreviation handling-the similarity value between the queried activity and each historical activity is then determined. It is important to note that if the queried activity has never been completed in past outages, no similar past activities will be found. This method does not involve conducting any form of regression analysis. Additionally, for activities with cryptic descriptions, such as data filled with unknown acronyms or lacking actual English words, no similarity value can be determined.
Figure 3. Histogram representing the duration variance in completing the queried activity (i.e., calibration of pressure transmitter) based on identified similar activities. Note that completion time values have been modified to protect proprietary data.
Fig. 4. Sankey plot of planned versus actual activities performed during an outage (actual values are omitted to protect proprietary data).
By selecting the subset of past activities that are "more similar" to the queried one, this process generates a histogram showing the duration variance of the queried activity based on past outage data (see Fig. 3). Given these results, it is possible to assess which portion of the histogram can cause delays to plant outage based on critical path calculation and identify possible outliers (activities that took much longer due to unexpected events). Furthermore, it is possible to track the historical trend for activity completion time and assess the impact of employed human resources on this completion time.
The second use case focuses on analyzing emergent or unplanned activities performed during an outage due to observed conditions that require immediate attention. Fig. 4 provides a graphical representation of the portion of completed activities during an outage versus the actual planned activities and the emergent activities.
The goal is to digitally capture the nature or cause of emergent activities. This is achieved by extracting information from each emergent activity through NLP methods,5 which are designed to identify entities and their semantic relations. In this context, entities can be of different natures:
Identification IDs-unique IDs for assets, components, work orders, or plant outage activities.
Plant-specific entities-entities indicating assets and component types (e.g., motor-operated valve, centrifugal pump, inverter), degradation phenomena (e.g., corrosion, oil leak), and maintenance operations (e.g., nondestructive evaluation, motor winding test).
Fig. 5. Example of named entity recognition (NER) processing of an unexpected activity.
Figure 6. Relations among unexpected outage events, issue/condition reports, and work orders.
An example of NLP applied to an emergent activity that contains several entities and plant specific IDs is shown in Fig. 5.
An important element in analyzing unexpected activities is capturing the event that triggered the issue, the planned outage activities that may have caused it, and the corresponding work order. Figure 6 illustrates the links between an unexpected outage event, the generated issue report, and the corresponding work order. For example, maintenance staff may perform a clean-and-inspect activity on a circuit breaker cubicle and notice a cracked fuse block that must be replaced before the component can return to service. This creates a new work order to be added to the outage schedule.
This information better informs outage managers about risks from unexpected activities during a planned outage and allow proactive preparation. However, a challenge is that the link between planned outage activity and the issue/condition report might be missing.
After processing all emergent activities from past plant outages, the data is stored in a knowledge graph. In this structure, each node represents an entity, and edges represent the semantic relationships between adjacent entities. An outage knowledge graph is designed to contain all relevant outage information (e.g., planned, completed, and unexpected outage activities, outage schedule, condition reports logged during the outage) in a single data structure. A knowledge graph can be queried to identify past unexpected activities by type of component or equipment and to determine planned activities that have the potential to trigger emergent ones.
Fig. 7. Integration of activity duration variance into the outage schedule to assess critical path duration and structure variance.
Once the duration variance associated with a subset of activities in a planned outage schedule has been determined using the previously discussed methods, it can be propagated through the entire outage schedule to assess the duration variance of the full outage schedule. This is performed using a classic Monte Carlo approach, where the statistical distribution of outage completion times is obtained along with distribution of possible critical paths (see Fig. 7). This approach captures the interdependence across outage activities, meaning that extended completion time for one activity can introduce cascading delays through the outage schedule. At this point, the outage analyst can do the following:
Assess the risk of outage delays by comparing the statistical distribution of outage completion times with the completion times calculated using point estimate values of activity completion times.
Determine the conditions under which the critical path structure changes due to completion delays in one or more activities.
Identify the potential emergence of unexpected activities based on historical outage data using developed natural language processing methods.
The next use case we have been investigating involves considering the availability of all outage resources on a shift basis. This includes not only time but also crews, equipment, and space in the calculation of the outage critical path. The goal is to optimize use of these resources to ensure the most efficient and effective outage schedule.
For each shift, the approach begins by prioritizing activities that are either part of or close to the critical path. By focusing on these critical activities, we can ensure that the most time-sensitive tasks are completed first, thereby minimizing delays in the overall schedule. This prioritization is based on the current availability of resources, ensuring that time, crews, equipment, and space are allocated most efficiently. Any activities that cannot be accommodated within the current shift's resources are delayed to the following shift to prevent overcommitting resources and to keep the schedule realistic and achievable.
After scheduling the activities for the current shift, we recompute the outage schedule parameters for all activities. This includes recalculating the critical path and total float (the amount of time that activities can be delayed without affecting the overall schedule). This process ensures that the schedule remains accurate and reflects any changes in resource availability or activity timing. Once these calculations are complete, the process is repeated for the next shift. This iterative approach aims to continuously optimize outage resource allocation and maintain the schedule.
Refueling outages are a critical period for nuclear power plants. They are characterized by a high volume of simultaneous maintenance and surveillance activities carried out by a large workforce. The significant operational costs and loss of generation during these outages underscore the importance of optimizing outage schedules to minimize economic impacts. While the cost of planned outages is high, the cost of schedule overruns is even higher. By measuring schedule resiliency, identifying critical points, and enabling efficient resource allocation, the methods presented in this article help mitigate delays and manage unexpected events, ultimately ensuring that outage schedules remain on track and the economic burden on plant operators is reduced. The methods presented are developed and can be found in three different open-source computational tools: LOGOS (for schedule optimization), DACKAR (for NLP processing of textual data), and RAVEN (for propagation of activity duration variance through outage schedule).
Diego Mandelli is an R&D scientist at Idaho National Laboratory. Shawn St. Germain is a distinguished staff scientist and technical lead at INL. Congjian Wang is a nuclear computational scientist at INL. Edward Chen is a digital instrumentation and control engineer at INL. Norman John Mapes is an R&D scientist at INL. Svetlana Lawrence is an R&D pathway lead at INL. Ahmad Al Rashdan is an R&D pathway lead at INL.
|
|
|
|
|
|
|
|
|
|