384 Yoshitaka Yuki / ISA Transactions 41 (2002) 383–387erator’s prompt reaction to unexpected situations.If an operator does not recognize an abnormalsituation, it may not only affect product quality,but may also waste time and lead to increased pro-duction costs, as well as potentially compromiseplant safety. On the other hand, if alarm messagesare too frequent, operators are overloaded withalarms and may miss an important message buriedamong spurious messages. This also may ad-versely affect quality, productivity, and safety. Inthis sense, maintaining a well-tuned alarm systemwhich generates an adequate and balanced fre-quency of messages is essential to reduce costs, Fig. 1. Spiral improvement cycle.improve quality, and maintain a safe operating en-vironment. The evil of alarm ﬂooding has long been recog- grading previous alarm optimization efforts. Asnized and most initial designs and startups attempt shown in Fig. 1 the ongoing alarm system optimi-to address the issue. Some of the countermeasures zation effort consists of three steps: ͑1͒ analysis;typically applied include alarm suppression, which ͑2͒ countermeasures, and ͑3͒ evaluation.bridles messages with conditions; an alarm inte-gration block, which combines several alarm con- 3.1. Analysisditions, thereby reducing the number of alarmswhich the operator is presented with; and an intel- Finding alarm system problems is not easy. In aligent alarm function block, which utilizes as ex- typical chemical plant, more than 5000 alarms andpert knowledge base or fuzzy logic. These coun- events are being recorded each day. Operators maytermeasures reduce message repetition, thereby notice when and where a rush of alarm notiﬁca-freeing operators from the burden of the alarm tions occurred. However, looking through an enor-ﬂooding. However, even with the best technology mous number of messages in a log ﬁle is like ﬁnd-and engineering, the effect of alarm system opti- ing a needle in a haystack.mization does not last forever. Whenever there are Focusing on the interrelation of alarm messageschanges in process, equipment, or system, addi- and operator action makes the problem of ﬁndingtional alarm points are usually added. Hence an process easier. Alarm messages are sent to opera-intelligent alarm can be buried among those newly tors to prompt them to react in some way. In thisadded alarms and messages. sense, they are the ‘‘process request’’ messages, The continual addition of alarms is a natural re- whereas a plant process asks for the operators’ re-sult of a plant’s evolution. When the alarms are action. The operators usually start their actions byadded, it is often done in the context of a work watching the console messages. Therefore fre-order or a change to the small part of a plant. In quency of the operators’ actions should be relatedthese limited scopes, the alarm addition may be to the frequency of message notiﬁcations. Thereapprovable. However, when considering a larger are also cases where the alarm messages fre-context, these changes may not be appropriate. As quency increases when an operator’s action is notthe number of alarm conditions increases, the op- adequate. In both cases, the alarm notiﬁcation anderators’ workload also increases. This cyclical ad- the operator actions are interrelated with eachdition of alarms makes it difﬁcult to maintain an other. This can be calculated by counting messageoptimum frequency of alarm notiﬁcations. notiﬁcations for a time period. The message fre- quency can be quantiﬁed and visualized as a bar chart. The frequency of the operators’ actions can3. Maintaining optimal alarm frequency be also quantiﬁed in a similar manner. By compar- ing the alarm message notiﬁcation frequency and Alarm system optimization must be treated as an the operator action frequency on an event balanceongoing activity in order to escape the creeping trend graph as shown in Fig. 2, you can easily telladdition of alarms that end up masking and de- whether the operator was busy dealing with mes-
Yoshitaka Yuki / ISA Transactions 41 (2002) 383–387 385 alarm/guidance messages and operator actions. By comparing the balance of these two items with the unit recipe’s relative time, a speciﬁc batch phase can be found. It generates more messages, or re- quires more manual operation, than expected. Fig. 4 shows another batch balance trend graph for the same product. The procedure ͑recipe͒ of this batch is the same as shown in Fig. 3, therefore Fig. 2. Event balance trend graph. the balance peak pattern is similar. As shown in these ﬁgures, comparing several batch event bal- ance patterns can lead to ﬁnding a repeating alarmsages. In this example, the alarm frequency had an message or frequent operator operations. With thisabrupt increase at 11:00, but the operators’ action approach, we could ﬁnd substantial productivityfrequency was not increasing. In this case, it was bottleneck problems in the material transfer phasefound to be a result of unneeded alarm messages. between reactor 1 and reactor 2, and the materialOnce the frequency and the source of spurious cake removal phase of a centrifuge.alarm messages is found, it is possible to applycountermeasures to prevent a recurrence. 3.3. Countermeasures3.2. Plan The next step is to apply countermeasures for It is important to apply improvement efforts in each problem. Countermeasures vary dependingproblem areas where the most beneﬁt is expected. on the nature of problem. Some examples of coun-The largest impact can usually be obtained by im- termeasures are:proving the most frequently occurring imbalances • Set adequate alarm range;between alarm messages and operator actions. • Tune watchdog timer depending on the pro-This can be accomplished by ﬁnding repetitive cedure;spurious alarms and concentrated manual opera- • Tuning parameter adjustment;tions that can be automated. For example, in abatch plant, many products are produced repeat- • Integrate/combine redundant manual opera- tions;edly according to a ‘‘recipe.’’ Each recipe consistsof several procedures, and each procedure is ex- • Add timely guidance message prior to un- stable sequences.ecuted in a deﬁned order. A repeating event bal-ance pattern for a speciﬁc recipe leads to repeating Countermeasures can be determined and appliedproblem areas that can be improved. to each problem. Countermeasures may require Fig. 3 shows an example of an event balance trial-and-error iterations, but once a problem areatrend of a batch process. It shows each unit proce- is narrowed down the countermeasures becomedure’s running duration, and the frequency of less complex. Our experience shows that a timely Fig. 3. Batch event balance trend graph ͑1͒ ͓batch ID: 22-9DAP͔.
386 Yoshitaka Yuki / ISA Transactions 41 (2002) 383–387 Fig. 4. Batch event balance trend graph ͑2͒ ͓batch ID: 23-9DAP͔.guidance message sent to an operator before the 4. Alarm system requisitesﬁrst warning of an abnormal state can help to pre-vent a rush of alarm messages caused by an alarm In order to make alarm optimization effective,tripping. As shown in Fig. 5, one timely guidance an alarm system should have a comprehensive da-message helps the operator to prevent this type of tabase. The alarms can then be analyzed fromsituation. various aspects as well as provide ﬂexibility for easy conﬁguration changes. The following fea-3.4. Evaluate tures are required in an optimal alarm system. ͑1͒ Basic alarm and event database features for As already discussed, it is difﬁcult to maintain supporting analysis activity:the effect of countermeasures. To learn whetherthe effect of a countermeasure continues, a longer- • Time stamped event description ͑message͒;term quantitative analysis is required. A daily • Identiﬁer for showing the origin of the mes-alarm notiﬁcation summary before and after ap- sage ͑tag-ID, etc.͒;plying a countermeasure is shown in Fig. 6. In this • Plant hierarchical ID for grouping messagesway, it is easy to determine when alarm messages of each unit/area;begin to increase. When the total daily alarm mes- • Alarm and event category ͑process alarm,sage count is increased, the event balance trend guidance, tracking record, operation record,analysis for a day/batch should be investigated etc.͒.again. Summarizing the count for each week, each ͑2͒ Flexibility and tolerance for optimization ef-month, or each batch also helps to watch the effect fort:continuity. • Alarm grouping so that only one integrated alarm message is presented to the operatorFig. 5. An earlier guidance message prevents alarm rushafterwards. Fig. 6. Alarm reduction effort result.
Yoshitaka Yuki / ISA Transactions 41 (2002) 383–387 387 when many interrelated alarms have been is to optimize the alarm system. However, achiev- generated in the control package at the same ing a well-tuned alarm system is difﬁcult because time; of the huge number of daily events and the lack of • Intelligent alarm blocks which capture status statistical analysis methods to detect problems. changes and predicts abnormal situations in Quantifying and visualizing alarms and events in advance; an event balance trend graph makes it easy to • Security system to prevent operators from grasp problem areas. Overlaying unit recipe performing invalid operations, such as privi- lege protection for acknowledgement and re- schedule results from a batch process on an event alarming when an alarm condition persists. balance trend can identify speciﬁc portions of the batch process that can be improved. Once a prob-5. Conclusion lem area is identiﬁed and analyzed numerically, countermeasures can be applied to eliminate pro- A plant’s alarm and event database is a live duction bottlenecks. It is also important to iteraterecord of a production process. It contains key in- the cycles of analysis-plan-countermeasure-formation useful in improving operations produc- evaluation in order to maintain the effect of im-tivity by solving production bottlenecks, perform- provement. A reliable alarm system and an efﬁ-ing production planning, and optimizing opera- cient analysis tool are indispensable to support thistion. One method to achieve these improvements spiral activity.