1. System Engineering
and Management Science
Examples and Fundamental Principles
SHS/ASQ 2010 Conference and Expo
February 26, 2010
Alexander Kolker, PhD
Outcomes Operations Project Manager
Children’s Hospital and Health System
• Main concept and some definitions.
• Typical hospital system as a set of interdependent subsystems:
• Subsystem 1: Emergency Department (ED).
• Subsystem 2: Intensive Care Unit (ICU).
• Subsystem 3: Operating Rooms (OR)- Surgical Department.
• Subsystem 4: Medical/Surgical Nursing Units (Floor_NU).
• Interdependency of subsystems.
• Main take-away.
• Summary of fundamental management engineering principles.
3. This presentation is adapted from
the following System Engineering Publications
Kolker, A, Queuing Theory and Discreet Events Simulation for Healthcare: from Basic
Processes to Complex Systems with Interdependencies. Chapter 20. In: Handbook of
Research on Discrete Event Simulation: Technologies and Applications, 2009, pp. 443
- 483. IGI Global Publishing, Hershey, PA.
Kolker, A, Process Modeling of Emergency Department Patient Flow: Effect of Patient
Length of Stay on ED Diversion. Journal of Medical Systems, 2008, v. 32, N 5, pp. 389 -
Kolker, A, Process Modeling of ICU Patient Flow: Effect of Daily Load Leveling of Elective
Surgeries on ICU Diversion. Journal of Medical Systems, 2009, v. 33, N 1, pp. 27 - 40.
Kolker, A, Norell, B., O’Connor, M., Hoffman, G., Oldham, K., The Use of Predictive
Simulation Modeling for Surgical Capacity Expansion Analysis
Presented at the 2010 SHS/ASQ joint Conference, Atlanta, GA, February 26, 2010 (poster
Kolker, A, Effective Managerial Decision Making in Healthcare Settings: Examples and
Principles. Quality Management Journal, 2009 (submitted).
4. Main Concept
• Modern medicine has achieved great progress in treating individual
patients. This progress is based mainly on hard science: molecular
genetics, biophysics, biochemistry, design and development of
medical devices and imaging.
• However relatively little resources have been devoted to the proper
functioning of overall healthcare delivery as an integrated system,
in which access to efficient care should be delivered to many
thousands of patients in an economically sustainable way. (Joint report
of National Academy of Engineering and Institute of Medicine, 2005).
A real impact on efficiency and sustainability of the healthcare
system can be achieved only by using healthcare delivery
engineering which is based on hard science such as: probability
theory, forecasting, calculus, stochastic optimization, computer
5. Some Definitions
What is Management?
Management is controlling and leveraging available resources (material,
financial and human) aimed at achieving the performance objectives.
Traditional (Intuitive) Management is based on
• Past experience.
• Intuition or educated guess.
• Static pictures or simple linear projections.
Linear projection assumes that the output is directly proportional to the
input, i.e. the more resources (material and human) thrown in, the more
output produced (and vice versa).
6. What is Management Engineering?
• Management Engineering (ME) is the discipline of
building and using validated mathematical models of
real systems to study their behavior aimed at making
justified business decisions.
• This field is also known as operations research.
Thus, Management Engineering is the application of
mathematical methods to system analysis and
7. Scientific Management is Based On
• A goal that is clearly stated and measurable, so the decision-maker
(manager) always knows if the goal is closer or farther away.
• Identification of available resources that can be leveraged (allocated) in
• Development of mathematical models or numeric computer algorithms
to quantitatively test different decisions for the use of resources and
consequences of these decisions (especially unintended
consequences) before finalizing the decisions.
The Underlying Premise of ME is
• Decisions should be made that best lead to reaching the goal.
• Valid mathematical models lead to better justified decisions than an
educated guess, past experience, and linear extrapolations (traditional
8. Main Steps for System Engineering Analysis
• Large systems are deconstructed into smaller subsystems
using natural breaks in the system.
• Subsystems are modeled, analyzed, and studied separately.
• Subsystems are then reconnected in a way that recaptures
the interdependency between them.
• The entire system is re-analyzed using the output of one
subsystem as the input for another subsystem.
9. High-Level Layout of a Typical Hospital System
ED – Emergency Room Floor NU – Med/Surg Units
ICU – Intensive Care Unit OR – Operating Rooms
WR – Waiting Room
10. Step 1
• Deconstruction of the entire hospital system into
• Simulation and Analysis of the Main Subsystems:
 Subsystem 1: Emergency Department (ED).
 Subsystem 2: Intensive Care Unit (ICU).
 Subsystem 3: Operating Rooms (OR).
 Subsystem 4: Floor Nursing Units (NU).
11. Subsystem 1: Typical Emergency Department (ED)
The high-level layout of
the entire hospital system: ED structure and in-patient units
12. Typical ED Challenges
ED Performance Issues
• ED ambulance diversion is unacceptably high (about 23% of
time sample ED is closed to new patients).
• Among many factors that affect ED diversion, patient Length of
Stay in ED (LOS) is one of the most significant factors.
High Level ED Analysis Goal
• Quantitatively predict the relationship between patient LOS
and ED diversion.
• Identify the upper LOS limit (ULOS) that will result in
significant reduction or elimination ED diversion.
13. ED simulation model layout
Typical ED Simulation Model Layout
ED pre-filled at the
wk, DOW, time
Mode of transp
Mode of Transportation
14. Modeling Approach
• ED diversion (closure) is declared when ED patient census
reaches ED bed capacity.
• ED stays in diversion until some beds become available after
patients are moved out of ED (discharged home, expired, or
admitted as in-patients).
• Upper LOS limits (simulation parameters) are imposed on the
baseline original LOS distributions: A LOS higher than the
limiting value is not allowed in the simulation run.
Baseline LOS distributions should be recalculated as
functions of the upper LOS limits.
15. Modeling Approach – continued
MODELING APPROACH (cont.)
Given original distribution density and the the random value of what is the conditional
Given original distribution density and the limiting value of limiting variable T, th e random variable T,
distribution of the restricted random variable T?
what is the conditional distribution of the restricted random va riable T ?
Original unbounded distribution
Distribution of LOS_ home, Hrs New re-calculated distribution
Re-calculated bounded distribution of LOS_ home, Hrs
3-Parameter (T ) orig
f Gamma 500
f (T )original
440 f (T , LOS ) new 
440 420 LOS
 f (T )
Imposed LOS limit 6 hrs 220
f (T ) new  0, if T LOS
0 2 4 6 8 10 12 0 2 4 6 8 10 12
LOS, Hrs LO Hrs
T, Hrs T, Hrs
16. Simulation Summary and Model Validation
SIMULATION SUMMARY & MODEL VALIDATION
Scenario/option LOS for discharged LOS for Predicted ED Note
home NOT more than admitted NOT diversion, %
Current, 07 24 hrs 24 hrs 23.7% Actual ED
1 5 hrs 6 hrs ~ 0.5 % Practically NO
Currently 17% Currently diversion
with LOS more 24% with
than 5 hrs; LOS more
than 6 hrs;
2 6 hrs 6 hrs ~ 2% Low single
3 5 hrs 24 hrs ~4% Low single
• ED diversion could be negligible (~0.5%) if patients discharged home stay NOT more than 5
• ED diversionand admittednegligible (~0.5%) if patients discharged home stay not more
hrs could be patients stay NOT more than 6 hrs.
• Relaxing of these LOS limits results in low digits % diversion that still could be acceptable
than five hours and admitted patients stay not more than six hours.
• Relaxing of these LOS limits results in a low digits percent diversion that still could be
17. Simulation Summary – continued
What other combinations of upper LOS limits are limits LOS arelow single digit percent ED
What other combinations of upper possible to get a possible to get low
diversion? single digits % ED diversion ?
Perform full factorial DOE with two factors (ULOS_home and ULOS_adm) at six at 6 levels each
Performed full factorial DOE with two factors ( ULOS_home and ULOS_adm) levels each using
simulatedsimulated % diversion as a response function.
using percent diversion as a response function.
Simulated Div % as a function of upper LOS limits, hrs
ULO S_home, hrs
Mean predicted Div %
16.5 Low single digits 24
15.0 % diversion
5 6 8 10 12 24
18. Conclusions for Subsystem 1:
• ED diversion can be negligible (less than 1%) if hospital-
admitted patients stay in ED not more than six hours.
• Currently 24% of hospital-admitted patients in study
hospital stay longer than this limit, up to 20 hours.
• This long LOS for a large percentage of patients results in
19. Subsystem 2: Typical Intensive Care Unit (ICU)
Patients move between the units:
• If no beds in CIC, move to SIC
• If no beds in MIC, move to CIC, else SIC, else NIC
• If no beds in SIC, move CIC
• If no beds in NIC, move to CIC, else SIC
20. Typical ICU Challenges
ICU Performance Issues
• Elective surgeries are usually scheduled for Operating Room block times
without taking into account the competing demand from emergency and
add-on surgeries for ICU resources.
• This practice results in:
 Increased ICU diversion due to ‘no ICU beds’.
 Increased rate of medical and quality issues due to staff overload and capacity
 Decreased patient throughput and hospital revenue.
High Level ICU Analysis Goal
• Establish a relationship between daily elective surgeries schedule,
emergency and add-on cases and ICU diversion.
• Given the number of the daily scheduled elective surgeries and the number
of unscheduled emergency and add-on admissions, predict ICU diversion
due to lack of available beds.
21. Baseline – Existing Number of Elective Cases
Elective surgeries current pattern - No daily cap
Critical census limit exceeded
Closed due to No ICU beds: 10.5 % of time
36 wk1 wk2 wk3 wk4 wk5 wk6 wk7 wk8 wk9 wk10 wk11 wk12 wk13 wk14 wk15 wk16 wk17
0 168 336 504 672 840 1008 1176 1344 1512 1680 1848 2016 2184 2352 2520 2688 2856 3024
22. Conclusions for Subsystem 2:
Intensive Care Unit
• There is a significant variation in the number of scheduled
elective cases between the same days of the different weeks
(Monday to Monday, Tuesday to Tuesday, and so on).
• Smoothing the number of elective cases over time (daily load
leveling) is a very significant factor which strongly affects ICU
closure time due to ‘no ICU beds.’
• Using Simulation it was demonstrated that daily load leveling of
elective cases to not more than 4 cases per day will result in a
very significant reduction of closure time due to ‘no ICU beds’
(from ~10.5% down to ~1%).
23. Subsystem 3: Typical Operating Rooms (OR)
• Is the number of general and specialized operating rooms and
pre/post operative beds adequate to meet the projected patient
flow and volume increases?
• If it is not, how many operating rooms and pre/post operative
beds would be needed?
• Ensure that the renovation cost is under control and maintain a
high level of quality and satisfaction standards for surgical
• Utilize Management Engineering to determine that the number of
operating rooms and pre/post operative beds is not excessive.
24. The following performance criteria
were used for the simulation model
1. Patient delay to be admitted to a preoperative surgical bed should not
exceed 15 minutes.
2. Delay to enter operating room from a preoperative surgical bed should
General OR – 2 hours Urgent OR – 3 hours
Cardiovascular OR – 5 hours Neurosurgery OR – 3 hours
Orthopedic OR – 2 hours Cardiac Cath Lab – 2 hours
3. Percent of patients waiting longer than the acceptable delay to enter
operating room from a preoperative surgical bed should not exceed
4. Delay to enter PACU beds from an operating room should not exceed
5. Average annual utilization of operating rooms should be in the range
of 60% to 90%.
25. The following simulation models
were developed and analyzed
Model 1: Baseline operations - all surgical services function as
currently specified between two floors. Construct two general operating
rooms onto upper level floor to serve otolaryngology, gastroenterology
and pulmonary patient volume from lower level floor.
Model 2: Move gastroenterology and pulmonary patient volume from
upper level to a separate service area.
Model 3: Separate service area for gastroenterology and pulmonary
patient volume that includes 2 to 3 special procedure rooms and 8 to11
pre/post beds and PACU beds.
Total annual patient volume included in the simulation models is in the range from
15,000 to 17,000.
Decision variables were: The number of pre-operative beds and PACU beds,
number of Operating Rooms and special procedure rooms and their allocation for
27. Conclusions for Subsystem 3:
Operating Rooms (OR)
• Model 3 is selected as the best. Twelve Operating Rooms
and four Special Procedure Rooms will be adequate to
handle patient volume up to the year 2013.
• Cath Lab could become an issue by 2013 with more than
10% of patients waiting longer than acceptable limit 2
• All other performance criteria will be met.
28. Subsystem 4: Medical/Surgical
Nursing Units (NU)
Total number of specialized nursing units: 24
Total number of licensed beds: 380
Patient Length of Stay
(LOS) is in the range from
2 days to 10 days;
The most likely LOS is 5
Census (i) (current period) = census (i-1) (previous period) +
[# admissions (i) – # discharges (i) ]; i = 1, 2, 3, …….
This is a dynamic balance of supply (beds) and demand (admissions).
29. Census (i) (current period) = census (i-1) (previous period) +
[# admissions (i) – # discharges (i) ]; i = 1, 2, 3, …….
Simulated Census. Capacity 380 beds
Mon Tue Wed Thu Fri Sat Sun
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100 104 108 112 116 120 124 128 132 136 140 144 148 152 156 160 164 168
Take Away: Percent of time Nursing Units are full (% diversion) is about 16%.
30. Step 2
• Subsystems are reconnected in a way that
recaptures the interdependency between them.
• The entire system is re-analyzed using the output of
one subsystem as the input for another subsystem.
31. Step 2 – continued
• All subsystems are reconnected to each other.
• The output of one subsystem is the input for another subsystem.
32. Hospital System Simulation Summary
Too aggressive ED Downstream Less aggressive Downstream
improvement: Units: Better ED improvement: Units: Better or
Performance Metrics State
patients admitted or worse than patients admitted words than current
within 6 hours current state? within 10 hours state?
95% CI of the number of
patients waiting to get to 25 – 27 8 – 10 Better 17 – 19 Better
ED (ED in)
95% CI of the number of
patients waiting hospital 57 – 62 64 – 69 Worse 57 – 62 Neutral
admissions (ED out)
Number of patients left
not seen (LNS) after
waiting more than 2 23 – 32 0 Better 0–3 Better
95% CI for % ED
diversion 22% – 23% 0.4% – 0.5% Better 6.8% – 7.3% Better
95% CI for % ICU
diversion 28% – 32% 30% – 34% Worse 28% – 32% Neutral
95% CI for % OR
diversion 12% – 13% 13% – 15% Worse 12% – 13% Neutral
95% CI for % floor NU
diversion 11% – 12% 11% – 12% Neutral 11% – 12% Neutral
33. Take-Away from Hospital System
• Too aggressive ED improvement results in worsening
three out of seven hospital system performance metrics.
• Less aggressive ED improvement is more aligned with
the ability of downstream subsystems to handle
increased patient volume.
• This illustrates important Management System
34. Important System Engineering Principles
• Improvement in the separate subsystems (local
optimization or local improvement) should not be
confused with the improvement of the entire system.
• A system of local improvements is not the best system;
it could be a very inefficient system.
• Analysis of an entire complex system is usually
incomplete and can be misleading without taking into
account subsystems’ interdependency.
35. Main Take-Away
Management Engineering helps to address the following typical
pressing hospital issues:
• How many beds are needed for each unit.
• How many procedure rooms are needed for each service.
• How many nurses/physicians should each unit schedule for the particular
day and night.
• How to reduce patient wait time and increase access to care.
• How to develop an efficient outpatient clinic schedule.
And so on, and so on…
And the Ultimate Goal:
How to manage hospital operations to increase profitability (reduce
costs, increase revenue) while keeping high quality, safety and
outcomes standards for patients.
36. Summary of Some Fundamental Management
• Systems behave differently than the sum of their independent
• All other factors being equal, combined resources are more efficient
than specialized (dedicated) resources with the same total
• Scheduling appointments (jobs) in the order of their increased duration
variability (from lower to higher variability) results in a lower overall
cycle time and waiting time.
• Size matters. Large units with the same arrival rate (relative to its
size) always have a significantly lower waiting time. Large units can
also function at a much higher utilization % level than small units
with about the same patient waiting time.
• Work load leveling (smoothing) is an effective strategy to reduce
waiting time and improve patient flow.
37. Summary of Some Fundamental Management
Engineering Principles – continued
• Because of the variability of patient arrivals and service time, a
reserved capacity (sometimes up to 30%) is usually needed to
avoid regular operational problems due to unavailable beds.
• Generally, the higher utilization level of the resource (good for the
organization) the longer is the waiting time to get this resource
(bad for patient). Utilization level higher than 80% to 85% results
in a significant increase in waiting time for random patient
arrivals and random service time.
• In a series of dependent activities only a bottleneck defines the
throughput of the entire system. A bottleneck is a resource (or
activity) whose capacity is less than or equal to demand placed
38. Summary of Some Fundamental Management
Engineering Principles – continued
• An appointment backlog can remain stable even if the
average appointment demand is less than appointment
• The time of peak congestion usually lags the time of the
peak arrival rate because it takes time to serve patients
from the previous time periods (service inertia).
• Reduction of process variability is the key to patient flow
improvement, increasing throughput and reducing delays.
40. What is a Simulation Model?
A Simulation Model is the computer model that mimics the behavior of a
real complex system as it evolves over the time in order to visualize and
quantitatively analyze its performance in terms of:
• Cycle times.
• Wait times.
• Value added time.
• Throughput capacity.
• Resources utilization.
• Activities utilization.
• Any other custom collected process information.
• The Simulation Model is a tool to perform ‘what-if’ analysis and play
different scenarios of the model behavior as conditions and process
• This allows one to build various experiments on the computer model
and test the effectiveness of various solutions (changes) before
implementing the change.
41. How Does a Typical Simulation Model Work?
A simulation model tracks the move of entities through the system at distinct points
of time (thus, discrete events.) The detailed track is recorded of all processing
times and waiting times. In the end, the system’s statistics for entities and
activities is gathered.
Example of Manual Simulation (step by step)
Let’s consider a very simple system that consists of:
• a single patient arrival line.
• a single server.
Suppose that patient inter-arrival time is uniformly (equally likely) distributed between
1 min and 3 min. Service time is exponentially distributed with the average 2.5 min.
(Of course, any statistical distributions or non-random patterns can be used instead).
A few random numbers sampled from these two distributions are, for example:
Inter-arrival time, min Service time, min
and so on… and so on….
42. We will be tracking any change (or event) that happened in the
system. A summary of what is happening in the system looks
Event # Time Event that happened in the system
1 2.6 First customer arrives. Service starts that should end at time = 4.
2 4 Service ends. Server waits for patient.
3 4.8 Second patient arrives. Service starts that should end at time = 13.6.
Server idle 0.8 minutes.
4 6.2 Third patient arrives. Joins the queue waiting for service.
5 8.6 Fourth patient arrives. Joins the queue waiting for service.
6 13.6 Second patient (from event 3) service ends. Third patient at the head of
the queue (first in, first out) starts service that should end at time 22.7.
7 22.7 Patient #4 starts service…and so on.
In this particular example, we were tracking events at discrete points in time
t = 2.6, 4.0, 4.8, 6.2, 8.6, 13.6, 22.7
DES models are capable of tracking hundreds of individual entities, each with its own unique set of
attributes, enabling one to simulate the most complex systems with interacting events and component
43. Basic Elements of a Simulation Model
• Flow chart of the process: Diagram that depicts logical flow of a process
from its inception to its completion.
• Entities: Items to be processed (i.e. patients, documents, customers, etc.)
• Activities: Tasks performed on entities (i.e. medical procedures, document
approval, customer checkout, etc.)
• Resources: Agents used to perform activities and move entities (i.e. service
personnel, operators, equipment, nurses, physicians.)
• Entity arrivals: They define process entry points, time and quantities of
the entities that enter the system to begin processing.
• Entity routings: They define directions and logical condition flows for
entities (i.e. percent routing, conditional routing, routing on demand, etc.)
44. Typical Data Inputs Required to Feed the Model
• Entities, their quantities and arrival times
Periodic, random, scheduled, daily pattern, etc.
• Time the entities spend in the activities
This is usually not a fixed time but a statistical distribution. The wider
the time distribution, the higher the variability of the system behavior.
• The capacity of each activity
The maximum number of entities that can be processed concurrently in
• The size of input and output queues for the activities (if needed).
• The routing type or the logical conditions for a specific routing.
• Resource Assignments
The number of resources, their availability, and/or resources shift