Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
Developing a real-time
predictive algo...
Introduction
2
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
• Approximately 7 billi...
Real World Problem
3
• What action to take?
• Where to do it?
• Who should do it?
• How quickly does it need to be done?
•...
Data Sources
4
• Caller phone number: call routing information, mobile/fixed,
single/multiple user (like an IP address), G...
Existing research
5
• 50 years of Operations Research / Management
• 25 years of decision tool/tree validation
• 10 years ...
Why is it so complex?
6
• Chinese city with 9 million residents
• 2.5 calls per resident over 5 years (0.5/person/year)
• ...
Categorization
7
• Started in 1978…
• 36 Families of problem types
• Level of Urgency: Hot or Not
• Omega, Alpha, Bravo, C...
FDNY Example
8
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
1120 * 8 = 8,960 hours ...
Decision Tree – Manual Deductive Reasoning
9
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoun...
Questions / Prioritization / Instructions
10
• Priorities designed to purposefully over-triage rather than increase
specif...
Queuing Theory – Planning to Disappoint
11
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFounda...
Erlang Call Center Algorithm
12
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
Source...
Natural Language Processing
13
• Machine learning to determine semantic meaning
• Based on ontologies and probabilistic de...
Machine Learning - Inductive
14
• Learns from the information itself
• Classifier accuracy is similar to human experts
• C...
Comparing Supervised Learning Algorithms
15
Algorithm
Problem
Type
Results
interpretabl
e by you?
Easy to
explain
algorith...
Support Vector Machine (SVM)
16
paramedicfoundation.org
twitter.com/paramedicfound
facebook.com/ParamedicFoundation
Nadkar...
Algorithm Quality
17
• Very similar level of accuracy
between algorithms
• Will use similar attributes for
scoring
• May v...
Random Forest
18
• Advantages
• It has been widely shown that random forests
are one of the most accurate existing
classif...
The Turing Test
19
• In 1950 Alan Turing wondered ‘Can computers think?’
• Proposed The Imitation Game
• Interrogator and ...
Research Questions
20
• Can an a priori algorithmic, inductive reasoning based approach be
developed to:
• improve the spe...
Discussion – Present Considerations
21
• Flowchart/Tree: veracity of the reporting party, socio-economic and
demographic f...
Discussion – Future Considerations & Research
22
• Future research: develop an AI, ML based approach.
• Obtain detailed 91...
Contact
23
Nikiah Nudell, MS, NRP
(760) 405-6869
nnudell@paramedicfoundation.org
http://twitter.com/runmedic
https://www.l...
Upcoming SlideShare
Loading in …5
×

Nudell Research Proposal

248 views

Published on

This evolving presentation describes my dissertation research project. It will continue to change in time, until I'm done!

Published in: Government & Nonprofit
  • Be the first to comment

  • Be the first to like this

Nudell Research Proposal

  1. 1. paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation Developing a real-time predictive algorithm for emergency medical dispatch Nick Nudell, MS, NRP Dakota State University
  2. 2. Introduction 2 paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation • Approximately 7 billion emergency calls annually • Police – Fire – EMS
  3. 3. Real World Problem 3 • What action to take? • Where to do it? • Who should do it? • How quickly does it need to be done? • Why was it done? • Decisions! paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  4. 4. Data Sources 4 • Caller phone number: call routing information, mobile/fixed, single/multiple user (like an IP address), GPS/tower, eCall/Automatic Crash Notification • Resources/system status: what people, vehicles, equipment, etc. • Environment: Weather, crowding & traffic (granular to the device), street corner/high rise/wilderness, ferry/train/plane schedules • Call center, paramedics, hospital, police records, fire records, public health • Social media: twitter, facebook, instagram, etc paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  5. 5. Existing research 5 • 50 years of Operations Research / Management • 25 years of decision tool/tree validation • 10 years of clinical registry prediction tool validation • 15 years of decision support in emergency calling “appropriateness” • 6 months of deep data mining exploratory work paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  6. 6. Why is it so complex? 6 • Chinese city with 9 million residents • 2.5 calls per resident over 5 years (0.5/person/year) • Repeat callers average 2.09 calls per year • USA with 320 million residents • 240 million 911 calls per year (0.75/person/year) • 41,000 calls per Public Safety Answering Point • $4.51 per call, just to maintain the ICT & dispatching system • 10,000+ ICD10 diagnosis codes • 19,000 EMS services across 50 states & 6 territories paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  7. 7. Categorization 7 • Started in 1978… • 36 Families of problem types • Level of Urgency: Hot or Not • Omega, Alpha, Bravo, Charlie, Delta, Echo • Nuanced descriptors help determine what kind of first-aid instructions are to be given paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  8. 8. FDNY Example 8 paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation 1120 * 8 = 8,960 hours of coverage Two-level capability 138,116 total calls 5,730 high priority (Cardiac Arrest & Choking) 53,481 life threatening 78,905 non-life threatening
  9. 9. Decision Tree – Manual Deductive Reasoning 9 paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation • Dispatching priority relies on standardized keywords compared to a known list of static scenarios • IF • Shooting THEN • Urgently send police, apply tourniquet, stop bleeding. • Not breathing/pulseless THEN • Start CPR, urgently send paramedics • Cardiac history THEN • Urgently send paramedics, take aspirin, stay calm • Known as clustering in computer science
  10. 10. Questions / Prioritization / Instructions 10 • Priorities designed to purposefully over-triage rather than increase specificity as risk management tool • Lots of vehicles / fewer vehicles • Lights & Sirens / no L&S • Queuing theory using probabilistic expected delays for paramedics, police, or fire department responders • Targeting the slowest delay possible because time=money • Knowledge discovery opportunities are overlooked! • Crowdsource trained people for faster response • Electronic medical records describe historical risk • Caller behavior, word choice, history, location, etc are untapped indicators paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  11. 11. Queuing Theory – Planning to Disappoint 11 paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation • Operations Research, Management Science, & Computer Science disciplines rely on probabilistic calculations • A model is constructed so that queue lengths and waiting time can be predicted • Interarrival time & service times are independent random variables • Designed to select next task to perform • The most commonly used laws are: • FIFO - First In First Out: who comes earlier leaves earlier • LIFO - Last Come First Out: who comes later leaves earlier • RS - Random Service: the customer is selected randomly • Priority
  12. 12. Erlang Call Center Algorithm 12 paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation Source: http://www.erlang.com/calculator/call/ Estimate how many agents you need in your call center for each hour during an eight hour day… How many taxis for a particular time of day? How many hospital beds? Fire trucks? Paramedics? Police?
  13. 13. Natural Language Processing 13 • Machine learning to determine semantic meaning • Based on ontologies and probabilistic decisions • “Understanding” of words, meanings, intents • Better suited for structured, grouped or otherwise trained text such as physician narratives or same language categorization • Excels at spelling, grammar, and Named Entity Recognition that are relatively structured attributes • Well suited for classifying/parsing simple or common statements • Generally “trained” by humans (expensive) • Handling unstructured data, stemming, bag of words, TF/IDF, topic modeling. paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  14. 14. Machine Learning - Inductive 14 • Learns from the information itself • Classifier accuracy is similar to human experts • Common Algorithm Types • K-nearest neighbors (KNN) • Linear regression • Logistic regression • Naive Bayes • Decision trees, bagged trees, boosted trees, boosted stumps • Random Forests • AdaBoost • Neural networks paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  15. 15. Comparing Supervised Learning Algorithms 15 Algorithm Problem Type Results interpretabl e by you? Easy to explain algorithm to others? Average predictive accuracy Training speed Prediction speed Amount of parameter tuning needed (excluding feature selection) Performs well with small number of observations? Handles lots of irrelevant features well (separates signal from noise)? Automaticall y learns feature interactions? Gives calibrated probabilities of class membership? Parametric ? Features might need scaling? KNN Either Yes Yes Lower Fast Depends on n Minimal No No No Yes No Yes Linear regression Regression Yes Yes Lower Fast Fast None (excluding regularization) Yes No No N/A Yes No (unless regularized) Logistic regression Classification Somewhat Somewhat Lower Fast Fast None (excluding regularization) Yes No No Yes Yes No (unless regularized) Naive Bayes Classification Somewhat Somewhat Lower Fast (excluding feature extraction) Fast Some for feature extraction Yes Yes No No Yes No Decision trees Either Somewhat Somewhat Lower Fast Fast Some No No Yes Possibly No No Random Forests Either A little No Higher Slow Moderate Some No Yes (unless noise ratio is very high) Yes Possibly No No AdaBoost Either A little No Higher Slow Fast Some No Yes Yes Possibly No No Neural networks Either No No Higher Slow Fast Lots No Yes Yes Possibly No Yes paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation https://docs.google.com/spreadsheets/d/16i47Wmjpj8k- mFRk-NnXXU5tmSQz8h37YxluDV8Zy9U/edit#gid=0
  16. 16. Support Vector Machine (SVM) 16 paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation Nadkarni, P. M., Ohno-Machado, L., & Chapman, W. W. (2011). Natural language processing: an introduction. Journal of the American Medical Informatics Association : JAMIA, 18(5), 544– 551. http://doi.org/10.1136/amiajnl-2011-000464
  17. 17. Algorithm Quality 17 • Very similar level of accuracy between algorithms • Will use similar attributes for scoring • May vary when categorical vs continuous data • Primary difference is in efficiency • Big-O Notation is a relative representation of the complexity of an algorithm paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  18. 18. Random Forest 18 • Advantages • It has been widely shown that random forests are one of the most accurate existing classification methods • It can deal with a huge number of features • It runs efficiently on large datasets • It can help estimate which variables are important in classification • It can be extended to an unsupervised version to work with unlabeled data. • It is relatively robust to noise • Disadvantages • They tend to overt noisy data. • Not as intuitive as some other classification methods • Might take a while to build the forest (but once it's built classification is very fast) paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  19. 19. The Turing Test 19 • In 1950 Alan Turing wondered ‘Can computers think?’ • Proposed The Imitation Game • Interrogator and two players, one human and one computer • Based on typewritten responses the interrogator was to guess which player was the computer • He believed having adequate storage was the primary limiting factor with speed being next • Learning machine is like a child being taught paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433-460.
  20. 20. Research Questions 20 • Can an a priori algorithmic, inductive reasoning based approach be developed to: • improve the speed of the decision making process during emergency call taking and dispatching? • improve the accuracy of the resource assignment for emergency call dispatching? paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  21. 21. Discussion – Present Considerations 21 • Flowchart/Tree: veracity of the reporting party, socio-economic and demographic factors of the patient/victim, the capability of the responding unit, the quality of services provided by the responding individual, and the specificity of the dispatching algorithm itself are not factored into the decision model. paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  22. 22. Discussion – Future Considerations & Research 22 • Future research: develop an AI, ML based approach. • Obtain detailed 911 call and electronic Patient Care Records for approximately five million patients where an outcome is identified. • unfounded/no merit, patient treated but not transported, patient treated and transported, and patient transferred to another responder. • The clinical condition at the time of the outcome will be determined based on standard paramedic coding practices. • Data split by randomization to a training dataset and test dataset. • A Random Forest model built from training dataset then applied to test dataset. • Comparative statistics to evaluate the resource assignments, reduced demand, and potential savings of the new model • New knowledge model is a dynamic and real-time application paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation
  23. 23. Contact 23 Nikiah Nudell, MS, NRP (760) 405-6869 nnudell@paramedicfoundation.org http://twitter.com/runmedic https://www.linkedin.com/in/medicnick paramedicfoundation.org twitter.com/paramedicfound facebook.com/ParamedicFoundation

×