VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
ICD-Tracker, a semantic clinical records analyzer
1. ICD-Tracker,
a semantic clinical records analyzer
RTSI2016 - Smarter city and Health care panel - Bologna - September 7th, 2016
Annamaria Chiasera, Lorenzo Verna, Tefo Toai, Dario Betti
3. Outline
• issue: report in healthcare as complex, tedious
and error prone activity
• proposed solution: semantic analysis of EHR
• lesson learned
4. Italian NHS budget
AgenziaNazionaleperiServiziSanitariR egionali
Figura 4 - Spesa Sanitaria corrente e Finanziamento SSN, anni 2008-2014
Fonte: NSIS e Intese Stato-Regioni (Valori in miliardi di euro)
Nota:
Il livello di spesa (Totale costi) è al lordo del saldo di mobilità passiva; il totale ricavi è al lordo del saldo mobilità attivo.
Regioni non in piano di rientro: Lombardia, Veneto, Liguria, Emilia Romagna, Toscana, Umbria, Marche, Basilicata;
100,577
103,864
105,565
106,900
107,960
107,004
109,928
103,315
106,372
108,518
110,015
111,786
110,969
112,457
108,144
111,373
112,630
112,624 112,688
111,684
112,673
94,000
96,000
98,000
100,000
102,000
104,000
106,000
108,000
110,000
112,000
114,000
2008 2009 2010 2011 2012 2013 2014
Finanziamento - intesa Stato -Regioni Totale Ricavi netti da CE Totale costi
7. Quality and audit control
more than 3.000* anomalies notified by
the Region every 6 months
3 months FTE (10’ for clinical record) for
clinical and administrative control
*Data from the pilot in Cuneo (Piemonte)
8. How can semantics help?
1. ICD-9-CM coding choice check
2. Assisted ICD9-CM coding
3. DRG audit
13. About
Tykli is an Advanced Analytics company
that provides skills, technology and platform
to his customers to solve their complex data challenges.
Tykli pioneered the use of Network Science
and Topological Data Analysis to simplify
and accelerate complex data analysis.
Tykli Edge is the company's flagship software
application and powers our solutions for Financial
Services, Manufacturing,
Healthcare, Media and Communications.
Tykli (r) -
29. Lessons learned
• Automatic training process requires dedicated teams at
hospitals to provide “certified” feedback
• Devise quality indicators balancing precision and accuracy with
stakeholders’ expectations and final users feeling
• Careful use of user feedback to refine analysis rules without
compromising code identification capability: codes prioritization
• Engine training set with clinical documents specific to branch
or ward to improve accuracy and recall of specific clinical terms
• Automate cleaning and anonymization procedure to anonymize
sensitive clinical documents before semantic analysis
35. Solution Architecture
Management
Control
Start Audit control
BPManager
Control Viewer
Physician
Data Layer
Semantic Engine
(Tykli)
Audit control
Control Manager
SemanticServiceManager
BI
BIManager
Other
services
…
Wrapper
Editor's Notes
My name is Dario Betti and I’m here as representative of CRG, the Research Center of GPI Group, an ICT player that develops solutions for Health sector.
I’m here as member of ICD-Tracker project team, that includes also my collegaues AnnaMaria Chiasera and Tefo Toai. On my left there is Lorenza Verna, CEO of Tikly, our partner whose semantic technology we used in the project.
In the first part of presentation I’ll outline the problem, then Lorenzo will speak about the solution and finally a few words about lessons learned
Let’s start defining the problem.
As you probably know, the national health system is one of the most significant items of the Italian public spending: in the past five years his average level is around 112 billions Euros (it the second biggest expenditure item after the social securit.)
The main funding source is the national and regional taxation, while the participating of citizens (the so called “compartecipazione”) by tickets helps for a very small percentage, below 5 billion Euro.
How does the NHS funding flow run?
For the sake of simplicity, we can say that the state and the regions reimburse the healthcare providers on the base of the accounted activities.
The Hospital or the clinic report services and activitities they did to the ASL (Azienda Sanitaria Locale it means Local Health Body) in order to get a refund and consequently the ASL ask in the same way the Region for a reimbursement.
Now, I’ll try to give you an overall vision of how is generated a typical health service report.
The doctor of the ward that finally disposes the patient should consult all the available clinical documentation, that obviously is written in natural language, in order to fill out the so-called SDO (Scheda di dimissione ospedaliera - Hospital discharge form).
The SDO is a mandatory, national–wide standard based, collection of data related to individual patients; It is the synthesis of the information contained in the medical records.
The SDO consists basically of two sections:
the first one contains 12 items related to information of the patient(eg. age, sex, date and place of birth, citizenship etc.).
the second one contains information strictly related to that single hospitalization episode. There are 21 items: The main diagnosis, secondary diagnosis, Main procedure and secondary procedures performed, admission date, discharge date, nursing department, etc.).
The SDO dataset is then elaborated from an automated algorithm, that calculates the Diagnosis-Related Group code. The Diagnosis-Related Group, abbreviated as DRG, is a system of classifying a patient’s hospital stay that standardizes prospective payment to hospitals and encourages cost containment.
So DRGs are clinically comparable hospitalizations with similar expected costs: It is a kind of ”standard cost” list based on the care given to and resources used by a "typical" patient within the group.
Every DRG code corresponds with a certain refund.
****************
So, where are the issues?
the first one depends on the quantity and the quality of the clinical documentation available. Unfortunately clinical records lack of standard template. Sometimes they lack also of informative content. Furthermore medical language uses a lot of acronym and terms used differently from doctor to doctor.
The second depends on the fact that the physician has to codify diagnosis and procedures with a standardized “dictionary” (ICD-9-CM). The ICD-9-CM hierarchy contains 15.000 diagnosis and 4.000 procedures codes. Some hospitals can use specific electonic tools supporting the search and explore the hierarchy (with free text search or guided by questions). However, in most of the cases doctors can rely only on printed manuals that are hard to explore and easily leads to errors. Even an expert doctor will take many minutes to identify correctly all the ICD codes describing the patient stay. Furthermore, certain ward are particularly difficult, for example “general Medicine” because the wide variety of pathologies treated in that division and because the patient’s history may be by far more complex (e.g. because the patient arrives from other departments like surgery). In that case it could be difficult to identify all the codes describing the whole patient history. Obviously what is not represented in a code wont be reflected in the DRG and consequently wont be paid.
ICD-9 coding is definitely a complex, tedious and error prone activity: sometimes is it difficult to code with the “right” ICD codes and wrong choice of ICD codes result in wrong DRG.
Sometimes physicians are encouraged by their ward to opportunistic encodings.
Ccomplications, procedures performed, principal diagnosis are the variables on which frauds are more frequent because they have a huge impact on the DRG and on the refund amount.
The auditors of Region perform auditing activities on samples of the DRG sent periodically by each hospital (by law 10% of SDO should be inspected either internally by the hospital or externally by the auditor).
Anomalies notified by the Region should be answered with a justification coming from the clinical documentation. This operation typically requires the presence of doctors both for using their knowledge in the domain but also for the legal responsibility related with the creation of clinical documentation. In case corrections are performed the new SDO should be printed and signed again by the doctor.
This audit activity takes effot both to doctors and administrative staff from 5 to 15 minutes depending on the complexity of the case.
In case the anomalies is related only on administrative data (e.g. missing data of birth) creating a justification is fast and may require only administrative staff effort; in case the anomalies is in diagnosis and procedures or in the excessive length of stay, producing a justification requires the doctor to examine again the clinical documentation (essentially the discharge letter and in some cases also the clinical exams) to produce an answer.
How can semantics help?
We found at least three scenario
Scenario 1: Semantic can enable a coherence check between the clinical records and the ICD codes picked by physician, to filter inconsistencies before the SDO is sent to GROUPER software to produce DRG.
In the 2nd scenario, a clinical documentation semantic analysis is performed on the fly to suggest ICD-9 codes to the physician, in order to spare time and avoid mistakes.
In the 3rd scenario, semantic technology may power tools for DRG audit activity.
In this scenario the ICD codes are extracted from the SDO associated to DRG under verification, then the tool could automatically perform a semantic analysys on clinical record to find codes justification.
In fact we tested a similar tool in Cuneo and found a substantical reduction of effort of both physician and administrative staff.
Now I give the floor to Lorenzo Verna.
The world is made of different kind of informations, some structured, some unstructured. Our philosophy that translates in our technology is to take all this information and
The world is made of different kind of informations, some structured, some unstructured. Our philosophy that translates in our technology is to take all this information and
SIOPE
SIOPE
SIOPE
The proposed solution includes the following functionalities:
1 – DRG simulation calculation with free text search (for the customers without Code Finder).
2 – support SDO editing: using a page integrated with the application used by the doctor to prepare the discharge letter and the SDO (ICD Tracker page automatically opened when the doctor writes the SDO in IE-Doc)
The picture shows scenario 2 – support SDO editing
This is the main screen of the assisted ICD code identification. On the right side you can see the content of the discharge letter and on the left side the suggested codes divided into diagnosis and procedures and further into first code and secondary codes (that are codes with less importance and consequently less impact on the DRG value).
Notice how the codes suggested by the semantic analyzer are associated with a weight and a set of keywords contained in the discharge letter “supporting” the suggested code (see Tykli presentation for the details). When the user selects a suggested code the associated keywords are shown in grey. The sentences that are more important for supporting the suggested code are highlighted in yellow depending on an algorithm considering the score of the codes. This allows the user to better focus the attention to the paragraphs in the discharge letter that are more important and useful to better contextualize the suggested code and get the all the information from the clinical documentation supporting that code choice.
The learning phase to improve the semantic analyzer is performed on the SDO compiled and approved by the doctor. In this way only official and correct association SDO-codes are considered and the “noise” induced by unsystematic feedback is avoided (as it was experienced in the experimental phase with the possibility to eliminate from the list of suggested codes the ones that the user considers incorrect or useless).
The user can change the order of the codes in the list and even change the first diagnosis or procedure. The DRG can be computed with the new set of codes. The cost of patient’s hospitalization is computed depending on the DRG and MDC value, the length of stay and rate tables specific for the region.
DRG is NOT automatically computed as the user changes the selected codes to avoid encouraging opportunistic behavior that may search for the more economically profitable configuration of codes even if not perfectly reflected in the clinical documentation (clinical documentation should instead be always the driver in code selection as in case of litigation with the auditor is the only proof on which the content of the SDO can be argued).
For the future evolutions may be useful to perform filtering or post-prioritization activities on the suggested codes, meaning the order of the suggested codes is changed according to rules based on codes frequencies or other preferences specific for a department or discipline (e.g. in urology raise the score, and consequently the order, of codes related to urological apparatus diseases).
------------
Vedere le cartelle per esempio
201113000
201118377
La parte delle segnalazioni non è abilitata perché non vengono gestite le segnalazioni provenienti dalla regione (in pratica è supportato solo lo scenario dell'aiuto alla compilazione).
Quando l'utente arriva sulla pagina di compilazione non sono ancora popolate diagnosi e procedure principali. Se l'analizzatore è in grado di individuare dei codici viene popolata diagnosi e procedura principale.
This final slide summarizes the findings and lessons learned from the experimentation performed with the prototype solution and proposes some future work.
- The domain is quite complex and requires the support of experts from the clinical domain to gradually create the knowledge base on which to train the semantic analyzer. The process is incremental as it requires to iterate the “text analysis” - ”code verification” - “feedback collection” - “training cycle”.
User feedback (especially in deleting wrong or useless codes) can be used to prioritize or filter out certain codes only after the text analysis phase to avoid compromising the capability of the analyzer to identify codes and customize the suggested codes according to user’s preferences.
[DA CHIEDERE A VERNA REVISIONE] The semantic analysis engine will benefit from extensive training sessions in which an high number of documents specific for certain code branch (e.g. urology) can be used to do selective training to improve the recognition of recurring clinical terms.
Precision and accuracy to measure the effectiveness of the semantic analysis do not reflect user “happiness” with the suggested codes. More suitable metrics should be devised to consider also the user’s satisfaction with the suggested codes (are the proposed codes what users expect/need to do their job?).
Finally, privacy is a key issue due to the sensitivity of clinical documents analyzed. An automatic cleaning procedure has been developed to filter out identifying information from the discharge letter before the semantic analysis. The procedure should be further improved to deal with different types of documents and privacy rules and different deployment architecture (to minimize the sensitive information flowing out from the owner’s systems).
This final slide summarizes the findings and lessons learned from the experimentation performed with the prototype solution and proposes some future work.
- The domain is quite complex and requires the support of experts from the clinical domain to gradually create the knowledge base on which to train the semantic analyzer. The process is incremental as it requires to iterate the “text analysis” - ”code verification” - “feedback collection” - “training cycle”.
User feedback (especially in deleting wrong or useless codes) can be used to prioritize or filter out certain codes only after the text analysis phase to avoid compromising the capability of the analyzer to identify codes and customize the suggested codes according to user’s preferences.
[DA CHIEDERE A VERNA REVISIONE] The semantic analysis engine will benefit from extensive training sessions in which an high number of documents specific for certain code branch (e.g. urology) can be used to do selective training to improve the recognition of recurring clinical terms.
Precision and accuracy to measure the effectiveness of the semantic analysis do not reflect user “happiness” with the suggested codes. More suitable metrics should be devised to consider also the user’s satisfaction with the suggested codes (are the proposed codes what users expect/need to do their job?).
Finally, privacy is a key issue due to the sensitivity of clinical documents analyzed. An automatic cleaning procedure has been developed to filter out identifying information from the discharge letter before the semantic analysis. The procedure should be further improved to deal with different types of documents and privacy rules and different deployment architecture (to minimize the sensitive information flowing out from the owner’s systems).
http://www.salute.gov.it/imgs/C_17_pubblicazioni_2270_allegato.pdf Rapporto sull’attività di ricovero ospedaliero Dati SDO Primo semestre 2014
http://www.salute.gov.it/imgs/C_17_pubblicazioni_2258_allegato.pdf Relazione sullo Stato Sanitario del Paese 2012-2013
Il numero complessivo di schede con almeno un errore di compilazione si riduce notevolmente, passando dal 33,9% nel 2013 al 29,4% delle schede pervenute nel primo semestre 2014, con una differenza di 4,5 punti percentuali.
Anche la distribuzione degli errori migliora, infatti il numero medio di errori per scheda si riduce da 0,5 a 0,4 (mentre il numero mediano di errori per scheda permane pari a zero), e la deviazione standard del numero di errori per scheda si riduce da 0,8 a 0,7(cfr Tav. 1.3 ).
La scheda SDO contiene 45 variabili: nell’anno 2013 sono pervenute 9.843.992 schede, per un totale di 442.979.640 informazioni distinte raccolte ed una percentuale complessiva di errori del 1,1%, mentre nel primo semestre 2014 sono pervenute 4.532.720 schede,per un totale di 203.972.400 informazioni complessive ed un numero di errori paria 1.848.665, ovvero una percentuale di errore pari a 0,9%.
Scenario 1. Coherence check DRG/SDO. Analyse SDO data to identify inconsistencies with the produced DRG.
Scenario 2: Assisted DRG identification. Clinical documentation analysis (SDO) to suggest ICD-IX-CM codes at DRG calculation
Scenario 3. Post-consistency check DRG/SDO. Post-consistency check ICD-IX-CM and DRG with clinical documentation and reporting of incoherent codes.
An overview of the sites in which the preliminary prototype and the consolidated product have been tested.
WHY
Healthcare providers funding wrt
- service type/volume
- patient population complexity
DRGs (Diagnosis Related Groups) are clinically comparable hospitalizations with similar expected costs (homogenous resource consumption pattern)
Hospital costs are used by the Region or the Ministry of Health to decide the future budget. A wrong estimation of such costs have an impact (even if not so direct) with the hospital economic resources.
A precise cost estimation is important especially for the treatments delegated to accredited institutions and for patients coming from other regions (passive mobility). In these cases a “real” amount of money must flow from the region of origin of the patient to the hosted region or from the Region to the accredited institution. It is evident an accurate estimation of the hospitalization costs is fundamental to prevent loss both for the paying institution (Region or Ministry of Health) due to deliberately or mistakenly increased reimbursement and the hospital (reduced reimbursement).
There are two core goals: 1) give a cost estimation as much as possible adherent with the actual costs (this depend on the type of estimation method used that in Italy is based on DRG) and 2) apply correctly the estimation method avoiding errors (wanted or not) resulting in wrong (increased or reduced reimbursed amount).
Point 1 cannot be changed as it is decided at National level. Point 2 is were the project impact.
How to measure the costs of patient’s hospitalization trying to evaluate costs of medical and surgical treatments, use of medical devices and medical staff, length of stay?
DRGs are assigned by a "grouper" program developed by 3M based on ICD (International Classification of Diseases, devised in US) diagnoses, procedures, age, sex, discharge status, and the presence of complications or comorbidities. The generated value is used to ask for reimbursement to the Region or the Italian Ministry of Health and represents a basis on which the Ministry of Health plan the annual financial budget assigned to hospitals.
http://www.agenas.it/images/agenas/monitoraggio/spesa_sanitaria/monitoraggio_spesa/2008_2014/aggiornamento_spesa_sanitaria_%20anni%202008_2014_PDF.pdf
La spesa sanitaria nazionale che cresce nel 2014 (112.672.629 migliaia di euro) dello 0,89% rispetto al 2013
La mobilità muove ogni anno circa 3,8 miliardi di euro (2012 – 3,873 mld; 2013 – 3,973 mld)
http://healthcarefunding.ca/key-issues/activity-based-funding/
http://www.who.int/bulletin/volumes/91/10/12-115931/en/