SlideShare a Scribd company logo
Extractive Evidence Based Medicine
Summarisation Based on Sentence-Specific
Statistics
Abeed Sarker1
1 Centre

Diego Moll´1
a

C´cile Paris2
e

for Language Technology, Macquarie University, Sydney
2 CSIRO ICT Centre, Sydney

CBMS 2012, Rome
Background

Method

Evaluation

Contents

Background
Evidence Based Medicine
Method
Corpus
Generation of Statistics
Evaluation

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

2/28
Background

Method

Evaluation

Contents

Background
Evidence Based Medicine
Method
Corpus
Generation of Statistics
Evaluation

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

3/28
Background

Method

Evaluation

Contents

Background
Evidence Based Medicine
Method
Corpus
Generation of Statistics
Evaluation

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

4/28
Background

Method

Evaluation

Evidence Based Medicine

http://laikaspoetnik.wordpress.com/2009/04/04/evidence-based-medicine-the-facebook-of-medicine/

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

5/28
Background

Method

Evaluation

EBM and Natural Language Processing
http://hlwiki.slais.ubc.ca/index.php?title=Five_steps_of_EBM

NLP tasks
Question analysis and
classification
Information Retrieval
Classification and
re-ranking
Information extraction
Question answering
Summarisation

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

6/28
Background

Method

Evaluation

Contents

Background
Evidence Based Medicine
Method
Corpus
Generation of Statistics
Evaluation

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

7/28
Background

Method

Evaluation

General Approach

In a Nutshell
1. Gather statistics from the best 3-sentence extracts.
Exhaustive search to find these best extracts.

2. Build three classifiers, one per sentence in the final extract.
Classifier 1 based on statistics from best 1st sentence.
Classifier 2 based on statistics from best 2nd sentence.
Classifier 3 based on statistics from best 3rd sentence.

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

8/28
Background

Method

Evaluation

Contents

Background
Evidence Based Medicine
Method
Corpus
Generation of Statistics
Evaluation

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

9/28
Background

Method

Evaluation

Journal of Family Practice’s “Clinical Inquiries”

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

10/28
Background

Method

Evaluation

The XML Contents I

<r e c o r d i d =”7843”>
<u r l >h t t p : / /www . j f p o n l i n e . com/ Pages . a s p ? AID=7843&amp ; i s s u e=S e p t e m b e r 2 0 0 9&amp ; UID=</u r l >
<q u e s t i o n >Which t r e a t m e n t s work b e s t f o r h e m o r r h o i d s ?</ q u e s t i o n >
<a n sw e r>
<s n i p i d =”1”>
<s n i p t e x t >E x c i s i o n i s t h e most e f f e c t i v e t r e a t m e n t f o r t h r o m b o s e d
e x t e r n a l h e m o r r h o i d s .</ s n i p t e x t >
<s o r t y p e =”B”> r e t r o s p e c t i v e s t u d i e s </s o r>
<l o n g i d =”1 1”>
<l o n g t e x t >A r e t r o s p e c t i v e s t u d y o f 231 p a t i e n t s t r e a t e d
c o n s e r v a t i v e l y o r s u r g i c a l l y f o u n d t h a t t h e 48.5% o f p a t i e n t s
t r e a t e d s u r g i c a l l y had a l o w e r r e c u r r e n c e r a t e t h a n t h e
c o n s e r v a t i v e g r o u p ( number n e e d e d t o t r e a t [NNT]=2 f o r
r e c u r r e n c e a t mean f o l l o w −up o f 7 . 6 months ) and e a r l i e r
r e s o l u t i o n o f symptoms ( a v e r a g e 3 . 9 d a y s compared w i t h 24 d a y s
f o r c o n s e r v a t i v e t r e a t m e n t ). </ l o n g t e x t >
<r e f i d =”15486746” a b s t r a c t =” A b s t r a c t s / 1 5 4 8 6 7 4 6 . xml”>G r e e n s p o n
J , W i l l i a m s SB , Young HA , e t a l . Thrombosed e x t e r n a l
h e m o r r h o i d s : outcome a f t e r c o n s e r v a t i v e o r s u r g i c a l
management . D i s C o l o n Rectum . 2 0 0 4 ; 4 7 : 1493−1498.</ r e f >
</l o n g>
<l o n g i d =”1 2”>
<l o n g t e x t >A r e t r o s p e c t i v e a n a l y s i s o f 340 p a t i e n t s who u n d e r w e n t
o u t p a t i e n t e x c i s i o n of thrombosed e x t e r n a l hemorrhoids under
l o c a l a n e s t h e s i a r e p o r t e d a l ow r e c u r r e n c e r a t e o f 6.5% a t a
mean f o l l o w −up o f 1 7 . 3 months .</ l o n g t e x t >

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

11/28
Background

Method

Evaluation

The XML Contents II

<r e f i d =”12972967” a b s t r a c t =” A b s t r a c t s / 1 2 9 7 2 9 6 7 . xml”>Jongen J ,
Bach S , S t u b i n g e r SH , e t a l . E x c i s i o n o f t h r o m b o s e d e x t e r n a l
hemorrhoids under l o c a l a n e s t h e s i a : a r e t r o s p e c t i v e e v a l u a t i o n
o f 340 p a t i e n t s . D i s C o l o n Rectum . 2 0 0 3 ; 4 6 : 1226−1231.</ r e f >
</l o n g>
<l o n g i d =”1 3”>
<l o n g t e x t >A p r o s p e c t i v e , r a n d o m i z e d c o n t r o l l e d t r i a l (RCT) o f 98
p a t i e n t s t r e a t e d n o n s u r g i c a l l y found improved pain r e l i e f with a
c o m b i n a t i o n o f t o p i c a l n i f e d i p i n e 0.3% and l i d o c a i n e 1.5% compared
w i t h l i d o c a i n e a l o n e . The NNT f o r c o m p l e t e p a i n r e l i e f a t 7 d a y s was
3.</ l o n g t e x t >
<r e f i d =”11289288” a b s t r a c t =” A b s t r a c t s / 1 1 2 8 9 2 8 8 . xml”> P e r r o t t i P ,
A n t r o p o l i C, Molino D , et a l . C o n s e r v a t i v e treatment of acute
thrombosed e x t e r n a l hemorrhoids with t o p i c a l n i f e d i p i n e . Dis
C o l o n Rectum . 2 0 0 1 ; 4 4 : 405−409.</ r e f >
</l o n g>
</s n i p >
</a ns w e r>
</r e c o r d >

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

12/28
Background

Method

Evaluation

Corpus Statistics

Size
456 questions (“records”).
Over 1,100 distinct answers (“snips”).
3,036 text explanations (“longs”).
2,707 references.

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

13/28
Background

Method

Evaluation

Summarisation Using This Corpus
Input
Question.
Document Abstract.

Output
Extractive summary that answers the question.
Target summary is the annotated evidence text (“long”).
Evaluated using ROUGE-L.

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

14/28
Background

Method

Evaluation

Contents

Background
Evidence Based Medicine
Method
Corpus
Generation of Statistics
Evaluation

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

15/28
Background

Method

Evaluation

The Statistics Gathered

1. Source sentence position.
2. Sentence length.
3. Sentence similarity.
4. Sentence type.

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

16/28
Background

Method

Evaluation

1. Source Sentence Position

Compute relative positions.
Create normalised frequency histograms f1 , f2 , . . . , f10 .
Score all relative positions of bin i with its bin frequency:
Spos (i) = fbin(i) .

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

17/28
Background

Method

Evaluation

2. Sentence Length

Reward larger sentences and penalise shorter sentences:

Normalised sentence length
Slen (i) =

ls − lavg
ld

ls : sentence length
lavg : average sentence length in the corpus
ld : document length

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

18/28
Background

Method

Evaluation

3. Sentence Similarity
Sentence Similarity
Lowercase, stem, remove stop words.
Build vector of tf .idf with remaining words and UMLS
semantic types.
CosSim(X , Y ) =

X .Y
|X ||Y |

Maximal Marginal Relevance (Carbonell & Goldstein, 1998)
Reward sentences similar to the query and penalise those similar to
other summary sentences.
MMR = λ(CosSim(Si , Q))
−(1 − λ)maxSj S (CosSim(Si , Sj ))
EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

19/28
Background

Method

Evaluation

4. PIBOSO (Kim et al. 2011) I

1. Classify all sentences into PIBOSO types (a variant of PICO).
2. Generate normalised frequency histograms of resulting
PIBOSO types.

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

20/28
Background

Method

Evaluation

4. PIBOSO (Kim et al. 2011) II

Position independent
Pbest :
Pbest
SPIPS (i) =
Pall

Position dependent
SPDPS (i) =

Ppos
Pbest

EBM Summarisation

proportion of this PIBOSO type
among all best summary sentences.

Pall :

proportion of this PIBOSO type
among all sentences.

Ppos :

proportion of this PIBOSO type
among at best summary sentences at
this position.

Abeed Sarker, Diego Moll´, C´cile Paris
a e

21/28
Background

Method

Evaluation

Classification
Edmunsonian Formula
SSi = αSrposi + βSleni + γSPIPSi
+δSPDPSi + SMMRi
MMR is replaced with cosine similarity for first sentence.
In case of ties, the sentence with greatest length is chosen.
Parameters are fine-tuned through exhaustive search using
training set.

α = 1.0, β = 0.8, γ = 0.1, δ = 0.8,
EBM Summarisation

= 0.1, λ = 0.1.
Abeed Sarker, Diego Moll´, C´cile Paris
a e

22/28
Background

Method

Evaluation

Contents

Background
Evidence Based Medicine
Method
Corpus
Generation of Statistics
Evaluation

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

23/28
Background

Method

Evaluation

Percentile-based Evaluation (Ceylan et al. 2010) I

We compare against all possible 3-sentence extracts in the test set.
1. Bin all possible three-sentence combinations of each abstract.
1,000 bins.

2. Normalise the resulting histograms.
3. Combine all histograms.
convolution.

4. The result approximates the probability density distribution of
all three-sentence summaries in all abstracts.

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

24/28
Background

Method

Evaluation

Percentile-based Evaluation (Ceylan et al. 2010) II

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

25/28
Background

Method

Evaluation

Systems

L3 Last three sentences.
O3 Last three PIBOSO outcome sentences.
R Random.
O All outcome sentences.
PI Sentence position independent.
PD Sentence position dependent (our proposal).

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

26/28
Background

Method

Evaluation

Results

System

F-Score

95% CI

Percentile (%)

L3
O3
R
O
PI

0.159
0.161
0.158
0.159
0.160

0.155–0.163
0.158–0.165
0.154–0.161
0.155–0.164
0.157–0.164

60.3
77.5
50.3
60.3
69.4

PD

0.166

0.162–0.170

97.3

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

27/28
Background

Method

Evaluation

Questions?
Background
Evidence Based Medicine
Method
Corpus
Generation of Statistics
Evaluation

Further Information
http://web.science.mq.edu.au/~diego/medicalnlp/

EBM Summarisation

Abeed Sarker, Diego Moll´, C´cile Paris
a e

28/28

More Related Content

Viewers also liked

Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Diego Molla-Aliod
 
Impact of Citing Papers for Summarisation of Clinical Documents
Impact of Citing Papers for Summarisation of Clinical DocumentsImpact of Citing Papers for Summarisation of Clinical Documents
Impact of Citing Papers for Summarisation of Clinical Documents
Diego Molla-Aliod
 
Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Document Distance for the Automated Expansion of Relevance Judgements for Inf...Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Diego Molla-Aliod
 
The Power of C2C Recommendations for the Financial Services Sector
The Power of C2C Recommendations for the Financial Services SectorThe Power of C2C Recommendations for the Financial Services Sector
The Power of C2C Recommendations for the Financial Services Sector
RewardStream Inc
 
Power of C2C Recommendations for the Telecommunications Sector
Power of C2C Recommendations for the Telecommunications SectorPower of C2C Recommendations for the Telecommunications Sector
Power of C2C Recommendations for the Telecommunications Sector
RewardStream Inc
 
The Power of C2C Recommendations for the Retail Sector
The Power of C2C Recommendations for the Retail SectorThe Power of C2C Recommendations for the Retail Sector
The Power of C2C Recommendations for the Retail Sector
RewardStream Inc
 
Looking Beyond the Transaction: Customer Loyalty in the Age of Social & Mobile
Looking Beyond the Transaction: Customer Loyalty in the Age of Social & MobileLooking Beyond the Transaction: Customer Loyalty in the Age of Social & Mobile
Looking Beyond the Transaction: Customer Loyalty in the Age of Social & Mobile
RewardStream Inc
 
Multi-Objective Optimization for Clustering of Medical Publications
Multi-Objective Optimization for Clustering of Medical PublicationsMulti-Objective Optimization for Clustering of Medical Publications
Multi-Objective Optimization for Clustering of Medical Publications
Diego Molla-Aliod
 
Text Mining for Evidence Based Medicine
Text Mining for Evidence Based MedicineText Mining for Evidence Based Medicine
Text Mining for Evidence Based Medicine
Diego Molla-Aliod
 

Viewers also liked (10)

Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
Overview of the 2014 ALTA Shared Task: Identifying Expressions of Locations i...
 
Impact of Citing Papers for Summarisation of Clinical Documents
Impact of Citing Papers for Summarisation of Clinical DocumentsImpact of Citing Papers for Summarisation of Clinical Documents
Impact of Citing Papers for Summarisation of Clinical Documents
 
Prezentare ts1
Prezentare ts1Prezentare ts1
Prezentare ts1
 
Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Document Distance for the Automated Expansion of Relevance Judgements for Inf...Document Distance for the Automated Expansion of Relevance Judgements for Inf...
Document Distance for the Automated Expansion of Relevance Judgements for Inf...
 
The Power of C2C Recommendations for the Financial Services Sector
The Power of C2C Recommendations for the Financial Services SectorThe Power of C2C Recommendations for the Financial Services Sector
The Power of C2C Recommendations for the Financial Services Sector
 
Power of C2C Recommendations for the Telecommunications Sector
Power of C2C Recommendations for the Telecommunications SectorPower of C2C Recommendations for the Telecommunications Sector
Power of C2C Recommendations for the Telecommunications Sector
 
The Power of C2C Recommendations for the Retail Sector
The Power of C2C Recommendations for the Retail SectorThe Power of C2C Recommendations for the Retail Sector
The Power of C2C Recommendations for the Retail Sector
 
Looking Beyond the Transaction: Customer Loyalty in the Age of Social & Mobile
Looking Beyond the Transaction: Customer Loyalty in the Age of Social & MobileLooking Beyond the Transaction: Customer Loyalty in the Age of Social & Mobile
Looking Beyond the Transaction: Customer Loyalty in the Age of Social & Mobile
 
Multi-Objective Optimization for Clustering of Medical Publications
Multi-Objective Optimization for Clustering of Medical PublicationsMulti-Objective Optimization for Clustering of Medical Publications
Multi-Objective Optimization for Clustering of Medical Publications
 
Text Mining for Evidence Based Medicine
Text Mining for Evidence Based MedicineText Mining for Evidence Based Medicine
Text Mining for Evidence Based Medicine
 

Similar to Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific Statistics

ISBNPA_2016_JM
ISBNPA_2016_JMISBNPA_2016_JM
ISBNPA_2016_JM
Joseph Murphy
 
MCCIA 014
MCCIA 014MCCIA 014
Kostogryzov-for china-2013
 Kostogryzov-for china-2013 Kostogryzov-for china-2013
Kostogryzov-for china-2013
Mathmodels Net
 
E xact micro 10 photometer v4
E xact micro 10 photometer v4E xact micro 10 photometer v4
E xact micro 10 photometer v4
Ronnie Lewis
 
Acute stroke imaging and intervention-dr. n khandelwal
Acute stroke  imaging and intervention-dr. n khandelwalAcute stroke  imaging and intervention-dr. n khandelwal
Acute stroke imaging and intervention-dr. n khandelwal
Teleradiology Solutions
 
Blood pressure
Blood pressureBlood pressure
Blood pressure
Luke Britton
 
Financial systems and banking crises, an assessment
Financial systems and banking crises, an assessmentFinancial systems and banking crises, an assessment
A Deep Dive into the Socio-Technical Aspects of Delays in Security Patching
A Deep Dive into the Socio-Technical Aspects of Delays in Security PatchingA Deep Dive into the Socio-Technical Aspects of Delays in Security Patching
A Deep Dive into the Socio-Technical Aspects of Delays in Security Patching
CREST
 
Certificates and Credentials new
Certificates and Credentials newCertificates and Credentials new
Certificates and Credentials new
sidharthbiswas9185
 
Scanned by CamScannerG o o d w M P r e p a id r e n t.docx
Scanned by CamScannerG o o d w M  P r e p a id  r e n t.docxScanned by CamScannerG o o d w M  P r e p a id  r e n t.docx
Scanned by CamScannerG o o d w M P r e p a id r e n t.docx
kenjordan97598
 
Reflective Teaching: Improving Library Instruction Through Self-Reflection
Reflective Teaching: Improving Library Instruction Through Self-ReflectionReflective Teaching: Improving Library Instruction Through Self-Reflection
Reflective Teaching: Improving Library Instruction Through Self-Reflection
Mandi Goodsett
 
Dissertation
DissertationDissertation
Dissertation
Amie Nevin
 
erc_rapport_2009-2014
erc_rapport_2009-2014erc_rapport_2009-2014
erc_rapport_2009-2014
Petra Uittenbogaard
 
Strategic Cartography: Identifying IL Intersections Across the Curriculum
Strategic Cartography: Identifying IL Intersections Across the CurriculumStrategic Cartography: Identifying IL Intersections Across the Curriculum
Strategic Cartography: Identifying IL Intersections Across the Curriculum
char booth
 
Introduction to Energy Efficiency, EMS and Energy Audit
Introduction to Energy Efficiency, EMS and Energy AuditIntroduction to Energy Efficiency, EMS and Energy Audit
Introduction to Energy Efficiency, EMS and Energy Audit
eecfncci
 
Training developers in user research af Søeren Vinther Færch, AAU
Training developers in user research af Søeren Vinther Færch, AAUTraining developers in user research af Søeren Vinther Færch, AAU
Training developers in user research af Søeren Vinther Færch, AAU
InfinIT - Innovationsnetværket for it
 
5 1 6 T o w a r d A l t e r n a t i v e s i n H e a l t h .docx
5 1 6  T o w a r d  A l t e r n a t i v e s  i n  H e a l t h .docx5 1 6  T o w a r d  A l t e r n a t i v e s  i n  H e a l t h .docx
5 1 6 T o w a r d A l t e r n a t i v e s i n H e a l t h .docx
alinainglis
 
SC4 Workshop 1: Helena Gellerman: data analyses in transport
SC4 Workshop 1: Helena Gellerman: data analyses in transport SC4 Workshop 1: Helena Gellerman: data analyses in transport
SC4 Workshop 1: Helena Gellerman: data analyses in transport
BigData_Europe
 
System suitability parameters assessment by HPLC
System suitability parameters assessment by HPLCSystem suitability parameters assessment by HPLC
System suitability parameters assessment by HPLC
Anirban Barik
 
Us case-study
Us case-studyUs case-study

Similar to Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific Statistics (20)

ISBNPA_2016_JM
ISBNPA_2016_JMISBNPA_2016_JM
ISBNPA_2016_JM
 
MCCIA 014
MCCIA 014MCCIA 014
MCCIA 014
 
Kostogryzov-for china-2013
 Kostogryzov-for china-2013 Kostogryzov-for china-2013
Kostogryzov-for china-2013
 
E xact micro 10 photometer v4
E xact micro 10 photometer v4E xact micro 10 photometer v4
E xact micro 10 photometer v4
 
Acute stroke imaging and intervention-dr. n khandelwal
Acute stroke  imaging and intervention-dr. n khandelwalAcute stroke  imaging and intervention-dr. n khandelwal
Acute stroke imaging and intervention-dr. n khandelwal
 
Blood pressure
Blood pressureBlood pressure
Blood pressure
 
Financial systems and banking crises, an assessment
Financial systems and banking crises, an assessmentFinancial systems and banking crises, an assessment
Financial systems and banking crises, an assessment
 
A Deep Dive into the Socio-Technical Aspects of Delays in Security Patching
A Deep Dive into the Socio-Technical Aspects of Delays in Security PatchingA Deep Dive into the Socio-Technical Aspects of Delays in Security Patching
A Deep Dive into the Socio-Technical Aspects of Delays in Security Patching
 
Certificates and Credentials new
Certificates and Credentials newCertificates and Credentials new
Certificates and Credentials new
 
Scanned by CamScannerG o o d w M P r e p a id r e n t.docx
Scanned by CamScannerG o o d w M  P r e p a id  r e n t.docxScanned by CamScannerG o o d w M  P r e p a id  r e n t.docx
Scanned by CamScannerG o o d w M P r e p a id r e n t.docx
 
Reflective Teaching: Improving Library Instruction Through Self-Reflection
Reflective Teaching: Improving Library Instruction Through Self-ReflectionReflective Teaching: Improving Library Instruction Through Self-Reflection
Reflective Teaching: Improving Library Instruction Through Self-Reflection
 
Dissertation
DissertationDissertation
Dissertation
 
erc_rapport_2009-2014
erc_rapport_2009-2014erc_rapport_2009-2014
erc_rapport_2009-2014
 
Strategic Cartography: Identifying IL Intersections Across the Curriculum
Strategic Cartography: Identifying IL Intersections Across the CurriculumStrategic Cartography: Identifying IL Intersections Across the Curriculum
Strategic Cartography: Identifying IL Intersections Across the Curriculum
 
Introduction to Energy Efficiency, EMS and Energy Audit
Introduction to Energy Efficiency, EMS and Energy AuditIntroduction to Energy Efficiency, EMS and Energy Audit
Introduction to Energy Efficiency, EMS and Energy Audit
 
Training developers in user research af Søeren Vinther Færch, AAU
Training developers in user research af Søeren Vinther Færch, AAUTraining developers in user research af Søeren Vinther Færch, AAU
Training developers in user research af Søeren Vinther Færch, AAU
 
5 1 6 T o w a r d A l t e r n a t i v e s i n H e a l t h .docx
5 1 6  T o w a r d  A l t e r n a t i v e s  i n  H e a l t h .docx5 1 6  T o w a r d  A l t e r n a t i v e s  i n  H e a l t h .docx
5 1 6 T o w a r d A l t e r n a t i v e s i n H e a l t h .docx
 
SC4 Workshop 1: Helena Gellerman: data analyses in transport
SC4 Workshop 1: Helena Gellerman: data analyses in transport SC4 Workshop 1: Helena Gellerman: data analyses in transport
SC4 Workshop 1: Helena Gellerman: data analyses in transport
 
System suitability parameters assessment by HPLC
System suitability parameters assessment by HPLCSystem suitability parameters assessment by HPLC
System suitability parameters assessment by HPLC
 
Us case-study
Us case-studyUs case-study
Us case-study
 

Recently uploaded

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 

Recently uploaded (20)

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 

Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific Statistics

  • 1. Extractive Evidence Based Medicine Summarisation Based on Sentence-Specific Statistics Abeed Sarker1 1 Centre Diego Moll´1 a C´cile Paris2 e for Language Technology, Macquarie University, Sydney 2 CSIRO ICT Centre, Sydney CBMS 2012, Rome
  • 2. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 2/28
  • 3. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 3/28
  • 4. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 4/28
  • 6. Background Method Evaluation EBM and Natural Language Processing http://hlwiki.slais.ubc.ca/index.php?title=Five_steps_of_EBM NLP tasks Question analysis and classification Information Retrieval Classification and re-ranking Information extraction Question answering Summarisation EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 6/28
  • 7. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 7/28
  • 8. Background Method Evaluation General Approach In a Nutshell 1. Gather statistics from the best 3-sentence extracts. Exhaustive search to find these best extracts. 2. Build three classifiers, one per sentence in the final extract. Classifier 1 based on statistics from best 1st sentence. Classifier 2 based on statistics from best 2nd sentence. Classifier 3 based on statistics from best 3rd sentence. EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 8/28
  • 9. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 9/28
  • 10. Background Method Evaluation Journal of Family Practice’s “Clinical Inquiries” EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 10/28
  • 11. Background Method Evaluation The XML Contents I <r e c o r d i d =”7843”> <u r l >h t t p : / /www . j f p o n l i n e . com/ Pages . a s p ? AID=7843&amp ; i s s u e=S e p t e m b e r 2 0 0 9&amp ; UID=</u r l > <q u e s t i o n >Which t r e a t m e n t s work b e s t f o r h e m o r r h o i d s ?</ q u e s t i o n > <a n sw e r> <s n i p i d =”1”> <s n i p t e x t >E x c i s i o n i s t h e most e f f e c t i v e t r e a t m e n t f o r t h r o m b o s e d e x t e r n a l h e m o r r h o i d s .</ s n i p t e x t > <s o r t y p e =”B”> r e t r o s p e c t i v e s t u d i e s </s o r> <l o n g i d =”1 1”> <l o n g t e x t >A r e t r o s p e c t i v e s t u d y o f 231 p a t i e n t s t r e a t e d c o n s e r v a t i v e l y o r s u r g i c a l l y f o u n d t h a t t h e 48.5% o f p a t i e n t s t r e a t e d s u r g i c a l l y had a l o w e r r e c u r r e n c e r a t e t h a n t h e c o n s e r v a t i v e g r o u p ( number n e e d e d t o t r e a t [NNT]=2 f o r r e c u r r e n c e a t mean f o l l o w −up o f 7 . 6 months ) and e a r l i e r r e s o l u t i o n o f symptoms ( a v e r a g e 3 . 9 d a y s compared w i t h 24 d a y s f o r c o n s e r v a t i v e t r e a t m e n t ). </ l o n g t e x t > <r e f i d =”15486746” a b s t r a c t =” A b s t r a c t s / 1 5 4 8 6 7 4 6 . xml”>G r e e n s p o n J , W i l l i a m s SB , Young HA , e t a l . Thrombosed e x t e r n a l h e m o r r h o i d s : outcome a f t e r c o n s e r v a t i v e o r s u r g i c a l management . D i s C o l o n Rectum . 2 0 0 4 ; 4 7 : 1493−1498.</ r e f > </l o n g> <l o n g i d =”1 2”> <l o n g t e x t >A r e t r o s p e c t i v e a n a l y s i s o f 340 p a t i e n t s who u n d e r w e n t o u t p a t i e n t e x c i s i o n of thrombosed e x t e r n a l hemorrhoids under l o c a l a n e s t h e s i a r e p o r t e d a l ow r e c u r r e n c e r a t e o f 6.5% a t a mean f o l l o w −up o f 1 7 . 3 months .</ l o n g t e x t > EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 11/28
  • 12. Background Method Evaluation The XML Contents II <r e f i d =”12972967” a b s t r a c t =” A b s t r a c t s / 1 2 9 7 2 9 6 7 . xml”>Jongen J , Bach S , S t u b i n g e r SH , e t a l . E x c i s i o n o f t h r o m b o s e d e x t e r n a l hemorrhoids under l o c a l a n e s t h e s i a : a r e t r o s p e c t i v e e v a l u a t i o n o f 340 p a t i e n t s . D i s C o l o n Rectum . 2 0 0 3 ; 4 6 : 1226−1231.</ r e f > </l o n g> <l o n g i d =”1 3”> <l o n g t e x t >A p r o s p e c t i v e , r a n d o m i z e d c o n t r o l l e d t r i a l (RCT) o f 98 p a t i e n t s t r e a t e d n o n s u r g i c a l l y found improved pain r e l i e f with a c o m b i n a t i o n o f t o p i c a l n i f e d i p i n e 0.3% and l i d o c a i n e 1.5% compared w i t h l i d o c a i n e a l o n e . The NNT f o r c o m p l e t e p a i n r e l i e f a t 7 d a y s was 3.</ l o n g t e x t > <r e f i d =”11289288” a b s t r a c t =” A b s t r a c t s / 1 1 2 8 9 2 8 8 . xml”> P e r r o t t i P , A n t r o p o l i C, Molino D , et a l . C o n s e r v a t i v e treatment of acute thrombosed e x t e r n a l hemorrhoids with t o p i c a l n i f e d i p i n e . Dis C o l o n Rectum . 2 0 0 1 ; 4 4 : 405−409.</ r e f > </l o n g> </s n i p > </a ns w e r> </r e c o r d > EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 12/28
  • 13. Background Method Evaluation Corpus Statistics Size 456 questions (“records”). Over 1,100 distinct answers (“snips”). 3,036 text explanations (“longs”). 2,707 references. EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 13/28
  • 14. Background Method Evaluation Summarisation Using This Corpus Input Question. Document Abstract. Output Extractive summary that answers the question. Target summary is the annotated evidence text (“long”). Evaluated using ROUGE-L. EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 14/28
  • 15. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 15/28
  • 16. Background Method Evaluation The Statistics Gathered 1. Source sentence position. 2. Sentence length. 3. Sentence similarity. 4. Sentence type. EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 16/28
  • 17. Background Method Evaluation 1. Source Sentence Position Compute relative positions. Create normalised frequency histograms f1 , f2 , . . . , f10 . Score all relative positions of bin i with its bin frequency: Spos (i) = fbin(i) . EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 17/28
  • 18. Background Method Evaluation 2. Sentence Length Reward larger sentences and penalise shorter sentences: Normalised sentence length Slen (i) = ls − lavg ld ls : sentence length lavg : average sentence length in the corpus ld : document length EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 18/28
  • 19. Background Method Evaluation 3. Sentence Similarity Sentence Similarity Lowercase, stem, remove stop words. Build vector of tf .idf with remaining words and UMLS semantic types. CosSim(X , Y ) = X .Y |X ||Y | Maximal Marginal Relevance (Carbonell & Goldstein, 1998) Reward sentences similar to the query and penalise those similar to other summary sentences. MMR = λ(CosSim(Si , Q)) −(1 − λ)maxSj S (CosSim(Si , Sj )) EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 19/28
  • 20. Background Method Evaluation 4. PIBOSO (Kim et al. 2011) I 1. Classify all sentences into PIBOSO types (a variant of PICO). 2. Generate normalised frequency histograms of resulting PIBOSO types. EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 20/28
  • 21. Background Method Evaluation 4. PIBOSO (Kim et al. 2011) II Position independent Pbest : Pbest SPIPS (i) = Pall Position dependent SPDPS (i) = Ppos Pbest EBM Summarisation proportion of this PIBOSO type among all best summary sentences. Pall : proportion of this PIBOSO type among all sentences. Ppos : proportion of this PIBOSO type among at best summary sentences at this position. Abeed Sarker, Diego Moll´, C´cile Paris a e 21/28
  • 22. Background Method Evaluation Classification Edmunsonian Formula SSi = αSrposi + βSleni + γSPIPSi +δSPDPSi + SMMRi MMR is replaced with cosine similarity for first sentence. In case of ties, the sentence with greatest length is chosen. Parameters are fine-tuned through exhaustive search using training set. α = 1.0, β = 0.8, γ = 0.1, δ = 0.8, EBM Summarisation = 0.1, λ = 0.1. Abeed Sarker, Diego Moll´, C´cile Paris a e 22/28
  • 23. Background Method Evaluation Contents Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 23/28
  • 24. Background Method Evaluation Percentile-based Evaluation (Ceylan et al. 2010) I We compare against all possible 3-sentence extracts in the test set. 1. Bin all possible three-sentence combinations of each abstract. 1,000 bins. 2. Normalise the resulting histograms. 3. Combine all histograms. convolution. 4. The result approximates the probability density distribution of all three-sentence summaries in all abstracts. EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 24/28
  • 25. Background Method Evaluation Percentile-based Evaluation (Ceylan et al. 2010) II EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 25/28
  • 26. Background Method Evaluation Systems L3 Last three sentences. O3 Last three PIBOSO outcome sentences. R Random. O All outcome sentences. PI Sentence position independent. PD Sentence position dependent (our proposal). EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 26/28
  • 28. Background Method Evaluation Questions? Background Evidence Based Medicine Method Corpus Generation of Statistics Evaluation Further Information http://web.science.mq.edu.au/~diego/medicalnlp/ EBM Summarisation Abeed Sarker, Diego Moll´, C´cile Paris a e 28/28