1. Feedback-Driven Radiology Exam Report
Retrieval with Semantics
Sarasi Lalithsena
Amit Sheth
Kno.e.sis Center
Wright State University
Luis Tari
Steven Gustafson
GE Global Research
Ann von Reden
Benjamin Wilson
Brian Kolowitz
UPMC Enterprises
John Kalafut
GE Healthcare
2. 2
Motivation
Reason for exam
Secondary diagnoses related by
anatomical region or disease type
Follow up exam for an existing
condition
Specific hypothesis or inquiry
mentioned
A radiologists will be interested to know the background,
3. 3
Problem
Reason for exam:
Foot pain
Vast amount of data
EMR, Radiology report
R1:……………………….
R2:……………………….
R3:History Diabetes, Smoker
……………………………
……………………………
Rn:...........................................
…..
R3:History Diabetes, Smoker
……
4. 4
Challenges
• Capture the semantics to find relevant data
Patient X : Current Study
Reason for Exam: Foot Pain
…………………………………………………………………………………
R1 : Lower Extremity Venous Insufficient Doppler Ultra sound
R2 : History: Diabetes Former Smoker, History of Hypertension
RX : Generalized Abdominal Pain
……………………………………………………………………………………
……………………………………………………………………………………
Patient X : Prior Reports
Lower Extremity Venous
Insufficient Doppler Ultra
sound
History: Diabetes Former
Smoker, History of
Hypertension
Generalized Abdominal Pain
Foot partOf Lower
Extremity
Foot Pain associatedTo
Diabetes
• Lexical Similarity
Measures falls behind
when there is no lexical
similarity
• Knowledge-based
similarity measures only
uses taxonomical
relationships
5. 5
Challenges
• Personalize the relevancy based on radiologist's need
Patient X : Current Study
Reason for Exam: XR Chest, Rib Fracture
Abdomen X-RAY
MRI CHEST WALL
Neuro Specialist Musculoskeletal Specialist
More
Relevant
More
Relevant
10. 10
Our Approach: Explicit Relevance Feedback
• Explicit relevance feedback is used in information retrieval to
improve the relevance of the results
• Useful when users have a general conception on what they are
looking for
• Adopted an existing algorithm “Rocchio Algorithm” to work with
the ratings provided by the users
*
!
* *
*
!
* *
*
!
!
* Relevant Record
! Non-Relevant Record
OO
MO
Original Query Vector
Modified Query Vector
11. 11
Our Approach: Explicit Relevance Feedback
Modified Rocchio query vector
𝒒 𝒎 = 𝒂𝒒 𝒐 + 𝒃
𝟏
𝑫 𝒓
𝒊=𝟏
𝟓
𝒘𝒊 𝒅 𝒋ε𝑫 𝒊
𝒅𝒋 + 𝒄
𝟏
𝑫 𝒏𝒓 𝒅 𝒌ε𝑫 𝒏𝒓
𝒅 𝒌
𝑨 = 𝑫 𝟏 𝑫 𝟐 𝑫 𝟑 𝑫 𝟒 𝑫 𝟓
qm - modified query vector
qo - original query vector
Dr - relevant patient records
Dnr - non relevant patient records
Di - patient records rated as i
a = 1 b = 0.6 c = 0.1
Short reason for exam, limited interaction with the patient
[Add one sentence about the personalization aspect]
Radiologists typically spend only (1 – 5 ) minutes for X-Ray and 20 – 30 minutes In MRI data
The larger the time they spent on the EMR, they will have less in interpreting the results
EMR tends to be oriented towards producing physician and specialty reports and not for radiology work flow needs
[Some explanation]
This simulates the process of user changing the original query by looking at the results.
User's search query is revised to include an arbitrary percentage of relevant and non-relevant documents as a means of increasing the search engine's recall, and possibly the precision as well
[Can we make this little more clear]
modify the query vector in a direction closer to related document and farther away to non-related documents
values for b and c should be incremented or decremented proportionally to the set of documents classified by the user
If the user decides that the modified query should not contain terms from either the original query, related documents, or non-related documents, then the corresponding weight (a, b, c) value for the category should be set to 0.
Similar to the nearest centroid classifier in classification models
As the weights are increased or decreased for a particular category of documents, the coordinates for the modified vector begin to move either closer, or farther away, from the centroid of the document collection. Thus if the weight is increased for related documents, then the modified vectors coordinates will reflect being closer to the centroid of related documents.
optimal query is the vector difference between the centroids of the relevant and nonrelevant documents
Limitation - what it could not work because we do not the set of full relevant documents
R-Precision is the precision after R documents have been retrieved, where R is the number of relevant documents for the topic.
R-precision refers to the best precision on the precision/recall curve when considering a set of rel relevant documents and looking at the rel first answers.
Rank cutoff R is the point at which precision and recall are equal, since at that point both are r/R