SlideShare a Scribd company logo
1 of 16
Multi-Label Modality
Classification for
Figures in Biomedical
Literature
Athanasios Lagopoulos, PhD Student
School of Informatics, Aristotle University of Thessaloniki
lathanag@csd.auth.gr
Anestis Fachantidis (afa@csd.auth.gr), Grigorios Tsoumakas (greg@csd.auth.gr)
30th International Symposium
on Computer-Based Medical
Systems - IEEE CBMS 2017
The Case
PubMed Central (PMC):
• >4 million figures available
• Great source of information for biomedical
research, education and clinical decision.
Lack of associated meta-data
Inaccessible information
30 different modalities/categories
as proposed by ImageCLEF
Simple Vs Compound
40% are compound figures (multi-panel format)
Simple (60%) Compound (40%)
The Standard Approach
Compound
Figure
Detection
Multi-class
Model
Simple Figure
Compound Subfigures
Figure Separation
Algorithm
Figure Separation is not perfect (~85%)
Figure Isolation - Information Loss
Our Approach: Multi-Label Classification
No use of figure separation algorithm
Three different multi-label learning approaches:
• Simple
• Standard
• Extended
Simple multi-label approach
Multi-label
model
Compound Simple
Training
Prediction
Standard multi-label approach
Compound
Figure
Detection
Simple
Compound
Multi-class
Model
Multi-label
Model
Compound
Figure
Detection
Multi-class
Model
Multi-label
Model
Compound Simple
PredictionTraining
Extended multi-label approach
Compound
Figure
Detection
Simple
Compound
Multi-class
Model
Multi-label
Model
Compound
Figure
Detection
Multi-class
Model
Multi-label
Model
Compound Simple
PredictionTraining
Model Training
Feature Extraction from JPEG
• BVLC model - Caffe1
• Deep learning (1.2 million images)
• 4096 visual features/figure
Linear Support Vector Machines (SVMs)
• scikit-learn2
• One-vs-Rest transformation (multiple binaries)
1
http://caffe.berkeleyvision.org/
2
http://scikit-learn.org/
ImageCLEF 2016 dataset
20.985 Figures
1.568 Compound
No simple figures with categories
Extracted subfigures as simple
Split 40% - 60% (compound –
subfigures) in order to resemble
the distribution of PMC
Results
Approach F1-Macro F1-Micro F1-Samples
Standard 0.3569 0.7786 0.7912
Simple multi-label 0.3139 0.7581 0.7215
Standard multi-label 0.3270 0.7667 0.7726
Extended multi-label 0.3309 0.7666 0.7728
Perfect (100%)
figure separation
Compound Figure Detection: 88,83% (Balanced Accuracy)
Multi-class model: 79,54% (F1-micro)
The System
Web app
Weekly updates from PMC
Extended multi-label approach
Easy search & filtering by modality
Build with Apache Solr & AngularJS
Available @
atypon.csd.auth.gr/medieval/
Conclusion
No information loss
Model redundancy
Promising results
Web application
Future Work
Backend:
• Textual features + Visual features
Frontend:
• User feedback, crowdsourcing
• Active learning
THANK YOU!
intelligence.csd.auth.gr
Questions?
atypon.csd.auth.gr/medieval
Partially funded by
A. Lagopoulos, A. Fachantidis, G. Tsoumakas
Multi-Label Modality Classification for
Figures in Biomedical Literature
{lathanag,afa,greg}@csd.auth.gr
Atypon Systems Inc.

More Related Content

Similar to Multi-Label Modality Classification for Figures in Biomedical Literature

A Framework for Statistical Simulation of Physiological Responses (SSPR).
A Framework for Statistical Simulation of Physiological Responses (SSPR).A Framework for Statistical Simulation of Physiological Responses (SSPR).
A Framework for Statistical Simulation of Physiological Responses (SSPR).Waqas Tariq
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsNatalio Krasnogor
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcUSD Bioinformatics
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Seattle DAML meetup
 
Top 50 ML Ques & Ans.pdf
Top 50 ML Ques & Ans.pdfTop 50 ML Ques & Ans.pdf
Top 50 ML Ques & Ans.pdfJetender Sharma
 
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...Waqas Tariq
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical DataPaul Agapow
 
Large scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyLarge scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyMaté Ongenaert
 
Machine learning to solve bioinformatics problems
Machine learning to solve bioinformatics problemsMachine learning to solve bioinformatics problems
Machine learning to solve bioinformatics problemsJunaidAKG
 
Challenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchChallenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchFranciscoJAzuajeG
 
The Power of Topology - Colleen Farrelly - WiDS Miami 2018
The Power of Topology - Colleen Farrelly - WiDS Miami 2018The Power of Topology - Colleen Farrelly - WiDS Miami 2018
The Power of Topology - Colleen Farrelly - WiDS Miami 2018Catalina Arango
 
Women in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
Women in Data Science 2018 Slides--Small Samples, Subgroups, and TopologyWomen in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
Women in Data Science 2018 Slides--Small Samples, Subgroups, and TopologyColleen Farrelly
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
A preliminary survey on optimized multiobjective metaheuristic methods for da...
A preliminary survey on optimized multiobjective metaheuristic methods for da...A preliminary survey on optimized multiobjective metaheuristic methods for da...
A preliminary survey on optimized multiobjective metaheuristic methods for da...ijcsit
 
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...University of Malaya
 
Real life application of statistics in engineering
Real life application of statistics in engineeringReal life application of statistics in engineering
Real life application of statistics in engineeringJannatulFerdous160
 

Similar to Multi-Label Modality Classification for Figures in Biomedical Literature (20)

A Framework for Statistical Simulation of Physiological Responses (SSPR).
A Framework for Statistical Simulation of Physiological Responses (SSPR).A Framework for Statistical Simulation of Physiological Responses (SSPR).
A Framework for Statistical Simulation of Physiological Responses (SSPR).
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric Bioinformatics
 
NTU-2019
NTU-2019NTU-2019
NTU-2019
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmc
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
 
Developing a Replicable Methodology for Automated Identification of Emerging ...
Developing a Replicable Methodology for Automated Identification of Emerging ...Developing a Replicable Methodology for Automated Identification of Emerging ...
Developing a Replicable Methodology for Automated Identification of Emerging ...
 
Top 50 ML Ques & Ans.pdf
Top 50 ML Ques & Ans.pdfTop 50 ML Ques & Ans.pdf
Top 50 ML Ques & Ans.pdf
 
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
Decision Tree Classifiers to determine the patient’s Post-operative Recovery ...
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
 
Large scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biologyLarge scale machine learning challenges for systems biology
Large scale machine learning challenges for systems biology
 
Machine learning to solve bioinformatics problems
Machine learning to solve bioinformatics problemsMachine learning to solve bioinformatics problems
Machine learning to solve bioinformatics problems
 
Ijetr021252
Ijetr021252Ijetr021252
Ijetr021252
 
Challenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchChallenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical research
 
The Power of Topology - Colleen Farrelly - WiDS Miami 2018
The Power of Topology - Colleen Farrelly - WiDS Miami 2018The Power of Topology - Colleen Farrelly - WiDS Miami 2018
The Power of Topology - Colleen Farrelly - WiDS Miami 2018
 
Women in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
Women in Data Science 2018 Slides--Small Samples, Subgroups, and TopologyWomen in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
Women in Data Science 2018 Slides--Small Samples, Subgroups, and Topology
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
A preliminary survey on optimized multiobjective metaheuristic methods for da...
A preliminary survey on optimized multiobjective metaheuristic methods for da...A preliminary survey on optimized multiobjective metaheuristic methods for da...
A preliminary survey on optimized multiobjective metaheuristic methods for da...
 
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
 
Real life application of statistics in engineering
Real life application of statistics in engineeringReal life application of statistics in engineering
Real life application of statistics in engineering
 
Clinical Data and AI
Clinical Data and AIClinical Data and AI
Clinical Data and AI
 

Recently uploaded

社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeralNABLAS株式会社
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理cyebo
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdfvyankatesh1
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonPayment Village
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfEmmanuel Dauda
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理pyhepag
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp onlinebalibahu1313
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyRafigAliyev2
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Jon Hansen
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理pyhepag
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsCEPTES Software Inc
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 

Recently uploaded (20)

社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeral
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 

Multi-Label Modality Classification for Figures in Biomedical Literature