SlideShare a Scribd company logo
Scientific Table Type
Classification in Digital Library
Seongchan Kim, Keejun Han, Ying Liu D
ept. of Knowledge Service Engineering K
AIST, Korea
Soon Young Kim
Dept. of Overseas Information
KISTI, Korea
Sept. 6, 2012 (3:35-3:55)
Outline
Introduction
Table Type Taxonomy
IMRAD-Base
d Fine-Graine
d
Table Type Distribution
Classification
Conclusion
Outline
Introduction
Table Type Taxonomy
IMRAD-Base
d Fine-Graine
d
Classification
Conclusion
4 / 20
Introduction
• Are there any special types of tables in papers in scientific papers?
If yes? What are they?
5 / 20
Introduction
• Are there any special types of tables in papers in scientific papers?
If yes? What are they?
Outline
Introduction
Table Type Taxonomy
IMRAD-Base
d Fine-Graine
d
Classification
Conclusion
7 / 20
Table Type Taxonomy
2,500 tables randomly from 25 randomly selected scie
ntific journals published by Springer from 2006 to 2010
Biomedical and Life Science, Chemistry and Materials S
cience, Computer Science, Electrical Engineering, and
Medicine
TableSeer
We found
 IMRAD-based table taxonomy
 Fine-grained table taxonomy
IMRAD-Based Table Taxonomy
Considering the structural position of tables within a
document
Table type is simply decided by the location of the
table
 E.g) if a table is in the introduction part of the paper
 Introduction table Other functions
Scientific
Table
Introduction
Table
Method
Table
Resul
t Tabl
e
Discussio
n Table
8 / 20
Fine-Grained Table Taxonomy
Table type is decided by table contents and purposes
Scientific
Table
Definition
Table
Statistic
Table
Survey
Table
Example
Table
Procedure
Table
Experiment
Setting T
able
Experiment
Result T
able
9 / 20
Fine-Grained Table Taxonomy
Definition Tables
 Consist of defining terms
and their explanations
 Usually appears before
the experiment
10 / 20
Fine-Grained Table Taxonomy
Statistics/Distribution
Tables
 Common statistical or
distribution data
 Not related with the c
urrent experiment being c
arried out in the paper
11 / 20
Fine-Grained Table Taxonomy
Survey Question/Result Table
 Contain questionnaires of the
those questionnaires
survey and the results of
12 / 20
Fine-Grained Table Taxonomy
Example Table
 Show instances that introduce and emphasize something
that needs to be explained clearly
13 / 20
Fine-Grained Table Taxonomy
Procedure Tables
 Describe the sequence
,
methods
step, flow, or schedule of the
14 / 20
Fine-Grained Table Taxonomy
Experiment Setting Tables
 Describe items required for the experiment
 configurations, parameters, data, apparatus, etc.
15 / 20
Fine-Grained Table Taxonomy
Experiment Setting Tables
 accompanied with a summary describing the output of the
experiment
 Some are shown comparing the other results
16 / 20
Table Type Distribution
By IMRAD Taxonomy
3.7
%
20.2
%
74.6
%
1.5
%
Introduction
Methods
Result
s
Discussion
2.9% 5.0%
0.3%
3.3%
1.6%
14.9%
72.0%
Definition St
atistics Surv
ey Example
Procedure
Exp. Setting
Exp. Result
17 / 20
By Fine-Grained Taxonomy
Annotation
 2,380 and 2,324 tables that had agreed on labeling from
more that two annotators out of 2,500 tables
 Inter-Annotator Agreement
• = 0.64 for IMRAD annotation
• = 0.53 for fine-grained annotation
Outline
Introduction
Table Type Taxonomy
IMRAD-Base
d Fine-Graine
d
Classification
Conclusion
19 / 20
Experiment
A preliminary classification
 Only textual feature from Table
• Table Caption
• Table Reference Text
 Textual Information obtained from metadata of TableSeer [ ]
Settings
 DataSet
• 2,380 tables for IMRAD classification
• 2,324 tables for fine-grained classification
 10-fold Cross validation
 SVM and Decision Tree in Weka toolkit (default setting)
Experiment
20 / 20
C1: T1, T
2
C2: T3, T
4
W1 : appears T1 and T2
W2 : appears T1 and T3
21 / 20
Experiment
Result
Performance of IMRAD Classification by Features
Performance of Fine-grained Classification by Features
Features SVM Decision Tree
P R F P R F
Cap.(Baseline) 0.836 0.506 0.543 0.947 0.550 0.792
Ref. 0.875 0.705 0.761 0.930 0.639 0.73
Cap.+Ref. 0.967 0.784 0.866 0.938 0.746 0.831
Features SVM Decision Tree
P R F P R F
Cap.(Baseline) 0.627 0.333 0.397 0.522 0.271 0.302
Ref. 0.707 0.615 0.649 0.790 0.673 0.716
Cap.+Ref. 0.701 0.657 0.668 0.764 0.62 0.671
22 / 20
Experiment
Result
Performance of IMRAD Classification by Types
Type SVM Decision Tree
P R F P R F
Introduction 0.968 0.6 0.741 0.907 0.68 0.777
Methods 0.901 0.996 0.943 0.912 0.992 0.95
Results 1 1 1 1 1 1
Discussion 1 0.543 0.704 0.933 0.314 0.44
Macro Avg. 0.967 0.784 0.866 0.938 0.746 0.831
Micro Avg. 0.977 0.976 0.973 0.973 0.975 0.972
23 / 20
Experiment
Result
Performance of Fine-grained Classification by Types
Type SVM Decision Tree
P R F P R F
Definition 0.689 0.609 0.646 0.837 0.522 0.643
Statistics 0.699 0.879 0.779 0.898 0.681 0.775
Survey 0 0 0 0 0 0
Example 0.716 0.725 0.72 0.875 0.525 0.656
Procedure 0.9 0.486 0.632 1 0.649 0.787
Exp. Setting 0.905 0.9 0.902 0.74 0.963 0.837
Exp. Result 1 1 1 1 1 1
Macro Avg. 0.701 0.657 0.668 0.764 0.62 0.671
Micro Avg. 0.947 0.947 0.946 0.94 0.936 0.987
Outline
Introduction
Table Type Taxonomy
IMRAD-Base
d Fine-Graine
d
Classification
Conclusion
25 / 20
Conclusion
Introduced our study of table types and classifications
in scientific papers
 IMRAD-based Taxonomy
 Fine-Grained Taxonomy
Future Work
 Developing various features from table layout and contents

More Related Content

Similar to Scientific table type classification in digital library (DocEng 2012)

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSISFUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
Irene Pochinok
 
Extending BM25 with multiple query operators
Extending BM25 with multiple query operatorsExtending BM25 with multiple query operators
Extending BM25 with multiple query operators
Roi Blanco
 
Efficient blocking method for a large scale citation matching
Efficient blocking method for a large scale citation matchingEfficient blocking method for a large scale citation matching
Efficient blocking method for a large scale citation matching
Mateusz Fedoryszak
 
Lesson03
Lesson03Lesson03
Lesson03
Ning Ding
 
Data collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysisData collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysis
RobinsonRaja1
 
ADaM - Where Do I Start?
ADaM - Where Do I Start?ADaM - Where Do I Start?
ADaM - Where Do I Start?
Dr.Sangram Parbhane
 
ADaM
ADaMADaM
ictir2016
ictir2016ictir2016
ictir2016
Tetsuya Sakai
 
Error Propagation in Forest Planning Models
Error Propagation in Forest Planning ModelsError Propagation in Forest Planning Models
Error Propagation in Forest Planning Models
KR Walters Consulting Services
 
Lecture 1 - Overview.pptx
Lecture 1 - Overview.pptxLecture 1 - Overview.pptx
Lecture 1 - Overview.pptx
DrAnisFatima
 
Pasolli-Improving_active_learning_methods_using_spatial_information.ppt
Pasolli-Improving_active_learning_methods_using_spatial_information.pptPasolli-Improving_active_learning_methods_using_spatial_information.ppt
Pasolli-Improving_active_learning_methods_using_spatial_information.ppt
grssieee
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

Maxim Kazantsev
 
Measures of Dispersion.pptx
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptx
Melba Shaya Sweety
 
CDISCs_SDTM_basics.ppt
CDISCs_SDTM_basics.pptCDISCs_SDTM_basics.ppt
CDISCs_SDTM_basics.ppt
krishnacherukuri1
 
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.pptHCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
MadeeshShaik
 
Sophia He
Sophia HeSophia He
Sophia He
Sophia He
 
AS/A2 AQA Biology and Human Biology
AS/A2 AQA Biology and Human BiologyAS/A2 AQA Biology and Human Biology
AS/A2 AQA Biology and Human Biology
TeachersFirst
 
Intro to SPSS.ppt
Intro to SPSS.pptIntro to SPSS.ppt
Intro to SPSS.ppt
HasanGilani3
 
BRM Unit 3 Data Analysis-1.pptx
BRM Unit 3 Data Analysis-1.pptxBRM Unit 3 Data Analysis-1.pptx
BRM Unit 3 Data Analysis-1.pptx
VikasRai405977
 
Statistic report
Statistic reportStatistic report
Statistic report
Sonali Gupta
 

Similar to Scientific table type classification in digital library (DocEng 2012) (20)

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSISFUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
 
Extending BM25 with multiple query operators
Extending BM25 with multiple query operatorsExtending BM25 with multiple query operators
Extending BM25 with multiple query operators
 
Efficient blocking method for a large scale citation matching
Efficient blocking method for a large scale citation matchingEfficient blocking method for a large scale citation matching
Efficient blocking method for a large scale citation matching
 
Lesson03
Lesson03Lesson03
Lesson03
 
Data collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysisData collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysis
 
ADaM - Where Do I Start?
ADaM - Where Do I Start?ADaM - Where Do I Start?
ADaM - Where Do I Start?
 
ADaM
ADaMADaM
ADaM
 
ictir2016
ictir2016ictir2016
ictir2016
 
Error Propagation in Forest Planning Models
Error Propagation in Forest Planning ModelsError Propagation in Forest Planning Models
Error Propagation in Forest Planning Models
 
Lecture 1 - Overview.pptx
Lecture 1 - Overview.pptxLecture 1 - Overview.pptx
Lecture 1 - Overview.pptx
 
Pasolli-Improving_active_learning_methods_using_spatial_information.ppt
Pasolli-Improving_active_learning_methods_using_spatial_information.pptPasolli-Improving_active_learning_methods_using_spatial_information.ppt
Pasolli-Improving_active_learning_methods_using_spatial_information.ppt
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

 
Measures of Dispersion.pptx
Measures of Dispersion.pptxMeasures of Dispersion.pptx
Measures of Dispersion.pptx
 
CDISCs_SDTM_basics.ppt
CDISCs_SDTM_basics.pptCDISCs_SDTM_basics.ppt
CDISCs_SDTM_basics.ppt
 
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.pptHCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
HCLSIG$$Drug_Safety_and_Efficacy$CDISCs_SDTM_basics.ppt
 
Sophia He
Sophia HeSophia He
Sophia He
 
AS/A2 AQA Biology and Human Biology
AS/A2 AQA Biology and Human BiologyAS/A2 AQA Biology and Human Biology
AS/A2 AQA Biology and Human Biology
 
Intro to SPSS.ppt
Intro to SPSS.pptIntro to SPSS.ppt
Intro to SPSS.ppt
 
BRM Unit 3 Data Analysis-1.pptx
BRM Unit 3 Data Analysis-1.pptxBRM Unit 3 Data Analysis-1.pptx
BRM Unit 3 Data Analysis-1.pptx
 
Statistic report
Statistic reportStatistic report
Statistic report
 

Recently uploaded

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 

Recently uploaded (20)

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 

Scientific table type classification in digital library (DocEng 2012)

  • 1. Scientific Table Type Classification in Digital Library Seongchan Kim, Keejun Han, Ying Liu D ept. of Knowledge Service Engineering K AIST, Korea Soon Young Kim Dept. of Overseas Information KISTI, Korea Sept. 6, 2012 (3:35-3:55)
  • 2. Outline Introduction Table Type Taxonomy IMRAD-Base d Fine-Graine d Table Type Distribution Classification Conclusion
  • 3. Outline Introduction Table Type Taxonomy IMRAD-Base d Fine-Graine d Classification Conclusion
  • 4. 4 / 20 Introduction • Are there any special types of tables in papers in scientific papers? If yes? What are they?
  • 5. 5 / 20 Introduction • Are there any special types of tables in papers in scientific papers? If yes? What are they?
  • 6. Outline Introduction Table Type Taxonomy IMRAD-Base d Fine-Graine d Classification Conclusion
  • 7. 7 / 20 Table Type Taxonomy 2,500 tables randomly from 25 randomly selected scie ntific journals published by Springer from 2006 to 2010 Biomedical and Life Science, Chemistry and Materials S cience, Computer Science, Electrical Engineering, and Medicine TableSeer We found  IMRAD-based table taxonomy  Fine-grained table taxonomy
  • 8. IMRAD-Based Table Taxonomy Considering the structural position of tables within a document Table type is simply decided by the location of the table  E.g) if a table is in the introduction part of the paper  Introduction table Other functions Scientific Table Introduction Table Method Table Resul t Tabl e Discussio n Table 8 / 20
  • 9. Fine-Grained Table Taxonomy Table type is decided by table contents and purposes Scientific Table Definition Table Statistic Table Survey Table Example Table Procedure Table Experiment Setting T able Experiment Result T able 9 / 20
  • 10. Fine-Grained Table Taxonomy Definition Tables  Consist of defining terms and their explanations  Usually appears before the experiment 10 / 20
  • 11. Fine-Grained Table Taxonomy Statistics/Distribution Tables  Common statistical or distribution data  Not related with the c urrent experiment being c arried out in the paper 11 / 20
  • 12. Fine-Grained Table Taxonomy Survey Question/Result Table  Contain questionnaires of the those questionnaires survey and the results of 12 / 20
  • 13. Fine-Grained Table Taxonomy Example Table  Show instances that introduce and emphasize something that needs to be explained clearly 13 / 20
  • 14. Fine-Grained Table Taxonomy Procedure Tables  Describe the sequence , methods step, flow, or schedule of the 14 / 20
  • 15. Fine-Grained Table Taxonomy Experiment Setting Tables  Describe items required for the experiment  configurations, parameters, data, apparatus, etc. 15 / 20
  • 16. Fine-Grained Table Taxonomy Experiment Setting Tables  accompanied with a summary describing the output of the experiment  Some are shown comparing the other results 16 / 20
  • 17. Table Type Distribution By IMRAD Taxonomy 3.7 % 20.2 % 74.6 % 1.5 % Introduction Methods Result s Discussion 2.9% 5.0% 0.3% 3.3% 1.6% 14.9% 72.0% Definition St atistics Surv ey Example Procedure Exp. Setting Exp. Result 17 / 20 By Fine-Grained Taxonomy Annotation  2,380 and 2,324 tables that had agreed on labeling from more that two annotators out of 2,500 tables  Inter-Annotator Agreement • = 0.64 for IMRAD annotation • = 0.53 for fine-grained annotation
  • 18. Outline Introduction Table Type Taxonomy IMRAD-Base d Fine-Graine d Classification Conclusion
  • 19. 19 / 20 Experiment A preliminary classification  Only textual feature from Table • Table Caption • Table Reference Text  Textual Information obtained from metadata of TableSeer [ ] Settings  DataSet • 2,380 tables for IMRAD classification • 2,324 tables for fine-grained classification  10-fold Cross validation  SVM and Decision Tree in Weka toolkit (default setting)
  • 20. Experiment 20 / 20 C1: T1, T 2 C2: T3, T 4 W1 : appears T1 and T2 W2 : appears T1 and T3
  • 21. 21 / 20 Experiment Result Performance of IMRAD Classification by Features Performance of Fine-grained Classification by Features Features SVM Decision Tree P R F P R F Cap.(Baseline) 0.836 0.506 0.543 0.947 0.550 0.792 Ref. 0.875 0.705 0.761 0.930 0.639 0.73 Cap.+Ref. 0.967 0.784 0.866 0.938 0.746 0.831 Features SVM Decision Tree P R F P R F Cap.(Baseline) 0.627 0.333 0.397 0.522 0.271 0.302 Ref. 0.707 0.615 0.649 0.790 0.673 0.716 Cap.+Ref. 0.701 0.657 0.668 0.764 0.62 0.671
  • 22. 22 / 20 Experiment Result Performance of IMRAD Classification by Types Type SVM Decision Tree P R F P R F Introduction 0.968 0.6 0.741 0.907 0.68 0.777 Methods 0.901 0.996 0.943 0.912 0.992 0.95 Results 1 1 1 1 1 1 Discussion 1 0.543 0.704 0.933 0.314 0.44 Macro Avg. 0.967 0.784 0.866 0.938 0.746 0.831 Micro Avg. 0.977 0.976 0.973 0.973 0.975 0.972
  • 23. 23 / 20 Experiment Result Performance of Fine-grained Classification by Types Type SVM Decision Tree P R F P R F Definition 0.689 0.609 0.646 0.837 0.522 0.643 Statistics 0.699 0.879 0.779 0.898 0.681 0.775 Survey 0 0 0 0 0 0 Example 0.716 0.725 0.72 0.875 0.525 0.656 Procedure 0.9 0.486 0.632 1 0.649 0.787 Exp. Setting 0.905 0.9 0.902 0.74 0.963 0.837 Exp. Result 1 1 1 1 1 1 Macro Avg. 0.701 0.657 0.668 0.764 0.62 0.671 Micro Avg. 0.947 0.947 0.946 0.94 0.936 0.987
  • 24. Outline Introduction Table Type Taxonomy IMRAD-Base d Fine-Graine d Classification Conclusion
  • 25. 25 / 20 Conclusion Introduced our study of table types and classifications in scientific papers  IMRAD-based Taxonomy  Fine-Grained Taxonomy Future Work  Developing various features from table layout and contents