SlideShare a Scribd company logo
1 of 19
Download to read offline
Using Text Comprehension Model for
Learning Concepts, Context, and Topic
of Web Content
11th International Conference on Semantic Computing
IEEE ICSC 2017 - San Diego, California, USA
Jan 30-Feb 1, 2017
Ismael Ali, Naser Al Madi, Austin Melton
Department of Computer Science
Kent State University
Outline
• Text Comprehension
• System Architecture and Workflow
• Semantic Learning
– Semantic Network Construction
– Mathematical Foundation
– Domain Concept Learning
– Topic Learning
– Context Learning
• Experimental Design
• Evaluation Strategy
• Results
• Conclusion and Future Works
Abstract
• Role of learning Semantics including concepts, contexts, and
topics from web documents
– semantic-based structuring and retrieving
• We present a novel approach for domain-independent
semantic learning.
• Our approach uses a computational version of the
Construction-Integration (CI) model of text comprehension.
Text Comprehension
• Comprehension is a cognitive-based learning process
• Comprehension produces the mental representations:
– perceptual
– verbal
– semantic representations
• CI model simulates the incremental and dynamic task of
comprehending the text and it leads to the construction of a
semantic network (SN)
CI as a Cognitive Model of Text Comprehension
This figure from: (Cathleen Wharton and Walter Kintsch, 1991 in ACM SIGART Bulletin)
Surface
Model
Text-Base
Model
Situation
Model
Situation
Model• Time of acquisition
• Recognizing main
concepts
• Integrating them with
background knowledge
System Architecture and Workflow
Using Stanford CoreNLP
1. Text tokenization
2. Lemmatization
3. Sentence splitting
- To get the Surface Model.
4. Part of Speech Tagging
5. Anaphora Resolution
Running the
computational CI model
to produce weighted
semantic network
Analysis and
filtering of the
weighted semantic
networks
Semantic Network Construction
• Sentences are presented as single units of time (a reading
episode)
• “Knowledge is a familiarity. Awareness or understanding of
something. Such as facts.”
Recognized Concepts
Neglected Concepts
Recognized Associations
Neglected Associations
Fig. 2. Sample Concept Network.
(After running the CI model)
• “Knowledge is a familiarity. Awareness or understanding of something. Such
as facts.”
• Episodes of {e1
, e2
, ... , ei
} are background knowledge for episode {ei+1
}
• Weights on edges represents the semantic association strength
Fig. 2. Sample Concept Network.
(After running the CI model)
1. concept recognition threshold (S) is 7
for Fig. 2
– s(“something”) = 6
– e1 + e2 < S
– s(“Awareness”) = 12
– e3 + e4 > S
2. association recognition threshold (I)
is 5 for Fig. 2
– i(“Knowledge”,”facts”) < I
– i(“Knowledge”,”Awareness”) > I
Semantic Network Construction
1. Associative Matrix is generated from Text-base model
2. Each sentence forms an Individual Concept Network, ICN
3. All ICN graphs are combined to create the Base Semantic Network, BSN
Semantic Network Construction:
Semantic Association Graph
C1-Sent-ID C2-Sent-ID;in which
C2 1st occured
C3-Sent-ID C4-Sent-ID ... Cn-Sent-ID
1 2 3 4 ... n
C1 C2 C3 C4 ... Cn
1 C1
2 C2
3 C3 Sentence-ID of 1st
episode, which
C3 and C2
are co-occurrence
4 C4
... ...
n Cn
- Finding weights and thresholds:
4. BSN shows recognized the which were neglected concepts and associations
6. BSN Semantic network is represented as a set of inequalities:
- Inequalities set upper- and lower-bound for concept (S) and association (I) recognition thresholds
- Linear programming finds the suitable values for all variables to satisfy the inequalities
7. Finding values for the variable vector X that satisfies the inequalities; by minimizing the problem
specified in:
Semantic Network Construction:
Mathematical Foundation
Where:
- f is the linear objective function
- A is the left hand side of the inequalities
- B is the right hand side of the inequalities
- LB is the lower bound of the solution
- UB is the upper bound of the solution
- The resulting variable vector contains
weights for nodes and associations, along
with individual thresholds (S) and (I) values
for recognizing concepts and associations.
Domain Concepts Learning
• variable vector used to construct the semantic network Gi
= (Ci
, Ei
)
• Then the concept filtering performed to learn domain concepts
• Domain concepts for web document di
are the concepts in a subgraph G*
i
of
its semantic network Gi
:
- G*
i
= (C*
i
, E*
i
) where;C*
i
⊂ Ci
, and Ei
*
⊂ Ei
• Filtering mechanisms:
(1) statistical-based filtering: mean threshold and median threshold
(2) positive-based filtering: suggested for the proposed cognitive-based
semantic learning approach
Topic Learning
• Foreach domain concept ci
∈ C*
i
in dj
calculate the Topic Identification
Weight (Tiw):
– CIw
(ci
) : the weight calculated the computational CI model
– Eigenvector(ci
) : the value of eigenvector centrality measure as the
function of the centralities of its neighbors
– e(ci
) is the episode in which the given concept ci
first appeared
• Topic Identification:
– Topic concept of di
is the concept with the highest Tiw weight
– The most influential node in the semantic network G*
i
of domain
concept set
Context Learning
• The context of the di
is the all the nearest neighbor (nodes
with distance k=1) to the topic concept
• Thus the context includes :
– the most semantically associated to the topic concept
– a normal distribution of a concept selection from
different sections of the text
Experimental Design
• A diverse set of ten randomly selected web documents
from Wikipedia
– astronomy, brain, cognition, ecology, knowledge, law,
literacy, robotic, virus and tennis
• Testing the the openness (domain-independency) property
of our approach in learning semantics of the web contents
Evaluation Strategies
• Results of filtering mechanisms are evaluated by human judgment strategy [4]:
1. A set of seven human judges (domain experts) selected, KSU
2. Human judges were asked to evaluate the list(s) of all potential concepts learned
from the CI model for each web document
3. Then asked to identify whether the concepts belonged to a given domain or not
4. Next, domain concepts identified by the domain experts were compared against the
domain concepts identified by each concept filtering strategy.
5. Then the quality of each concept filtering strategy was evaluated.
• The evaluation performed using the binary evaluation measures from IR: Precision, Recall
and F1
Domain Concepts Analysis
Domain concepts for web document of Ecology
Context and Topic Analysis
Context for web document of EcologyTopic-Concept for web document of Ecology
• We investigated a novel approach for open learning of the concepts,
contexts, and topics of web contents.
• Our approach is based on the Construction-Integration (CI) model of text
comprehension, which mimics the way humans learn the semantic
components of a web document.
• We also highlighted the use of cognitive science results in learning
semantics from web content.
• Our work is a step toward our future research on cognition and open
based:
– Ontology Learning
– Ontology Selection
Conclusion and Future Work
Thank you.

More Related Content

What's hot

Language Policy
Language PolicyLanguage Policy
Language PolicyAiden Yeh
 
Syllabi in English Language Teaching
Syllabi in English Language TeachingSyllabi in English Language Teaching
Syllabi in English Language TeachingJulieHowell
 
Psychological processes: Bottom-up and Top-Down Listening Schemata
Psychological processes: Bottom-up and Top-Down Listening SchemataPsychological processes: Bottom-up and Top-Down Listening Schemata
Psychological processes: Bottom-up and Top-Down Listening SchemataJC Mark Gumban
 
3 factors-affecting-l2-learning-1225479052924337-9
3 factors-affecting-l2-learning-1225479052924337-93 factors-affecting-l2-learning-1225479052924337-9
3 factors-affecting-l2-learning-1225479052924337-9Erol Kahraman
 
Testing language skills chapter one
Testing language skills chapter oneTesting language skills chapter one
Testing language skills chapter onevidadehnad
 
Chapter 6 : Connectionist Approaches
Chapter 6 : Connectionist ApproachesChapter 6 : Connectionist Approaches
Chapter 6 : Connectionist ApproachesPiseth Chea
 
Motivation in second language acquisition
Motivation in second language acquisitionMotivation in second language acquisition
Motivation in second language acquisitionPrimadina Cahyati
 
The application of irt using the rasch model presnetation1
The application of irt using the rasch model presnetation1The application of irt using the rasch model presnetation1
The application of irt using the rasch model presnetation1Carlo Magno
 
The principles of language learning
The principles of language learningThe principles of language learning
The principles of language learningEXO_Honey
 
Krashen, five central hypothesis
Krashen, five central hypothesis Krashen, five central hypothesis
Krashen, five central hypothesis solankipintu
 

What's hot (20)

Language Policy
Language PolicyLanguage Policy
Language Policy
 
Methods of sampling
Methods of sampling Methods of sampling
Methods of sampling
 
Syllabi in English Language Teaching
Syllabi in English Language TeachingSyllabi in English Language Teaching
Syllabi in English Language Teaching
 
Psychological processes: Bottom-up and Top-Down Listening Schemata
Psychological processes: Bottom-up and Top-Down Listening SchemataPsychological processes: Bottom-up and Top-Down Listening Schemata
Psychological processes: Bottom-up and Top-Down Listening Schemata
 
Classical Test Theory (CTT)- By Dr. Jai Singh
Classical Test Theory (CTT)- By Dr. Jai SinghClassical Test Theory (CTT)- By Dr. Jai Singh
Classical Test Theory (CTT)- By Dr. Jai Singh
 
Sla theories 10
Sla theories 10Sla theories 10
Sla theories 10
 
Types of tests and types of testing
Types of tests and types of testingTypes of tests and types of testing
Types of tests and types of testing
 
3 factors-affecting-l2-learning-1225479052924337-9
3 factors-affecting-l2-learning-1225479052924337-93 factors-affecting-l2-learning-1225479052924337-9
3 factors-affecting-l2-learning-1225479052924337-9
 
Testing language skills chapter one
Testing language skills chapter oneTesting language skills chapter one
Testing language skills chapter one
 
Chapter 6 : Connectionist Approaches
Chapter 6 : Connectionist ApproachesChapter 6 : Connectionist Approaches
Chapter 6 : Connectionist Approaches
 
statistics
statisticsstatistics
statistics
 
Krashen’s theory
Krashen’s theoryKrashen’s theory
Krashen’s theory
 
Motivation in second language acquisition
Motivation in second language acquisitionMotivation in second language acquisition
Motivation in second language acquisition
 
The application of irt using the rasch model presnetation1
The application of irt using the rasch model presnetation1The application of irt using the rasch model presnetation1
The application of irt using the rasch model presnetation1
 
Tolman
TolmanTolman
Tolman
 
Microgenesis
MicrogenesisMicrogenesis
Microgenesis
 
The principles of language learning
The principles of language learningThe principles of language learning
The principles of language learning
 
ERROR ANALYSIS
ERROR ANALYSISERROR ANALYSIS
ERROR ANALYSIS
 
Item Writting.pptx
Item Writting.pptxItem Writting.pptx
Item Writting.pptx
 
Krashen, five central hypothesis
Krashen, five central hypothesis Krashen, five central hypothesis
Krashen, five central hypothesis
 

Similar to Using Text Comprehension Model for Learning Concepts, Context, and Topic of Web Content

Co-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approachCo-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approachAllen Wu
 
Unit_4- Principles of AI explaining the importants of AI
Unit_4- Principles of AI explaining the importants of AIUnit_4- Principles of AI explaining the importants of AI
Unit_4- Principles of AI explaining the importants of AIVijayAECE1
 
Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Angelo Salatino
 
Cse 8th sem syllabus
Cse 8th sem syllabusCse 8th sem syllabus
Cse 8th sem syllabusAkshatha Nair
 
Domain Modeling for Personalized Learning
Domain Modeling for Personalized LearningDomain Modeling for Personalized Learning
Domain Modeling for Personalized LearningPeter Brusilovsky
 
Cognitive Self-Synchronisation
Cognitive Self-SynchronisationCognitive Self-Synchronisation
Cognitive Self-SynchronisationMarco Manso
 
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...AnuragVijayAgrawal
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET Journal
 
Analytic and strategic challenges of serious games
Analytic and strategic challenges of serious gamesAnalytic and strategic challenges of serious games
Analytic and strategic challenges of serious gamesDavid Gibson
 
Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.Corrado Monti
 
Project MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIProject MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIbutest
 
Project MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIProject MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIbutest
 
Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory acijjournal
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial ScienceAudrey Britton
 
AUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research TopicsAUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research TopicsAngelo Salatino
 
Using content and interactions for discovering communities in
Using content and interactions for discovering communities inUsing content and interactions for discovering communities in
Using content and interactions for discovering communities inmoresmile
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchSujit Pal
 
Automated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion TranscriptsAutomated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion TranscriptsVitomir Kovanovic
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and NowSi Krishan
 
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...IIIT Hyderabad
 

Similar to Using Text Comprehension Model for Learning Concepts, Context, and Topic of Web Content (20)

Co-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approachCo-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approach
 
Unit_4- Principles of AI explaining the importants of AI
Unit_4- Principles of AI explaining the importants of AIUnit_4- Principles of AI explaining the importants of AI
Unit_4- Principles of AI explaining the importants of AI
 
Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics
 
Cse 8th sem syllabus
Cse 8th sem syllabusCse 8th sem syllabus
Cse 8th sem syllabus
 
Domain Modeling for Personalized Learning
Domain Modeling for Personalized LearningDomain Modeling for Personalized Learning
Domain Modeling for Personalized Learning
 
Cognitive Self-Synchronisation
Cognitive Self-SynchronisationCognitive Self-Synchronisation
Cognitive Self-Synchronisation
 
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
 
Analytic and strategic challenges of serious games
Analytic and strategic challenges of serious gamesAnalytic and strategic challenges of serious games
Analytic and strategic challenges of serious games
 
Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.
 
Project MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIProject MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AI
 
Project MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIProject MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AI
 
Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial Science
 
AUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research TopicsAUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research Topics
 
Using content and interactions for discovering communities in
Using content and interactions for discovering communities inUsing content and interactions for discovering communities in
Using content and interactions for discovering communities in
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
 
Automated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion TranscriptsAutomated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion Transcripts
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
 

Recently uploaded

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Using Text Comprehension Model for Learning Concepts, Context, and Topic of Web Content

  • 1. Using Text Comprehension Model for Learning Concepts, Context, and Topic of Web Content 11th International Conference on Semantic Computing IEEE ICSC 2017 - San Diego, California, USA Jan 30-Feb 1, 2017 Ismael Ali, Naser Al Madi, Austin Melton Department of Computer Science Kent State University
  • 2. Outline • Text Comprehension • System Architecture and Workflow • Semantic Learning – Semantic Network Construction – Mathematical Foundation – Domain Concept Learning – Topic Learning – Context Learning • Experimental Design • Evaluation Strategy • Results • Conclusion and Future Works
  • 3. Abstract • Role of learning Semantics including concepts, contexts, and topics from web documents – semantic-based structuring and retrieving • We present a novel approach for domain-independent semantic learning. • Our approach uses a computational version of the Construction-Integration (CI) model of text comprehension.
  • 4. Text Comprehension • Comprehension is a cognitive-based learning process • Comprehension produces the mental representations: – perceptual – verbal – semantic representations • CI model simulates the incremental and dynamic task of comprehending the text and it leads to the construction of a semantic network (SN)
  • 5. CI as a Cognitive Model of Text Comprehension This figure from: (Cathleen Wharton and Walter Kintsch, 1991 in ACM SIGART Bulletin) Surface Model Text-Base Model Situation Model Situation Model• Time of acquisition • Recognizing main concepts • Integrating them with background knowledge
  • 6. System Architecture and Workflow Using Stanford CoreNLP 1. Text tokenization 2. Lemmatization 3. Sentence splitting - To get the Surface Model. 4. Part of Speech Tagging 5. Anaphora Resolution Running the computational CI model to produce weighted semantic network Analysis and filtering of the weighted semantic networks
  • 7. Semantic Network Construction • Sentences are presented as single units of time (a reading episode) • “Knowledge is a familiarity. Awareness or understanding of something. Such as facts.” Recognized Concepts Neglected Concepts Recognized Associations Neglected Associations Fig. 2. Sample Concept Network. (After running the CI model)
  • 8. • “Knowledge is a familiarity. Awareness or understanding of something. Such as facts.” • Episodes of {e1 , e2 , ... , ei } are background knowledge for episode {ei+1 } • Weights on edges represents the semantic association strength Fig. 2. Sample Concept Network. (After running the CI model) 1. concept recognition threshold (S) is 7 for Fig. 2 – s(“something”) = 6 – e1 + e2 < S – s(“Awareness”) = 12 – e3 + e4 > S 2. association recognition threshold (I) is 5 for Fig. 2 – i(“Knowledge”,”facts”) < I – i(“Knowledge”,”Awareness”) > I Semantic Network Construction
  • 9. 1. Associative Matrix is generated from Text-base model 2. Each sentence forms an Individual Concept Network, ICN 3. All ICN graphs are combined to create the Base Semantic Network, BSN Semantic Network Construction: Semantic Association Graph C1-Sent-ID C2-Sent-ID;in which C2 1st occured C3-Sent-ID C4-Sent-ID ... Cn-Sent-ID 1 2 3 4 ... n C1 C2 C3 C4 ... Cn 1 C1 2 C2 3 C3 Sentence-ID of 1st episode, which C3 and C2 are co-occurrence 4 C4 ... ... n Cn
  • 10. - Finding weights and thresholds: 4. BSN shows recognized the which were neglected concepts and associations 6. BSN Semantic network is represented as a set of inequalities: - Inequalities set upper- and lower-bound for concept (S) and association (I) recognition thresholds - Linear programming finds the suitable values for all variables to satisfy the inequalities 7. Finding values for the variable vector X that satisfies the inequalities; by minimizing the problem specified in: Semantic Network Construction: Mathematical Foundation Where: - f is the linear objective function - A is the left hand side of the inequalities - B is the right hand side of the inequalities - LB is the lower bound of the solution - UB is the upper bound of the solution - The resulting variable vector contains weights for nodes and associations, along with individual thresholds (S) and (I) values for recognizing concepts and associations.
  • 11. Domain Concepts Learning • variable vector used to construct the semantic network Gi = (Ci , Ei ) • Then the concept filtering performed to learn domain concepts • Domain concepts for web document di are the concepts in a subgraph G* i of its semantic network Gi : - G* i = (C* i , E* i ) where;C* i ⊂ Ci , and Ei * ⊂ Ei • Filtering mechanisms: (1) statistical-based filtering: mean threshold and median threshold (2) positive-based filtering: suggested for the proposed cognitive-based semantic learning approach
  • 12. Topic Learning • Foreach domain concept ci ∈ C* i in dj calculate the Topic Identification Weight (Tiw): – CIw (ci ) : the weight calculated the computational CI model – Eigenvector(ci ) : the value of eigenvector centrality measure as the function of the centralities of its neighbors – e(ci ) is the episode in which the given concept ci first appeared • Topic Identification: – Topic concept of di is the concept with the highest Tiw weight – The most influential node in the semantic network G* i of domain concept set
  • 13. Context Learning • The context of the di is the all the nearest neighbor (nodes with distance k=1) to the topic concept • Thus the context includes : – the most semantically associated to the topic concept – a normal distribution of a concept selection from different sections of the text
  • 14. Experimental Design • A diverse set of ten randomly selected web documents from Wikipedia – astronomy, brain, cognition, ecology, knowledge, law, literacy, robotic, virus and tennis • Testing the the openness (domain-independency) property of our approach in learning semantics of the web contents
  • 15. Evaluation Strategies • Results of filtering mechanisms are evaluated by human judgment strategy [4]: 1. A set of seven human judges (domain experts) selected, KSU 2. Human judges were asked to evaluate the list(s) of all potential concepts learned from the CI model for each web document 3. Then asked to identify whether the concepts belonged to a given domain or not 4. Next, domain concepts identified by the domain experts were compared against the domain concepts identified by each concept filtering strategy. 5. Then the quality of each concept filtering strategy was evaluated. • The evaluation performed using the binary evaluation measures from IR: Precision, Recall and F1
  • 16. Domain Concepts Analysis Domain concepts for web document of Ecology
  • 17. Context and Topic Analysis Context for web document of EcologyTopic-Concept for web document of Ecology
  • 18. • We investigated a novel approach for open learning of the concepts, contexts, and topics of web contents. • Our approach is based on the Construction-Integration (CI) model of text comprehension, which mimics the way humans learn the semantic components of a web document. • We also highlighted the use of cognitive science results in learning semantics from web content. • Our work is a step toward our future research on cognition and open based: – Ontology Learning – Ontology Selection Conclusion and Future Work