SlideShare a Scribd company logo
Using Text Comprehension Model for
Learning Concepts, Context, and Topic
of Web Content
11th International Conference on Semantic Computing
IEEE ICSC 2017 - San Diego, California, USA
Jan 30-Feb 1, 2017
Ismael Ali, Naser Al Madi, Austin Melton
Department of Computer Science
Kent State University
Outline
• Text Comprehension
• System Architecture and Workflow
• Semantic Learning
– Semantic Network Construction
– Mathematical Foundation
– Domain Concept Learning
– Topic Learning
– Context Learning
• Experimental Design
• Evaluation Strategy
• Results
• Conclusion and Future Works
Abstract
• Role of learning Semantics including concepts, contexts, and
topics from web documents
– semantic-based structuring and retrieving
• We present a novel approach for domain-independent
semantic learning.
• Our approach uses a computational version of the
Construction-Integration (CI) model of text comprehension.
Text Comprehension
• Comprehension is a cognitive-based learning process
• Comprehension produces the mental representations:
– perceptual
– verbal
– semantic representations
• CI model simulates the incremental and dynamic task of
comprehending the text and it leads to the construction of a
semantic network (SN)
CI as a Cognitive Model of Text Comprehension
This figure from: (Cathleen Wharton and Walter Kintsch, 1991 in ACM SIGART Bulletin)
Surface
Model
Text-Base
Model
Situation
Model
Situation
Model• Time of acquisition
• Recognizing main
concepts
• Integrating them with
background knowledge
System Architecture and Workflow
Using Stanford CoreNLP
1. Text tokenization
2. Lemmatization
3. Sentence splitting
- To get the Surface Model.
4. Part of Speech Tagging
5. Anaphora Resolution
Running the
computational CI model
to produce weighted
semantic network
Analysis and
filtering of the
weighted semantic
networks
Semantic Network Construction
• Sentences are presented as single units of time (a reading
episode)
• “Knowledge is a familiarity. Awareness or understanding of
something. Such as facts.”
Recognized Concepts
Neglected Concepts
Recognized Associations
Neglected Associations
Fig. 2. Sample Concept Network.
(After running the CI model)
• “Knowledge is a familiarity. Awareness or understanding of something. Such
as facts.”
• Episodes of {e1
, e2
, ... , ei
} are background knowledge for episode {ei+1
}
• Weights on edges represents the semantic association strength
Fig. 2. Sample Concept Network.
(After running the CI model)
1. concept recognition threshold (S) is 7
for Fig. 2
– s(“something”) = 6
– e1 + e2 < S
– s(“Awareness”) = 12
– e3 + e4 > S
2. association recognition threshold (I)
is 5 for Fig. 2
– i(“Knowledge”,”facts”) < I
– i(“Knowledge”,”Awareness”) > I
Semantic Network Construction
1. Associative Matrix is generated from Text-base model
2. Each sentence forms an Individual Concept Network, ICN
3. All ICN graphs are combined to create the Base Semantic Network, BSN
Semantic Network Construction:
Semantic Association Graph
C1-Sent-ID C2-Sent-ID;in which
C2 1st occured
C3-Sent-ID C4-Sent-ID ... Cn-Sent-ID
1 2 3 4 ... n
C1 C2 C3 C4 ... Cn
1 C1
2 C2
3 C3 Sentence-ID of 1st
episode, which
C3 and C2
are co-occurrence
4 C4
... ...
n Cn
- Finding weights and thresholds:
4. BSN shows recognized the which were neglected concepts and associations
6. BSN Semantic network is represented as a set of inequalities:
- Inequalities set upper- and lower-bound for concept (S) and association (I) recognition thresholds
- Linear programming finds the suitable values for all variables to satisfy the inequalities
7. Finding values for the variable vector X that satisfies the inequalities; by minimizing the problem
specified in:
Semantic Network Construction:
Mathematical Foundation
Where:
- f is the linear objective function
- A is the left hand side of the inequalities
- B is the right hand side of the inequalities
- LB is the lower bound of the solution
- UB is the upper bound of the solution
- The resulting variable vector contains
weights for nodes and associations, along
with individual thresholds (S) and (I) values
for recognizing concepts and associations.
Domain Concepts Learning
• variable vector used to construct the semantic network Gi
= (Ci
, Ei
)
• Then the concept filtering performed to learn domain concepts
• Domain concepts for web document di
are the concepts in a subgraph G*
i
of
its semantic network Gi
:
- G*
i
= (C*
i
, E*
i
) where;C*
i
⊂ Ci
, and Ei
*
⊂ Ei
• Filtering mechanisms:
(1) statistical-based filtering: mean threshold and median threshold
(2) positive-based filtering: suggested for the proposed cognitive-based
semantic learning approach
Topic Learning
• Foreach domain concept ci
∈ C*
i
in dj
calculate the Topic Identification
Weight (Tiw):
– CIw
(ci
) : the weight calculated the computational CI model
– Eigenvector(ci
) : the value of eigenvector centrality measure as the
function of the centralities of its neighbors
– e(ci
) is the episode in which the given concept ci
first appeared
• Topic Identification:
– Topic concept of di
is the concept with the highest Tiw weight
– The most influential node in the semantic network G*
i
of domain
concept set
Context Learning
• The context of the di
is the all the nearest neighbor (nodes
with distance k=1) to the topic concept
• Thus the context includes :
– the most semantically associated to the topic concept
– a normal distribution of a concept selection from
different sections of the text
Experimental Design
• A diverse set of ten randomly selected web documents
from Wikipedia
– astronomy, brain, cognition, ecology, knowledge, law,
literacy, robotic, virus and tennis
• Testing the the openness (domain-independency) property
of our approach in learning semantics of the web contents
Evaluation Strategies
• Results of filtering mechanisms are evaluated by human judgment strategy [4]:
1. A set of seven human judges (domain experts) selected, KSU
2. Human judges were asked to evaluate the list(s) of all potential concepts learned
from the CI model for each web document
3. Then asked to identify whether the concepts belonged to a given domain or not
4. Next, domain concepts identified by the domain experts were compared against the
domain concepts identified by each concept filtering strategy.
5. Then the quality of each concept filtering strategy was evaluated.
• The evaluation performed using the binary evaluation measures from IR: Precision, Recall
and F1
Domain Concepts Analysis
Domain concepts for web document of Ecology
Context and Topic Analysis
Context for web document of EcologyTopic-Concept for web document of Ecology
• We investigated a novel approach for open learning of the concepts,
contexts, and topics of web contents.
• Our approach is based on the Construction-Integration (CI) model of text
comprehension, which mimics the way humans learn the semantic
components of a web document.
• We also highlighted the use of cognitive science results in learning
semantics from web content.
• Our work is a step toward our future research on cognition and open
based:
– Ontology Learning
– Ontology Selection
Conclusion and Future Work
Thank you.

More Related Content

What's hot

Communication Accommodation Theory
Communication Accommodation TheoryCommunication Accommodation Theory
Communication Accommodation Theorydnlowry
 
Deductive And Inductive Reasoning
Deductive And Inductive ReasoningDeductive And Inductive Reasoning
Deductive And Inductive Reasoning
Veniez Sunga
 
purposive communication.pptx
purposive communication.pptxpurposive communication.pptx
purposive communication.pptx
OciRosalieMarasigan
 
GE 5 Purposive communicatio ppt 1.pptx
GE 5 Purposive communicatio ppt 1.pptxGE 5 Purposive communicatio ppt 1.pptx
GE 5 Purposive communicatio ppt 1.pptx
MelanyAlteaManriza1
 
Cultural Relatvism
Cultural RelatvismCultural Relatvism
Cultural Relatvism
Mia Eaker
 
Moral philosophy
Moral philosophyMoral philosophy
Moral philosophy
Rhianne Gt
 
Social variation (SLT)
Social variation (SLT)Social variation (SLT)
Social variation (SLT)
H. R. Marasabessy
 
Literature review on the role of mother tongue in learning and teaching engli...
Literature review on the role of mother tongue in learning and teaching engli...Literature review on the role of mother tongue in learning and teaching engli...
Literature review on the role of mother tongue in learning and teaching engli...
Alexander Decker
 
Critical thinking and logic powerpoint
Critical thinking and logic powerpointCritical thinking and logic powerpoint
Critical thinking and logic powerpointannvillanueva
 
INTERCULTURAL COMMUNICATION
INTERCULTURAL COMMUNICATIONINTERCULTURAL COMMUNICATION
INTERCULTURAL COMMUNICATION
Rona Trinidad
 
Developmental reading
Developmental readingDevelopmental reading
Developmental reading
Joy Marie Dinglasa Blasco
 
3 models of communication
3 models of communication3 models of communication
3 models of communicationDiego Rodrigo
 
Questionnaires and surveys
Questionnaires and surveysQuestionnaires and surveys
Questionnaires and surveys
Mary Jane T.
 
Sociolinguistic The ethnography of speaking and the structure of conversation
Sociolinguistic The ethnography of speaking and the structure of conversationSociolinguistic The ethnography of speaking and the structure of conversation
Sociolinguistic The ethnography of speaking and the structure of conversationamalina_muktia
 
Ethical intuitionism
Ethical intuitionismEthical intuitionism
Ethical intuitionism
aquinas_rs
 
METHODOLOGY OF DESCRIPTIVE RESEARCH
METHODOLOGY OF DESCRIPTIVE RESEARCHMETHODOLOGY OF DESCRIPTIVE RESEARCH
METHODOLOGY OF DESCRIPTIVE RESEARCHKatja Hus
 
Models of Reading
Models of ReadingModels of Reading
Models of Reading
Jennifer Ocampo
 
Communicative competence slides
Communicative competence slidesCommunicative competence slides
Communicative competence slides
sonsedela
 

What's hot (20)

Communication Accommodation Theory
Communication Accommodation TheoryCommunication Accommodation Theory
Communication Accommodation Theory
 
Deductive And Inductive Reasoning
Deductive And Inductive ReasoningDeductive And Inductive Reasoning
Deductive And Inductive Reasoning
 
purposive communication.pptx
purposive communication.pptxpurposive communication.pptx
purposive communication.pptx
 
GE 5 Purposive communicatio ppt 1.pptx
GE 5 Purposive communicatio ppt 1.pptxGE 5 Purposive communicatio ppt 1.pptx
GE 5 Purposive communicatio ppt 1.pptx
 
Cultural Relatvism
Cultural RelatvismCultural Relatvism
Cultural Relatvism
 
Chapter12
Chapter12Chapter12
Chapter12
 
Moral philosophy
Moral philosophyMoral philosophy
Moral philosophy
 
Social variation (SLT)
Social variation (SLT)Social variation (SLT)
Social variation (SLT)
 
02 value and ethics
02 value and ethics02 value and ethics
02 value and ethics
 
Literature review on the role of mother tongue in learning and teaching engli...
Literature review on the role of mother tongue in learning and teaching engli...Literature review on the role of mother tongue in learning and teaching engli...
Literature review on the role of mother tongue in learning and teaching engli...
 
Critical thinking and logic powerpoint
Critical thinking and logic powerpointCritical thinking and logic powerpoint
Critical thinking and logic powerpoint
 
INTERCULTURAL COMMUNICATION
INTERCULTURAL COMMUNICATIONINTERCULTURAL COMMUNICATION
INTERCULTURAL COMMUNICATION
 
Developmental reading
Developmental readingDevelopmental reading
Developmental reading
 
3 models of communication
3 models of communication3 models of communication
3 models of communication
 
Questionnaires and surveys
Questionnaires and surveysQuestionnaires and surveys
Questionnaires and surveys
 
Sociolinguistic The ethnography of speaking and the structure of conversation
Sociolinguistic The ethnography of speaking and the structure of conversationSociolinguistic The ethnography of speaking and the structure of conversation
Sociolinguistic The ethnography of speaking and the structure of conversation
 
Ethical intuitionism
Ethical intuitionismEthical intuitionism
Ethical intuitionism
 
METHODOLOGY OF DESCRIPTIVE RESEARCH
METHODOLOGY OF DESCRIPTIVE RESEARCHMETHODOLOGY OF DESCRIPTIVE RESEARCH
METHODOLOGY OF DESCRIPTIVE RESEARCH
 
Models of Reading
Models of ReadingModels of Reading
Models of Reading
 
Communicative competence slides
Communicative competence slidesCommunicative competence slides
Communicative competence slides
 

Similar to Using Text Comprehension Model for Learning Concepts, Context, and Topic of Web Content

Co-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approachCo-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approach
Allen Wu
 
Unit_4- Principles of AI explaining the importants of AI
Unit_4- Principles of AI explaining the importants of AIUnit_4- Principles of AI explaining the importants of AI
Unit_4- Principles of AI explaining the importants of AI
VijayAECE1
 
Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics
Angelo Salatino
 
Cse 8th sem syllabus
Cse 8th sem syllabusCse 8th sem syllabus
Cse 8th sem syllabus
Akshatha Nair
 
Domain Modeling for Personalized Learning
Domain Modeling for Personalized LearningDomain Modeling for Personalized Learning
Domain Modeling for Personalized Learning
Peter Brusilovsky
 
Cognitive Self-Synchronisation
Cognitive Self-SynchronisationCognitive Self-Synchronisation
Cognitive Self-SynchronisationMarco Manso
 
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
AnuragVijayAgrawal
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
IRJET Journal
 
Analytic and strategic challenges of serious games
Analytic and strategic challenges of serious gamesAnalytic and strategic challenges of serious games
Analytic and strategic challenges of serious games
David Gibson
 
Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.
Corrado Monti
 
Project MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIProject MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIbutest
 
Project MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIProject MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIbutest
 
Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory
acijjournal
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial Science
Audrey Britton
 
AUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research TopicsAUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research Topics
Angelo Salatino
 
Using content and interactions for discovering communities in
Using content and interactions for discovering communities inUsing content and interactions for discovering communities in
Using content and interactions for discovering communities inmoresmile
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
Sujit Pal
 
Automated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion TranscriptsAutomated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion Transcripts
Vitomir Kovanovic
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
Si Krishan
 
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
IIIT Hyderabad
 

Similar to Using Text Comprehension Model for Learning Concepts, Context, and Topic of Web Content (20)

Co-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approachCo-clustering of multi-view datasets: a parallelizable approach
Co-clustering of multi-view datasets: a parallelizable approach
 
Unit_4- Principles of AI explaining the importants of AI
Unit_4- Principles of AI explaining the importants of AIUnit_4- Principles of AI explaining the importants of AI
Unit_4- Principles of AI explaining the importants of AI
 
Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics Invited Talk: Early Detection of Research Topics
Invited Talk: Early Detection of Research Topics
 
Cse 8th sem syllabus
Cse 8th sem syllabusCse 8th sem syllabus
Cse 8th sem syllabus
 
Domain Modeling for Personalized Learning
Domain Modeling for Personalized LearningDomain Modeling for Personalized Learning
Domain Modeling for Personalized Learning
 
Cognitive Self-Synchronisation
Cognitive Self-SynchronisationCognitive Self-Synchronisation
Cognitive Self-Synchronisation
 
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
A Comprehensive Analysis on Co-Saliency Detection on Learning Approaches in 3...
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
 
Analytic and strategic challenges of serious games
Analytic and strategic challenges of serious gamesAnalytic and strategic challenges of serious games
Analytic and strategic challenges of serious games
 
Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.
 
Project MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIProject MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AI
 
Project MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AIProject MLExAI: Machine Learning Experiences in AI
Project MLExAI: Machine Learning Experiences in AI
 
Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory
 
AI In Actuarial Science
AI In Actuarial ScienceAI In Actuarial Science
AI In Actuarial Science
 
AUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research TopicsAUGUR: Forecasting the Emergence of New Research Topics
AUGUR: Forecasting the Emergence of New Research Topics
 
Using content and interactions for discovering communities in
Using content and interactions for discovering communities inUsing content and interactions for discovering communities in
Using content and interactions for discovering communities in
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
 
Automated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion TranscriptsAutomated Content Analysis of Discussion Transcripts
Automated Content Analysis of Discussion Transcripts
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...
 

Recently uploaded

Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 

Recently uploaded (20)

Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 

Using Text Comprehension Model for Learning Concepts, Context, and Topic of Web Content

  • 1. Using Text Comprehension Model for Learning Concepts, Context, and Topic of Web Content 11th International Conference on Semantic Computing IEEE ICSC 2017 - San Diego, California, USA Jan 30-Feb 1, 2017 Ismael Ali, Naser Al Madi, Austin Melton Department of Computer Science Kent State University
  • 2. Outline • Text Comprehension • System Architecture and Workflow • Semantic Learning – Semantic Network Construction – Mathematical Foundation – Domain Concept Learning – Topic Learning – Context Learning • Experimental Design • Evaluation Strategy • Results • Conclusion and Future Works
  • 3. Abstract • Role of learning Semantics including concepts, contexts, and topics from web documents – semantic-based structuring and retrieving • We present a novel approach for domain-independent semantic learning. • Our approach uses a computational version of the Construction-Integration (CI) model of text comprehension.
  • 4. Text Comprehension • Comprehension is a cognitive-based learning process • Comprehension produces the mental representations: – perceptual – verbal – semantic representations • CI model simulates the incremental and dynamic task of comprehending the text and it leads to the construction of a semantic network (SN)
  • 5. CI as a Cognitive Model of Text Comprehension This figure from: (Cathleen Wharton and Walter Kintsch, 1991 in ACM SIGART Bulletin) Surface Model Text-Base Model Situation Model Situation Model• Time of acquisition • Recognizing main concepts • Integrating them with background knowledge
  • 6. System Architecture and Workflow Using Stanford CoreNLP 1. Text tokenization 2. Lemmatization 3. Sentence splitting - To get the Surface Model. 4. Part of Speech Tagging 5. Anaphora Resolution Running the computational CI model to produce weighted semantic network Analysis and filtering of the weighted semantic networks
  • 7. Semantic Network Construction • Sentences are presented as single units of time (a reading episode) • “Knowledge is a familiarity. Awareness or understanding of something. Such as facts.” Recognized Concepts Neglected Concepts Recognized Associations Neglected Associations Fig. 2. Sample Concept Network. (After running the CI model)
  • 8. • “Knowledge is a familiarity. Awareness or understanding of something. Such as facts.” • Episodes of {e1 , e2 , ... , ei } are background knowledge for episode {ei+1 } • Weights on edges represents the semantic association strength Fig. 2. Sample Concept Network. (After running the CI model) 1. concept recognition threshold (S) is 7 for Fig. 2 – s(“something”) = 6 – e1 + e2 < S – s(“Awareness”) = 12 – e3 + e4 > S 2. association recognition threshold (I) is 5 for Fig. 2 – i(“Knowledge”,”facts”) < I – i(“Knowledge”,”Awareness”) > I Semantic Network Construction
  • 9. 1. Associative Matrix is generated from Text-base model 2. Each sentence forms an Individual Concept Network, ICN 3. All ICN graphs are combined to create the Base Semantic Network, BSN Semantic Network Construction: Semantic Association Graph C1-Sent-ID C2-Sent-ID;in which C2 1st occured C3-Sent-ID C4-Sent-ID ... Cn-Sent-ID 1 2 3 4 ... n C1 C2 C3 C4 ... Cn 1 C1 2 C2 3 C3 Sentence-ID of 1st episode, which C3 and C2 are co-occurrence 4 C4 ... ... n Cn
  • 10. - Finding weights and thresholds: 4. BSN shows recognized the which were neglected concepts and associations 6. BSN Semantic network is represented as a set of inequalities: - Inequalities set upper- and lower-bound for concept (S) and association (I) recognition thresholds - Linear programming finds the suitable values for all variables to satisfy the inequalities 7. Finding values for the variable vector X that satisfies the inequalities; by minimizing the problem specified in: Semantic Network Construction: Mathematical Foundation Where: - f is the linear objective function - A is the left hand side of the inequalities - B is the right hand side of the inequalities - LB is the lower bound of the solution - UB is the upper bound of the solution - The resulting variable vector contains weights for nodes and associations, along with individual thresholds (S) and (I) values for recognizing concepts and associations.
  • 11. Domain Concepts Learning • variable vector used to construct the semantic network Gi = (Ci , Ei ) • Then the concept filtering performed to learn domain concepts • Domain concepts for web document di are the concepts in a subgraph G* i of its semantic network Gi : - G* i = (C* i , E* i ) where;C* i ⊂ Ci , and Ei * ⊂ Ei • Filtering mechanisms: (1) statistical-based filtering: mean threshold and median threshold (2) positive-based filtering: suggested for the proposed cognitive-based semantic learning approach
  • 12. Topic Learning • Foreach domain concept ci ∈ C* i in dj calculate the Topic Identification Weight (Tiw): – CIw (ci ) : the weight calculated the computational CI model – Eigenvector(ci ) : the value of eigenvector centrality measure as the function of the centralities of its neighbors – e(ci ) is the episode in which the given concept ci first appeared • Topic Identification: – Topic concept of di is the concept with the highest Tiw weight – The most influential node in the semantic network G* i of domain concept set
  • 13. Context Learning • The context of the di is the all the nearest neighbor (nodes with distance k=1) to the topic concept • Thus the context includes : – the most semantically associated to the topic concept – a normal distribution of a concept selection from different sections of the text
  • 14. Experimental Design • A diverse set of ten randomly selected web documents from Wikipedia – astronomy, brain, cognition, ecology, knowledge, law, literacy, robotic, virus and tennis • Testing the the openness (domain-independency) property of our approach in learning semantics of the web contents
  • 15. Evaluation Strategies • Results of filtering mechanisms are evaluated by human judgment strategy [4]: 1. A set of seven human judges (domain experts) selected, KSU 2. Human judges were asked to evaluate the list(s) of all potential concepts learned from the CI model for each web document 3. Then asked to identify whether the concepts belonged to a given domain or not 4. Next, domain concepts identified by the domain experts were compared against the domain concepts identified by each concept filtering strategy. 5. Then the quality of each concept filtering strategy was evaluated. • The evaluation performed using the binary evaluation measures from IR: Precision, Recall and F1
  • 16. Domain Concepts Analysis Domain concepts for web document of Ecology
  • 17. Context and Topic Analysis Context for web document of EcologyTopic-Concept for web document of Ecology
  • 18. • We investigated a novel approach for open learning of the concepts, contexts, and topics of web contents. • Our approach is based on the Construction-Integration (CI) model of text comprehension, which mimics the way humans learn the semantic components of a web document. • We also highlighted the use of cognitive science results in learning semantics from web content. • Our work is a step toward our future research on cognition and open based: – Ontology Learning – Ontology Selection Conclusion and Future Work