SlideShare a Scribd company logo
Yun-Nung (Vivian) Chen, Yu Huang,
Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan
2
Key Term Extraction, National Taiwan University
Definition
•Key Term
• Higher term frequency
• Core content
•Two types
• Keyword
• Key phrase
•Advantage
• Indexing and retrieval
• The relations between key terms and segments of documents
3
Key Term Extraction, National Taiwan University
Introduction
4
Key Term Extraction, National Taiwan University
Introduction
5
Key Term Extraction, National Taiwan University
acoustic model
language model
hmm n gram
phonehidden Markov model
Introduction
6
Key Term Extraction, National Taiwan University
hmm
acoustic model
language model
n gram
phonehidden Markov model
bigram
Target: extract key terms from course lectures
7
Key Term Extraction, National Taiwan University
Automatic Key Term Extraction
8
Key Term Extraction, National Taiwan University
▼ Original spoken documents
Archive of
spoken
documents
Branching
Entropy
Feature
Extraction
Learning Methods
1) K-means Exemplar
2) AdaBoost
3) Neural Network
ASR
speech signal
ASR trans
Automatic Key Term Extraction
9
Key Term Extraction, National Taiwan University
Archive of
spoken
documents
Branching
Entropy
Feature
Extraction
Learning Methods
1) K-means Exemplar
2) AdaBoost
3) Neural Network
ASR
speech signal
ASR trans
Automatic Key Term Extraction
10
Key Term Extraction, National Taiwan University
Archive of
spoken
documents
Branching
Entropy
Feature
Extraction
Learning Methods
1) K-means Exemplar
2) AdaBoost
3) Neural Network
ASR
speech signal
ASR trans
Phrase
Identification
Automatic Key Term Extraction
11
Key Term Extraction, National Taiwan University
Archive of
spoken
documents
Branching
Entropy
Feature
Extraction
Learning Methods
1) K-means Exemplar
2) AdaBoost
3) Neural Network
ASR
speech signal
First using branching entropy to identify phrases
ASR trans
Phrase
Identification
Key Term Extraction
Automatic Key Term Extraction
12
Key Term Extraction, National Taiwan University
Archive of
spoken
documents
Branching
Entropy
Feature
Extraction
Learning Methods
1) K-means Exemplar
2) AdaBoost
3) Neural Network
ASR
speech signal
Key terms
entropy
acoustic
model
:
Then using learning methods to extract key terms by some features
ASR trans
Phrase
Identification
Key Term Extraction
Automatic Key Term Extraction
13
Key Term Extraction, National Taiwan University
Archive of
spoken
documents
Branching
Entropy
Feature
Extraction
Learning Methods
1) K-means Exemplar
2) AdaBoost
3) Neural Network
ASR
speech signal
Key terms
entropy
acoustic
model
:
ASR trans
Branching Entropy
14
Key Term Extraction, National Taiwan University
• “hidden” is almost always followed by the same word
hidden Markov model
How to decide the boundary of a phrase?
represent
is
can
:
:
is
of
in
:
:
Branching Entropy
15
Key Term Extraction, National Taiwan University
• “hidden” is almost always followed by the same word
• “hidden Markov” is almost always followed by the same word
hidden Markov model
How to decide the boundary of a phrase?
represent
is
can
:
:
is
of
in
:
:
Branching Entropy
16
Key Term Extraction, National Taiwan University
hidden Markov model
boundary
Define branching entropy to decide possible boundary
How to decide the boundary of a phrase?
represent
is
can
:
:
is
of
in
:
:
• “hidden” is almost always followed by the same word
• “hidden Markov” is almost always followed by the same word
• “hidden Markov model” is followed by many different words
Branching Entropy
17
Key Term Extraction, National Taiwan University
hidden Markov model
• Definition of Right Branching Entropy
• Probability of children xi for X
• Right branching entropy for X
X xi
How to decide the boundary of a phrase?
represent
is
can
:
:
is
of
in
:
:
Branching Entropy
18
Key Term Extraction, National Taiwan University
hidden Markov model
• Decision of Right Boundary
• Find the right boundary located between X and xi where
X
boundary
How to decide the boundary of a phrase?
represent
is
can
:
:
is
of
in
:
:
Branching Entropy
19
Key Term Extraction, National Taiwan University
hidden Markov model
How to decide the boundary of a phrase?
represent
is
can
:
:
is
of
in
:
:
Branching Entropy
20
Key Term Extraction, National Taiwan University
hidden Markov model
How to decide the boundary of a phrase?
represent
is
can
:
:
is
of
in
:
:
Branching Entropy
21
Key Term Extraction, National Taiwan University
hidden Markov model
How to decide the boundary of a phrase?
represent
is
can
:
:
is
of
in
:
:
Branching Entropy
22
Key Term Extraction, National Taiwan University
hidden Markov model
• Decision of Left Boundary
• Find the left boundary located between X and xi where
X: model Markov hidden
How to decide the boundary of a phrase?
boundary
X
represent
is
can
:
:
is
of
in
:
:
Using PAT Tree to implement
Branching Entropy
23
Key Term Extraction, National Taiwan University
• Implementation in the PAT tree
• Probability of children xi for X
• Right branching entropy for X
hidden
Markov
1 model
2
chain
3
state
5distribution
6
variable
4
X
x1
x2
X : hidden Markov
x1: hidden Markov model
x2: hidden Markov chain
How to decide the boundary of a phrase?
Phrase
Identification
Key Term Extraction
Automatic Key Term Extraction
24
Key Term Extraction, National Taiwan University
Archive of
spoken
documents
Branching
Entropy
Feature
Extraction
Learning Methods
1) K-means Exemplar
2) AdaBoost
3) Neural Network
ASR
speech signal
Key terms
entropy
acoustic
model
:
Extract prosodic, lexical, and semantic features for each candidate term
ASR trans
Feature Extraction
25
Key Term Extraction, National Taiwan University
•Prosodic features
• For each candidate term appearing at the first time
Feature
Name
Feature Description
Duration
(I – IV)
normalized duration
(max, min, mean, range)
Speaker tends to use longer duration to emphasize key terms
using 4 values for
duration of the term
duration of phone “a” normalized by
avg duration of phone “a”
Feature Extraction
26
Key Term Extraction, National Taiwan University
•Prosodic features
• For each candidate term appearing at the first time
Higher pitch may represent significant information
Feature
Name
Feature Description
Duration
(I – IV)
normalized duration
(max, min, mean, range)
Feature Extraction
27
Key Term Extraction, National Taiwan University
•Prosodic features
• For each candidate term appearing at the first time
Higher pitch may represent significant information
Feature
Name
Feature Description
Duration
(I – IV)
normalized duration
(max, min, mean, range)
Pitch
(I - IV)
F0
(max, min, mean, range)
Feature Extraction
28
Key Term Extraction, National Taiwan University
•Prosodic features
• For each candidate term appearing at the first time
Higher energy emphasizes important information
Feature
Name
Feature Description
Duration
(I – IV)
normalized duration
(max, min, mean, range)
Pitch
(I - IV)
F0
(max, min, mean, range)
Feature Extraction
29
Key Term Extraction, National Taiwan University
•Prosodic features
• For each candidate term appearing at the first time
Higher energy emphasizes important information
Feature
Name
Feature Description
Duration
(I – IV)
normalized duration
(max, min, mean, range)
Pitch
(I - IV)
F0
(max, min, mean, range)
Energy
(I - IV)
energy
(max, min, mean, range)
Feature Extraction
30
Key Term Extraction, National Taiwan University
•Lexical features
Feature Name Feature Description
TF term frequency
IDF inverse document frequency
TFIDF tf * idf
PoS the PoS tag
Using some well-known lexical features for each candidate term
Feature Extraction
31
Key Term Extraction, National Taiwan University
•Semantic features
• Probabilistic Latent Semantic Analysis (PLSA)
 Latent Topic Probability
Key terms tend to focus on limited topics
t1
t2
tj
tn
D1
D2
Di
DN
TK
Tk
T2
T1
P(T |D )k i
P(t |T )j k
Di: documents Tk: latent topics tj: terms
Feature Extraction
32
Key Term Extraction, National Taiwan University
•Semantic features
• Probabilistic Latent Semantic Analysis (PLSA)
 Latent Topic Probability
Feature Name Feature Description
LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)
non-key term
key term
Key terms tend to focus on limited topics
describe a probability distribution
How to use it?
Feature Extraction
33
Key Term Extraction, National Taiwan University
•Semantic features
• Probabilistic Latent Semantic Analysis (PLSA)
 Latent Topic Significance
Within-topic to out-of-topic ratio
Feature Name Feature Description
LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)
non-key term
key term
Key terms tend to focus on limited topics
within-topic freq.
out-of-topic freq.
Feature Extraction
34
Key Term Extraction, National Taiwan University
•Semantic features
• Probabilistic Latent Semantic Analysis (PLSA)
 Latent Topic Significance
Within-topic to out-of-topic ratio
Feature Name Feature Description
LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)
LTS (I - III) Latent Topic Significance (mean, variance, standard deviation)
non-key term
key term
Key terms tend to focus on limited topics
within-topic freq.
out-of-topic freq.
Feature Extraction
35
Key Term Extraction, National Taiwan University
•Semantic features
• Probabilistic Latent Semantic Analysis (PLSA)
 Latent Topic Entropy
Feature Name Feature Description
LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)
LTS (I - III) Latent Topic Significance (mean, variance, standard deviation)
non-key term
key term
Key terms tend to focus on limited topics
Feature Extraction
36
Key Term Extraction, National Taiwan University
•Semantic features
• Probabilistic Latent Semantic Analysis (PLSA)
 Latent Topic Entropy
Feature Name Feature Description
LTP (I - III) Latent Topic Probability (mean, variance, standard deviation)
LTS (I - III) Latent Topic Significance (mean, variance, standard deviation)
LTE term entropy for latent topic
non-key term
key term
Key terms tend to focus on limited topics
Higher LTE
Lower LTE
Phrase
Identification
Key Term Extraction
Automatic Key Term Extraction
37
Key Term Extraction, National Taiwan University
Archive of
spoken
documents
Branching
Entropy
Feature
Extraction
Learning Methods
1) K-means Exemplar
2) AdaBoost
3) Neural Network
ASR
speech signal
ASR trans
Key terms
entropy
acoustic
model
:
Using unsupervised and supervised approaches to extract key terms
Learning Methods
38
Key Term Extraction, National Taiwan University
•Unsupervised learning
• K-means Exemplar
 Transform a term into a vector in LTS (Latent Topic
Significance) space
 Run K-means
 Find the centroid of each cluster to be the key term
The candidate term in the same group are related to the key term
The key term can represent this topic
The terms in the same cluster focus on a single topic
Learning Methods
39
Key Term Extraction, National Taiwan University
•Supervised learning
• Adaptive Boosting
• Neural Network
Automatically adjust the weights of features to produce a classifier
40
Key Term Extraction, National Taiwan University
Experiments
41
Key Term Extraction, National Taiwan University
•Corpus
• NTU lecture corpus
 Mandarin Chinese embedded by English words
 Single speaker
 45.2 hours
我們的solution是viterbi algorithm
(Our solution is viterbi algorithm)
Experiments
42
Key Term Extraction, National Taiwan University
•ASR Accuracy
Language Mandarin English Overall
Char Acc (%) 78.15 53.44 76.26
CH EN
SI Model
some data from
target speaker
AM
Out-of-domain
corpora
Background
In-domain
corpus
Adaptive
trigram
interpolation LM
Bilingual AM and
model adaptation
Experiments
43
Key Term Extraction, National Taiwan University
•Reference Key Terms
• Annotations from 61 students who have taken the course
 If the k-th annotator labeled Nk key terms, he gave each
of them a score of , but 0 to others
 Rank the terms by the sum of all scores given by all
annotators for each term
 Choose the top N terms form the list (N is average Nk)
• N = 154 key terms
 59 key phrases and 95 keywords
Experiments
44
Key Term Extraction, National Taiwan University
•Evaluation
• Unsupervised learning
 Set the number of key terms to be N
• Supervised learning
 3-fold cross validation
0
10
20
30
40
50
60
Pr Lx Sm Pr+Lx Pr+Lx+Sm
Experiments
45
Key Term Extraction, National Taiwan University
•Feature Effectiveness
• Neural network for keywords from ASR transcriptions
Each set of these features alone gives F1 from 20% to 42%Prosodic features and lexical features are additiveThree sets of features are all useful
20.78
42.86
35.63
48.15
56.55
Pr: Prosodic
Lx: Lexical
Sm: Semantic
F-measure
0
10
20
30
40
50
60
70
Baseline U: TFIDF U: K-means S: AB S: NN
manual
Experiments
46
Key Term Extraction, National Taiwan University
•Overall Performance
51.95
55.84
62.39
67.31
23.38
Conventional TFIDF scores
w/o branching entropy
stop word removal
PoS filtering
Branching entropy performs wellK-means Exempler outperforms TFIDFSupervised approaches are better than unsupervised approaches
F-measure
AB: AdaBoost
NN: Neural Network
0
10
20
30
40
50
60
70
Baseline U: TFIDF U: K-means S: AB S: NN
manual
ASR
Experiments
47
Key Term Extraction, National Taiwan University
•Overall Performance
The performance of ASR is slightly worse than manual but reasonableSupervised learning using neural network gives the best results
23.38
20.78
51.95
43.51
55.84
52.60
62.39
57.68
67.31
62.70
F-measure
AB: AdaBoost
NN: Neural Network
48
Key Term Extraction, National Taiwan University
Conclusion
49
Key Term Extraction, National Taiwan University
•We propose the new approach to extract key terms
•The performance can be improved by
• Identifying phrases by branching entropy
• Prosodic, lexical, and semantic features together
•The results are encouraging
Thank reviewers for valuable comments
NTU Virtual Instructor: http://speech.ee.ntu.edu.tw/~RA/lecture
50
Key Term Extraction, National Taiwan University

More Related Content

What's hot

Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
ananth
 
co:op-READ-Convention Marburg - Roger Labahn
co:op-READ-Convention Marburg - Roger Labahnco:op-READ-Convention Marburg - Roger Labahn
co:op-READ-Convention Marburg - Roger Labahn
ICARUS - International Centre for Archival Research
 
Automated Abstracts and Big Data
Automated Abstracts and Big DataAutomated Abstracts and Big Data
Automated Abstracts and Big Data
Sameer Wadkar
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Chunyang Chen
 
Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...
inscit2006
 
Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
Sebastian Ruder
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
WarNik Chow
 
Processing short-message communications in low-resource languages
Processing short-message communications in low-resource languages�Processing short-message communications in low-resource languages�
Processing short-message communications in low-resource languages
Robert Munro
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
Seonghyun Kim
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Bhaskar Mitra
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text Summarization
Elaheh Barati
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain Ontology
Keerti Bhogaraju
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
alessio_ferrari
 
Teldap4 getty multilingual vocab workshop2010
Teldap4 getty multilingual vocab workshop2010Teldap4 getty multilingual vocab workshop2010
Teldap4 getty multilingual vocab workshop2010AAT Taiwan
 
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - IntroductionOntology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Aldo Gangemi
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Facultad de Informática UCM
 
Deep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text ClassificationDeep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text Classification
HPCC Systems
 

What's hot (20)

Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
 
co:op-READ-Convention Marburg - Roger Labahn
co:op-READ-Convention Marburg - Roger Labahnco:op-READ-Convention Marburg - Roger Labahn
co:op-READ-Convention Marburg - Roger Labahn
 
Automated Abstracts and Big Data
Automated Abstracts and Big DataAutomated Abstracts and Big Data
Automated Abstracts and Big Data
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
 
Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...Taking into account communities of practice’s specific vocabularies in inform...
Taking into account communities of practice’s specific vocabularies in inform...
 
Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
 
Asr
AsrAsr
Asr
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
Processing short-message communications in low-resource languages
Processing short-message communications in low-resource languages�Processing short-message communications in low-resource languages�
Processing short-message communications in low-resource languages
 
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword InformationEnriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text Summarization
 
Topic Extraction on Domain Ontology
Topic Extraction on Domain OntologyTopic Extraction on Domain Ontology
Topic Extraction on Domain Ontology
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
 
Arabic question answering ‫‬
Arabic question answering ‫‬Arabic question answering ‫‬
Arabic question answering ‫‬
 
Teldap4 getty multilingual vocab workshop2010
Teldap4 getty multilingual vocab workshop2010Teldap4 getty multilingual vocab workshop2010
Teldap4 getty multilingual vocab workshop2010
 
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - IntroductionOntology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
Ontology Design Patterns for Linked Data Tutorial at ISWC2016 - Introduction
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
 
Deep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text ClassificationDeep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text Classification
 
ppt
pptppt
ppt
 

Viewers also liked

Detecting Actionable Items in Meetings by Convolutional Deep Structured Seman...
Detecting Actionable Items in Meetings by Convolutional Deep Structured Seman...Detecting Actionable Items in Meetings by Convolutional Deep Structured Seman...
Detecting Actionable Items in Meetings by Convolutional Deep Structured Seman...
Yun-Nung (Vivian) Chen
 
Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken...
Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken...Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken...
Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken...
Yun-Nung (Vivian) Chen
 
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
Yun-Nung (Vivian) Chen
 
Language Empowering Intelligent Assistants (CHT)
Language Empowering Intelligent Assistants (CHT)Language Empowering Intelligent Assistants (CHT)
Language Empowering Intelligent Assistants (CHT)
Yun-Nung (Vivian) Chen
 
Statistical Learning from Dialogues for Intelligent Assistants
Statistical Learning from Dialogues for Intelligent AssistantsStatistical Learning from Dialogues for Intelligent Assistants
Statistical Learning from Dialogues for Intelligent Assistants
Yun-Nung (Vivian) Chen
 
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
End-to-End Joint Learning of Natural Language Understanding and Dialogue ManagerEnd-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
Yun-Nung (Vivian) Chen
 
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Yun-Nung (Vivian) Chen
 
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Yun-Nung (Vivian) Chen
 
Deep Learning for Dialogue Modeling - NTHU
Deep Learning for Dialogue Modeling - NTHUDeep Learning for Dialogue Modeling - NTHU
Deep Learning for Dialogue Modeling - NTHU
Yun-Nung (Vivian) Chen
 
"Sorry, I didn't get that!" - Statistical Learning from Dialogues for Intelli...
"Sorry, I didn't get that!" - Statistical Learning from Dialogues for Intelli..."Sorry, I didn't get that!" - Statistical Learning from Dialogues for Intelli...
"Sorry, I didn't get that!" - Statistical Learning from Dialogues for Intelli...
Yun-Nung (Vivian) Chen
 
Intent-Aware Diversification Using a Constrained PLSA
Intent-Aware Diversification Using a Constrained PLSAIntent-Aware Diversification Using a Constrained PLSA
Intent-Aware Diversification Using a Constrained PLSA
Jacek Wasilewski
 
Pengolahan Limbah Secara Alamiah
Pengolahan Limbah Secara AlamiahPengolahan Limbah Secara Alamiah
Pengolahan Limbah Secara Alamiah
Luthfiana Rahmasari
 
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Yun-Nung (Vivian) Chen
 
An Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task UnderstandingAn Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task Understanding
Yun-Nung (Vivian) Chen
 
큐레이션 세미나 Pt자료(도서관대회 인쇄본)
큐레이션 세미나 Pt자료(도서관대회 인쇄본)큐레이션 세미나 Pt자료(도서관대회 인쇄본)
큐레이션 세미나 Pt자료(도서관대회 인쇄본)
MoonW Lee
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
[D2CAMPUS] Algorithm tips - ALGOS
[D2CAMPUS] Algorithm tips - ALGOS[D2CAMPUS] Algorithm tips - ALGOS
[D2CAMPUS] Algorithm tips - ALGOS
NAVER D2
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
Innovation Engineering
 
[D2 CAMPUS] 분야별 모임 '보안' 발표자료
[D2 CAMPUS] 분야별 모임 '보안' 발표자료[D2 CAMPUS] 분야별 모임 '보안' 발표자료
[D2 CAMPUS] 분야별 모임 '보안' 발표자료
NAVER D2
 

Viewers also liked (20)

Detecting Actionable Items in Meetings by Convolutional Deep Structured Seman...
Detecting Actionable Items in Meetings by Convolutional Deep Structured Seman...Detecting Actionable Items in Meetings by Convolutional Deep Structured Seman...
Detecting Actionable Items in Meetings by Convolutional Deep Structured Seman...
 
Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken...
Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken...Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken...
Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken...
 
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
 
Language Empowering Intelligent Assistants (CHT)
Language Empowering Intelligent Assistants (CHT)Language Empowering Intelligent Assistants (CHT)
Language Empowering Intelligent Assistants (CHT)
 
Statistical Learning from Dialogues for Intelligent Assistants
Statistical Learning from Dialogues for Intelligent AssistantsStatistical Learning from Dialogues for Intelligent Assistants
Statistical Learning from Dialogues for Intelligent Assistants
 
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
End-to-End Joint Learning of Natural Language Understanding and Dialogue ManagerEnd-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager
 
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
 
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
Unsupervised Learning and Modeling of Knowledge and Intent for Spoken Dialogu...
 
Deep Learning for Dialogue Modeling - NTHU
Deep Learning for Dialogue Modeling - NTHUDeep Learning for Dialogue Modeling - NTHU
Deep Learning for Dialogue Modeling - NTHU
 
"Sorry, I didn't get that!" - Statistical Learning from Dialogues for Intelli...
"Sorry, I didn't get that!" - Statistical Learning from Dialogues for Intelli..."Sorry, I didn't get that!" - Statistical Learning from Dialogues for Intelli...
"Sorry, I didn't get that!" - Statistical Learning from Dialogues for Intelli...
 
Intent-Aware Diversification Using a Constrained PLSA
Intent-Aware Diversification Using a Constrained PLSAIntent-Aware Diversification Using a Constrained PLSA
Intent-Aware Diversification Using a Constrained PLSA
 
Pengolahan Limbah Secara Alamiah
Pengolahan Limbah Secara AlamiahPengolahan Limbah Secara Alamiah
Pengolahan Limbah Secara Alamiah
 
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
Leveraging Behavioral Patterns of Mobile Applications for Personalized Spoken...
 
An Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task UnderstandingAn Intelligent Assistant for High-Level Task Understanding
An Intelligent Assistant for High-Level Task Understanding
 
큐레이션 세미나 Pt자료(도서관대회 인쇄본)
큐레이션 세미나 Pt자료(도서관대회 인쇄본)큐레이션 세미나 Pt자료(도서관대회 인쇄본)
큐레이션 세미나 Pt자료(도서관대회 인쇄본)
 
NLP_session-1
NLP_session-1NLP_session-1
NLP_session-1
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
[D2CAMPUS] Algorithm tips - ALGOS
[D2CAMPUS] Algorithm tips - ALGOS[D2CAMPUS] Algorithm tips - ALGOS
[D2CAMPUS] Algorithm tips - ALGOS
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
[D2 CAMPUS] 분야별 모임 '보안' 발표자료
[D2 CAMPUS] 분야별 모임 '보안' 발표자료[D2 CAMPUS] 분야별 모임 '보안' 발표자료
[D2 CAMPUS] 분야별 모임 '보안' 발표자료
 

Similar to Automatic Key Term Extraction from Spoken Course Lectures

Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
DigiGurukul
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
Abdelaziz Al-Rihawi
 
SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming text
Sujit Pal
 
Word Frequency Dominance and L2 Word Recognition
Word Frequency Dominance and L2 Word RecognitionWord Frequency Dominance and L2 Word Recognition
Word Frequency Dominance and L2 Word Recognition
Yu Tamura
 
Voice Cloning
Voice CloningVoice Cloning
Voice Cloning
Aatiz Ghimire
 
Eskm20140903
Eskm20140903Eskm20140903
Eskm20140903
Shuhei Otani
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Saurabh Kaushik
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Textbutest
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
Akshay Hegde
 
From Semantics to Self-supervised Learning for Speech and Beyond (Opening Ke...
From Semantics to Self-supervised Learning  for Speech and Beyond (Opening Ke...From Semantics to Self-supervised Learning  for Speech and Beyond (Opening Ke...
From Semantics to Self-supervised Learning for Speech and Beyond (Opening Ke...
linshanleearchive
 
Research_Wu.pptx
Research_Wu.pptxResearch_Wu.pptx
Research_Wu.pptx
Rakesh Pogula
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
haenelt.ppt
haenelt.ppthaenelt.ppt
haenelt.ppt
ssuser4293bd
 
Spoken Content Retrieval - Lattices and Beyond
Spoken Content Retrieval - Lattices and BeyondSpoken Content Retrieval - Lattices and Beyond
Spoken Content Retrieval - Lattices and Beyond
linshanleearchive
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesValentina Paunovic
 
DNN Model Interpretability
DNN Model InterpretabilityDNN Model Interpretability
DNN Model Interpretability
Subhashis Hazarika
 
李宏毅/當語音處理遇上深度學習
李宏毅/當語音處理遇上深度學習李宏毅/當語音處理遇上深度學習
李宏毅/當語音處理遇上深度學習
台灣資料科學年會
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
RajkiranVeluri
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
Richie
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
Satyam Saxena
 

Similar to Automatic Key Term Extraction from Spoken Course Lectures (20)

Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
 
SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming text
 
Word Frequency Dominance and L2 Word Recognition
Word Frequency Dominance and L2 Word RecognitionWord Frequency Dominance and L2 Word Recognition
Word Frequency Dominance and L2 Word Recognition
 
Voice Cloning
Voice CloningVoice Cloning
Voice Cloning
 
Eskm20140903
Eskm20140903Eskm20140903
Eskm20140903
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Text
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
From Semantics to Self-supervised Learning for Speech and Beyond (Opening Ke...
From Semantics to Self-supervised Learning  for Speech and Beyond (Opening Ke...From Semantics to Self-supervised Learning  for Speech and Beyond (Opening Ke...
From Semantics to Self-supervised Learning for Speech and Beyond (Opening Ke...
 
Research_Wu.pptx
Research_Wu.pptxResearch_Wu.pptx
Research_Wu.pptx
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
haenelt.ppt
haenelt.ppthaenelt.ppt
haenelt.ppt
 
Spoken Content Retrieval - Lattices and Beyond
Spoken Content Retrieval - Lattices and BeyondSpoken Content Retrieval - Lattices and Beyond
Spoken Content Retrieval - Lattices and Beyond
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositories
 
DNN Model Interpretability
DNN Model InterpretabilityDNN Model Interpretability
DNN Model Interpretability
 
李宏毅/當語音處理遇上深度學習
李宏毅/當語音處理遇上深度學習李宏毅/當語音處理遇上深度學習
李宏毅/當語音處理遇上深度學習
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 

More from Yun-Nung (Vivian) Chen

End-to-End Task-Completion Neural Dialogue Systems
End-to-End Task-Completion Neural Dialogue SystemsEnd-to-End Task-Completion Neural Dialogue Systems
End-to-End Task-Completion Neural Dialogue Systems
Yun-Nung (Vivian) Chen
 
How the Context Matters Language and Interaction in Dialogues
How the Context Matters Language and Interaction in DialoguesHow the Context Matters Language and Interaction in Dialogues
How the Context Matters Language and Interaction in Dialogues
Yun-Nung (Vivian) Chen
 
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Yun-Nung (Vivian) Chen
 
Chatbot的智慧與靈魂
Chatbot的智慧與靈魂Chatbot的智慧與靈魂
Chatbot的智慧與靈魂
Yun-Nung (Vivian) Chen
 
Deep Learning for Dialogue Systems
Deep Learning for Dialogue SystemsDeep Learning for Dialogue Systems
Deep Learning for Dialogue Systems
Yun-Nung (Vivian) Chen
 
One Day for Bot 一天搞懂聊天機器人
One Day for Bot 一天搞懂聊天機器人One Day for Bot 一天搞懂聊天機器人
One Day for Bot 一天搞懂聊天機器人
Yun-Nung (Vivian) Chen
 

More from Yun-Nung (Vivian) Chen (6)

End-to-End Task-Completion Neural Dialogue Systems
End-to-End Task-Completion Neural Dialogue SystemsEnd-to-End Task-Completion Neural Dialogue Systems
End-to-End Task-Completion Neural Dialogue Systems
 
How the Context Matters Language and Interaction in Dialogues
How the Context Matters Language and Interaction in DialoguesHow the Context Matters Language and Interaction in Dialogues
How the Context Matters Language and Interaction in Dialogues
 
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information ...
 
Chatbot的智慧與靈魂
Chatbot的智慧與靈魂Chatbot的智慧與靈魂
Chatbot的智慧與靈魂
 
Deep Learning for Dialogue Systems
Deep Learning for Dialogue SystemsDeep Learning for Dialogue Systems
Deep Learning for Dialogue Systems
 
One Day for Bot 一天搞懂聊天機器人
One Day for Bot 一天搞懂聊天機器人One Day for Bot 一天搞懂聊天機器人
One Day for Bot 一天搞懂聊天機器人
 

Recently uploaded

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 

Recently uploaded (20)

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 

Automatic Key Term Extraction from Spoken Course Lectures

  • 1. Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan
  • 2. 2 Key Term Extraction, National Taiwan University
  • 3. Definition •Key Term • Higher term frequency • Core content •Two types • Keyword • Key phrase •Advantage • Indexing and retrieval • The relations between key terms and segments of documents 3 Key Term Extraction, National Taiwan University
  • 4. Introduction 4 Key Term Extraction, National Taiwan University
  • 5. Introduction 5 Key Term Extraction, National Taiwan University acoustic model language model hmm n gram phonehidden Markov model
  • 6. Introduction 6 Key Term Extraction, National Taiwan University hmm acoustic model language model n gram phonehidden Markov model bigram Target: extract key terms from course lectures
  • 7. 7 Key Term Extraction, National Taiwan University
  • 8. Automatic Key Term Extraction 8 Key Term Extraction, National Taiwan University ▼ Original spoken documents Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network ASR speech signal ASR trans
  • 9. Automatic Key Term Extraction 9 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network ASR speech signal ASR trans
  • 10. Automatic Key Term Extraction 10 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network ASR speech signal ASR trans
  • 11. Phrase Identification Automatic Key Term Extraction 11 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network ASR speech signal First using branching entropy to identify phrases ASR trans
  • 12. Phrase Identification Key Term Extraction Automatic Key Term Extraction 12 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network ASR speech signal Key terms entropy acoustic model : Then using learning methods to extract key terms by some features ASR trans
  • 13. Phrase Identification Key Term Extraction Automatic Key Term Extraction 13 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network ASR speech signal Key terms entropy acoustic model : ASR trans
  • 14. Branching Entropy 14 Key Term Extraction, National Taiwan University • “hidden” is almost always followed by the same word hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
  • 15. Branching Entropy 15 Key Term Extraction, National Taiwan University • “hidden” is almost always followed by the same word • “hidden Markov” is almost always followed by the same word hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
  • 16. Branching Entropy 16 Key Term Extraction, National Taiwan University hidden Markov model boundary Define branching entropy to decide possible boundary How to decide the boundary of a phrase? represent is can : : is of in : : • “hidden” is almost always followed by the same word • “hidden Markov” is almost always followed by the same word • “hidden Markov model” is followed by many different words
  • 17. Branching Entropy 17 Key Term Extraction, National Taiwan University hidden Markov model • Definition of Right Branching Entropy • Probability of children xi for X • Right branching entropy for X X xi How to decide the boundary of a phrase? represent is can : : is of in : :
  • 18. Branching Entropy 18 Key Term Extraction, National Taiwan University hidden Markov model • Decision of Right Boundary • Find the right boundary located between X and xi where X boundary How to decide the boundary of a phrase? represent is can : : is of in : :
  • 19. Branching Entropy 19 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
  • 20. Branching Entropy 20 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
  • 21. Branching Entropy 21 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
  • 22. Branching Entropy 22 Key Term Extraction, National Taiwan University hidden Markov model • Decision of Left Boundary • Find the left boundary located between X and xi where X: model Markov hidden How to decide the boundary of a phrase? boundary X represent is can : : is of in : : Using PAT Tree to implement
  • 23. Branching Entropy 23 Key Term Extraction, National Taiwan University • Implementation in the PAT tree • Probability of children xi for X • Right branching entropy for X hidden Markov 1 model 2 chain 3 state 5distribution 6 variable 4 X x1 x2 X : hidden Markov x1: hidden Markov model x2: hidden Markov chain How to decide the boundary of a phrase?
  • 24. Phrase Identification Key Term Extraction Automatic Key Term Extraction 24 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network ASR speech signal Key terms entropy acoustic model : Extract prosodic, lexical, and semantic features for each candidate term ASR trans
  • 25. Feature Extraction 25 Key Term Extraction, National Taiwan University •Prosodic features • For each candidate term appearing at the first time Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Speaker tends to use longer duration to emphasize key terms using 4 values for duration of the term duration of phone “a” normalized by avg duration of phone “a”
  • 26. Feature Extraction 26 Key Term Extraction, National Taiwan University •Prosodic features • For each candidate term appearing at the first time Higher pitch may represent significant information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range)
  • 27. Feature Extraction 27 Key Term Extraction, National Taiwan University •Prosodic features • For each candidate term appearing at the first time Higher pitch may represent significant information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range)
  • 28. Feature Extraction 28 Key Term Extraction, National Taiwan University •Prosodic features • For each candidate term appearing at the first time Higher energy emphasizes important information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range)
  • 29. Feature Extraction 29 Key Term Extraction, National Taiwan University •Prosodic features • For each candidate term appearing at the first time Higher energy emphasizes important information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range) Energy (I - IV) energy (max, min, mean, range)
  • 30. Feature Extraction 30 Key Term Extraction, National Taiwan University •Lexical features Feature Name Feature Description TF term frequency IDF inverse document frequency TFIDF tf * idf PoS the PoS tag Using some well-known lexical features for each candidate term
  • 31. Feature Extraction 31 Key Term Extraction, National Taiwan University •Semantic features • Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Probability Key terms tend to focus on limited topics t1 t2 tj tn D1 D2 Di DN TK Tk T2 T1 P(T |D )k i P(t |T )j k Di: documents Tk: latent topics tj: terms
  • 32. Feature Extraction 32 Key Term Extraction, National Taiwan University •Semantic features • Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Probability Feature Name Feature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics describe a probability distribution How to use it?
  • 33. Feature Extraction 33 Key Term Extraction, National Taiwan University •Semantic features • Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Significance Within-topic to out-of-topic ratio Feature Name Feature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics within-topic freq. out-of-topic freq.
  • 34. Feature Extraction 34 Key Term Extraction, National Taiwan University •Semantic features • Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Significance Within-topic to out-of-topic ratio Feature Name Feature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics within-topic freq. out-of-topic freq.
  • 35. Feature Extraction 35 Key Term Extraction, National Taiwan University •Semantic features • Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Entropy Feature Name Feature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics
  • 36. Feature Extraction 36 Key Term Extraction, National Taiwan University •Semantic features • Probabilistic Latent Semantic Analysis (PLSA)  Latent Topic Entropy Feature Name Feature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) LTE term entropy for latent topic non-key term key term Key terms tend to focus on limited topics Higher LTE Lower LTE
  • 37. Phrase Identification Key Term Extraction Automatic Key Term Extraction 37 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1) K-means Exemplar 2) AdaBoost 3) Neural Network ASR speech signal ASR trans Key terms entropy acoustic model : Using unsupervised and supervised approaches to extract key terms
  • 38. Learning Methods 38 Key Term Extraction, National Taiwan University •Unsupervised learning • K-means Exemplar  Transform a term into a vector in LTS (Latent Topic Significance) space  Run K-means  Find the centroid of each cluster to be the key term The candidate term in the same group are related to the key term The key term can represent this topic The terms in the same cluster focus on a single topic
  • 39. Learning Methods 39 Key Term Extraction, National Taiwan University •Supervised learning • Adaptive Boosting • Neural Network Automatically adjust the weights of features to produce a classifier
  • 40. 40 Key Term Extraction, National Taiwan University
  • 41. Experiments 41 Key Term Extraction, National Taiwan University •Corpus • NTU lecture corpus  Mandarin Chinese embedded by English words  Single speaker  45.2 hours 我們的solution是viterbi algorithm (Our solution is viterbi algorithm)
  • 42. Experiments 42 Key Term Extraction, National Taiwan University •ASR Accuracy Language Mandarin English Overall Char Acc (%) 78.15 53.44 76.26 CH EN SI Model some data from target speaker AM Out-of-domain corpora Background In-domain corpus Adaptive trigram interpolation LM Bilingual AM and model adaptation
  • 43. Experiments 43 Key Term Extraction, National Taiwan University •Reference Key Terms • Annotations from 61 students who have taken the course  If the k-th annotator labeled Nk key terms, he gave each of them a score of , but 0 to others  Rank the terms by the sum of all scores given by all annotators for each term  Choose the top N terms form the list (N is average Nk) • N = 154 key terms  59 key phrases and 95 keywords
  • 44. Experiments 44 Key Term Extraction, National Taiwan University •Evaluation • Unsupervised learning  Set the number of key terms to be N • Supervised learning  3-fold cross validation
  • 45. 0 10 20 30 40 50 60 Pr Lx Sm Pr+Lx Pr+Lx+Sm Experiments 45 Key Term Extraction, National Taiwan University •Feature Effectiveness • Neural network for keywords from ASR transcriptions Each set of these features alone gives F1 from 20% to 42%Prosodic features and lexical features are additiveThree sets of features are all useful 20.78 42.86 35.63 48.15 56.55 Pr: Prosodic Lx: Lexical Sm: Semantic F-measure
  • 46. 0 10 20 30 40 50 60 70 Baseline U: TFIDF U: K-means S: AB S: NN manual Experiments 46 Key Term Extraction, National Taiwan University •Overall Performance 51.95 55.84 62.39 67.31 23.38 Conventional TFIDF scores w/o branching entropy stop word removal PoS filtering Branching entropy performs wellK-means Exempler outperforms TFIDFSupervised approaches are better than unsupervised approaches F-measure AB: AdaBoost NN: Neural Network
  • 47. 0 10 20 30 40 50 60 70 Baseline U: TFIDF U: K-means S: AB S: NN manual ASR Experiments 47 Key Term Extraction, National Taiwan University •Overall Performance The performance of ASR is slightly worse than manual but reasonableSupervised learning using neural network gives the best results 23.38 20.78 51.95 43.51 55.84 52.60 62.39 57.68 67.31 62.70 F-measure AB: AdaBoost NN: Neural Network
  • 48. 48 Key Term Extraction, National Taiwan University
  • 49. Conclusion 49 Key Term Extraction, National Taiwan University •We propose the new approach to extract key terms •The performance can be improved by • Identifying phrases by branching entropy • Prosodic, lexical, and semantic features together •The results are encouraging
  • 50. Thank reviewers for valuable comments NTU Virtual Instructor: http://speech.ee.ntu.edu.tw/~RA/lecture 50 Key Term Extraction, National Taiwan University

Editor's Notes

  1. Hello, everybody. I am Vivian Chen, coming from National Taiwan University. Today I’m going to present my work about automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features.
  2. First I will define what is key term. A key term is a term that has higher term frequency and includes core content. There are two types of key terms. One of them is phrase, we call it key phrase. For example, “language model” is a key phrase. Another type is single word, we call it keyword, like “entropy”. Then there are two advantages about key term extraction. They can help us index and retrieve. We also can construct the relationships between key terms and segment documents. Here’s an example.
  3. We can show some key terms related to acoustic model. If the key term and acoustic model co-occurs in the same document, they are relevant, so that we can show them for users.
  4. Then we can construct the key term graph to represent the relationships between these key terms like this.
  5. Similarly, we can also construct the relation between language model and other terms. Then we can show the whole graph to know the organization of key terms.
  6. Here’s the flow chart. Now there are a lot of spoken documents.
  7. First, with speech recognition system, we can get ASR transcriptions.
  8. The input of our work includes transcriptions and speech signals.
  9. The first part is to identify phrases. We propose branching entropy to finish this part.
  10. Then we can enter key term extraction part. We first extract some features to represent a candidate term, and use machine learning methods to decide the key terms.
  11. First we explain how to do phrase identification.
  12. The target of this work is to decide the boundary of a phrase, but where’s the boundary? We can observe some characteristics first. Hidden is almost always followed by Markov.
  13. Similarly, hidden Markov is almost always followed by model,
  14. But at this position, we find that hidden Markov model is followed by some different words. The position is more likely to be boundary. Then we define branching entropy by the concept.
  15. First, I will define what is right branching entropy. We assume that X is hidden and x_i is the children of X, representing hidden Markov. We define p of x_i to be the frequency of x_i over the frequency of X, which is probability of children x_i for X. Then we define right branching entropy as this equation, H_r of X, which represents the entropy of X’s chidren.
  16. The idea is that the branching entropy at the boundary is higher. So we can find the right boundary where branching entropy is higher like hidden Markov model.
  17. Similarly, left branching entropy has the same attribute.
  18. The approach called branching entropy is to find the boundaries of key phrases, but where’s the boundary. The idea is that the word entropy at the boundary is higher. There are four parts in this approach. We can present each part in detail.
  19. The approach called branching entropy is to find the boundaries of key phrases, but where’s the boundary. The idea is that the word entropy at the boundary is higher. There are four parts in this approach. We can present each part in detail.
  20. We also decide the left boundary by H_l of X bar, X bar is the reverse pattern like model Markov hidden. Branching entropy can be implemented by PAT tree.
  21. Then in the PAT tree, we can compute right branching entropy for each node. We can take an example to explain p(xi). X is the node representing a phrase hidden Markov, x_1 is X’s child hidden Markov model, x_2 is X’s another child hidden Markov chain. Previously described p of x_i is shown as this one, and then compute right branching entropy of this node. We compute H_r of X for all X in PAT tree and H_l of X bar for all X bar in the reverse PAT tree. We can co
  22. With all candidate terms, including words and phrases, we can extract prosodic, lexical, semantic features for each term.
  23. For each word, we also compute some prosodic features. First, we believe that lecturer would use longer duration to emphasize key terms. For the candidate term first appearing, we compute the duration for each phone in each candidate term. Then we normalize the duration of specific phone by the average duration of this phone. In this feature, we only use four values to represent this term. We can compute maximum, minimum, mean and range over all phones in a single term to be the features.
  24. We believe that higher pitch may represent important information. So we can extract the pitch contour like this.
  25. The method is like duration, but the segment unit is changed to single frame. We also use these four values to represent the features.
  26. Similarly, we think that higher energy may represent important information. We also can extract energy for each frame in a candidate term.
  27. The features are like pitch. The first set of features is shown in this.
  28. The second set of features is lexical features. These features are well-known, which may indicate the importance of term. We just use these features to represent each term.
  29. Third set of features is semantic features. The assumption is that key terms tend to focus on limited topics. We use PLSA to compute some semantic features. We can get the probability of each topic given a candidate term.
  30. Here are distributions of two terms. Notice that the first term has similar probability for most topics, and it may be a function word, so that this candidate term may be non-key term. In other one, the term focuses on less topics, so it is more likely to be key term. In order to describe the distribution, for the distribution, we compute mean, variance and standard deviation to be the features of this term.
  31. Similarly, we can compute latent topic significance, which is within-topic to out-of-topic ratio like this equation. This is the significance of topic T_k in term t_i. This one is within-topic frequency, where n of t_i and d_j is the number of t_i in document d_j, and this one is out-of-topic frequency.
  32. Because the significance of key terms would focus on limited topics, we also compute this three values to represent a distribution.
  33. Then, we also can compute latent topic entropy. Because this feature also can describe the difference between these two distributions. Non-key term may have higher LTE, and key term would have lower one.
  34. Finally, we use some machine learning methods to extract key terms.
  35. The first one is an unsupervised learning, K-means Examplar. First, we can transform each word into a vector in latent topic significance space, as this equation. With these vectors, we run K-means. The terms in the same cluster focus on a single topic. So we can extract the centroid of each cluster to be the key term, because the terms in the same group are related to this key term.
  36. We also use two supervised learning, adaptive boosting and neural network to automatically adjust the weights of features to produce a classifier.
  37. Then we do some experiments to evaluate our approach. The corpus is NTU lecture, which includes Mandarin Chinese and some English words, like this example. This sentence is ~~, which means our solution is viterbi algorithm, The lecture is from a single speaker, and total corpus is about 45 hours.
  38. In the ASR system, we train two acoustic models for chinese and english, and use some data to adapt, finally getting a bilingual acoustic model. Language model is trigram interpolation of out-of-domain corpora and some in-domain corpus. Here’s ASR accuracy.
  39. To evaluate our result, we need to generate reference key term list. The reference key terms are from students’ annotations, and these students have taken the course. Then we sort all terms and decide top N to be key terms. N is the average numbers of key terms extracted by students, and N is 154. The reference key term list includes 59 key phrases and 95 keywords.
  40. Finally we evaluate the results. For unsupervised learning, we set the number of key terms to be N. And using 3-fold cross validation evaluates supervised learning.
  41. At this experiment, we are going to see feature effectiveness. The results only include keywords, and it is from neural network using ASR transcriptions. Row a, b, c perform F1 measure from 20% to 42%. Then row d shows that prosodic and lexical features are additive. Row e proves that adding semantic features can further improve the performance so that three sets of features are all useful.
  42. Finally, we show the overall performance for manual and ASR transcriptions. These are baseline, conventional TFIDF without extracting phrases using branching entropy. From the better performance, we can see that branching entropy is very useful. This proves our assumption that the term with higher branching entropy is more likely to be key term. And the best results are from supervised learning using neural network, achieving F1 measure of 67 and 62.
  43. Finally, we show the overall performance for manual and ASR transcriptions. These are baseline, conventional TFIDF without extracting phrases using branching entropy. From the better performance, we can see that branching entropy is very useful. This proves our assumption that the term with higher branching entropy is more likely to be key term. And the best results are from supervised learning using neural network, achieving F1 measure of 67 and 62.
  44. From the above experiments, the conclusion is that we proposed new approach to extract key terms efficiently. The performance can be improved by two ideas, using branching entropy to extract key phrases and using three sets of features.