SlideShare a Scribd company logo
1 of 72
Download to read offline
Kyunghoon Kim
UNIST
Department of Mathematical Sciences
December 12, 2017
kyunghoon@unist.ac.kr
A Mathematical Measurement for Korean Text mining
and its applications
Difficulty of Korean Language
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 2 / 83
• New Concepts

- Korean Alphabet ( , , , , …)

- End of a word ( ) ( , , , …)

- Postposition ( ) ( , , , , …)

- Word order ( ) (SOV, …)

- …
Language Destruction
Outline
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 3 / 83
1. Text Summarization
Korean Text Mining
2. Text Clustering
3. Learning of
Text Relationship
Korean
Language

Feature V2
Syllable
Vector
Heterogeneous Word2Vec
( Law2Vec )
Fuzzy System Term-Frequency Matrix
( LSI, NMF )
Artificial Neural Network
( Word2Vec )
2013’ 2015’ 2017’
1. Text Summarization | Motivation
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 4 / 83
< Raw News Article > < Summarized News >
March, 2013
How about Korean?
News Article
Summarized
Sentences
1. Text Summarization | Process
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 5 / 83
Document Preprocessing Feature Selection Scoring by Model Refinement & Sorting by score
NNP,*,T, ,*,*,*,*
JKB,*,F, ,*,*,*,*

NNG,*,T, ,*,*,*,*
NNG,*,T, ,*,*,*,*
JC,*,F, ,*,*,*,*

NNG,*,T, ,*,*,*,*
XSN,*,T, ,*,*,*,*

NNG,*,T, ,*,*,*,*
NNG,*,T, ,*,*,*,*
XSN,*,T, ,*,*,*,*

NNG,*,F, ,*,*,*,*
JC,*,F, ,*,*,*,*

NNG,*,T, ,*,*,*,*
NNG,*,F,
,Compound,*,*, /NNG/*+ /
NNG/*
JKS,*,F, ,*,*,*,*

MAG, / ,F,
,*,*,*,*

VV,*,F, ,*,*,*,*

EC,*,F, ,*,*,*,*

VX,*,T, ,*,*,*,*

EF,*,F, ,*,*,*,*

. SF,*,*,*,*,*,*,*
• Content word(Keyword) feature
• Title word feature
• Sentence location feature
• Sentence Length feature
• Proper Noun feature
• Upper-case word feature
• Cue-Phrase feature
• Biased Word feature
• Font based feature
• Pronouns
• Sentence-to-Sentence Cohesion
• Sentence-to-Centroid Cohesion
• Occurrence of non-essential information
• Discourse analysis
Only for
English features
1. Text Summarization | Feature measurements
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 6 / 83
Feature based on English
Feature based on Korean
, , , , , ... , , , , ...
1. Text Summarization | Fuzzy Set
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 7 / 83
1. Text Summarization | Fuzzy Set
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 8 / 83
1. Text Summarization | Fuzzy Set
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 9 / 83
1. Text Summarization | Fuzzy Set
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 10 / 83
1. Text Summarization | Calculating the score of sentences
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 11 / 83
1. Text Summarization | Korean Text Summarization
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 12 / 83
http://summ-dev.ap-northeast-2.elasticbeanstalk.com/
1. Text Summarization | Patent, 2013
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 13 / 83
https://goo.gl/blkjwf
Korean News Summarization System And Method
2. Text Clustering
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 14 / 83
Text Clustering
2. Text Clustering
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 15 / 83
MatrixDocuments
1. Select Matrix







2. Calculating similarity

between each column of matrix
3. Clustering by the degree of similarity
A =
0
B
B
B
@
a11 a12 ··· a1n
a21 a22 ··· a2n
...
...
...
...
am1 am2 ··· amn
1
C
C
C
A
Convert
A. Basic (using raw matrix)
B. LSI (Latent Semantic Indexing)
C. NMF (Non-negative Matrix Factorization)
2. Text Clustering | Term-Frequency Matrix
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 16 / 83
= { apple, banana, kiwi }
= { apple, banana, store }
= { store }
d1
d2
d3
A =
2
4 d1 · · · dn
3
5
Term-Frequency Matrix
Frequency
Document vector
2. Text Clustering | Singular Value Decomposition (SVD)
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 27 / 83
d1 d2 d3
w1 1 0 0
w2 0 1 0
w3 1 1 1
w4 1 1 0
w5 0 0 1
-0.27 0.21 0.70 -0.53 0.30
-0.27 0.21 -0.70 -0.53 0.30
-0.71 -0.33 0 -0.10 -0.60
-0.55 0.43 0 0.64 0.29
-0.15 -0.77 0 0.10 0.60
2.35 0 0
0 1.19 0
0 0 1.00
0 0 0
0 0 0
-0.65 0.26 0.70
-0.65 0.26 -0.70
-0.36 -0.92 0
=
2. Text Clustering | Latent Semantic Indexing (LSI)
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 28 / 83
2. Text Clustering | Non-negative Matrix Factorization (NMF)
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 31 / 83
2. Text Clustering | Non-negative Matrix Factorization (NMF)
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 32 / 83
Doc1
Doc2
Doc3
Feature
1
Feature
2
Feature 1
Feature 2
Term
1
Term
2
Term
3
Term
4
Term
5
2. Text Clustering | Term-Frequency Matrix
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 16 / 83
Large Dimension Matrix
for large-scale set
Proposed method
Syllable Vector
2. Text Clustering | Syllable-n Vector
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 17 / 83
about 1,200
dimension
2. Text Clustering | Dimension reduction using Syllable-n vector
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 18 / 83
Dimension Reduction
by Syllable Vector
Syllable-1 Syllable-2 Syllable-3
2. Text Clustering | Syllable-n-All Vector
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 19 / 83
Syllable-1-All Syllable-2-All
, , , , , , , ,
✓
lj
n
◆
length of word wj
Take all combination of syllable-n
2. Text Clustering | Benchmark Dataset HKIB-20000
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 23 / 83
Dimension reduction
How about information loss?
2. Text Clustering | Similarity
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 20 / 83
✓
a
b
sim(d1, d2) =
v
u
u
t2 1
2/9
p
3/9
p
3/9
!
= 0.8164
sim(d2, d3) = 0.919
sim(d1, d3) = 1.414
2. Text Clustering | Similarity
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 24 / 83
Source :
Doc Number 5222
Target :
Other all documents
2. Text Clustering | Correlation
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 25 / 83
Basic
LSI
NMF
2. Text Clustering | Evaluation of Text Clustering
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 34 / 83
2. Text Clustering | Precision
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 35 / 83
Real
Answer
TP
FP
Precision =
5
7
= 0.71
2. Text Clustering | Evaluation Set
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 36 / 83
Doc 1
Doc 2
Doc 3
Doc 4
Doc 5
Doc 6
…
2. Text Clustering | Standard for Evaluation
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 37 / 83
1
2
3
4
5
Nearest
neighbors
Limited
Radius
2. Text Clustering | Evaluation of text clustering
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 38 / 83
Radius Threshold
Syl-2
Syl-3
Word
2. Text Clustering | Evaluation of text clustering
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 39 / 83
Count Threshold
2. Text Clustering | Evaluation of text clustering
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 40 / 83
Precision Speed
n = 5 , count threshold
LSI LSI
2. Text Clustering | Evaluation of text clustering
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 41 / 83
Syl-2 for LSI
is BEST!
2. Text Clustering | Patent
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 42 / 83
https://goo.gl/fskHxTKorean Text Clustering System and Method
2. Text Clustering | Limitation of word-based method
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 43 / 83
These words are NOT important
to understand the given text!
Limitation of word-based method
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 44 / 83
3. Learning of Text Relationship
Word-based
Citation Relation
Find similar documents
using citation information
3. Learning of Text Relationship | Natural Language Processing (NLP)
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 45 / 83
https://www.upwork.com/hiring/for-clients/artificial-intelligence-and-natural-language-processing-in-big-data/
3. Learning of Text Relationship | Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 46 / 83
2013, Hot Model in NLP
“Word2Vec” (Google)
http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/
(맥도날드가, 햄버거는)
(맥도날드가, 맛있다.)
(맛있다., 맥도날드가)
(맛있다., 감자튀김도)
(감자튀김도, 맛있다.)
(감자튀김도, 맛있었는데..)
(맘스터치도, 햄버거는)
(맘스터치도, 맛있다.)
(맛있다., 맘스터치도)
(맛있다., 패티가)
Source Text
Red : Target keyword, Blue : Context Keyword
Training Set
3. Learning of Text Relationship | Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 47 / 83
(맥도날드가, 햄버거는)
(맥도날드가, 맛있다.)
Input, Output
3. Learning of Text Relationship | Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 48 / 83
3. Learning of Text Relationship | Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 50 / 83
Shortage of Word2Vec
• Only Word-based Method

=> Meaningless words are also counted.

• Only Same vocabulary set for input, output

=> Dimensions of input, output are fixed.

• Only use a context information of target word

=> depends entirely on context with windows size N.
3. Learning of Text Relationship | Heterogeneous Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 51 / 83
Heterogeneous Word2Vec
Input Output
3. Learning of Text Relationship | Heterogeneous Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 52 / 83
3. Learning of Text Relationship | Heterogeneous Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 53 / 83
1
2
3
4
5
1
2
3
4
5
0
6
3. Learning of Text Relationship | Heterogeneous Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 54 / 83
1
0
0
0
0
0
0
0
0
0
1
0
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
1
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
1
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
3. Learning of Text Relationship | Heterogeneous Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 55 / 83
1
0
0
0
0
0
1
0
0
0
0
0
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
1
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
1
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
Ch 4. Learning for number relationship
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 56 / 83
3. Learning of Text Relationship | Heterogeneous Word2Vec
1
2
3
4
5
1
2
3
4
5
0
6
Ch 4. Learning for number relationship
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 57 / 83
3. Learning of Text Relationship | Heterogeneous Word2Vec
1
2
3
4
5
1
2
3
4
5
0
6
3. Learning of Text Relationship | Heterogeneous Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 58 / 83
1
2
3
4
5
1
2
3
4
5
Similarity
( 0 is best )
Matrix (Vectors)
3. Learning of Text Relationship | Heterogeneous Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 60 / 83
3. Learning of Text Relationship | Heterogeneous Word2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 62 / 83
Input Output
3. Learning of Text Relationship | Law2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 63 / 83
Legal information comprises mainly of legislation and case.
• CL ( Case - Legislation )
• CC ( Case - Case )
• CLC ( Case - Legislation, Case )
3. Learning of Text Relationship | Law2Vec CL Model, CC Model
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 64 / 83
Cited legislations Cited cases
Case Case
3. Learning of Text Relationship | Law2Vec CLC Model
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 65 / 83
Cited legislations Cited cases
Case
3. Learning of Text Relationship | Evaluation of Law2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 66 / 83
3. Learning of Text Relationship | Evaluation of CL Model
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 67 / 83
Cited legislations
Case
3. Learning of Text Relationship | Evaluation of Law2Vec : W_1
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 68 / 83
3. Learning of Text Relationship | Evaluation of Law2Vec : W_2
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 69 / 83
3. Learning of Text Relationship | Evaluation of CC Model
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 70 / 83
Cited cases
Case
3. Learning of Text Relationship | Evaluation of CC Model
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 71 / 83
3. Learning of Text Relationship | Evaluation of CLC Model
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 73 / 83
Cited legislations Cited cases
Case
3. Learning of Text Relationship | Evaluation of CLC Model
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 74 / 83
3. Learning of Text Relationship | Result of Law2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 76 / 83
3. Learning of Text Relationship | Result of Law2Vec
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 77 / 83
3. Learning of Text Relationship | Expansion of Data set
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 78 / 83
< Lawyer Oh’s Answer Sheet >
3. Learning of Text Relationship | Law2vec for Sample Data
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 79 / 83
CL Model
Iteration
10000
CL Model
Iteration
60000
CC Model
Iteration
10000
CC Model
Iteration
60000
CLC Model
Iteration
10000
CLC Model
Iteration
60000
3. Learning of Text Relationship | Link Prediction
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 81 / 83
Conclusion
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 82 / 83
1. Main Contribution
Korean Language Feature V2 (JKB, JX)
Syllable Vector
Heterogeneous Word2Vec ( Law2Vec )
2. Advantage
Chapter 2.
Text Summarization
Chapter 3.
Text Clustering
Chapter 4.
Text Relational Learning To summarize by linguistic feature for Korean
To get the dimension reduction with a small amount of information
loss using Syllable vector and to make efficient computing for large-
scale document set.
To learn of heterogeneous data by using the relationship
between them without text(word) data
Conclusion
Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 83 / 83
3. Interest to readerChapter 2.
Text Summarization
Chapter 3.
Text Clustering
Chapter 4.
Text Relational Learning
To apply Fuzzy Concept to text mining considering Language
features
=> Define your idea and apply it to system easily
Korean language has more efficient for large-scale document set
=> Korean language is adequate to compress text data
Design the NN system to fit the structure of your data
=> Meta-data is a good enough material to learn the relationship
between them.

More Related Content

What's hot

Hybridization of Bat and Genetic Algorithm to Solve N-Queens Problem
Hybridization of Bat and Genetic Algorithm to Solve N-Queens ProblemHybridization of Bat and Genetic Algorithm to Solve N-Queens Problem
Hybridization of Bat and Genetic Algorithm to Solve N-Queens ProblemjournalBEEI
 
Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Dan Elton
 
Icitam2019 2020 book_chapter
Icitam2019 2020 book_chapterIcitam2019 2020 book_chapter
Icitam2019 2020 book_chapterBan Bang
 
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...Parallel Guided Local Search and Some Preliminary Experimental Results for Co...
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...csandit
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalBhaskar Mitra
 
Algorithm of Dynamic Programming for Paper-Reviewer Assignment Problem
Algorithm of Dynamic Programming for Paper-Reviewer Assignment ProblemAlgorithm of Dynamic Programming for Paper-Reviewer Assignment Problem
Algorithm of Dynamic Programming for Paper-Reviewer Assignment ProblemIRJET Journal
 
A NEW ALGORITHM FOR SOLVING FULLY FUZZY BI-LEVEL QUADRATIC PROGRAMMING PROBLEMS
A NEW ALGORITHM FOR SOLVING FULLY FUZZY BI-LEVEL QUADRATIC PROGRAMMING PROBLEMSA NEW ALGORITHM FOR SOLVING FULLY FUZZY BI-LEVEL QUADRATIC PROGRAMMING PROBLEMS
A NEW ALGORITHM FOR SOLVING FULLY FUZZY BI-LEVEL QUADRATIC PROGRAMMING PROBLEMSorajjournal
 
GA Based Multi-Objective Time-Cost Optimization in a Project with Resources C...
GA Based Multi-Objective Time-Cost Optimization in a Project with Resources C...GA Based Multi-Objective Time-Cost Optimization in a Project with Resources C...
GA Based Multi-Objective Time-Cost Optimization in a Project with Resources C...IJMER
 

What's hot (8)

Hybridization of Bat and Genetic Algorithm to Solve N-Queens Problem
Hybridization of Bat and Genetic Algorithm to Solve N-Queens ProblemHybridization of Bat and Genetic Algorithm to Solve N-Queens Problem
Hybridization of Bat and Genetic Algorithm to Solve N-Queens Problem
 
Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design
 
Icitam2019 2020 book_chapter
Icitam2019 2020 book_chapterIcitam2019 2020 book_chapter
Icitam2019 2020 book_chapter
 
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...Parallel Guided Local Search and Some Preliminary Experimental Results for Co...
Parallel Guided Local Search and Some Preliminary Experimental Results for Co...
 
A Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information RetrievalA Simple Introduction to Neural Information Retrieval
A Simple Introduction to Neural Information Retrieval
 
Algorithm of Dynamic Programming for Paper-Reviewer Assignment Problem
Algorithm of Dynamic Programming for Paper-Reviewer Assignment ProblemAlgorithm of Dynamic Programming for Paper-Reviewer Assignment Problem
Algorithm of Dynamic Programming for Paper-Reviewer Assignment Problem
 
A NEW ALGORITHM FOR SOLVING FULLY FUZZY BI-LEVEL QUADRATIC PROGRAMMING PROBLEMS
A NEW ALGORITHM FOR SOLVING FULLY FUZZY BI-LEVEL QUADRATIC PROGRAMMING PROBLEMSA NEW ALGORITHM FOR SOLVING FULLY FUZZY BI-LEVEL QUADRATIC PROGRAMMING PROBLEMS
A NEW ALGORITHM FOR SOLVING FULLY FUZZY BI-LEVEL QUADRATIC PROGRAMMING PROBLEMS
 
GA Based Multi-Objective Time-Cost Optimization in a Project with Resources C...
GA Based Multi-Objective Time-Cost Optimization in a Project with Resources C...GA Based Multi-Objective Time-Cost Optimization in a Project with Resources C...
GA Based Multi-Objective Time-Cost Optimization in a Project with Resources C...
 

Similar to Korean Text mining

Word Embedding Models & Support Vector Machines for Text Classification
Word Embedding Models & Support Vector Machines for Text ClassificationWord Embedding Models & Support Vector Machines for Text Classification
Word Embedding Models & Support Vector Machines for Text ClassificationNa'im Tyson
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyIJTET Journal
 
The Essay Scoring Tool (TEST) for Hindi
The Essay Scoring Tool (TEST) for HindiThe Essay Scoring Tool (TEST) for Hindi
The Essay Scoring Tool (TEST) for Hindisinghg77
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysGoon83
 
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean MetricsYuxin Su
 
Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...IJERA Editor
 
Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...IJERA Editor
 
Introduction to neural networks and Keras
Introduction to neural networks and KerasIntroduction to neural networks and Keras
Introduction to neural networks and KerasJie He
 
A Review on Text Mining in Data Mining
A Review on Text Mining in Data Mining  A Review on Text Mining in Data Mining
A Review on Text Mining in Data Mining ijsc
 
A Review on Text Mining in Data Mining
A Review on Text Mining in Data MiningA Review on Text Mining in Data Mining
A Review on Text Mining in Data Miningijsc
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueJinho Choi
 
Optimization for-power-sy-8631549
Optimization for-power-sy-8631549Optimization for-power-sy-8631549
Optimization for-power-sy-8631549Kannan Kathiravan
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introductionYueshen Xu
 
Evaluation of subjective answers using glsa enhanced with contextual synonymy
Evaluation of subjective answers using glsa enhanced with contextual synonymyEvaluation of subjective answers using glsa enhanced with contextual synonymy
Evaluation of subjective answers using glsa enhanced with contextual synonymyijnlc
 
Tensor Networks and Their Applications on Machine Learning
Tensor Networks and Their Applications on Machine LearningTensor Networks and Their Applications on Machine Learning
Tensor Networks and Their Applications on Machine LearningKwan-yuet Ho
 
H03302058066
H03302058066H03302058066
H03302058066theijes
 

Similar to Korean Text mining (20)

Word Embedding Models & Support Vector Machines for Text Classification
Word Embedding Models & Support Vector Machines for Text ClassificationWord Embedding Models & Support Vector Machines for Text Classification
Word Embedding Models & Support Vector Machines for Text Classification
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A SurveyDimensionality Reduction Techniques for Document Clustering- A Survey
Dimensionality Reduction Techniques for Document Clustering- A Survey
 
The Essay Scoring Tool (TEST) for Hindi
The Essay Scoring Tool (TEST) for HindiThe Essay Scoring Tool (TEST) for Hindi
The Essay Scoring Tool (TEST) for Hindi
 
Bl24409420
Bl24409420Bl24409420
Bl24409420
 
Cmpe 255 Short Story Assignment
Cmpe 255 Short Story AssignmentCmpe 255 Short Story Assignment
Cmpe 255 Short Story Assignment
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
 
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
[SIGIR17] Learning to Rank Using Localized Geometric Mean Metrics
 
Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...
 
Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...Construction of Keyword Extraction using Statistical Approaches and Document ...
Construction of Keyword Extraction using Statistical Approaches and Document ...
 
Introduction to neural networks and Keras
Introduction to neural networks and KerasIntroduction to neural networks and Keras
Introduction to neural networks and Keras
 
A Review on Text Mining in Data Mining
A Review on Text Mining in Data Mining  A Review on Text Mining in Data Mining
A Review on Text Mining in Data Mining
 
Audio Water Marking Using DCT & EMD
Audio Water Marking Using DCT & EMDAudio Water Marking Using DCT & EMD
Audio Water Marking Using DCT & EMD
 
A Review on Text Mining in Data Mining
A Review on Text Mining in Data MiningA Review on Text Mining in Data Mining
A Review on Text Mining in Data Mining
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Optimization for-power-sy-8631549
Optimization for-power-sy-8631549Optimization for-power-sy-8631549
Optimization for-power-sy-8631549
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Evaluation of subjective answers using glsa enhanced with contextual synonymy
Evaluation of subjective answers using glsa enhanced with contextual synonymyEvaluation of subjective answers using glsa enhanced with contextual synonymy
Evaluation of subjective answers using glsa enhanced with contextual synonymy
 
Tensor Networks and Their Applications on Machine Learning
Tensor Networks and Their Applications on Machine LearningTensor Networks and Their Applications on Machine Learning
Tensor Networks and Their Applications on Machine Learning
 
H03302058066
H03302058066H03302058066
H03302058066
 

More from Kyunghoon Kim

넥스트 노멀 - 인간과 AI의 협업
넥스트 노멀 - 인간과 AI의 협업넥스트 노멀 - 인간과 AI의 협업
넥스트 노멀 - 인간과 AI의 협업Kyunghoon Kim
 
토론하는 AI 김컴재와 AI 조향사 센트리아
토론하는 AI 김컴재와 AI 조향사 센트리아토론하는 AI 김컴재와 AI 조향사 센트리아
토론하는 AI 김컴재와 AI 조향사 센트리아Kyunghoon Kim
 
빅데이터의 다음 단계는 예측 분석이다
빅데이터의 다음 단계는 예측 분석이다빅데이터의 다음 단계는 예측 분석이다
빅데이터의 다음 단계는 예측 분석이다Kyunghoon Kim
 
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기Kyunghoon Kim
 
4차 산업혁명 시대의 진로와 진학
4차 산업혁명 시대의 진로와 진학4차 산업혁명 시대의 진로와 진학
4차 산업혁명 시대의 진로와 진학Kyunghoon Kim
 
20200620 신호와 소음 독서토론
20200620 신호와 소음 독서토론20200620 신호와 소음 독서토론
20200620 신호와 소음 독서토론Kyunghoon Kim
 
중학생을 위한 인공지능 이야기
중학생을 위한 인공지능 이야기중학생을 위한 인공지능 이야기
중학생을 위한 인공지능 이야기Kyunghoon Kim
 
슬쩍 해보는 선형대수학
슬쩍 해보는 선형대수학슬쩍 해보는 선형대수학
슬쩍 해보는 선형대수학Kyunghoon Kim
 
파이썬으로 해보는 이미지 처리
파이썬으로 해보는 이미지 처리파이썬으로 해보는 이미지 처리
파이썬으로 해보는 이미지 처리Kyunghoon Kim
 
기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법Kyunghoon Kim
 
공공데이터 활용사례
공공데이터 활용사례공공데이터 활용사례
공공데이터 활용사례Kyunghoon Kim
 
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기Kyunghoon Kim
 
2018 인공지능에 대하여
2018 인공지능에 대하여2018 인공지능에 대하여
2018 인공지능에 대하여Kyunghoon Kim
 
Naive bayes Classification using Python3
Naive bayes Classification using Python3Naive bayes Classification using Python3
Naive bayes Classification using Python3Kyunghoon Kim
 
Basic statistics using Python3
Basic statistics using Python3Basic statistics using Python3
Basic statistics using Python3Kyunghoon Kim
 
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼Kyunghoon Kim
 
사회 연결망의 링크 예측
사회 연결망의 링크 예측사회 연결망의 링크 예측
사회 연결망의 링크 예측Kyunghoon Kim
 

More from Kyunghoon Kim (20)

넥스트 노멀 - 인간과 AI의 협업
넥스트 노멀 - 인간과 AI의 협업넥스트 노멀 - 인간과 AI의 협업
넥스트 노멀 - 인간과 AI의 협업
 
토론하는 AI 김컴재와 AI 조향사 센트리아
토론하는 AI 김컴재와 AI 조향사 센트리아토론하는 AI 김컴재와 AI 조향사 센트리아
토론하는 AI 김컴재와 AI 조향사 센트리아
 
빅데이터의 다음 단계는 예측 분석이다
빅데이터의 다음 단계는 예측 분석이다빅데이터의 다음 단계는 예측 분석이다
빅데이터의 다음 단계는 예측 분석이다
 
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
중학생을 위한 4차 산업혁명 시대의 인공지능 이야기
 
업무 자동화
업무 자동화업무 자동화
업무 자동화
 
4차 산업혁명 시대의 진로와 진학
4차 산업혁명 시대의 진로와 진학4차 산업혁명 시대의 진로와 진학
4차 산업혁명 시대의 진로와 진학
 
20200620 신호와 소음 독서토론
20200620 신호와 소음 독서토론20200620 신호와 소음 독서토론
20200620 신호와 소음 독서토론
 
중학생을 위한 인공지능 이야기
중학생을 위한 인공지능 이야기중학생을 위한 인공지능 이야기
중학생을 위한 인공지능 이야기
 
슬쩍 해보는 선형대수학
슬쩍 해보는 선형대수학슬쩍 해보는 선형대수학
슬쩍 해보는 선형대수학
 
파이썬으로 해보는 이미지 처리
파이썬으로 해보는 이미지 처리파이썬으로 해보는 이미지 처리
파이썬으로 해보는 이미지 처리
 
기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법기계가 선형대수학을 통해 한국어를 이해하는 방법
기계가 선형대수학을 통해 한국어를 이해하는 방법
 
공공데이터 활용사례
공공데이터 활용사례공공데이터 활용사례
공공데이터 활용사례
 
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
기계학습, 딥러닝, 인공지능 사이의 차이점 이해하기
 
2018 인공지능에 대하여
2018 인공지능에 대하여2018 인공지능에 대하여
2018 인공지능에 대하여
 
Naive bayes Classification using Python3
Naive bayes Classification using Python3Naive bayes Classification using Python3
Naive bayes Classification using Python3
 
Basic statistics using Python3
Basic statistics using Python3Basic statistics using Python3
Basic statistics using Python3
 
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
[20160813, PyCon2016APAC] 뉴스를 재미있게 만드는 방법; 뉴스잼
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
사회 연결망의 링크 예측
사회 연결망의 링크 예측사회 연결망의 링크 예측
사회 연결망의 링크 예측
 
NMF with python
NMF with pythonNMF with python
NMF with python
 

Recently uploaded

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Korean Text mining

  • 1. Kyunghoon Kim UNIST Department of Mathematical Sciences December 12, 2017 kyunghoon@unist.ac.kr A Mathematical Measurement for Korean Text mining and its applications
  • 2. Difficulty of Korean Language Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 2 / 83 • New Concepts
 - Korean Alphabet ( , , , , …)
 - End of a word ( ) ( , , , …)
 - Postposition ( ) ( , , , , …)
 - Word order ( ) (SOV, …)
 - … Language Destruction
  • 3. Outline Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 3 / 83 1. Text Summarization Korean Text Mining 2. Text Clustering 3. Learning of Text Relationship Korean Language
 Feature V2 Syllable Vector Heterogeneous Word2Vec ( Law2Vec ) Fuzzy System Term-Frequency Matrix ( LSI, NMF ) Artificial Neural Network ( Word2Vec ) 2013’ 2015’ 2017’
  • 4. 1. Text Summarization | Motivation Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 4 / 83 < Raw News Article > < Summarized News > March, 2013 How about Korean? News Article Summarized Sentences
  • 5. 1. Text Summarization | Process Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 5 / 83 Document Preprocessing Feature Selection Scoring by Model Refinement & Sorting by score NNP,*,T, ,*,*,*,* JKB,*,F, ,*,*,*,* NNG,*,T, ,*,*,*,* NNG,*,T, ,*,*,*,* JC,*,F, ,*,*,*,* NNG,*,T, ,*,*,*,* XSN,*,T, ,*,*,*,* NNG,*,T, ,*,*,*,* NNG,*,T, ,*,*,*,* XSN,*,T, ,*,*,*,* NNG,*,F, ,*,*,*,* JC,*,F, ,*,*,*,* NNG,*,T, ,*,*,*,* NNG,*,F, ,Compound,*,*, /NNG/*+ / NNG/* JKS,*,F, ,*,*,*,* MAG, / ,F, ,*,*,*,* VV,*,F, ,*,*,*,* EC,*,F, ,*,*,*,* VX,*,T, ,*,*,*,* EF,*,F, ,*,*,*,* . SF,*,*,*,*,*,*,* • Content word(Keyword) feature • Title word feature • Sentence location feature • Sentence Length feature • Proper Noun feature • Upper-case word feature • Cue-Phrase feature • Biased Word feature • Font based feature • Pronouns • Sentence-to-Sentence Cohesion • Sentence-to-Centroid Cohesion • Occurrence of non-essential information • Discourse analysis Only for English features
  • 6. 1. Text Summarization | Feature measurements Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 6 / 83 Feature based on English Feature based on Korean , , , , , ... , , , , ...
  • 7. 1. Text Summarization | Fuzzy Set Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 7 / 83
  • 8. 1. Text Summarization | Fuzzy Set Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 8 / 83
  • 9. 1. Text Summarization | Fuzzy Set Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 9 / 83
  • 10. 1. Text Summarization | Fuzzy Set Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 10 / 83
  • 11. 1. Text Summarization | Calculating the score of sentences Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 11 / 83
  • 12. 1. Text Summarization | Korean Text Summarization Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 12 / 83 http://summ-dev.ap-northeast-2.elasticbeanstalk.com/
  • 13. 1. Text Summarization | Patent, 2013 Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 13 / 83 https://goo.gl/blkjwf Korean News Summarization System And Method
  • 14. 2. Text Clustering Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 14 / 83 Text Clustering
  • 15. 2. Text Clustering Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 15 / 83 MatrixDocuments 1. Select Matrix
 
 
 
 2. Calculating similarity
 between each column of matrix 3. Clustering by the degree of similarity A = 0 B B B @ a11 a12 ··· a1n a21 a22 ··· a2n ... ... ... ... am1 am2 ··· amn 1 C C C A Convert A. Basic (using raw matrix) B. LSI (Latent Semantic Indexing) C. NMF (Non-negative Matrix Factorization)
  • 16. 2. Text Clustering | Term-Frequency Matrix Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 16 / 83 = { apple, banana, kiwi } = { apple, banana, store } = { store } d1 d2 d3 A = 2 4 d1 · · · dn 3 5 Term-Frequency Matrix Frequency Document vector
  • 17. 2. Text Clustering | Singular Value Decomposition (SVD) Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 27 / 83 d1 d2 d3 w1 1 0 0 w2 0 1 0 w3 1 1 1 w4 1 1 0 w5 0 0 1 -0.27 0.21 0.70 -0.53 0.30 -0.27 0.21 -0.70 -0.53 0.30 -0.71 -0.33 0 -0.10 -0.60 -0.55 0.43 0 0.64 0.29 -0.15 -0.77 0 0.10 0.60 2.35 0 0 0 1.19 0 0 0 1.00 0 0 0 0 0 0 -0.65 0.26 0.70 -0.65 0.26 -0.70 -0.36 -0.92 0 =
  • 18. 2. Text Clustering | Latent Semantic Indexing (LSI) Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 28 / 83
  • 19. 2. Text Clustering | Non-negative Matrix Factorization (NMF) Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 31 / 83
  • 20. 2. Text Clustering | Non-negative Matrix Factorization (NMF) Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 32 / 83 Doc1 Doc2 Doc3 Feature 1 Feature 2 Feature 1 Feature 2 Term 1 Term 2 Term 3 Term 4 Term 5
  • 21. 2. Text Clustering | Term-Frequency Matrix Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 16 / 83 Large Dimension Matrix for large-scale set Proposed method Syllable Vector
  • 22. 2. Text Clustering | Syllable-n Vector Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 17 / 83 about 1,200 dimension
  • 23. 2. Text Clustering | Dimension reduction using Syllable-n vector Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 18 / 83 Dimension Reduction by Syllable Vector Syllable-1 Syllable-2 Syllable-3
  • 24. 2. Text Clustering | Syllable-n-All Vector Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 19 / 83 Syllable-1-All Syllable-2-All , , , , , , , , ✓ lj n ◆ length of word wj Take all combination of syllable-n
  • 25. 2. Text Clustering | Benchmark Dataset HKIB-20000 Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 23 / 83 Dimension reduction How about information loss?
  • 26. 2. Text Clustering | Similarity Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 20 / 83 ✓ a b sim(d1, d2) = v u u t2 1 2/9 p 3/9 p 3/9 ! = 0.8164 sim(d2, d3) = 0.919 sim(d1, d3) = 1.414
  • 27. 2. Text Clustering | Similarity Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 24 / 83 Source : Doc Number 5222 Target : Other all documents
  • 28. 2. Text Clustering | Correlation Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 25 / 83 Basic LSI NMF
  • 29. 2. Text Clustering | Evaluation of Text Clustering Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 34 / 83
  • 30. 2. Text Clustering | Precision Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 35 / 83 Real Answer TP FP Precision = 5 7 = 0.71
  • 31. 2. Text Clustering | Evaluation Set Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 36 / 83 Doc 1 Doc 2 Doc 3 Doc 4 Doc 5 Doc 6 …
  • 32. 2. Text Clustering | Standard for Evaluation Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 37 / 83 1 2 3 4 5 Nearest neighbors Limited Radius
  • 33. 2. Text Clustering | Evaluation of text clustering Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 38 / 83 Radius Threshold Syl-2 Syl-3 Word
  • 34. 2. Text Clustering | Evaluation of text clustering Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 39 / 83 Count Threshold
  • 35. 2. Text Clustering | Evaluation of text clustering Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 40 / 83 Precision Speed n = 5 , count threshold LSI LSI
  • 36. 2. Text Clustering | Evaluation of text clustering Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 41 / 83 Syl-2 for LSI is BEST!
  • 37. 2. Text Clustering | Patent Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 42 / 83 https://goo.gl/fskHxTKorean Text Clustering System and Method
  • 38. 2. Text Clustering | Limitation of word-based method Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 43 / 83 These words are NOT important to understand the given text! Limitation of word-based method
  • 39. Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 44 / 83 3. Learning of Text Relationship Word-based Citation Relation Find similar documents using citation information
  • 40. 3. Learning of Text Relationship | Natural Language Processing (NLP) Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 45 / 83 https://www.upwork.com/hiring/for-clients/artificial-intelligence-and-natural-language-processing-in-big-data/
  • 41. 3. Learning of Text Relationship | Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 46 / 83 2013, Hot Model in NLP “Word2Vec” (Google) http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/ (맥도날드가, 햄버거는) (맥도날드가, 맛있다.) (맛있다., 맥도날드가) (맛있다., 감자튀김도) (감자튀김도, 맛있다.) (감자튀김도, 맛있었는데..) (맘스터치도, 햄버거는) (맘스터치도, 맛있다.) (맛있다., 맘스터치도) (맛있다., 패티가) Source Text Red : Target keyword, Blue : Context Keyword Training Set
  • 42. 3. Learning of Text Relationship | Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 47 / 83 (맥도날드가, 햄버거는) (맥도날드가, 맛있다.) Input, Output
  • 43. 3. Learning of Text Relationship | Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 48 / 83
  • 44. 3. Learning of Text Relationship | Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 50 / 83 Shortage of Word2Vec • Only Word-based Method
 => Meaningless words are also counted.
 • Only Same vocabulary set for input, output
 => Dimensions of input, output are fixed.
 • Only use a context information of target word
 => depends entirely on context with windows size N.
  • 45. 3. Learning of Text Relationship | Heterogeneous Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 51 / 83 Heterogeneous Word2Vec Input Output
  • 46. 3. Learning of Text Relationship | Heterogeneous Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 52 / 83
  • 47. 3. Learning of Text Relationship | Heterogeneous Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 53 / 83 1 2 3 4 5 1 2 3 4 5 0 6
  • 48. 3. Learning of Text Relationship | Heterogeneous Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 54 / 83 1 0 0 0 0 0 0 0 0 0 1 0 0 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B @ 1 C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C A 0 B B B B B B B B B B B B B B B B B B B B @ 1 C C C C C C C C C C C C C C C C C C C C A
  • 49. 3. Learning of Text Relationship | Heterogeneous Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 55 / 83 1 0 0 0 0 0 1 0 0 0 0 0 0 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B @ 1 C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C A 0 B B B B B B B B B B B B B B B B B B B B @ 1 C C C C C C C C C C C C C C C C C C C C A
  • 50. Ch 4. Learning for number relationship Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 56 / 83 3. Learning of Text Relationship | Heterogeneous Word2Vec 1 2 3 4 5 1 2 3 4 5 0 6
  • 51. Ch 4. Learning for number relationship Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 57 / 83 3. Learning of Text Relationship | Heterogeneous Word2Vec 1 2 3 4 5 1 2 3 4 5 0 6
  • 52. 3. Learning of Text Relationship | Heterogeneous Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 58 / 83 1 2 3 4 5 1 2 3 4 5 Similarity ( 0 is best ) Matrix (Vectors)
  • 53. 3. Learning of Text Relationship | Heterogeneous Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 60 / 83
  • 54. 3. Learning of Text Relationship | Heterogeneous Word2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 62 / 83 Input Output
  • 55. 3. Learning of Text Relationship | Law2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 63 / 83 Legal information comprises mainly of legislation and case. • CL ( Case - Legislation ) • CC ( Case - Case ) • CLC ( Case - Legislation, Case )
  • 56. 3. Learning of Text Relationship | Law2Vec CL Model, CC Model Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 64 / 83 Cited legislations Cited cases Case Case
  • 57. 3. Learning of Text Relationship | Law2Vec CLC Model Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 65 / 83 Cited legislations Cited cases Case
  • 58. 3. Learning of Text Relationship | Evaluation of Law2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 66 / 83
  • 59. 3. Learning of Text Relationship | Evaluation of CL Model Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 67 / 83 Cited legislations Case
  • 60. 3. Learning of Text Relationship | Evaluation of Law2Vec : W_1 Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 68 / 83
  • 61. 3. Learning of Text Relationship | Evaluation of Law2Vec : W_2 Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 69 / 83
  • 62. 3. Learning of Text Relationship | Evaluation of CC Model Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 70 / 83 Cited cases Case
  • 63. 3. Learning of Text Relationship | Evaluation of CC Model Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 71 / 83
  • 64. 3. Learning of Text Relationship | Evaluation of CLC Model Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 73 / 83 Cited legislations Cited cases Case
  • 65. 3. Learning of Text Relationship | Evaluation of CLC Model Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 74 / 83
  • 66. 3. Learning of Text Relationship | Result of Law2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 76 / 83
  • 67. 3. Learning of Text Relationship | Result of Law2Vec Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 77 / 83
  • 68. 3. Learning of Text Relationship | Expansion of Data set Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 78 / 83 < Lawyer Oh’s Answer Sheet >
  • 69. 3. Learning of Text Relationship | Law2vec for Sample Data Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 79 / 83 CL Model Iteration 10000 CL Model Iteration 60000 CC Model Iteration 10000 CC Model Iteration 60000 CLC Model Iteration 10000 CLC Model Iteration 60000
  • 70. 3. Learning of Text Relationship | Link Prediction Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 81 / 83
  • 71. Conclusion Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 82 / 83 1. Main Contribution Korean Language Feature V2 (JKB, JX) Syllable Vector Heterogeneous Word2Vec ( Law2Vec ) 2. Advantage Chapter 2. Text Summarization Chapter 3. Text Clustering Chapter 4. Text Relational Learning To summarize by linguistic feature for Korean To get the dimension reduction with a small amount of information loss using Syllable vector and to make efficient computing for large- scale document set. To learn of heterogeneous data by using the relationship between them without text(word) data
  • 72. Conclusion Kyunghoon Kim (UNIST) A Mathematical Measurement for Korean Text mining and its applications Dec 12, 2017 83 / 83 3. Interest to readerChapter 2. Text Summarization Chapter 3. Text Clustering Chapter 4. Text Relational Learning To apply Fuzzy Concept to text mining considering Language features => Define your idea and apply it to system easily Korean language has more efficient for large-scale document set => Korean language is adequate to compress text data Design the NN system to fit the structure of your data => Meta-data is a good enough material to learn the relationship between them.