SlideShare a Scribd company logo
Evaluation Dataset and System
for Japanese Lexical Simplification
Tomoyuki Kajiwara and Kazuhide Yamamoto
Nagaoka University of Technology, Japan
(Now I am studying at the Tokyo Metropolitan University)
Extensive / various forms of texts
Hitler committed terrible atrocities
during the second World War.
Hitler committed terrible cruelties
during the second World War.
Children Language Learners Elderlies
Motivation
Easily
accessible
Easily
readable and
understandable
too!
Problems in Japanese
Ø Unpublished system
•  It is difficult for people who need reading
assistance to obtain simple Japanese sentences
Ø Unpublished dataset
•  It is difficult for researchers and developers to
evaluate the performance of different systems
Our works
ü Built and published
Japanese lexical simplification system
http://www.jnlp.org/SNOW/S3
ü Built and published dataset for
evaluation of Japanese lexical simplification
http://www.jnlp.org/SNOW/E4
Lexical Simplification System
Substitution Generation
担う: 支える,引継ぐ,受け継ぐ,伝承する
bear: hold, wear, carry, expect
Identification of Complex Words
担う
bear
Word Sense Disambiguation
担う: 支える, 受け継ぐ
bear: hold, carry
Synonym Ranking
1: 支える, 2: 受け継ぐ, 3: 担う
1: hold, 2: carry, 3: bear
Input
未来は若者が担う
Young people bear the future
Output
未来は若者が支える
Young people hold the future
Lexical Simplification Dataset
1. Constructing Japanese Lexical Substitution Dataset
 ・Collecting Substitutions (crowdsourcing)
 ・Evaluating Substitutions (crowdsourcing)
2. Transforming into Lexical Simplification Dataset
 ・Ranking Substitutions (crowdsourcing)
 ・Merging All Rankings
 Sample: Young people bear the future.(未来は若者が担う)
   Lexical Substitutions: carry, hold (受け継ぐ, 支える)
   Rank of Simple Level: 1.hold, 2.carry, 3.bear
                (1. 支える, 2. 受け継ぐ, 3. 担う)
Evaluation
Dataset Sentence Noun Verb Adjective Adverb
SemEval [1]
2012 Task1
2,010
580
(28.9%)
520
(25.9%)
560
(27.9%)
350
(17.4%)
Ours 2,330
630
(27.0%)
720
(30.9%)
500
(21.5%)
480
(20.6%)
System Precision Recall F-measure
Our Original 0.89 0.08 0.15
w/o WSD 0.84 0.71 0.77
[1] Lucia Specia, Sujay Kumar Jauhar, and Rada Mihalcea. 2012.
[1] Semeval-2012 task 1: English lexical simplification.
[1] In Proceedings of the 6th International Workshop on Semantic Evaluation, pages 347‒355.
We built and published system and evaluation
dataset for Japanese lexical simplification
http://www.jnlp.org/SNOW
Lexical Simplification:
•  Substitutes a complex word or phrase
in a sentence with a simpler synonym
•  Supports the reading comprehension
of a wide range of readers (e.g. language learners)
Evaluation Dataset and System
for Japanese Lexical Simplification
Tomoyuki Kajiwara and Kazuhide Yamamoto
Nagaoka University of Technology, Japan

More Related Content

What's hot

Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Saurabh Kaushik
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Marina Santini
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionary
Roelof Pieters
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
Kuppusamy P
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
ananth
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
ananth
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine Translation
Lifeng (Aaron) Han
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
National Institute of Technology Durgapur
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
Minh Pham
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
Mustafa Jarrar
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
Anuj Gupta
 
Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational Semantics
Marina Santini
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
Viet-Trung TRAN
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
butest
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
Jose Zagal
 
Improvement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A ReviewImprovement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A Review
inscit2006
 
NLP new words
NLP new wordsNLP new words
NLP new words
guest9fc47a
 
Jarrar: Introduction to Natural Language Processing
Jarrar: Introduction to Natural Language ProcessingJarrar: Introduction to Natural Language Processing
Jarrar: Introduction to Natural Language Processing
Mustafa Jarrar
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
Yogendra Tamang
 

What's hot (20)

Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionary
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine Translation
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational Semantics
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
Improvement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A ReviewImprovement in Quality of Speech associated with Braille codes - A Review
Improvement in Quality of Speech associated with Braille codes - A Review
 
NLP new words
NLP new wordsNLP new words
NLP new words
 
Jarrar: Introduction to Natural Language Processing
Jarrar: Introduction to Natural Language ProcessingJarrar: Introduction to Natural Language Processing
Jarrar: Introduction to Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 

Viewers also liked

高頻度語は平易なのか?
高頻度語は平易なのか?高頻度語は平易なのか?
高頻度語は平易なのか?
Tomoyuki Kajiwara
 
Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...
sekizawayuuki
 
文章読解支援のための語彙平易化@第1回NLP東京Dの会
文章読解支援のための語彙平易化@第1回NLP東京Dの会文章読解支援のための語彙平易化@第1回NLP東京Dの会
文章読解支援のための語彙平易化@第1回NLP東京Dの会
Tomoyuki Kajiwara
 
tmu_science_cafe02
tmu_science_cafe02tmu_science_cafe02
tmu_science_cafe02
Tomoyuki Kajiwara
 
joint_seminar
joint_seminarjoint_seminar
joint_seminar
Tomoyuki Kajiwara
 
単語分散表現のアライメントに基づく文間類似度を用いたテキスト平易化のための単言語パラレルコーパスの構築
単語分散表現のアライメントに基づく文間類似度を用いたテキスト平易化のための単言語パラレルコーパスの構築単語分散表現のアライメントに基づく文間類似度を用いたテキスト平易化のための単言語パラレルコーパスの構築
単語分散表現のアライメントに基づく文間類似度を用いたテキスト平易化のための単言語パラレルコーパスの構築
Tomoyuki Kajiwara
 
Noun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of ContextsNoun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of Contexts
Tomoyuki Kajiwara
 
文献紹介:Simple English Wikipedia: A New Text Simplification Task
文献紹介:Simple English Wikipedia: A New Text Simplification Task文献紹介:Simple English Wikipedia: A New Text Simplification Task
文献紹介:Simple English Wikipedia: A New Text Simplification Task
Tomoyuki Kajiwara
 
文章読解支援のための語彙平易化
文章読解支援のための語彙平易化文章読解支援のための語彙平易化
文章読解支援のための語彙平易化
Tomoyuki Kajiwara
 

Viewers also liked (9)

高頻度語は平易なのか?
高頻度語は平易なのか?高頻度語は平易なのか?
高頻度語は平易なのか?
 
Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...
 
文章読解支援のための語彙平易化@第1回NLP東京Dの会
文章読解支援のための語彙平易化@第1回NLP東京Dの会文章読解支援のための語彙平易化@第1回NLP東京Dの会
文章読解支援のための語彙平易化@第1回NLP東京Dの会
 
tmu_science_cafe02
tmu_science_cafe02tmu_science_cafe02
tmu_science_cafe02
 
joint_seminar
joint_seminarjoint_seminar
joint_seminar
 
単語分散表現のアライメントに基づく文間類似度を用いたテキスト平易化のための単言語パラレルコーパスの構築
単語分散表現のアライメントに基づく文間類似度を用いたテキスト平易化のための単言語パラレルコーパスの構築単語分散表現のアライメントに基づく文間類似度を用いたテキスト平易化のための単言語パラレルコーパスの構築
単語分散表現のアライメントに基づく文間類似度を用いたテキスト平易化のための単言語パラレルコーパスの構築
 
Noun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of ContextsNoun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of Contexts
 
文献紹介:Simple English Wikipedia: A New Text Simplification Task
文献紹介:Simple English Wikipedia: A New Text Simplification Task文献紹介:Simple English Wikipedia: A New Text Simplification Task
文献紹介:Simple English Wikipedia: A New Text Simplification Task
 
文章読解支援のための語彙平易化
文章読解支援のための語彙平易化文章読解支援のための語彙平易化
文章読解支援のための語彙平易化
 

Similar to Evaluation Dataset and System for Japanese Lexical Simplification

Icsm07.ppt
Icsm07.pptIcsm07.ppt
IRJET - Response Analysis of Educational Videos
IRJET - Response Analysis of Educational VideosIRJET - Response Analysis of Educational Videos
IRJET - Response Analysis of Educational Videos
IRJET Journal
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
RIILP
 
Poster: Controlled and Balanced Dataset for Japanese Lexical Simplification
Poster: Controlled and Balanced Dataset for Japanese Lexical SimplificationPoster: Controlled and Balanced Dataset for Japanese Lexical Simplification
Poster: Controlled and Balanced Dataset for Japanese Lexical Simplification
Kodaira Tomonori
 
Examining the Impact of Individual Differences of Informatijon Processing St...
Examining the Impact of Individual Differences of Informatijon Processing St...Examining the Impact of Individual Differences of Informatijon Processing St...
Examining the Impact of Individual Differences of Informatijon Processing St...
Takeshi Sato
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
Jinho Choi
 
columbia-gwu
columbia-gwucolumbia-gwu
columbia-gwu
Tianrui Peng
 
Doc format.
Doc format.Doc format.
Doc format.
butest
 
Word Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented LanguagesWord Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented Languages
hs0041
 
Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255
deffa5
 
VOC real world enterprise needs
VOC real world enterprise needsVOC real world enterprise needs
VOC real world enterprise needs
Ivan Berlocher
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
Seth Grimes
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Seth Grimes
 
Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion   Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion
dannyijwest
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Matthew Lease
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
punedevscom
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
jcscholtes
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
Roelof Pieters
 
EUROCALL 2012 conference presentation
EUROCALL 2012 conference presentationEUROCALL 2012 conference presentation
EUROCALL 2012 conference presentation
Takeshi Sato
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Bhaskar Mitra
 

Similar to Evaluation Dataset and System for Japanese Lexical Simplification (20)

Icsm07.ppt
Icsm07.pptIcsm07.ppt
Icsm07.ppt
 
IRJET - Response Analysis of Educational Videos
IRJET - Response Analysis of Educational VideosIRJET - Response Analysis of Educational Videos
IRJET - Response Analysis of Educational Videos
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
 
Poster: Controlled and Balanced Dataset for Japanese Lexical Simplification
Poster: Controlled and Balanced Dataset for Japanese Lexical SimplificationPoster: Controlled and Balanced Dataset for Japanese Lexical Simplification
Poster: Controlled and Balanced Dataset for Japanese Lexical Simplification
 
Examining the Impact of Individual Differences of Informatijon Processing St...
Examining the Impact of Individual Differences of Informatijon Processing St...Examining the Impact of Individual Differences of Informatijon Processing St...
Examining the Impact of Individual Differences of Informatijon Processing St...
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
 
columbia-gwu
columbia-gwucolumbia-gwu
columbia-gwu
 
Doc format.
Doc format.Doc format.
Doc format.
 
Word Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented LanguagesWord Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented Languages
 
Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255Natural Language Processing: Lecture 255
Natural Language Processing: Lecture 255
 
VOC real world enterprise needs
VOC real world enterprise needsVOC real world enterprise needs
VOC real world enterprise needs
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
 
Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion   Lexical Analysis to Effectively Detect User's Opinion
Lexical Analysis to Effectively Detect User's Opinion
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
How can text-mining leverage developments in Deep Learning? Presentation at ...
How can text-mining leverage developments in Deep Learning?  Presentation at ...How can text-mining leverage developments in Deep Learning?  Presentation at ...
How can text-mining leverage developments in Deep Learning? Presentation at ...
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
EUROCALL 2012 conference presentation
EUROCALL 2012 conference presentationEUROCALL 2012 conference presentation
EUROCALL 2012 conference presentation
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 

More from Tomoyuki Kajiwara

20190315 nlp
20190315 nlp20190315 nlp
20190315 nlp
Tomoyuki Kajiwara
 
20180208公聴会
20180208公聴会20180208公聴会
20180208公聴会
Tomoyuki Kajiwara
 
20150702文章読解支援のための日本語の語彙平易化システム
20150702文章読解支援のための日本語の語彙平易化システム20150702文章読解支援のための日本語の語彙平易化システム
20150702文章読解支援のための日本語の語彙平易化システムTomoyuki Kajiwara
 
文献紹介:SemEval-2012 Task 1: English Lexical Simplification
文献紹介:SemEval-2012 Task 1: English Lexical Simplification文献紹介:SemEval-2012 Task 1: English Lexical Simplification
文献紹介:SemEval-2012 Task 1: English Lexical Simplification
Tomoyuki Kajiwara
 
文献紹介:新聞記事中の難解語を平易な表現へ変換する手法の提案
文献紹介:新聞記事中の難解語を平易な表現へ変換する手法の提案文献紹介:新聞記事中の難解語を平易な表現へ変換する手法の提案
文献紹介:新聞記事中の難解語を平易な表現へ変換する手法の提案
Tomoyuki Kajiwara
 
日本語の語彙平易化システムおよび評価セットの構築
日本語の語彙平易化システムおよび評価セットの構築日本語の語彙平易化システムおよび評価セットの構築
日本語の語彙平易化システムおよび評価セットの構築
Tomoyuki Kajiwara
 
文献紹介:格フレームの対応付けに基づく用言の言い換え
文献紹介:格フレームの対応付けに基づく用言の言い換え文献紹介:格フレームの対応付けに基づく用言の言い換え
文献紹介:格フレームの対応付けに基づく用言の言い換え
Tomoyuki Kajiwara
 
文献紹介:言い換え技術に関する研究動向
文献紹介:言い換え技術に関する研究動向文献紹介:言い換え技術に関する研究動向
文献紹介:言い換え技術に関する研究動向
Tomoyuki Kajiwara
 
日本語の語彙平易化評価セットの構築
日本語の語彙平易化評価セットの構築日本語の語彙平易化評価セットの構築
日本語の語彙平易化評価セットの構築
Tomoyuki Kajiwara
 
日本語の語彙平易化システムの構築
日本語の語彙平易化システムの構築日本語の語彙平易化システムの構築
日本語の語彙平易化システムの構築
Tomoyuki Kajiwara
 
日本語の語彙的換言知識の質的評価
日本語の語彙的換言知識の質的評価日本語の語彙的換言知識の質的評価
日本語の語彙的換言知識の質的評価
Tomoyuki Kajiwara
 
文脈の多様性に基づく名詞換言の評価
文脈の多様性に基づく名詞換言の評価文脈の多様性に基づく名詞換言の評価
文脈の多様性に基づく名詞換言の評価
Tomoyuki Kajiwara
 
文脈の多様性に基づく名詞換言の提案
文脈の多様性に基づく名詞換言の提案文脈の多様性に基づく名詞換言の提案
文脈の多様性に基づく名詞換言の提案
Tomoyuki Kajiwara
 
機械学習を用いたニ格深層格の自動付与の検討
機械学習を用いたニ格深層格の自動付与の検討機械学習を用いたニ格深層格の自動付与の検討
機械学習を用いたニ格深層格の自動付与の検討
Tomoyuki Kajiwara
 
Selecting Proper Lexical Paraphrase for Children
Selecting Proper Lexical Paraphrase for ChildrenSelecting Proper Lexical Paraphrase for Children
Selecting Proper Lexical Paraphrase for Children
Tomoyuki Kajiwara
 
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
Tomoyuki Kajiwara
 
小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
Tomoyuki Kajiwara
 
小学生の読解支援に向けた語釈文による換言
小学生の読解支援に向けた語釈文による換言小学生の読解支援に向けた語釈文による換言
小学生の読解支援に向けた語釈文による換言
Tomoyuki Kajiwara
 
対話型自動作曲システムに関する研究 -Aメロ, Bメロ, サビで異なる印象を感じさせる楽曲生成-
対話型自動作曲システムに関する研究 -Aメロ, Bメロ, サビで異なる印象を感じさせる楽曲生成-対話型自動作曲システムに関する研究 -Aメロ, Bメロ, サビで異なる印象を感じさせる楽曲生成-
対話型自動作曲システムに関する研究 -Aメロ, Bメロ, サビで異なる印象を感じさせる楽曲生成-
Tomoyuki Kajiwara
 
IGAを用いた個人の感性を反映した楽曲作成に関する研究 -Aメロ, Bメロ, サビに異なる感性的印象を感じさせる楽曲生成手法-
IGAを用いた個人の感性を反映した楽曲作成に関する研究 -Aメロ, Bメロ, サビに異なる感性的印象を感じさせる楽曲生成手法-IGAを用いた個人の感性を反映した楽曲作成に関する研究 -Aメロ, Bメロ, サビに異なる感性的印象を感じさせる楽曲生成手法-
IGAを用いた個人の感性を反映した楽曲作成に関する研究 -Aメロ, Bメロ, サビに異なる感性的印象を感じさせる楽曲生成手法-
Tomoyuki Kajiwara
 

More from Tomoyuki Kajiwara (20)

20190315 nlp
20190315 nlp20190315 nlp
20190315 nlp
 
20180208公聴会
20180208公聴会20180208公聴会
20180208公聴会
 
20150702文章読解支援のための日本語の語彙平易化システム
20150702文章読解支援のための日本語の語彙平易化システム20150702文章読解支援のための日本語の語彙平易化システム
20150702文章読解支援のための日本語の語彙平易化システム
 
文献紹介:SemEval-2012 Task 1: English Lexical Simplification
文献紹介:SemEval-2012 Task 1: English Lexical Simplification文献紹介:SemEval-2012 Task 1: English Lexical Simplification
文献紹介:SemEval-2012 Task 1: English Lexical Simplification
 
文献紹介:新聞記事中の難解語を平易な表現へ変換する手法の提案
文献紹介:新聞記事中の難解語を平易な表現へ変換する手法の提案文献紹介:新聞記事中の難解語を平易な表現へ変換する手法の提案
文献紹介:新聞記事中の難解語を平易な表現へ変換する手法の提案
 
日本語の語彙平易化システムおよび評価セットの構築
日本語の語彙平易化システムおよび評価セットの構築日本語の語彙平易化システムおよび評価セットの構築
日本語の語彙平易化システムおよび評価セットの構築
 
文献紹介:格フレームの対応付けに基づく用言の言い換え
文献紹介:格フレームの対応付けに基づく用言の言い換え文献紹介:格フレームの対応付けに基づく用言の言い換え
文献紹介:格フレームの対応付けに基づく用言の言い換え
 
文献紹介:言い換え技術に関する研究動向
文献紹介:言い換え技術に関する研究動向文献紹介:言い換え技術に関する研究動向
文献紹介:言い換え技術に関する研究動向
 
日本語の語彙平易化評価セットの構築
日本語の語彙平易化評価セットの構築日本語の語彙平易化評価セットの構築
日本語の語彙平易化評価セットの構築
 
日本語の語彙平易化システムの構築
日本語の語彙平易化システムの構築日本語の語彙平易化システムの構築
日本語の語彙平易化システムの構築
 
日本語の語彙的換言知識の質的評価
日本語の語彙的換言知識の質的評価日本語の語彙的換言知識の質的評価
日本語の語彙的換言知識の質的評価
 
文脈の多様性に基づく名詞換言の評価
文脈の多様性に基づく名詞換言の評価文脈の多様性に基づく名詞換言の評価
文脈の多様性に基づく名詞換言の評価
 
文脈の多様性に基づく名詞換言の提案
文脈の多様性に基づく名詞換言の提案文脈の多様性に基づく名詞換言の提案
文脈の多様性に基づく名詞換言の提案
 
機械学習を用いたニ格深層格の自動付与の検討
機械学習を用いたニ格深層格の自動付与の検討機械学習を用いたニ格深層格の自動付与の検討
機械学習を用いたニ格深層格の自動付与の検討
 
Selecting Proper Lexical Paraphrase for Children
Selecting Proper Lexical Paraphrase for ChildrenSelecting Proper Lexical Paraphrase for Children
Selecting Proper Lexical Paraphrase for Children
 
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
小学生の読解支援に向けた語釈文から語彙的換言を選択する手法
 
小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
小学生の読解支援に向けた複数の換言知識を併用した語彙平易化と評価
 
小学生の読解支援に向けた語釈文による換言
小学生の読解支援に向けた語釈文による換言小学生の読解支援に向けた語釈文による換言
小学生の読解支援に向けた語釈文による換言
 
対話型自動作曲システムに関する研究 -Aメロ, Bメロ, サビで異なる印象を感じさせる楽曲生成-
対話型自動作曲システムに関する研究 -Aメロ, Bメロ, サビで異なる印象を感じさせる楽曲生成-対話型自動作曲システムに関する研究 -Aメロ, Bメロ, サビで異なる印象を感じさせる楽曲生成-
対話型自動作曲システムに関する研究 -Aメロ, Bメロ, サビで異なる印象を感じさせる楽曲生成-
 
IGAを用いた個人の感性を反映した楽曲作成に関する研究 -Aメロ, Bメロ, サビに異なる感性的印象を感じさせる楽曲生成手法-
IGAを用いた個人の感性を反映した楽曲作成に関する研究 -Aメロ, Bメロ, サビに異なる感性的印象を感じさせる楽曲生成手法-IGAを用いた個人の感性を反映した楽曲作成に関する研究 -Aメロ, Bメロ, サビに異なる感性的印象を感じさせる楽曲生成手法-
IGAを用いた個人の感性を反映した楽曲作成に関する研究 -Aメロ, Bメロ, サビに異なる感性的印象を感じさせる楽曲生成手法-
 

Recently uploaded

8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
RASHMI M G
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 

Recently uploaded (20)

8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 

Evaluation Dataset and System for Japanese Lexical Simplification

  • 1. Evaluation Dataset and System for Japanese Lexical Simplification Tomoyuki Kajiwara and Kazuhide Yamamoto Nagaoka University of Technology, Japan (Now I am studying at the Tokyo Metropolitan University)
  • 2. Extensive / various forms of texts Hitler committed terrible atrocities during the second World War. Hitler committed terrible cruelties during the second World War. Children Language Learners Elderlies Motivation Easily accessible Easily readable and understandable too!
  • 3. Problems in Japanese Ø Unpublished system •  It is difficult for people who need reading assistance to obtain simple Japanese sentences Ø Unpublished dataset •  It is difficult for researchers and developers to evaluate the performance of different systems
  • 4. Our works ü Built and published Japanese lexical simplification system http://www.jnlp.org/SNOW/S3 ü Built and published dataset for evaluation of Japanese lexical simplification http://www.jnlp.org/SNOW/E4
  • 5. Lexical Simplification System Substitution Generation 担う: 支える,引継ぐ,受け継ぐ,伝承する bear: hold, wear, carry, expect Identification of Complex Words 担う bear Word Sense Disambiguation 担う: 支える, 受け継ぐ bear: hold, carry Synonym Ranking 1: 支える, 2: 受け継ぐ, 3: 担う 1: hold, 2: carry, 3: bear Input 未来は若者が担う Young people bear the future Output 未来は若者が支える Young people hold the future
  • 6. Lexical Simplification Dataset 1. Constructing Japanese Lexical Substitution Dataset  ・Collecting Substitutions (crowdsourcing)  ・Evaluating Substitutions (crowdsourcing) 2. Transforming into Lexical Simplification Dataset  ・Ranking Substitutions (crowdsourcing)  ・Merging All Rankings  Sample: Young people bear the future.(未来は若者が担う)    Lexical Substitutions: carry, hold (受け継ぐ, 支える)    Rank of Simple Level: 1.hold, 2.carry, 3.bear                 (1. 支える, 2. 受け継ぐ, 3. 担う)
  • 7. Evaluation Dataset Sentence Noun Verb Adjective Adverb SemEval [1] 2012 Task1 2,010 580 (28.9%) 520 (25.9%) 560 (27.9%) 350 (17.4%) Ours 2,330 630 (27.0%) 720 (30.9%) 500 (21.5%) 480 (20.6%) System Precision Recall F-measure Our Original 0.89 0.08 0.15 w/o WSD 0.84 0.71 0.77 [1] Lucia Specia, Sujay Kumar Jauhar, and Rada Mihalcea. 2012. [1] Semeval-2012 task 1: English lexical simplification. [1] In Proceedings of the 6th International Workshop on Semantic Evaluation, pages 347‒355.
  • 8. We built and published system and evaluation dataset for Japanese lexical simplification http://www.jnlp.org/SNOW Lexical Simplification: •  Substitutes a complex word or phrase in a sentence with a simpler synonym •  Supports the reading comprehension of a wide range of readers (e.g. language learners) Evaluation Dataset and System for Japanese Lexical Simplification Tomoyuki Kajiwara and Kazuhide Yamamoto Nagaoka University of Technology, Japan