SlideShare a Scribd company logo
1 of 31
Download to read offline
See	discussions,	stats,	and	author	profiles	for	this	publication	at:	https://www.researchgate.net/publication/280561259
SIAHAN2015	CNER	PPT	copy	5
Data	·	July	2015
CITATIONS
0
READS
9
4	authors,	including:
Some	of	the	authors	of	this	publication	are	also	working	on	these	related	projects:
Machine	Translation	View	project
Xiaodong	Zeng
University	of	Macau
14	PUBLICATIONS			52	CITATIONS			
SEE	PROFILE
Derek	F.	Wong
University	of	Macau
114	PUBLICATIONS			311	CITATIONS			
SEE	PROFILE
Aaron	L.-F.	Han
Dublin	City	University
38	PUBLICATIONS			48	CITATIONS			
SEE	PROFILE
All	content	following	this	page	was	uploaded	by	Aaron	L.-F.	Han	on	30	July	2015.
The	user	has	requested	enhancement	of	the	downloaded	file.
Chinese Named Entity Recognition with Graph-based
Semi-supervised Learning Model
Aaron Li-Feng Han* Xiaodong Zeng+ Derek F. Wong+ Lidia S. Chao+
* ILLC, University of Amsterdam, The Netherlands
+ NLP2CT Laboratory, University of Macau, Macau S.A.R., China
SIGHAN 2015 @ Beijing, July 30-31
• NER Tasks
• Traditional methods
• Motivation
• Designed model
• Experiments
• Conclusion
NER Tasks
NER tasks:
The annotations in MUC-7 Named Entity tasks (Marsh and
Perzanowski, 1998)
- Entities (organization, person, and location), times and quantities
such as monetary values and percentages
- Languages of English, Chinese and Japanese.
The entity categories in CONLL-02 (Tjong Kim Sang, 2002) and
CONLL-03 (Tjong Kim Sang and De Meulder, 2003) NER shared tasks
- Persons, locations, organizations and names of miscellaneous entities
- Languages span from Spanish, Dutch, English, to German.
The SIGHAN bakeoff-3 (Levow, 2006) and bakeoff-4 (Jin and Chen,
2008) tasks offer standard Chinese NER (CNER) corpora
- Three commonly used entities, i.e., personal names, location names,
and organization names.
Traditional methods
Traditional methods used for the entity recognition tend to employ
external annotated corpora to enhance the machine learning stage, and
improve the testing scores using the enhanced models (Zhang et al.,
2006; Mao et al., 2008; Yu et al., 2008).
The conditional random filed (CRF) models have shown advantages
and good performances in CNER tasks as compared with other machine
learning algorithms (Zhou et al., 2006; Zhao and Kit, 2008)
However, the annotated corpora are generally very expensive and time
consuming.
On the other hand, there are a lot of freely available unlabeled data in
the internet that can be used for our researches.
Due to this reason, some researchers begin to explore the usage of the
unlabeled data and the semi-supervised learning methods based on
labeled training data and unlabeled external data have shown their ad-
vantages (Blum and Chawla, 2001; Shin et al., 2006; Zha et al., 2008;
Zhang et al., 2013).
Motivation
Named entity recognition (NER) plays an important role in the NLP literature
The traditional methods tend to employ large annotated corpus to achieve a high performance.
- However the annotated corpus is usually expensive to gain.
Many semi-supervised learning models for NER task are proposed to utilize the freely available
unlabeled data.
Graph-based semi-supervised learning (GBSSL) methods have been employed in many NLP tasks,
e.g.
- sentiment categorization (Goldberg and Zhu, 2006)
- question- answering (Celikyilmaz et al., 2009)
- class-Instance acquisition (Talukdar and Pereira, 2010)
- structured tagging models (Subramanya et al., 2010)
- Joint Chinese word segmentation and part of speech (POS) tagging (Zeng et al., 2013)
Why not:
GBSSL for NER
Extend the labeled data
Enhance the learning model , e.g.conditional random field (CRF)
Designed model
• Enhanced learning with unlabelled data
• GBSSL to extend labeled data
• CRF learning
• Graph based semi-supervised learning
• - Graph Construction & Label Propagation
• Graph Construction
• - Follow the research of Subramanya et al.
(2010), represent the vertices using character
trigrams in labeled and unlabeled sentences for
graph construction
• - A symmetric k-NN graph is utilized with the
edge weights calculated by a symmetric
similarity function designed by Zeng et al.
(2013).
• where k(i) is the k nearest neighbours of xi. sim() is
a similarity measure of two vertices. The similarity is
computed based on the co-occurrence statistics
(Zeng et al., 2013).
wi, j =
sim(xi ,xj ) if j ∈k(i) or i ∈k( j)
0 otherwise
⎧
⎨
⎪
⎩
⎪
The feature set we employed to measure the similarity of two vertices based on the co-occurrence
statistics is the optimized one by Han et al. (2013) for CNER tasks, as denoted in Table 1.
• Label propagation
After the graph construction on both labeled and unlabeled data
- Use the sparsity inducing penalty (Das and Smith, 2012) label
propagation algorithm to induce trigram level label distributions from
the constructed graph
- Based on the Junto toolkit (Talukdar and Pereira, 2010).
Enhance CRF learning
- put the propagated labeled data into the training data
- based on the CRF++ toolkit
- Use the same feature set as in Table 1.
Experiments
Dataset
- We employ the SIGHAN bakeoff-3 (Levow, 2006) MSRA
(Microsoft research of Asia) training and testing data as standard
setting.
- To testify the effectiveness of the GBSSL method for CRF model in
CNER tasks, we utilize some plain (un- annotated) text from
SIGHAN bakeoff-2 (Emerson, 2005) and bakeoff-4 (Jin and Chen,
2008) as external unlabeled data.
- The data set is introduced in Table 2 from the aspect of sentence
number.
We set two baseline scores for the evaluation.
- One baseline is the simple left-to-right maximum matching model
(MaxMatch) based on the training data
- Another baseline is the closed CRF model (Closed-CRF) without
using unlabeled data.
- The employment of GBSSL model into semi-supervised CRF
learning is denoted as GBSSL-CRF.
• Training costs
- The comparison shows that the extracted features grow from
8,729,098 to 11,336,486 (29.87%) due to the external dataset, and the
corresponding iterations and training hours also grow by 12.86% and
77.04% respectively.
- The training costs of the CRF learning stage are detailed in Table 3.
• Evaluation results
- The evaluation results are shown in Table 4, from the aspects of
recall, precision and the harmonic mean of recall and precision (F1-
score).
- The evaluation shows that both the Closed-CRF and GBSSL-CRF
models have largely outperformed baseline-1 model (MaxMatch).
- As compared with the Closed-CRF model, the GBSSL- CRF model
yielded a higher performance in precision score, a lower performance
in recall score, and finally resulted in a faint improvement in F1 score.
- Both the GBSSL-CRF and Closed-CRF show higher performance in
precision and lower performance in recall value.
To look inside the GBSSL performance on each kind of entity
- denote the detailed evaluation results from the aspect of F1-score in
Table 5.
- The detailed evaluation from three kinds of entities shows that both
the GBSSL-CRF and Closed-CRF show higher performance in LOC
entity type, and lower performance in PER and ORG entities.
- Fortunately, the GBSSL model can enhance the CRF learning on the
two kinds of difficult entities PER and ORG with the better perfor-
mances of 0.28% and 0.58% respectively. However, the GBSSL
model decreases the F1 score on LOC entity by 0.19%.
The lower performance of GBSSL model on LOC entity may be due to
that the unlabelled data is only as much as 62.75% of the training
corpus, which is not large enough to cover the Out-of-Vocabulary
(OOV) testing words of LOC entity; on the other hand, the unlabeled
data also bring some noise into the model.
Conclusion and future works
This paper makes an effort to see the effectiveness of the GBSSL model
for the traditional CNER task.
Extend data
- The experiments verify that the GBSSL can enhance the state-of-the-
art CRF learning models. The improvement score is a little weak
because the unlabeled data is not large enough. In the future work, we
decide to use larger unlabeled dataset to enhance the CRF learning
model.
Feature selection:
The feature set optimized for CRF learning may be not the best one for
the similarity calculation in graph construction stage.
So we will make efforts to select the best feature set for the measuring
of vertices similarity in graph construction on CNER documents.
Corpus type:
In this paper, we utilized the Microsoft research of Asia corpus for
experiments.
We will use more kinds of Chinese corpora for testing, such as CITYU
and LDC corpus, etc.
Performance on OOV words:
The GBSSL model generally improves the tagging accuracy of the Out-
of-Vocabulary OOV) words in the test data, which are unseen in the
training corpora.
In the future work, we plan to give a detailed analysis of the GBSSL
model performance on the OOV words for CNER tasks.
• Thanks for your attention!
View publication statsView publication stats

More Related Content

What's hot

Study on Relavance Feature Selection Methods
Study on Relavance Feature Selection MethodsStudy on Relavance Feature Selection Methods
Study on Relavance Feature Selection MethodsIRJET Journal
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDavide Chicco
 
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET Journal
 
An unsupervised feature selection algorithm with feature ranking for maximizi...
An unsupervised feature selection algorithm with feature ranking for maximizi...An unsupervised feature selection algorithm with feature ranking for maximizi...
An unsupervised feature selection algorithm with feature ranking for maximizi...Asir Singh
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Seunghyun Hwang
 
Strategies for Metabolomics Data Analysis
Strategies for Metabolomics Data AnalysisStrategies for Metabolomics Data Analysis
Strategies for Metabolomics Data AnalysisDmitry Grapov
 
Improved Teaching Leaning Based Optimization Algorithm
Improved Teaching Leaning Based Optimization AlgorithmImproved Teaching Leaning Based Optimization Algorithm
Improved Teaching Leaning Based Optimization Algorithmrajani51
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkSaurav Jha
 
Concatenated decision paths classification for time series shapelets
Concatenated decision paths classification for time series shapeletsConcatenated decision paths classification for time series shapelets
Concatenated decision paths classification for time series shapeletsijics
 
Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...IAEME Publication
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresIJwest
 
Feature selection
Feature selectionFeature selection
Feature selectionDong Guo
 
Optimal rule set generation using pso algorithm
Optimal rule set generation using pso algorithmOptimal rule set generation using pso algorithm
Optimal rule set generation using pso algorithmcsandit
 
Migration strategies for object oriented system to component based system
Migration strategies for object oriented system to component based systemMigration strategies for object oriented system to component based system
Migration strategies for object oriented system to component based systemijfcstjournal
 
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyPareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyAbdel Salam Sayyad
 
A Collaborative Document Ranking Model for a Multi-Faceted Search
A Collaborative Document Ranking Model for a Multi-Faceted SearchA Collaborative Document Ranking Model for a Multi-Faceted Search
A Collaborative Document Ranking Model for a Multi-Faceted SearchUPMC - Sorbonne Universities
 
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)Minia University, Egypt
 

What's hot (17)

Study on Relavance Feature Selection Methods
Study on Relavance Feature Selection MethodsStudy on Relavance Feature Selection Methods
Study on Relavance Feature Selection Methods
 
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMiDoctoral Thesis Dissertation 2014-03-20 @PoliMi
Doctoral Thesis Dissertation 2014-03-20 @PoliMi
 
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...IRJET-  	  Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
IRJET- Comparative Study of PCA, KPCA, KFA and LDA Algorithms for Face Re...
 
An unsupervised feature selection algorithm with feature ranking for maximizi...
An unsupervised feature selection algorithm with feature ranking for maximizi...An unsupervised feature selection algorithm with feature ranking for maximizi...
An unsupervised feature selection algorithm with feature ranking for maximizi...
 
Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...Your Classifier is Secretly an Energy based model and you should treat it lik...
Your Classifier is Secretly an Energy based model and you should treat it lik...
 
Strategies for Metabolomics Data Analysis
Strategies for Metabolomics Data AnalysisStrategies for Metabolomics Data Analysis
Strategies for Metabolomics Data Analysis
 
Improved Teaching Leaning Based Optimization Algorithm
Improved Teaching Leaning Based Optimization AlgorithmImproved Teaching Leaning Based Optimization Algorithm
Improved Teaching Leaning Based Optimization Algorithm
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural Network
 
Concatenated decision paths classification for time series shapelets
Concatenated decision paths classification for time series shapeletsConcatenated decision paths classification for time series shapelets
Concatenated decision paths classification for time series shapelets
 
Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...
 
Question Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical featuresQuestion Classification using Semantic, Syntactic and Lexical features
Question Classification using Semantic, Syntactic and Lexical features
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Optimal rule set generation using pso algorithm
Optimal rule set generation using pso algorithmOptimal rule set generation using pso algorithm
Optimal rule set generation using pso algorithm
 
Migration strategies for object oriented system to component based system
Migration strategies for object oriented system to component based systemMigration strategies for object oriented system to component based system
Migration strategies for object oriented system to component based system
 
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyPareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
 
A Collaborative Document Ranking Model for a Multi-Faceted Search
A Collaborative Document Ranking Model for a Multi-Faceted SearchA Collaborative Document Ranking Model for a Multi-Faceted Search
A Collaborative Document Ranking Model for a Multi-Faceted Search
 
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)
ES-Rank: Evolutionary Strategy Learning to Rank Approach (Presentation)
 

Similar to Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model

The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...Jinho Choi
 
TEACHING AND LEARNING BASED OPTIMISATION
TEACHING AND LEARNING BASED OPTIMISATIONTEACHING AND LEARNING BASED OPTIMISATION
TEACHING AND LEARNING BASED OPTIMISATIONUday Wankar
 
A study on the efficiency of a test analysis method utilizing test-categories...
A study on the efficiency of a test analysis method utilizing test-categories...A study on the efficiency of a test analysis method utilizing test-categories...
A study on the efficiency of a test analysis method utilizing test-categories...Tsuyoshi Yumoto
 
Active learning for ranking through expected loss optimization
Active learning for ranking through expected loss optimizationActive learning for ranking through expected loss optimization
Active learning for ranking through expected loss optimizationPvrtechnologies Nellore
 
Machine learning project_promotion
Machine learning project_promotionMachine learning project_promotion
Machine learning project_promotionkahhuey
 
Hydraulics Team Full-Technical Lab Report
Hydraulics Team Full-Technical Lab ReportHydraulics Team Full-Technical Lab Report
Hydraulics Team Full-Technical Lab ReportAlfonso Figueroa
 
Technical research writing
Technical research writing   Technical research writing
Technical research writing AJAL A J
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...ijcsa
 
IRJET- Missing Value Evaluation in SQL Queries: A Survey
IRJET- 	  Missing Value Evaluation in SQL Queries: A SurveyIRJET- 	  Missing Value Evaluation in SQL Queries: A Survey
IRJET- Missing Value Evaluation in SQL Queries: A SurveyIRJET Journal
 
Missing Value Evaluation in SQL Queries: A Survey
Missing Value Evaluation in SQL Queries: A SurveyMissing Value Evaluation in SQL Queries: A Survey
Missing Value Evaluation in SQL Queries: A SurveyIRJET Journal
 
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSINGAUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSINGIRJET Journal
 
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor DriveIRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor DriveIRJET Journal
 
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET Journal
 
Machine learning testing survey, landscapes and horizons, the Cliff Notes
Machine learning testing  survey, landscapes and horizons, the Cliff NotesMachine learning testing  survey, landscapes and horizons, the Cliff Notes
Machine learning testing survey, landscapes and horizons, the Cliff NotesHeemeng Foo
 
A Hybrid method of face detection based on Feature Extraction using PIFR and ...
A Hybrid method of face detection based on Feature Extraction using PIFR and ...A Hybrid method of face detection based on Feature Extraction using PIFR and ...
A Hybrid method of face detection based on Feature Extraction using PIFR and ...IJERA Editor
 
A Hybrid method of face detection based on Feature Extraction using PIFR and ...
A Hybrid method of face detection based on Feature Extraction using PIFR and ...A Hybrid method of face detection based on Feature Extraction using PIFR and ...
A Hybrid method of face detection based on Feature Extraction using PIFR and ...IJERA Editor
 

Similar to Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model (20)

The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
 
TEACHING AND LEARNING BASED OPTIMISATION
TEACHING AND LEARNING BASED OPTIMISATIONTEACHING AND LEARNING BASED OPTIMISATION
TEACHING AND LEARNING BASED OPTIMISATION
 
A study on the efficiency of a test analysis method utilizing test-categories...
A study on the efficiency of a test analysis method utilizing test-categories...A study on the efficiency of a test analysis method utilizing test-categories...
A study on the efficiency of a test analysis method utilizing test-categories...
 
Mangai
MangaiMangai
Mangai
 
Active learning for ranking through expected loss optimization
Active learning for ranking through expected loss optimizationActive learning for ranking through expected loss optimization
Active learning for ranking through expected loss optimization
 
Machine learning project_promotion
Machine learning project_promotionMachine learning project_promotion
Machine learning project_promotion
 
50120130406007
5012013040600750120130406007
50120130406007
 
Research proposal
Research proposalResearch proposal
Research proposal
 
Hydraulics Team Full-Technical Lab Report
Hydraulics Team Full-Technical Lab ReportHydraulics Team Full-Technical Lab Report
Hydraulics Team Full-Technical Lab Report
 
Technical research writing
Technical research writing   Technical research writing
Technical research writing
 
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
 
IRJET- Missing Value Evaluation in SQL Queries: A Survey
IRJET- 	  Missing Value Evaluation in SQL Queries: A SurveyIRJET- 	  Missing Value Evaluation in SQL Queries: A Survey
IRJET- Missing Value Evaluation in SQL Queries: A Survey
 
Missing Value Evaluation in SQL Queries: A Survey
Missing Value Evaluation in SQL Queries: A SurveyMissing Value Evaluation in SQL Queries: A Survey
Missing Value Evaluation in SQL Queries: A Survey
 
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSINGAUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
AUTOMATIC QUESTION GENERATION USING NATURAL LANGUAGE PROCESSING
 
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor DriveIRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
 
IRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware PerformanceIRJET- Deep Learning Model to Predict Hardware Performance
IRJET- Deep Learning Model to Predict Hardware Performance
 
T0 numtq0n tk=
T0 numtq0n tk=T0 numtq0n tk=
T0 numtq0n tk=
 
Machine learning testing survey, landscapes and horizons, the Cliff Notes
Machine learning testing  survey, landscapes and horizons, the Cliff NotesMachine learning testing  survey, landscapes and horizons, the Cliff Notes
Machine learning testing survey, landscapes and horizons, the Cliff Notes
 
A Hybrid method of face detection based on Feature Extraction using PIFR and ...
A Hybrid method of face detection based on Feature Extraction using PIFR and ...A Hybrid method of face detection based on Feature Extraction using PIFR and ...
A Hybrid method of face detection based on Feature Extraction using PIFR and ...
 
A Hybrid method of face detection based on Feature Extraction using PIFR and ...
A Hybrid method of face detection based on Feature Extraction using PIFR and ...A Hybrid method of face detection based on Feature Extraction using PIFR and ...
A Hybrid method of face detection based on Feature Extraction using PIFR and ...
 

More from Lifeng (Aaron) Han

WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterWMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterLifeng (Aaron) Han
 
Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Lifeng (Aaron) Han
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Lifeng (Aaron) Han
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...Lifeng (Aaron) Han
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...Lifeng (Aaron) Han
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsLifeng (Aaron) Han
 
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Lifeng (Aaron) Han
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Lifeng (Aaron) Han
 
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...Lifeng (Aaron) Han
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word ExpressionsLifeng (Aaron) Han
 
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerBuild moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerLifeng (Aaron) Han
 
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Lifeng (Aaron) Han
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...Lifeng (Aaron) Han
 
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaMultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaLifeng (Aaron) Han
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationLifeng (Aaron) Han
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveyLifeng (Aaron) Han
 
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Lifeng (Aaron) Han
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Lifeng (Aaron) Han
 
PubhD talk: MT serving the society
PubhD talk: MT serving the societyPubhD talk: MT serving the society
PubhD talk: MT serving the societyLifeng (Aaron) Han
 

More from Lifeng (Aaron) Han (20)

WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni ManchesterWMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
WMT2022 Biomedical MT PPT: Logrus Global and Uni Manchester
 
Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)Measuring Uncertainty in Translation Quality Evaluation (TQE)
Measuring Uncertainty in Translation Quality Evaluation (TQE)
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Profession...
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methods
 
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...
 
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
 
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
Chinese Character Decomposition for  Neural MT with Multi-Word ExpressionsChinese Character Decomposition for  Neural MT with Multi-Word Expressions
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
 
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longerBuild moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
Build moses on ubuntu (64 bit) system in virtubox recorded by aaron _v2longer
 
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with...
 
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
AlphaMWE: Construction of Multilingual Parallel Corpora with MWE Annotations ...
 
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaMultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
 
A deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine TranslationA deep analysis of Multi-word Expression and Machine Translation
A deep analysis of Multi-word Expression and Machine Translation
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a survey
 
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
 
PubhD talk: MT serving the society
PubhD talk: MT serving the societyPubhD talk: MT serving the society
PubhD talk: MT serving the society
 

Recently uploaded

Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 

Recently uploaded (20)

Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 

Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model

  • 2. Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model Aaron Li-Feng Han* Xiaodong Zeng+ Derek F. Wong+ Lidia S. Chao+ * ILLC, University of Amsterdam, The Netherlands + NLP2CT Laboratory, University of Macau, Macau S.A.R., China SIGHAN 2015 @ Beijing, July 30-31
  • 3. • NER Tasks • Traditional methods • Motivation • Designed model • Experiments • Conclusion
  • 4. NER Tasks NER tasks: The annotations in MUC-7 Named Entity tasks (Marsh and Perzanowski, 1998) - Entities (organization, person, and location), times and quantities such as monetary values and percentages - Languages of English, Chinese and Japanese.
  • 5. The entity categories in CONLL-02 (Tjong Kim Sang, 2002) and CONLL-03 (Tjong Kim Sang and De Meulder, 2003) NER shared tasks - Persons, locations, organizations and names of miscellaneous entities - Languages span from Spanish, Dutch, English, to German. The SIGHAN bakeoff-3 (Levow, 2006) and bakeoff-4 (Jin and Chen, 2008) tasks offer standard Chinese NER (CNER) corpora - Three commonly used entities, i.e., personal names, location names, and organization names.
  • 6. Traditional methods Traditional methods used for the entity recognition tend to employ external annotated corpora to enhance the machine learning stage, and improve the testing scores using the enhanced models (Zhang et al., 2006; Mao et al., 2008; Yu et al., 2008). The conditional random filed (CRF) models have shown advantages and good performances in CNER tasks as compared with other machine learning algorithms (Zhou et al., 2006; Zhao and Kit, 2008) However, the annotated corpora are generally very expensive and time consuming.
  • 7. On the other hand, there are a lot of freely available unlabeled data in the internet that can be used for our researches. Due to this reason, some researchers begin to explore the usage of the unlabeled data and the semi-supervised learning methods based on labeled training data and unlabeled external data have shown their ad- vantages (Blum and Chawla, 2001; Shin et al., 2006; Zha et al., 2008; Zhang et al., 2013).
  • 8. Motivation Named entity recognition (NER) plays an important role in the NLP literature The traditional methods tend to employ large annotated corpus to achieve a high performance. - However the annotated corpus is usually expensive to gain. Many semi-supervised learning models for NER task are proposed to utilize the freely available unlabeled data. Graph-based semi-supervised learning (GBSSL) methods have been employed in many NLP tasks, e.g. - sentiment categorization (Goldberg and Zhu, 2006) - question- answering (Celikyilmaz et al., 2009) - class-Instance acquisition (Talukdar and Pereira, 2010) - structured tagging models (Subramanya et al., 2010) - Joint Chinese word segmentation and part of speech (POS) tagging (Zeng et al., 2013)
  • 9. Why not: GBSSL for NER Extend the labeled data Enhance the learning model , e.g.conditional random field (CRF)
  • 10. Designed model • Enhanced learning with unlabelled data • GBSSL to extend labeled data • CRF learning
  • 11. • Graph based semi-supervised learning • - Graph Construction & Label Propagation
  • 12. • Graph Construction • - Follow the research of Subramanya et al. (2010), represent the vertices using character trigrams in labeled and unlabeled sentences for graph construction • - A symmetric k-NN graph is utilized with the edge weights calculated by a symmetric similarity function designed by Zeng et al. (2013).
  • 13. • where k(i) is the k nearest neighbours of xi. sim() is a similarity measure of two vertices. The similarity is computed based on the co-occurrence statistics (Zeng et al., 2013). wi, j = sim(xi ,xj ) if j ∈k(i) or i ∈k( j) 0 otherwise ⎧ ⎨ ⎪ ⎩ ⎪
  • 14. The feature set we employed to measure the similarity of two vertices based on the co-occurrence statistics is the optimized one by Han et al. (2013) for CNER tasks, as denoted in Table 1.
  • 15. • Label propagation After the graph construction on both labeled and unlabeled data - Use the sparsity inducing penalty (Das and Smith, 2012) label propagation algorithm to induce trigram level label distributions from the constructed graph - Based on the Junto toolkit (Talukdar and Pereira, 2010).
  • 16. Enhance CRF learning - put the propagated labeled data into the training data - based on the CRF++ toolkit - Use the same feature set as in Table 1.
  • 17. Experiments Dataset - We employ the SIGHAN bakeoff-3 (Levow, 2006) MSRA (Microsoft research of Asia) training and testing data as standard setting. - To testify the effectiveness of the GBSSL method for CRF model in CNER tasks, we utilize some plain (un- annotated) text from SIGHAN bakeoff-2 (Emerson, 2005) and bakeoff-4 (Jin and Chen, 2008) as external unlabeled data. - The data set is introduced in Table 2 from the aspect of sentence number.
  • 18.
  • 19. We set two baseline scores for the evaluation. - One baseline is the simple left-to-right maximum matching model (MaxMatch) based on the training data - Another baseline is the closed CRF model (Closed-CRF) without using unlabeled data. - The employment of GBSSL model into semi-supervised CRF learning is denoted as GBSSL-CRF.
  • 20. • Training costs - The comparison shows that the extracted features grow from 8,729,098 to 11,336,486 (29.87%) due to the external dataset, and the corresponding iterations and training hours also grow by 12.86% and 77.04% respectively. - The training costs of the CRF learning stage are detailed in Table 3.
  • 21.
  • 22. • Evaluation results - The evaluation results are shown in Table 4, from the aspects of recall, precision and the harmonic mean of recall and precision (F1- score).
  • 23. - The evaluation shows that both the Closed-CRF and GBSSL-CRF models have largely outperformed baseline-1 model (MaxMatch). - As compared with the Closed-CRF model, the GBSSL- CRF model yielded a higher performance in precision score, a lower performance in recall score, and finally resulted in a faint improvement in F1 score. - Both the GBSSL-CRF and Closed-CRF show higher performance in precision and lower performance in recall value.
  • 24. To look inside the GBSSL performance on each kind of entity - denote the detailed evaluation results from the aspect of F1-score in Table 5.
  • 25. - The detailed evaluation from three kinds of entities shows that both the GBSSL-CRF and Closed-CRF show higher performance in LOC entity type, and lower performance in PER and ORG entities. - Fortunately, the GBSSL model can enhance the CRF learning on the two kinds of difficult entities PER and ORG with the better perfor- mances of 0.28% and 0.58% respectively. However, the GBSSL model decreases the F1 score on LOC entity by 0.19%.
  • 26. The lower performance of GBSSL model on LOC entity may be due to that the unlabelled data is only as much as 62.75% of the training corpus, which is not large enough to cover the Out-of-Vocabulary (OOV) testing words of LOC entity; on the other hand, the unlabeled data also bring some noise into the model.
  • 27. Conclusion and future works This paper makes an effort to see the effectiveness of the GBSSL model for the traditional CNER task. Extend data - The experiments verify that the GBSSL can enhance the state-of-the- art CRF learning models. The improvement score is a little weak because the unlabeled data is not large enough. In the future work, we decide to use larger unlabeled dataset to enhance the CRF learning model.
  • 28. Feature selection: The feature set optimized for CRF learning may be not the best one for the similarity calculation in graph construction stage. So we will make efforts to select the best feature set for the measuring of vertices similarity in graph construction on CNER documents.
  • 29. Corpus type: In this paper, we utilized the Microsoft research of Asia corpus for experiments. We will use more kinds of Chinese corpora for testing, such as CITYU and LDC corpus, etc.
  • 30. Performance on OOV words: The GBSSL model generally improves the tagging accuracy of the Out- of-Vocabulary OOV) words in the test data, which are unseen in the training corpora. In the future work, we plan to give a detailed analysis of the GBSSL model performance on the OOV words for CNER tasks.
  • 31. • Thanks for your attention! View publication statsView publication stats