SlideShare a Scribd company logo
1 of 21
Download to read offline
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Corpus Effects on the Evaluation of
Automated Transliteration Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
School of Computer Science and Information Technology
RMIT University, Melbourne, Australia
26 June 2007
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Transliteration
Machine Transliteration:
Automatically transforming a word written in a source
language into a word in a target language.
Example:
Prague (source) to üÉà (target)
Evaluation:
Machine generated words are compared with human
generated ones. Human judgment is a gold standard.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Transliteration is Subjective
How to define correct transliteration?
Prague: üÉà or üÉÃ?
00000000000000000000000000
000000000000000000000000000000000000000
11111111111111111111111111
111111111111111111111111111111111111111
0000000000000
000000000000000000000000000000000000000
00000000000000000000000000
1111111111111
111111111111111111111111111111111111111
11111111111111111111111111?
?
Automatic
Transliterator
Target Word
Target Word
Source Word
Source Word
Human Transliterator
STANDARD
STANDARD
???
Praha
Prague
Prago
Prag
Praag
Prago
Prag
Prague
Praha
Prag
Praag
Prago
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Evaluating Algorithms or Corpus?
When evaluating a transliteration algorithm, can
a testing corpus mislead us in our judgments?
Algorithm
Corpus
+
Algorithm
+
Corpus
Specification
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Experimental Scheme
◮ Transliteration systems: Two grapheme-based
algorithms previously examined for English-Persian
language pairs. We refer them as system A and
system B.
◮ Corpus: We constructed a controlled corpus
(language origin, number of transliterators,
transliterators language knowledge).
◮ Evaluation measure: Word accuracy and its
variants, human agreement, entropy of transliteration
rules (transliterator’s consistency).
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Controlled corpus
We made a corpus with the following specifications:
◮ Three datasets (English, Arabic and Dutch)
containing 500 word-pairs each.
◮ Seven transliterators (Persian native speakers).
◮ All of the transliterators knew English and Arabic and
had no Dutch knowledge.
◮ All of the transliterators had at least a Bachelors
degree.
◮ The origin of the words was not given to them.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Word Accuracy
WA = number of correct transliterations
total number of test words
If more than one judgment is available we can define:
1. Uniform Word Accuracy (UWA):
All the variations suggested by transliterators are equally valid.
2. Weighted Word Accuracy (WWA):
A weight is assigned to the transliterations based on the number
of people who suggested that variant.
3. Majority Word Accuracy (MWA):
Only one of the transliterations suggested by majority of the
transliterators, is chosen as the correct one.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Language Origin and the Ranking of the
Systems
Corpora with different language origins:
E7 D7 A7 EDA7
Corpus
0
20
40
60
80
100
WordAccuracy(%)
UWA (SYS-B)
UWA (SYS-A)
MWA (SYS-B)
MWA (SYS-A)
Randomly selected EDA sub-corpora:
0 20 40 60 80 100
Corpus
0
20
40
60
80
100
WordAccuracy(%)
UWA (SYS-B)
UWA (SYS-A)
MWA (SYS-B)
MWA (SYS-A)
Systems ranking remains constant but not accuracy values.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Language Origin and the Ranking of the
Systems
Corpora with different language origins:
E7 D7 A7 EDA7
Corpus
0
20
40
60
80
100
WordAccuracy(%)
UWA (SYS-B)
UWA (SYS-A)
MWA (SYS-B)
MWA (SYS-A)
Randomly selected EDA sub-corpora:
0 20 40 60 80 100
Corpus
0
20
40
60
80
100
WordAccuracy(%)
UWA (SYS-B)
UWA (SYS-A)
MWA (SYS-B)
MWA (SYS-A)
Systems ranking remains constant but not accuracy values.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Accuracy and Single Transliterators
System A:
E7 D7 A7 EDA7
Corpus
0
20
40
60
WordAccuracy(%)
T1
T2
T3
T4
T5
T6
T7
17.2
39.0
System B:
E7 D7 A7 EDA7
Corpus
0
20
40
60
WordAccuracy(%)
T1
T2
T3
T4
T5
T6
T7
23.2
56.2
Evaluation can be heavily biased towards the judgments.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Accuracy and Single Transliterators
System A:
E7 D7 A7 EDA7
Corpus
0
20
40
60
WordAccuracy(%)
T1
T2
T3
T4
T5
T6
T7
17.2
39.0
System B:
E7 D7 A7 EDA7
Corpus
0
20
40
60
WordAccuracy(%)
T1
T2
T3
T4
T5
T6
T7
23.2
56.2
Evaluation can be heavily biased towards the judgments.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Accuracy and Number of Transliterators
Transliteration using a combination of transliterators
(EDA corpus)
Creating a corpus for training and testing of a transliteration
system should be done using more than one transliterator.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Accuracy and Number of Transliterators
Transliteration using a combination of transliterators
(EDA corpus)
Creating a corpus for training and testing of a transliteration
system should be done using more than one transliterator.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Human Agreement
How far do humans themselves agree on
transliteration?
Raw agreement adapted to calculate human agreement:
PA =
total number of actual agreements
total number of possible agreements
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Inter-Transliterator Agreement and Perceived
Difficulty
Transliterator’s perception of the task
(H:hard, M: medium, E:easy)
Transliterator English Dutch Arabic
1 H H M
2 M M E
3 M H M
4 M M E
5 M H E
6 M H E
7 M H M
Measured Agreement:
English: 33.6%
Dutch: 15.5%
Arabic: 33.3%
There is a direct relation between transliterator knowledge of the
source language which the words come from and their
agreement.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Inter-Transliterator Agreement and Perceived
Difficulty
Transliterator’s perception of the task
(H:hard, M: medium, E:easy)
Transliterator English Dutch Arabic
1 H H M
2 M M E
3 M H M
4 M M E
5 M H E
6 M H E
7 M H M
Measured Agreement:
English: 33.6%
Dutch: 15.5%
Arabic: 33.3%
There is a direct relation between transliterator knowledge of the
source language which the words come from and their
agreement.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Transliterator Consistency
A transliterator’s habit of transliteration defines the rules
of transforming words.
Rules: C → ( , 0.6)
C → (ô, 0.3)
C → ( , 0.1)
E7 D7 A7 EDA7
Corpus
0.0
0.2
0.4
0.6
Entropy
E7 D7 A7 EDA7
Corpus
0
20
40
60
WordAccuracy(%)
T1
T2
T3
T4
T5
T6
T7
The consistency with which transliterators employ their own
rules has a direct effect on the system’s accuracy.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Transliterator Consistency
A transliterator’s habit of transliteration defines the rules
of transforming words.
Rules: C → ( , 0.6)
C → (ô, 0.3)
C → ( , 0.1)
E7 D7 A7 EDA7
Corpus
0.0
0.2
0.4
0.6
Entropy
E7 D7 A7 EDA7
Corpus
0
20
40
60
WordAccuracy(%)
T1
T2
T3
T4
T5
T6
T7
The consistency with which transliterators employ their own
rules has a direct effect on the system’s accuracy.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Conclusions
Main achievements of our experiments:
1. Although different transliteration systems may have
different accuracy levels on different corpora, their
ranking holds across these corpora.
2. One transliteration system can achieve different
accuracy with corpora constructed by different
transliterators. The variation can be up to 30% in
terms of word accuracy.
3. The origin of source words has a direct effect on
system performance. The English origin words are
generally transliterated more accurately than Arabic
and Dutch origin words.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Suggestions
◮ We conclude that, when making a collection for
transliteration we should construct it with assistance
of multiple transliterators ( 4) or make sure that
transliterations are from different sources that are
more likely reflect different people knowedge.
◮ When we report our results we should report:
1. The origin of source words.
2. Number of transliterators who constructed the corpus
or exact process of corpus construction.
Corpus specifications are as important as
algorithms, and must be stated clearly in our
experiments.
Corpus Effects
on the Evaluation
of Automated
Transliteration
Systems
Sarvnaz Karimi
Andrew Turpin
Falk Scholer
Introduction
Corpus
Experiments
Conclusion
Thank You!

More Related Content

Similar to Corpus Effects on the Evaluation of Automated Transliteration Systems

LREC'2008 translation universals
LREC'2008 translation universalsLREC'2008 translation universals
LREC'2008 translation universals
Naveed Afzal
 
Machine translation evaluation: a survey
Machine translation evaluation: a surveyMachine translation evaluation: a survey
Machine translation evaluation: a survey
Lifeng (Aaron) Han
 
AMTA'2008 translation universals
AMTA'2008 translation universalsAMTA'2008 translation universals
AMTA'2008 translation universals
Naveed Afzal
 
Answer Selection and Validation for Arabic Questions
Answer Selection and Validation for Arabic QuestionsAnswer Selection and Validation for Arabic Questions
Answer Selection and Validation for Arabic Questions
Ahmed Magdy Ezzeldin, MSc.
 

Similar to Corpus Effects on the Evaluation of Automated Transliteration Systems (20)

Hybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsHybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
 
LREC'2008 translation universals
LREC'2008 translation universalsLREC'2008 translation universals
LREC'2008 translation universals
 
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration ExtractionEnriching Transliteration Lexicon Using Automatic Transliteration Extraction
Enriching Transliteration Lexicon Using Automatic Transliteration Extraction
 
Machine translation evaluation: a survey
Machine translation evaluation: a surveyMachine translation evaluation: a survey
Machine translation evaluation: a survey
 
"Machine Translation 101" and the Challenge of Patents
"Machine Translation 101" and the Challenge of Patents"Machine Translation 101" and the Challenge of Patents
"Machine Translation 101" and the Challenge of Patents
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processing
 
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
 
Machine Transalation.pdf
Machine Transalation.pdfMachine Transalation.pdf
Machine Transalation.pdf
 
Optimizing Near-Synonym System
Optimizing Near-Synonym SystemOptimizing Near-Synonym System
Optimizing Near-Synonym System
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modelling
 
C8 akumaran
C8 akumaranC8 akumaran
C8 akumaran
 
AMTA'2008 translation universals
AMTA'2008 translation universalsAMTA'2008 translation universals
AMTA'2008 translation universals
 
Answer Selection and Validation for Arabic Questions
Answer Selection and Validation for Arabic QuestionsAnswer Selection and Validation for Arabic Questions
Answer Selection and Validation for Arabic Questions
 
Sslis
SslisSslis
Sslis
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabic
 
What are the basics of Analysing a corpus? chpt.10 Routledge
What are the basics of Analysing a corpus? chpt.10 RoutledgeWhat are the basics of Analysing a corpus? chpt.10 Routledge
What are the basics of Analysing a corpus? chpt.10 Routledge
 
Beyond Word2Vec: Embedding Words and Phrases in Same Vector Space
Beyond Word2Vec: Embedding Words and Phrases in Same Vector SpaceBeyond Word2Vec: Embedding Words and Phrases in Same Vector Space
Beyond Word2Vec: Embedding Words and Phrases in Same Vector Space
 
PhD Proposal
PhD ProposalPhD Proposal
PhD Proposal
 
Jérémy Ferrero - 2017 - Using Word Embedding for Cross-Language Plagiarism ...
Jérémy Ferrero - 2017 - Using Word Embedding for Cross-Language Plagiarism ...Jérémy Ferrero - 2017 - Using Word Embedding for Cross-Language Plagiarism ...
Jérémy Ferrero - 2017 - Using Word Embedding for Cross-Language Plagiarism ...
 
NLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology ConstraintsNLP Data Cleansing Based on Linguistic Ontology Constraints
NLP Data Cleansing Based on Linguistic Ontology Constraints
 

More from Sarvnaz Karimi

More from Sarvnaz Karimi (6)

Search in Medical Text
Search in Medical TextSearch in Medical Text
Search in Medical Text
 
Collapsed Consonant and Vowel Models: New Approaches for English-Persian Tran...
Collapsed Consonant and Vowel Models: New Approaches for English-Persian Tran...Collapsed Consonant and Vowel Models: New Approaches for English-Persian Tran...
Collapsed Consonant and Vowel Models: New Approaches for English-Persian Tran...
 
Karimi esair2015
Karimi esair2015Karimi esair2015
Karimi esair2015
 
Pinpointing Location Focus in Microblogs
Pinpointing Location Focus in MicroblogsPinpointing Location Focus in Microblogs
Pinpointing Location Focus in Microblogs
 
Biomedical Search
Biomedical SearchBiomedical Search
Biomedical Search
 
Classifying Microblogs For Disasters
Classifying Microblogs For DisastersClassifying Microblogs For Disasters
Classifying Microblogs For Disasters
 

Recently uploaded

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Recently uploaded (20)

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Creating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening DesignsCreating and Analyzing Definitive Screening Designs
Creating and Analyzing Definitive Screening Designs
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 

Corpus Effects on the Evaluation of Automated Transliteration Systems

  • 1. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer School of Computer Science and Information Technology RMIT University, Melbourne, Australia 26 June 2007
  • 2. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Transliteration Machine Transliteration: Automatically transforming a word written in a source language into a word in a target language. Example: Prague (source) to üÉà (target) Evaluation: Machine generated words are compared with human generated ones. Human judgment is a gold standard.
  • 3. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Transliteration is Subjective How to define correct transliteration? Prague: üÉà or üÉÃ? 00000000000000000000000000 000000000000000000000000000000000000000 11111111111111111111111111 111111111111111111111111111111111111111 0000000000000 000000000000000000000000000000000000000 00000000000000000000000000 1111111111111 111111111111111111111111111111111111111 11111111111111111111111111? ? Automatic Transliterator Target Word Target Word Source Word Source Word Human Transliterator STANDARD STANDARD ??? Praha Prague Prago Prag Praag Prago Prag Prague Praha Prag Praag Prago
  • 4. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Evaluating Algorithms or Corpus? When evaluating a transliteration algorithm, can a testing corpus mislead us in our judgments? Algorithm Corpus + Algorithm + Corpus Specification
  • 5. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Experimental Scheme ◮ Transliteration systems: Two grapheme-based algorithms previously examined for English-Persian language pairs. We refer them as system A and system B. ◮ Corpus: We constructed a controlled corpus (language origin, number of transliterators, transliterators language knowledge). ◮ Evaluation measure: Word accuracy and its variants, human agreement, entropy of transliteration rules (transliterator’s consistency).
  • 6. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Controlled corpus We made a corpus with the following specifications: ◮ Three datasets (English, Arabic and Dutch) containing 500 word-pairs each. ◮ Seven transliterators (Persian native speakers). ◮ All of the transliterators knew English and Arabic and had no Dutch knowledge. ◮ All of the transliterators had at least a Bachelors degree. ◮ The origin of the words was not given to them.
  • 7. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Word Accuracy WA = number of correct transliterations total number of test words If more than one judgment is available we can define: 1. Uniform Word Accuracy (UWA): All the variations suggested by transliterators are equally valid. 2. Weighted Word Accuracy (WWA): A weight is assigned to the transliterations based on the number of people who suggested that variant. 3. Majority Word Accuracy (MWA): Only one of the transliterations suggested by majority of the transliterators, is chosen as the correct one.
  • 8. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Language Origin and the Ranking of the Systems Corpora with different language origins: E7 D7 A7 EDA7 Corpus 0 20 40 60 80 100 WordAccuracy(%) UWA (SYS-B) UWA (SYS-A) MWA (SYS-B) MWA (SYS-A) Randomly selected EDA sub-corpora: 0 20 40 60 80 100 Corpus 0 20 40 60 80 100 WordAccuracy(%) UWA (SYS-B) UWA (SYS-A) MWA (SYS-B) MWA (SYS-A) Systems ranking remains constant but not accuracy values.
  • 9. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Language Origin and the Ranking of the Systems Corpora with different language origins: E7 D7 A7 EDA7 Corpus 0 20 40 60 80 100 WordAccuracy(%) UWA (SYS-B) UWA (SYS-A) MWA (SYS-B) MWA (SYS-A) Randomly selected EDA sub-corpora: 0 20 40 60 80 100 Corpus 0 20 40 60 80 100 WordAccuracy(%) UWA (SYS-B) UWA (SYS-A) MWA (SYS-B) MWA (SYS-A) Systems ranking remains constant but not accuracy values.
  • 10. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Accuracy and Single Transliterators System A: E7 D7 A7 EDA7 Corpus 0 20 40 60 WordAccuracy(%) T1 T2 T3 T4 T5 T6 T7 17.2 39.0 System B: E7 D7 A7 EDA7 Corpus 0 20 40 60 WordAccuracy(%) T1 T2 T3 T4 T5 T6 T7 23.2 56.2 Evaluation can be heavily biased towards the judgments.
  • 11. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Accuracy and Single Transliterators System A: E7 D7 A7 EDA7 Corpus 0 20 40 60 WordAccuracy(%) T1 T2 T3 T4 T5 T6 T7 17.2 39.0 System B: E7 D7 A7 EDA7 Corpus 0 20 40 60 WordAccuracy(%) T1 T2 T3 T4 T5 T6 T7 23.2 56.2 Evaluation can be heavily biased towards the judgments.
  • 12. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Accuracy and Number of Transliterators Transliteration using a combination of transliterators (EDA corpus) Creating a corpus for training and testing of a transliteration system should be done using more than one transliterator.
  • 13. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Accuracy and Number of Transliterators Transliteration using a combination of transliterators (EDA corpus) Creating a corpus for training and testing of a transliteration system should be done using more than one transliterator.
  • 14. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Human Agreement How far do humans themselves agree on transliteration? Raw agreement adapted to calculate human agreement: PA = total number of actual agreements total number of possible agreements
  • 15. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Inter-Transliterator Agreement and Perceived Difficulty Transliterator’s perception of the task (H:hard, M: medium, E:easy) Transliterator English Dutch Arabic 1 H H M 2 M M E 3 M H M 4 M M E 5 M H E 6 M H E 7 M H M Measured Agreement: English: 33.6% Dutch: 15.5% Arabic: 33.3% There is a direct relation between transliterator knowledge of the source language which the words come from and their agreement.
  • 16. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Inter-Transliterator Agreement and Perceived Difficulty Transliterator’s perception of the task (H:hard, M: medium, E:easy) Transliterator English Dutch Arabic 1 H H M 2 M M E 3 M H M 4 M M E 5 M H E 6 M H E 7 M H M Measured Agreement: English: 33.6% Dutch: 15.5% Arabic: 33.3% There is a direct relation between transliterator knowledge of the source language which the words come from and their agreement.
  • 17. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Transliterator Consistency A transliterator’s habit of transliteration defines the rules of transforming words. Rules: C → ( , 0.6) C → (ô, 0.3) C → ( , 0.1) E7 D7 A7 EDA7 Corpus 0.0 0.2 0.4 0.6 Entropy E7 D7 A7 EDA7 Corpus 0 20 40 60 WordAccuracy(%) T1 T2 T3 T4 T5 T6 T7 The consistency with which transliterators employ their own rules has a direct effect on the system’s accuracy.
  • 18. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Transliterator Consistency A transliterator’s habit of transliteration defines the rules of transforming words. Rules: C → ( , 0.6) C → (ô, 0.3) C → ( , 0.1) E7 D7 A7 EDA7 Corpus 0.0 0.2 0.4 0.6 Entropy E7 D7 A7 EDA7 Corpus 0 20 40 60 WordAccuracy(%) T1 T2 T3 T4 T5 T6 T7 The consistency with which transliterators employ their own rules has a direct effect on the system’s accuracy.
  • 19. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Conclusions Main achievements of our experiments: 1. Although different transliteration systems may have different accuracy levels on different corpora, their ranking holds across these corpora. 2. One transliteration system can achieve different accuracy with corpora constructed by different transliterators. The variation can be up to 30% in terms of word accuracy. 3. The origin of source words has a direct effect on system performance. The English origin words are generally transliterated more accurately than Arabic and Dutch origin words.
  • 20. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Suggestions ◮ We conclude that, when making a collection for transliteration we should construct it with assistance of multiple transliterators ( 4) or make sure that transliterations are from different sources that are more likely reflect different people knowedge. ◮ When we report our results we should report: 1. The origin of source words. 2. Number of transliterators who constructed the corpus or exact process of corpus construction. Corpus specifications are as important as algorithms, and must be stated clearly in our experiments.
  • 21. Corpus Effects on the Evaluation of Automated Transliteration Systems Sarvnaz Karimi Andrew Turpin Falk Scholer Introduction Corpus Experiments Conclusion Thank You!