SlideShare a Scribd company logo
1 of 17
HTET MYET LYNN
20147728
Department of Computer Engineering
Chosun University
Intelligent Computing Laboratory
First-Character Filtering Method in SyllableSegmentation usingData
Dictionary for MyanmarLanguage
Contents
 Nature of Myanmar Words
 First-Character Filtering (FCF) Method
 Implementation & Result
 Future Work
Nature of Myanmar words (1/2)
 Like other South-East Asian languages
 No delimiter (whitespace) between words but
phrases
 No standard rules for whitespace
 33 Consonants, 10 digits
Nature of Myanmar words (2/2)
자음
모음
좋 다.
ㄲ
께이
꺼
까웅
까웅이
1
2
3
4
5
Nature of Myanmar words (3/3)
 Kinzi
 Stacked Consonants
Writing Methods
영어
대학교
First-Character Filtering (FCF)
Method(1/8)
Input
Sentence
Get First
Character
Syllable
Collections
Output
Syllable
First-Character Filtering (FCF)
Method(2/8)
Syllable Collections
First-Character Filtering (FCF)
Method(3/8)
Syllable Collections
First-Character Filtering (FCF)
Method(4/8)
Sentence Pre-processing
• Whitespace
• Punctuations Marks
• Number of Input Sentence
• Length of Each Sentence
학생은 학교로 간다.
First-Character Filtering (FCF)
Method(5/8)
Get First Character of the sentence
Input_txt_length =160
First-Character Filtering (FCF)
Method(6/8)
Extract Candidates from Syllable Collections
First-Character Filtering (FCF)
Method(7/8)
Extract Candidates from Syllable Collections
Input_txt_length =160
Length_of_syl=1
Length_of_syl=4
Length_of_syl=8
Length_of_syl=12
.
.
.
.
.
.
.
$candidate = substr ( Input_txt, 0, length_of_syl);
//Store Candidate
If $candidate == $syllable {
Store_Candidate ($candidate);
}
Candidates.txt
First-Character Filtering (FCF)
Method(8/8)
Store Final Syllable
Input_txt_length =140
Results.txt
Final_syllable_length = 20
Input_txt_length =160
destroy candidates.txt
Implementation & Result
Implementation & Result
Future Work
Algorithm for :
 Loan Words
 Kinzi syllables
 Stacked Consonants syllables
 Word Segmentation
영어
대학교
버스
First-Character Filtering Method in Syllable Segmentation using Data Dictionary for Myanmar Language

More Related Content

What's hot

IRJET- Survey on Grammar Checking and Correction using Deep Learning for Indi...
IRJET- Survey on Grammar Checking and Correction using Deep Learning for Indi...IRJET- Survey on Grammar Checking and Correction using Deep Learning for Indi...
IRJET- Survey on Grammar Checking and Correction using Deep Learning for Indi...IRJET Journal
 
Named Entity Recognition System for Hindi Language: A Hybrid Approach
Named Entity Recognition System for Hindi Language: A Hybrid ApproachNamed Entity Recognition System for Hindi Language: A Hybrid Approach
Named Entity Recognition System for Hindi Language: A Hybrid ApproachWaqas Tariq
 
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES ijnlc
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicijnlc
 
STATISTICAL FUNCTION TAGGING AND GRAMMATICAL RELATIONS OF MYANMAR SENTENCES
STATISTICAL FUNCTION TAGGING AND GRAMMATICAL RELATIONS OF MYANMAR SENTENCESSTATISTICAL FUNCTION TAGGING AND GRAMMATICAL RELATIONS OF MYANMAR SENTENCES
STATISTICAL FUNCTION TAGGING AND GRAMMATICAL RELATIONS OF MYANMAR SENTENCEScscpconf
 
A survey on phrase structure learning methods for text classification
A survey on phrase structure learning methods for text classificationA survey on phrase structure learning methods for text classification
A survey on phrase structure learning methods for text classificationijnlc
 
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerIOSR Journals
 
Error analysis on subject verb agreement the case of a university student in ...
Error analysis on subject verb agreement the case of a university student in ...Error analysis on subject verb agreement the case of a university student in ...
Error analysis on subject verb agreement the case of a university student in ...Alexander Decker
 
The Investigation of Grammatical Errors in Grade 10 Students’ Expository Essa...
The Investigation of Grammatical Errors in Grade 10 Students’ Expository Essa...The Investigation of Grammatical Errors in Grade 10 Students’ Expository Essa...
The Investigation of Grammatical Errors in Grade 10 Students’ Expository Essa...Tshen Tashi
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...iosrjce
 
The recognition system of sentential
The recognition system of sententialThe recognition system of sentential
The recognition system of sententialijaia
 

What's hot (18)

Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification ...
 
Selecting proper lexical paraphrase for children
Selecting proper lexical paraphrase for childrenSelecting proper lexical paraphrase for children
Selecting proper lexical paraphrase for children
 
Michigan ecce exam 2013
Michigan ecce exam 2013Michigan ecce exam 2013
Michigan ecce exam 2013
 
IRJET- Survey on Grammar Checking and Correction using Deep Learning for Indi...
IRJET- Survey on Grammar Checking and Correction using Deep Learning for Indi...IRJET- Survey on Grammar Checking and Correction using Deep Learning for Indi...
IRJET- Survey on Grammar Checking and Correction using Deep Learning for Indi...
 
Named Entity Recognition System for Hindi Language: A Hybrid Approach
Named Entity Recognition System for Hindi Language: A Hybrid ApproachNamed Entity Recognition System for Hindi Language: A Hybrid Approach
Named Entity Recognition System for Hindi Language: A Hybrid Approach
 
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
 
Toefl
ToeflToefl
Toefl
 
An expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabicAn expert system for automatic reading of a text written in standard arabic
An expert system for automatic reading of a text written in standard arabic
 
STATISTICAL FUNCTION TAGGING AND GRAMMATICAL RELATIONS OF MYANMAR SENTENCES
STATISTICAL FUNCTION TAGGING AND GRAMMATICAL RELATIONS OF MYANMAR SENTENCESSTATISTICAL FUNCTION TAGGING AND GRAMMATICAL RELATIONS OF MYANMAR SENTENCES
STATISTICAL FUNCTION TAGGING AND GRAMMATICAL RELATIONS OF MYANMAR SENTENCES
 
A survey on phrase structure learning methods for text classification
A survey on phrase structure learning methods for text classificationA survey on phrase structure learning methods for text classification
A survey on phrase structure learning methods for text classification
 
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
 
Error analysis on subject verb agreement the case of a university student in ...
Error analysis on subject verb agreement the case of a university student in ...Error analysis on subject verb agreement the case of a university student in ...
Error analysis on subject verb agreement the case of a university student in ...
 
The Investigation of Grammatical Errors in Grade 10 Students’ Expository Essa...
The Investigation of Grammatical Errors in Grade 10 Students’ Expository Essa...The Investigation of Grammatical Errors in Grade 10 Students’ Expository Essa...
The Investigation of Grammatical Errors in Grade 10 Students’ Expository Essa...
 
Aaai 1
Aaai 1Aaai 1
Aaai 1
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
 
Seminar hasil pendidikan bahasa inggris
Seminar hasil pendidikan bahasa inggrisSeminar hasil pendidikan bahasa inggris
Seminar hasil pendidikan bahasa inggris
 
The recognition system of sentential
The recognition system of sententialThe recognition system of sentential
The recognition system of sentential
 
4th sem
4th sem4th sem
4th sem
 

Similar to First-Character Filtering Method in Syllable Segmentation using Data Dictionary for Myanmar Language

A Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for MarathiA Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for Marathiiosrjce
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis SystemA Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis Systemiosrjce
 
Improving a Lightweight Stemmer for Gujarati Language
Improving a Lightweight Stemmer for Gujarati LanguageImproving a Lightweight Stemmer for Gujarati Language
Improving a Lightweight Stemmer for Gujarati Languageijistjournal
 
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...IJECEIAES
 
MYANMAR WORDS SORTING
MYANMAR WORDS SORTING MYANMAR WORDS SORTING
MYANMAR WORDS SORTING kevig
 
MYANMAR WORDS SORTING
MYANMAR WORDS SORTING MYANMAR WORDS SORTING
MYANMAR WORDS SORTING kevig
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01ijcsit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELS
S URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELSS URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELS
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELSijnlc
 
Parsing of Myanmar Sentences With Function Tagging
Parsing of Myanmar Sentences With Function TaggingParsing of Myanmar Sentences With Function Tagging
Parsing of Myanmar Sentences With Function Taggingkevig
 
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGPARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGkevig
 
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGPARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGkevig
 
speech segmentation based on four articles in one.
speech segmentation based on four articles in one.speech segmentation based on four articles in one.
speech segmentation based on four articles in one.Abebe Tora
 

Similar to First-Character Filtering Method in Syllable Segmentation using Data Dictionary for Myanmar Language (20)

D2 anandkumar
D2 anandkumarD2 anandkumar
D2 anandkumar
 
A Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for MarathiA Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for Marathi
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis SystemA Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis System
 
Improving a Lightweight Stemmer for Gujarati Language
Improving a Lightweight Stemmer for Gujarati LanguageImproving a Lightweight Stemmer for Gujarati Language
Improving a Lightweight Stemmer for Gujarati Language
 
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
 
MYANMAR WORDS SORTING
MYANMAR WORDS SORTING MYANMAR WORDS SORTING
MYANMAR WORDS SORTING
 
MYANMAR WORDS SORTING
MYANMAR WORDS SORTING MYANMAR WORDS SORTING
MYANMAR WORDS SORTING
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
D3 dhanalakshmi
D3 dhanalakshmiD3 dhanalakshmi
D3 dhanalakshmi
 
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELS
S URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELSS URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELS
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELS
 
F017163443
F017163443F017163443
F017163443
 
Parsing of Myanmar Sentences With Function Tagging
Parsing of Myanmar Sentences With Function TaggingParsing of Myanmar Sentences With Function Tagging
Parsing of Myanmar Sentences With Function Tagging
 
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGPARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
 
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGPARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
 
Arabic MT Project
Arabic MT ProjectArabic MT Project
Arabic MT Project
 
speech segmentation based on four articles in one.
speech segmentation based on four articles in one.speech segmentation based on four articles in one.
speech segmentation based on four articles in one.
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

First-Character Filtering Method in Syllable Segmentation using Data Dictionary for Myanmar Language

Editor's Notes

  1. 안녕하십니까? 14학번 택미얏린라고 입니다. 오늘은 제가 미얀마언어의 음절 분할에대해서 연구하고 있는 내용은 발표하도록 할겠습니다. 오늘 발표한제목은 First Character Filtering Method in Syllable Segmentation using Data Dictionary for Myanmar Language라고 이고 논문준비중입니다.
  2. 목적은 보시디시피, 먼저, 미얀마언어의 쓰기방법하고 미얀마 단어의 특성들에대해 알아보겠습니다 그다음에는 , 제가 연구하고 있는 “First-Character Filtering (FCF) 방법에 대해서 설명해드리고 구현 및 결과를 알려주겠습니다. 그리고, 마지막으로 향후 연구에대해서 알아보겠습니다.
  3. 미얀마 단어의 자연에 대해서 설명해드리겠습니다. 미얀마언어 쓰기방법는 동나마나라들의 언어들이랑 비슷합니다. 영어하고 한국어들의 문장에서는 whitespace 이라는 (공백)이 단어에 따라 있어서 단어 분할을 편하게 할수있지만 미얀마언어의 문장에서 공백이는 단어들 사이에 없고 문구 사이에 있기 때문에 단어 분할은 어렵습니다. 그리고 .. 미얀마언어에는 공백에 대한 표준 규칙이 없어서.. 문구 사이에 공백이 있어도 단어분할기 어렵습니다. 보시다시피 미얀마언어에는 자음이 33가개 있고… 번호수자는 10개 있습니다.
  4. 그리고 . 미얀마 단어의 자연이 보시다시피 자음과 모음 조합해서 한 음절가 형성되것입니다. 자음은 가만색이로표현하고 모음들은 파란색으로 표현하고 있습니다. 미얀마언어 쓸대 자음이 먼저나오고 그다음에 모음이 붙어서 한 음절가 됩니다. 예를들면, 제일 밑에 볼수있는 음절이는 한국어로 “좋다”라고 뜻입니다. 그단어를 쓰면는 왼쪽에 펴현하는게 처럼 자음과 모음 조합해서 단계별로 써야됩니다. 이것은 일반 미얀마 음절의 구조아니면은 미얀마음절의 쓰기교칙입니다.
  5. 그리고, 다른 쓰기방법2개가 있습니다. 그 방법들은 Kinzi 하고 Stacked consonant 라는고 입니다. Kinzi 라는 쓰기방법은 어떤 단어에따라 앞에자음의모음을 뒤자리에있는 자음 위에서 올리는것입니다. Stacked consonant라고 자음 적층 쓰기방법은 뒤자리에 있는 자음이 앞 자음의 모음 아래 내래서 쓰는게 입니다. 그런데 , 이런 kinzi 과 stacked consonant 라는 쓰는 방법들은 고유한 단어들이 위해서만 씁니다. 지금까지 미얀마 단어의 자연과 음절의 쓰기교칙에대 살펴보겠습니다 또는, 앞에 이야기했것처럼 미얀마 문장에서 공백이 없기때문에 단어 분할 안되서 먼저… 음절 분할을 수행해야합니다. 음절하기위해서 제가 만들었던 알고리즘을 다음에 설명해드릴겠습니다.
  6. 보시디시피, 첫째 그림은 알고리즘에대해서 자세하게 표현하고 2번째 그림은 짧은 형태 표현하고 있습니다. 먼저, 입력한 문장의 첫 자음을 뽑아서 그 자음에대해 음절를 음절 데이터베이스에서 찾습니다. 그리고, 추력했던 음절을 저장해서 출력합니다. 알고리즘을 자세하게 설명하기전에 이방법에 쓰고있는.. (Syllable collections)라는 음절 데이터베이스에 대해서 먼저 설명해드리겠습니다.
  7. 음절 데이터베이스 만드기위해서 .. 먼저, MySQL로 사용해서 미얀마언어의 자음 (33)의 각각에 따라 table 하나씩 만듭니다.
  8. 그리고 깍 하나의 자음에 따라서 음절들은 table 안어세 저정합니다. 그래서 Syllable collections 데이터를 구축됩니다. 다음은 알고리즘에 대해 설명드리도록 하겠습니다
  9. 처음에, 입력한 문장을 전처리하기위해 공백이나, 구두점들을 추출하고 입력 문장의수를 뽑았습니다. 그리고 깍 문장의 길을 저장합니다. 예를 들면 “학생은 학교로 간다”라고 뜻이 문장을 음절처리 해볼겠습니다. 먼저 , 보시디시피 문장에서 공백이를 다 빼고 새로운 문장을 공백이나 구두점이 없게 만듭니다.
  10. 그리고, 제가 앞에 이야기했것처럼 미얀마 단어쓸때 자음이 먼저 써야되니까 첫 자음을 문장에서 추출합니다. 그다음에.. 문장에서 첫 자음을 찾아서 음절 데애터베이스에서 음절을 찾으려고 준비됩니다.
  11. 뽑하떤 자음의 table에서 음절들을 찾아서 후보자 음절들을 찾으려고 준비됩니다.
  12. Table 안에 있는 깍 음절들의 길를 먼저추출하고 깍 음절길 값에따라 입력문장의 맨 처음에서 잘라서 후보자 음절을 만듭니다. 그리고, 추출했던 후보자 음절이랑 table 안에있는.. 음절이랑 동일하거나 상이하거나 확인하고 똑같으면 candidates 라는 txt 파일에서 저장됩니다. 아니면은, 통과됩니다.
  13. 그다음에, candidates 택트파일에서 제일 가장 긴은 음절을 뽑하고 최종음절라고 result 라는 txt file 에서 저장됩니다. 그 추출했던 최종음절을 txt 파일에서 저장후에 candidate 파일을 삭제됩니다. 그리고 최종음절의 길값을 추출해서 똑 같은 길값을 입력문장의 맨 처음에서 삭제됩니다. 그래서 새로운 입력문장가 되고 새로운 음절을 찾기위해 다시 준비됩니다. 지금까지 First-character filtering 알고리즘에대 살펴보겠습니다. 다음에는 구현 및 결과를 표현하도록 할겠습니다.
  14. 아까우리가 입력던 문장을 음절분할 처리해봤습니다. 보시다시피, 문장에서 최종음절들을 추출하고 result라는 파일에서 저장해서… 결과를 표현해줍니다.
  15. 이것은 제가 어떤 미얀마뉴스 홈페이지에서 1문장이상 뽑아서 실험해봤던 결과입니다.
  16. 마지막으로, 제가 나중에 연구하도록 향후 연구에대해서 알려드리겠습니다. 지금까지는 미얀마언어의 음절분할처리는 일반쓰기방법에 적용하면은 트린게 없이 처리되고 있는데 외래어, Kinzi 쓰기방법 하고 자음 적층쓰기방법에 대해서 연구하고 있는중입니다. 그리고, 제가 나중에 음절데이터베이스에서 음절자료가없어도 음절교칙에 따라 음절분할 처리할수있게 연구할도로 할겠습니다. 감사합니다.
  17. 혹시 질문이 있으시면 질문해 주십시오.