SlideShare a Scribd company logo

Spell checker using Natural language processing

NLP techniques used for Spell checking to recommend find error in the written word and also suggest a relevant word. Algorithm: Jaccard Coefficient, The Levenshtein Distance

1 of 14
Download to read offline
Presentation
On
Spell Checking Techniques In NLP
Presented By: Sandeep Wakchaure
Outline
 Abstract
Introduction
Error Category
 Error Detection Technique
 Algorithms
Project Screenshot
Conclusion
Abstract
In this project, I am proposing a simple, flexible, and efficient spell checker
editor application based upon edit distance score, Supervised learning. I am
integrating Levenshtein distance (LD), Jaccard coefficient Algorithms to
achieve the require target, these all algorithms are measure the similarity
between two strings, which we will refer to as the source string (s) and the target
string (t). My approach is to design text editor based upon NLP having
auto spell checker which suggest the user mistake. I am using a novel scoring
scheme to integrate the retrieved words from each spelling approach and
calculate an overall score for each matched word. From the overall scores, we
can rank the possible matches. This algorithms required a training data set
which nothing but a data as dictionary, while writing a content in editor, the
backend proceed will happen like tokenization, distance calculation and
further filtering the results using mentioned algorithms and finally suggest
appropriate results.
 Spell Check is a process of detecting and sometimes providing
suggestions for incorrectly spelled words in a text.
 In computing, Spell Checker is an application program that flags words
in a document that may not be spelled correctly.
 Spell Checker may be stand- alone capable of operating on a block a
text such as word-processor, electronic dictionary.
A basic spell checker carries out the following processes:
 It scans the text and extracts the words contained in it.
 It then compares each word with a known list of correctly spelled words (i.e.
a dictionary).
 An additional step is a language-dependent algorithm for handling
morphology.
Introduction:
Spelling errors can be divided into two categories:
Real-word errors
Non-word errors.
Real-word errors : are those error words that are acceptable words in
the dictionary.
Non-word errors : are those error words that cannot be found in the
dictionary.
This words are complex to provide the suggestion, so this might not be
suggested.
2. ERROR DETECTION TECHNIQUES
A. Dictionary Lookup Technique :
- In this, Dictionary lookup technique is used which checks every word of input
text for its presence in dictionary.
If that word present in the dictionary, then it is a correct
word.
Otherwise it is put into the list of error words.
Ad

Recommended

More Related Content

What's hot

Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)Kuppusamy P
 
Syntax Analysis in Compiler Design
Syntax Analysis in Compiler Design Syntax Analysis in Compiler Design
Syntax Analysis in Compiler Design MAHASREEM
 
Knowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceKnowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceRamla Sheikh
 
Finite Automata in compiler design
Finite Automata in compiler designFinite Automata in compiler design
Finite Automata in compiler designRiazul Islam
 
States, state graphs and transition testing
States, state graphs and transition testingStates, state graphs and transition testing
States, state graphs and transition testingABHISHEK KUMAR
 
Unit1 principle of programming language
Unit1 principle of programming languageUnit1 principle of programming language
Unit1 principle of programming languageVasavi College of Engg
 
Type checking
Type checkingType checking
Type checkingrawan_z
 
Knowledge Representation & Reasoning
Knowledge Representation & ReasoningKnowledge Representation & Reasoning
Knowledge Representation & ReasoningSajid Marwat
 
Issues in knowledge representation
Issues in knowledge representationIssues in knowledge representation
Issues in knowledge representationSravanthi Emani
 
compiler ppt on symbol table
 compiler ppt on symbol table compiler ppt on symbol table
compiler ppt on symbol tablenadarmispapaulraj
 
Knowledge representation and Predicate logic
Knowledge representation and Predicate logicKnowledge representation and Predicate logic
Knowledge representation and Predicate logicAmey Kerkar
 

What's hot (20)

Phases of Compiler
Phases of CompilerPhases of Compiler
Phases of Compiler
 
Natural language processing (nlp)
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
 
Role-of-lexical-analysis
Role-of-lexical-analysisRole-of-lexical-analysis
Role-of-lexical-analysis
 
Syntax Analysis in Compiler Design
Syntax Analysis in Compiler Design Syntax Analysis in Compiler Design
Syntax Analysis in Compiler Design
 
Knowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceKnowledge representation In Artificial Intelligence
Knowledge representation In Artificial Intelligence
 
First pass of assembler
First pass of assemblerFirst pass of assembler
First pass of assembler
 
Chapter 5 Syntax Directed Translation
Chapter 5   Syntax Directed TranslationChapter 5   Syntax Directed Translation
Chapter 5 Syntax Directed Translation
 
Lexical analysis - Compiler Design
Lexical analysis - Compiler DesignLexical analysis - Compiler Design
Lexical analysis - Compiler Design
 
Recognition-of-tokens
Recognition-of-tokensRecognition-of-tokens
Recognition-of-tokens
 
Finite Automata in compiler design
Finite Automata in compiler designFinite Automata in compiler design
Finite Automata in compiler design
 
Input-Buffering
Input-BufferingInput-Buffering
Input-Buffering
 
States, state graphs and transition testing
States, state graphs and transition testingStates, state graphs and transition testing
States, state graphs and transition testing
 
Types of Compilers
Types of CompilersTypes of Compilers
Types of Compilers
 
Unit1 principle of programming language
Unit1 principle of programming languageUnit1 principle of programming language
Unit1 principle of programming language
 
Hidden markov model ppt
Hidden markov model pptHidden markov model ppt
Hidden markov model ppt
 
Type checking
Type checkingType checking
Type checking
 
Knowledge Representation & Reasoning
Knowledge Representation & ReasoningKnowledge Representation & Reasoning
Knowledge Representation & Reasoning
 
Issues in knowledge representation
Issues in knowledge representationIssues in knowledge representation
Issues in knowledge representation
 
compiler ppt on symbol table
 compiler ppt on symbol table compiler ppt on symbol table
compiler ppt on symbol table
 
Knowledge representation and Predicate logic
Knowledge representation and Predicate logicKnowledge representation and Predicate logic
Knowledge representation and Predicate logic
 

Similar to Spell checker using Natural language processing

EasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfEasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfNohaGhoweil
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSgerogepatton
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSgerogepatton
 
Bangla spell checker & suggestion generator
Bangla spell checker & suggestion generatorBangla spell checker & suggestion generator
Bangla spell checker & suggestion generatorMdAlAmin187
 
Sentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsSentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsEditor IJCATR
 
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...ijctcm
 
role of lexical anaysis
role of lexical anaysisrole of lexical anaysis
role of lexical anaysisSudhaa Ravi
 
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING mlaij
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluationijnlc
 
A Film Synopsis Genre Classifier Based On
A Film Synopsis Genre Classifier Based OnA Film Synopsis Genre Classifier Based On
A Film Synopsis Genre Classifier Based OnAndrew Molina
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEkevig
 
Cohesive Software Design
Cohesive Software DesignCohesive Software Design
Cohesive Software Designijtsrd
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEijnlc
 
Automated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisAutomated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisGina Rizzo
 
Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Edmond Lepedus
 
An Application of Pattern matching for Motif Identification
An Application of Pattern matching for Motif IdentificationAn Application of Pattern matching for Motif Identification
An Application of Pattern matching for Motif IdentificationCSCJournals
 
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...Lifeng (Aaron) Han
 
A survey of Stemming Algorithms for Information Retrieval
A survey of Stemming Algorithms for Information RetrievalA survey of Stemming Algorithms for Information Retrieval
A survey of Stemming Algorithms for Information Retrievaliosrjce
 

Similar to Spell checker using Natural language processing (20)

EasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdfEasyChair-Preprint-7375.pdf
EasyChair-Preprint-7375.pdf
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
 
Bangla spell checker & suggestion generator
Bangla spell checker & suggestion generatorBangla spell checker & suggestion generator
Bangla spell checker & suggestion generator
 
Sentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic RelationsSentence Validation by Statistical Language Modeling and Semantic Relations
Sentence Validation by Statistical Language Modeling and Semantic Relations
 
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
An Approach To Automatic Text Summarization Using Simplified Lesk Algorithm A...
 
role of lexical anaysis
role of lexical anaysisrole of lexical anaysis
role of lexical anaysis
 
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
A SURVEY ON SIMILARITY MEASURES IN TEXT MINING
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
 
A Film Synopsis Genre Classifier Based On
A Film Synopsis Genre Classifier Based OnA Film Synopsis Genre Classifier Based On
A Film Synopsis Genre Classifier Based On
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
 
Cohesive Software Design
Cohesive Software DesignCohesive Software Design
Cohesive Software Design
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
 
Automated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisAutomated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic Analysis
 
Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...Understanding Natural Languange with Corpora-based Generation of Dependency G...
Understanding Natural Languange with Corpora-based Generation of Dependency G...
 
An Application of Pattern matching for Motif Identification
An Application of Pattern matching for Motif IdentificationAn Application of Pattern matching for Motif Identification
An Application of Pattern matching for Motif Identification
 
Searching.pptx
Searching.pptxSearching.pptx
Searching.pptx
 
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
COLING 2012 - LEPOR: A Robust Evaluation Metric for Machine Translation with ...
 
K017367680
K017367680K017367680
K017367680
 
A survey of Stemming Algorithms for Information Retrieval
A survey of Stemming Algorithms for Information RetrievalA survey of Stemming Algorithms for Information Retrieval
A survey of Stemming Algorithms for Information Retrieval
 

Recently uploaded

killingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfkillingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfssuser82c38d
 
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이ssuser82c38d
 
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A..."Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...ISPMAIndia
 
SPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementSPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementISPMAIndia
 
Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Jeffrey Haguewood
 
Role of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxRole of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxMindInventory
 
Essence of Requirements Engineering: Pragmatic Insights for 2024
Essence of Requirements Engineering: Pragmatic Insights for 2024Essence of Requirements Engineering: Pragmatic Insights for 2024
Essence of Requirements Engineering: Pragmatic Insights for 2024Asher Sterkin
 
Joseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureJoseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureHironori Washizaki
 
killing camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfkilling camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfssuser82c38d
 
Implementing Docker Containers with Windows Server 2019
Implementing Docker Containers with Windows Server 2019Implementing Docker Containers with Windows Server 2019
Implementing Docker Containers with Windows Server 2019VICTOR MAESTRE RAMIREZ
 
Agile & Scrum, Certified Scrum Master! Crash Course
Agile & Scrum,  Certified Scrum Master! Crash CourseAgile & Scrum,  Certified Scrum Master! Crash Course
Agile & Scrum, Certified Scrum Master! Crash CourseRohan Chandane
 
No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!Anthony Dahanne
 
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ..."Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...ISPMAIndia
 
OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20Shane Coughlan
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!ISPMAIndia
 
Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Dmitry Zinoviev
 
P1 Inspection Types in Municity 5 Smartsheet
P1 Inspection Types in Municity 5 SmartsheetP1 Inspection Types in Municity 5 Smartsheet
P1 Inspection Types in Municity 5 SmartsheetMatthewTHawley
 
Welcome to AltTask - the nexus where innovation converges with empowerment!
Welcome to AltTask - the nexus where innovation converges with empowerment!Welcome to AltTask - the nexus where innovation converges with empowerment!
Welcome to AltTask - the nexus where innovation converges with empowerment!alttaskcom
 

Recently uploaded (20)

killingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdfkillingcamp longest common subsequence.pdf
killingcamp longest common subsequence.pdf
 
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
killingcamp 광고삽입문제 풀이, killingcamp 광고삽입문제 풀이
 
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A..."Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
"Discovery and Delivery through Product IntelliGenAI framework" by Ramkumar A...
 
SPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product ManagementSPM 2024 – Overview of and benefits of AI in Product Management
SPM 2024 – Overview of and benefits of AI in Product Management
 
Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)Automation for Bonterra Impact Management (fka Apricot)
Automation for Bonterra Impact Management (fka Apricot)
 
Role of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptxRole of DevOps in SaaS product Development.pdf.pptx
Role of DevOps in SaaS product Development.pdf.pptx
 
Essence of Requirements Engineering: Pragmatic Insights for 2024
Essence of Requirements Engineering: Pragmatic Insights for 2024Essence of Requirements Engineering: Pragmatic Insights for 2024
Essence of Requirements Engineering: Pragmatic Insights for 2024
 
eLearning Content Development Company Code and Pixels.pdf
eLearning Content Development Company Code and Pixels.pdfeLearning Content Development Company Code and Pixels.pdf
eLearning Content Development Company Code and Pixels.pdf
 
Joseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about ArchitectureJoseph Yoder : Being Agile about Architecture
Joseph Yoder : Being Agile about Architecture
 
killing camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdfkilling camp week 6 problem - maximal matrix.pdf
killing camp week 6 problem - maximal matrix.pdf
 
Implementing Docker Containers with Windows Server 2019
Implementing Docker Containers with Windows Server 2019Implementing Docker Containers with Windows Server 2019
Implementing Docker Containers with Windows Server 2019
 
Agile & Scrum, Certified Scrum Master! Crash Course
Agile & Scrum,  Certified Scrum Master! Crash CourseAgile & Scrum,  Certified Scrum Master! Crash Course
Agile & Scrum, Certified Scrum Master! Crash Course
 
No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!No more Dockerfiles? Buildpacks to help you ship your image!
No more Dockerfiles? Buildpacks to help you ship your image!
 
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ..."Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
"Taking an idea to a Product in Health diagnostics" by Dr. Geetha Manjunath, ...
 
OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20OpenChain AI Study Group - North America and Europe - 2024-02-20
OpenChain AI Study Group - North America and Europe - 2024-02-20
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!The Age of AI: Elevating Experiences & Delivering Customer Value!
The Age of AI: Elevating Experiences & Delivering Customer Value!
 
Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)
 
P1 Inspection Types in Municity 5 Smartsheet
P1 Inspection Types in Municity 5 SmartsheetP1 Inspection Types in Municity 5 Smartsheet
P1 Inspection Types in Municity 5 Smartsheet
 
Welcome to AltTask - the nexus where innovation converges with empowerment!
Welcome to AltTask - the nexus where innovation converges with empowerment!Welcome to AltTask - the nexus where innovation converges with empowerment!
Welcome to AltTask - the nexus where innovation converges with empowerment!
 

Spell checker using Natural language processing

  • 1. Presentation On Spell Checking Techniques In NLP Presented By: Sandeep Wakchaure
  • 2. Outline  Abstract Introduction Error Category  Error Detection Technique  Algorithms Project Screenshot Conclusion
  • 3. Abstract In this project, I am proposing a simple, flexible, and efficient spell checker editor application based upon edit distance score, Supervised learning. I am integrating Levenshtein distance (LD), Jaccard coefficient Algorithms to achieve the require target, these all algorithms are measure the similarity between two strings, which we will refer to as the source string (s) and the target string (t). My approach is to design text editor based upon NLP having auto spell checker which suggest the user mistake. I am using a novel scoring scheme to integrate the retrieved words from each spelling approach and calculate an overall score for each matched word. From the overall scores, we can rank the possible matches. This algorithms required a training data set which nothing but a data as dictionary, while writing a content in editor, the backend proceed will happen like tokenization, distance calculation and further filtering the results using mentioned algorithms and finally suggest appropriate results.
  • 4.  Spell Check is a process of detecting and sometimes providing suggestions for incorrectly spelled words in a text.  In computing, Spell Checker is an application program that flags words in a document that may not be spelled correctly.  Spell Checker may be stand- alone capable of operating on a block a text such as word-processor, electronic dictionary. A basic spell checker carries out the following processes:  It scans the text and extracts the words contained in it.  It then compares each word with a known list of correctly spelled words (i.e. a dictionary).  An additional step is a language-dependent algorithm for handling morphology. Introduction:
  • 5. Spelling errors can be divided into two categories: Real-word errors Non-word errors. Real-word errors : are those error words that are acceptable words in the dictionary. Non-word errors : are those error words that cannot be found in the dictionary. This words are complex to provide the suggestion, so this might not be suggested.
  • 6. 2. ERROR DETECTION TECHNIQUES A. Dictionary Lookup Technique : - In this, Dictionary lookup technique is used which checks every word of input text for its presence in dictionary. If that word present in the dictionary, then it is a correct word. Otherwise it is put into the list of error words.
  • 7. String 1 – “statistics” String 2 – “statistical” If n is set to 2 (Bigrams are being extracted), then the similarity of the two strings is calculated as follows. Initially, the two strings are split into n-grams: Statistics - st ta at ti is st ti ic cs 9 Bigrams Statistical - st ta at ti is st ti ic ca al 10 Bigrams Coefficient = A ⊓ B A ⊔ B 1. N- grams Based Technique using Jaccard coefficient B. ALGORITHMS FOR ERROR WORDS
  • 8. Levenshtein distance (LD) is a measure of the similarity between two strings, which we will refer to as the source string (s) and the target string (t). The distance is the number of deletions, insertions, or substitutions required to transform s into t. For example, If s is "test" and t is "test", then LD(s,t) = 0, because no transformations are needed. The strings are already identical. If s is "test" and t is "tent", then LD(s,t) = 1, because one substitution (change "s" to "n") is sufficient to transform s into t. 2. The Levenshtein Algorithm (LD)
  • 9. The Levenshtein distance algorithm has been used in: Spell checking Speech recognition DNA analysis Plagiarism detection Operation : Insertion, Deletion, Substitution In this algorithm cost calculated by comparing the source and targeted words, and according to low cost word is suggested to user. e.g Cat Cut here the one substitution by 1 latter.
  • 13. Conclusion In this presentation we seen error detection , correction techniques, the word are suggested to end user is based on two algorithms one is Jaccard coefficient and second is Levenshteien distance. This algorithms filter out the dictionary words and provide the exact suggestion to user, so that user enter text in editor should be a error free and it does not content any spelling mistakes.