SlideShare a Scribd company logo
1 of 7
Download to read offline
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
International Journal of Innovative Research in Computer
and Communication Engineering
(An ISO 3297: 2007 Certified Organization)
Vol. 2, Issue 12, December 2014
10.15680/ijircce.2014.0212083
Copyright to IJIRCCE www.ijircce.com 7632
Empirical Comparative Study to Some
Supervised Approaches
Boshra F. Zopon AL_Bayaty1
, Dr. Shashank Joshi2
Department of Computer Science, Yashwantrao Mohite College, Bharati Vidyapeeth University, AL-Mustansiriya
University, Baghdad, Iraq1
Department of Computer Engineering, Engineering College, Bharati Vidyapeeth University, Pune, Maharashtra,India2
ABSTRACT:Word sense disambiguation is solved with the help of various data mining approaches like Naïve Bayes
Approach, Decision List, decision tree, and SVM (Support Vector machine). These approaches help to find out correct
meaning of word by referring WordNet 2.1. Experiment performed is discussed in this paper along with the comparison
of SVM algorithm with various approaches. In this study Decision List achieved the best result among all other
approaches.
KEYWORDS: Support Vector Machine, Naive Bayes, Decision List, Decision Tree, Supervised learning approaches,
Senseval-3, WSD, WordNet.
I. INTRODUCTION
Natural language processing is study of word and their meaning role from meaningful language. Most for every
system this word acts as an input. While inferring means out of it, if system misinterprets it entire system will get
affected. That’s why WSD is extremely important to infer correctly meaning of word as per the perception of user or
machine who has inserted it.
Word sense disambiguation is a task to identify correct meaning of word by using some algorithm with the help of
some or other approach [1]
. To accomplish this process system is trained to identify correct results meaning of word
according to the multiple words like Map. Map is a geographical representation of particular place or it is an
association between two terms (Mapping). So problem statement is to identify the meaning of given word as per the
requirement of user [2]
.
Fig. 1: The Screenshot Shows the Multiple of Anger Word
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
International Journal of Innovative Research in Computer
and Communication Engineering
(An ISO 3297: 2007 Certified Organization)
Vol. 2, Issue 12, December 2014
10.15680/ijircce.2014.0212083
Copyright to IJIRCCE www.ijircce.com 7633
II. BACK GROUND AND RELATED WORK
Many researchers have contributed to this field of disambiguation. There are various approaches to accomplish
this task of disambiguation.
 Support vector machine is to generate a hyperplan to separate hyperspace by separating them as per the category
or group. Distance between closest edges of plan is known as support vector [3]
.
 Naive Bayes approach is a way to calculate posterior probability by using conditional probability. Naïve part of the
classifier is to extract features dependency. It is assumed that there is no dependency among the features extracted[4]
:
  

n
i
i
i
FFp
CFFp
FFCp
1 21
21
21
),(
)|,(
,|
Where:
F1, F2 are features
C is category.
 Decision tree deals with information gained during the experiment. In decision tree processing is from top to
bottom that is from root to leaf. So if a length or tree is higher probability of data storage or information gain is
comparatively higher. This also calculates the error rate in terms of entropy. Maximum is entropy minimum will be the
accuracy and vice versa[5]
.
 Decision list works on condition like (If-else) structure. If condition is satisfied visit the node deal with data
otherwise leave it. Repeat the process till desired data or conditions are not meet[6]
.
These approaches and their comparison is discussed in this paper based on the experiment which is performed to
meet the goal of word sense disambiguation using effective approach for empirical retrieval of information.
1. Motivation
To address the challenge discussed earlier resinous efforts are needed because every approach facer some or
other drawback. The figure below represent support vector machine approach implemented in this paper:
Fig 2. Support Vector Machine with Hyperplane
Where x and y are various categories on which the data instances are separated. So the motivation to conduct this
experiment is to increase the overall accuracy, address word sense disambiguation by considering some classifier,
which will train the database and identify meaning of word correctly out of total list of meanings which are provided.
This task is carried out by referring the context to resolve disambiguation.
F(x)
F(y)
Data instance Support Vector
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
International Journal of Innovative Research in Computer
and Communication Engineering
(An ISO 3297: 2007 Certified Organization)
Vol. 2, Issue 12, December 2014
10.15680/ijircce.2014.0212083
Copyright to IJIRCCE www.ijircce.com 7634
II. EXPERIMENTAL SETUP
1.1 Data
Experiment is conducted by using a WordNet repository, 10 nouns and 5 verbs[7]
. To know the accuracy of sense
context is designed by following senseval norms. This representation is made by using XML representation. With the
help of algorithm and context mentioned in a database meaning of word is calculated. To accomplish this task
semistructured and unstructured representation is used, because of the latency; that is the required to store and retrieve
the data to and from database[8]
.
1.2 Implementation Supervised Machine Learning Techniques
To identify meaning of word two types of techniques used, Supervised, unsupervised techniques. If a data is identified
on the best is of frequency of occurrence then it is unsupervised approach; But all the time we cannot completely relay
on unsupervised approach, because meaning could very as per the context used and perception. Supervised technique,
because system is trained with some defined context to predict meaning based on the surrounding word. Their
predictions are maide with suitable data mining algorithm like, Naïve Bayes, Decision tree algorithm, Decision List
algorithm, and Support vector machine. These algorithms are munitions and empirically implemented in this paper, and
the comparative analysis based on the accuracy of that algorithms to predict the meaning.
4.4.1 Naïve Bayes
Naïve Bayes approach works on conditional probability. In some approaches it gives better result while in other
approaches it does not deliver appropriate results.
There are few scores where Naïve bayes provide better result and these top 3 results according to the accuracy are:
{Name: 1000, worlds: 1000, Day: 1000}.
In some cases performance of Naïve Bayes algorithm is not satisfactory lowest three such cases are: {Worship: 414,
Trust: 167, Help: 414}
Box.(1): Naive Bayes Algorithm implemented on Our Data Set
Overall accuracy of Naïve Bayes algorithm is (58.32 %) which need to be improved to find but desired word correctly
[9]
.
4.4.2 Decision Tree
Decision tree is based on storage of result or meaning at node. As far as WSD is concerned for data set that we are
referring overall accuracy of decision tree is not satisfactory.
Overall accuracy is (45.14%).
1. Initialize context c, sense s, and ambiguous
word w.
2. As per training context
3.  
)|(
)|(),|(
,|
cwp
cspcswp
cwsp 
Calculate Maximized ),|( cwsp
4. Select one with highest value
5. Map sense according to the highest accuracy.
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
International Journal of Innovative Research in Computer
and Communication Engineering
(An ISO 3297: 2007 Certified Organization)
Vol. 2, Issue 12, December 2014
10.15680/ijircce.2014.0212083
Copyright to IJIRCCE www.ijircce.com 7635
Box.(2) : C4.5 Algorithm implemented on Our Data Set.
Though overall accuracy of decision tree is not up to the mark but for few cases it gives better results, such top 2 cases
are: {Name: 1000, Worlds:1000}. On this contrary, there are some result its where performance is not satisfactory such
lowest three cases are: {Trust: 167, Day: 109, Help: 125}[10]
.
4.4.3 Decision List
Among the approaches discussed, so for decision list provides more accurate result by forming if else ladder. The
efficiency and accuracy would be noted by few cases where results are better are mentioned below: {Praise: 1000,
Name: 1000, Worlds: 1000, Lord: 1000, Recompense: 1000, Day: 1000}.
Box (3): Decision List Algorithm implemented
Though overall accuracy is better in case of decision list there are some cases where the performance is not according
to the expectation is not satisfactory are as below:
{Trust: 167, Help: 125, Favored: 250, Path: 333}[11]
.
4.2.4 Support Vector Machine
Support vector machine is a technique to separate a data in a hyperspace with the help of hyperplane. This separation is
done creating hyperplane by maximizing the distance between the data instances which are located at the edge. If we
observer working of SVM carefully it is observed that it is practically difficult to sprat data instances clearly, so this
gap is known as slack. This slack is to be maximized to separate data instances and categorize them under one heading.
Support vector machine is an idea example of binary classifier but when it comes to word sense disambiguation
performance or the results are not up to the mark.
Such top 4 “four cases” where results are at pick are mentioned below:
{Name: 1000, Worlds: 1000, Guide: 1000, Day: 1000}.
In some cases performance of Support vector machine algorithm is not satisfactory lowest three such cases are:
{Worship: 414, Lord: 431, Trust, 167, Path: 318, Favored: 250 Help: 125}[12]
.
1. Read data set and calculation POS (e.g. recompense.)
2. Prepare context containing various senses of word (e.g.
Recompense- reward)
3. Calculate frequency at context (i.e. - p- and +P+)
-P- Negative
-P+ Positive
4. Calculate information gain for calculating entropy (S) = -
P+log2P+-P-log2P-
5. Gain (S,A)= Entropy(S) -
 DAv S
Sv
||
|| Entropy (Sv)
6. Select highest (Entropy, Attribute ratio)
1. Identify and calculate feature (f).
2. Calculate value of sense (Si).
3. Identify collocation one value per collocation basis.
4. Repeat this process for multiple senses.
5. Calculate absolute (log) of P (Si |f) for all sense.
6. Select maximum value out of it.
Max log






)|2(
)|(
fsP
fsiP
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
International Journal of Innovative Research in Computer
and Communication Engineering
(An ISO 3297: 2007 Certified Organization)
Vol. 2, Issue 12, December 2014
10.15680/ijircce.2014.0212083
Copyright to IJIRCCE www.ijircce.com 7636
Box.(4):7636SVM Algorithm implemented on Our Data Set
III. THE RESULTS
Disambiguation is performed in this paper via a four supervised approaches, using WordNet and Senseval-3. Table (1),
shows the results of four approaches, Naïve Bayes, Decision tree,, Decision List, and Support vector machine, which’s
has been given based on their score and accuracy.ich shows a comparative of those different approaches has been given
based on their score and accuracy.
1. Data Set: five verbs, ten nouns [7]
, and WordNet repository ℎ ∈
∈ {−1,1}
2. Training data: Part of speech, user, context, and words are used to train the
systems.
( ) = { ≥ 0 = ±1; < 0 = −1}
3. For linear approach
F (x) = wT
x +b
w Normal line
b base line
wT
weight vector
4. Processing of Algorithm application[8]
:
For all sense of word decide meaning with high score
Where m meaning
5. Testing: By delivering final result with accuracy and score helps to decide
performance of algorithm [9]
.
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
International Journal of Innovative Research in Computer
and Communication Engineering
(An ISO 3297: 2007 Certified Organization)
Vol. 2, Issue 12, December 2014
10.15680/ijircce.2014.0212083
Copyright to IJIRCCE www.ijircce.com 7637
IV. CONCLUSION
We have presented a comparative study for four supervised learning machine algorithms, using WordNet, and
Senseval-3, table (2), below shows the final results and accuracy for each approach. In conclusion, Decision List
algorithm, obtained high accuracy.
TABLE 1.
DATA SET OF WORDS AND RESULTS OF SUPERVISED LEARNING MACHINE CLASSIFIERS
Word POS
#
Sense
Naïve Bayes Decision Tree Decision List SVM
Score Accuracy Score Accuracy Score Accuracy Score Accuracy
Praise n 2 0.408 0.592 405 593 668 1000 592 594
Name n 6 0.189 1.0 184 1000 1000 1000 189 1000
Worship v 3 0.172 0.414 308 425 387 500 352 414
Worlds n 8 0.137 1.0 1000 1000 142 1000 1000 1000
Lord n 3 0.341 0.681 187 426 489 1000 418 431
Owner n 2 0.406 0.594 405 595 755 999 592 594
Recompe-nse n 2 0.48 0.594 405 595 791 1000 592 594
Trust v 6 0.167 0.167 167 167 167 167 167 167
Guide v 5 0.352 0.648 199 247 387 995 244 1000
Straight n 3 0.496 0.504 462 462 500 500 69 465
Path n 4 0.415 0.585 316 316 333 333 47 318
anger n 3 0.412 0.588 462 462 500 500 69 465
Day n 10 0.109 1.0 109 109 111 1000 109 1000
Favored v 4 0.587 0.648 250 250 250 250 250 250
Help v 8 0.352 0.414 125 125 125 125 125 125
TABLE.2
THE FINAL RESULTS OF SUPERVISED LEARNING MACHINE CLASSIFIERS
Approaches Accuracy (%)
Naïve Bayes 58.32
Decision Tree 45.14
Decision List 69.12
SVM 56.11
ISSN(Online): 2320-9801
ISSN (Print): 2320-9798
International Journal of Innovative Research in Computer
and Communication Engineering
(An ISO 3297: 2007 Certified Organization)
Vol. 2, Issue 12, December 2014
10.15680/ijircce.2014.0212083
Copyright to IJIRCCE www.ijircce.com 7638
ACKNOWLEDGMENT
I’m grateful to my research guide respected Dr. Shashank Joshi (Professor at Bharati Vidyapeeth University, College of
Engineering) for his support and cooperation all the time.
REFERENCES
1. Boshra F. Zopon AL_Bayaty, Dr. Shashank Joshi, Conceptualisation of Knowledge Discovery from Web Search, Bharati Vidyapeeth
University, International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February-2014, pages 1246- 1248.
2. Miller, G. et al., 1993, Introduction to WordNet: An On-line Lexical Database, ftp://ftp.cogsci.princeton.edu/pub/wordnet/5papers.pdf,
Princeton University.
3. Yoong Keok Lee and Hwee Tou Ng and Tee Kiah Chia, Supervised Word Sense Disambiguation with Support Vector Machines and Multiple
Knowledge Sources, 2007.
4. Ruben Izquierdo and Armando Suarez, German Rigau. An Empirical Study on Class-based Word Sense Disambiguation. Paper has been
supported by the European Union under the projects QALL-ME (FP6 IST-033860) and KY-OTO (FP7 ICT-211423), and the Spanish
Government under the project Text-Mess (TIN2006-15265-C06-01) and KNOWS (TIN2006-15049-C03-01).
5. Mahesh Joshi, Ted Pedersen and Richared Maclin. A Comparative Study of Support Vector Machines Applide to the Supervised Word Sense
Disambiguation Problim in the Medical Domain. Bhanu Prasad (Editor): IICAI-05, pp.3449-3468, 2005.
6. Vicky Panagiotopoulou, Iraklis Varlamis, Ion Androutsopoulos, and George Tsatsaronis, Word Sense Disambiguation as an Integer Linear
gramming Problem,, SETN 2012, 7th
Hellenic Conference on Artificial Intelligence, May 28-31, 2012, Lamia.
7. http://www.e-quran.com/language/english.
8. http://www.senseval.org/senseval3.
9. Boshra F. Zopon AL_Bayaty, Shashank Joshi, Empirical Implementation Naive Bayes Classifier for WSD Using WordNet., Bharati
Vidyapeeth University, international journal of computer engineering & technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 –
6375(Online), Volume 5, Issue 8, August (2014), pp. 25-31,© IAEME: ww.iaeme.com/IJCET.asp, Journal Impact Factor (2014): 8.5328
(Calculated by GISI), www.jifactor.com.
10. Boshra F. Zopon AL_Bayaty, Shashank Joshi, Empirical Implementation Decision Tree Classifier to WSD Problem, International Conference
on Emerging Trends Science and Cutting Edge Technology (ICETSCET), YMCA, 28,Sep, 2014, International Journal of Advanced
Technology in Engineering and Science www.ijates.com,Volume No.02, Special Issue No. 01, September 2014 ISSN (online): 2348 – 7550.
11. Boshra F. Zopon AL_Bayaty, Shashank Joshi, SENSE IDENTIFICATION FOR AMBIGUOUS WORD USING DECISION LIST,
International Journal of Advance Research In Science And Engineering http://www.ijarse.com, Vol. No.3, Issue No.10, October 2014 ISSN-
2319-8354(E).
12. Boshra F. Zopon AL_Bayaty, Shashank Joshi, Implementation SVM to Solve Multiple Meaning of Word Problem, IJISET - International
Journal of Innovative Science, Engineering & Technology, Vol. 1 Issue 9, November 2014. www.ijiset.com ISSN 2348 – 7968

More Related Content

What's hot

CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARECLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
 
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION cscpconf
 
A MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERS
A MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERSA MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERS
A MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERSZac Darcy
 
Reduct generation for the incremental data using rough set theory
Reduct generation for the incremental data using rough set theoryReduct generation for the incremental data using rough set theory
Reduct generation for the incremental data using rough set theorycsandit
 
Modeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationModeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationTELKOMNIKA JOURNAL
 
Ppback propagation-bansal-zhong-2010
Ppback propagation-bansal-zhong-2010Ppback propagation-bansal-zhong-2010
Ppback propagation-bansal-zhong-2010girivaishali
 
FUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATION
FUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATIONFUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATION
FUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATIONijdms
 
PERFORMANCE EVALUATION OF FUZZY LOGIC AND BACK PROPAGATION NEURAL NETWORK FOR...
PERFORMANCE EVALUATION OF FUZZY LOGIC AND BACK PROPAGATION NEURAL NETWORK FOR...PERFORMANCE EVALUATION OF FUZZY LOGIC AND BACK PROPAGATION NEURAL NETWORK FOR...
PERFORMANCE EVALUATION OF FUZZY LOGIC AND BACK PROPAGATION NEURAL NETWORK FOR...ijesajournal
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109IJRAT
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
 
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEIJCSEA Journal
 
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...ijscmcj
 

What's hot (18)

CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARECLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
 
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
FAST FUZZY FEATURE CLUSTERING FOR TEXT CLASSIFICATION
 
A MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERS
A MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERSA MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERS
A MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERS
 
Reduct generation for the incremental data using rough set theory
Reduct generation for the incremental data using rough set theoryReduct generation for the incremental data using rough set theory
Reduct generation for the incremental data using rough set theory
 
Modeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector QuantizationModeling Text Independent Speaker Identification with Vector Quantization
Modeling Text Independent Speaker Identification with Vector Quantization
 
Ppback propagation-bansal-zhong-2010
Ppback propagation-bansal-zhong-2010Ppback propagation-bansal-zhong-2010
Ppback propagation-bansal-zhong-2010
 
Poster
PosterPoster
Poster
 
My experiment
My experimentMy experiment
My experiment
 
Hx3115011506
Hx3115011506Hx3115011506
Hx3115011506
 
FUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATION
FUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATIONFUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATION
FUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATION
 
PERFORMANCE EVALUATION OF FUZZY LOGIC AND BACK PROPAGATION NEURAL NETWORK FOR...
PERFORMANCE EVALUATION OF FUZZY LOGIC AND BACK PROPAGATION NEURAL NETWORK FOR...PERFORMANCE EVALUATION OF FUZZY LOGIC AND BACK PROPAGATION NEURAL NETWORK FOR...
PERFORMANCE EVALUATION OF FUZZY LOGIC AND BACK PROPAGATION NEURAL NETWORK FOR...
 
Paper id 212014109
Paper id 212014109Paper id 212014109
Paper id 212014109
 
winbis1005
winbis1005winbis1005
winbis1005
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining Algorithms
 
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
 
50120140504015
5012014050401550120140504015
50120140504015
 
Ijetcas14 338
Ijetcas14 338Ijetcas14 338
Ijetcas14 338
 
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
 

Viewers also liked

Aprendizaje colaborativo haidee nicolas
Aprendizaje colaborativo haidee nicolasAprendizaje colaborativo haidee nicolas
Aprendizaje colaborativo haidee nicolashaideenicolas
 
геометрия 8 класс атанасян
геометрия 8 класс атанасянгеометрия 8 класс атанасян
геометрия 8 класс атанасянrosgdz
 
химия 9 класс габриелян
химия 9 класс габриелянхимия 9 класс габриелян
химия 9 класс габриелянrosgdz
 
ZOA traineeship program
ZOA traineeship programZOA traineeship program
ZOA traineeship programNienke Maris
 
математ рудницкая в.н. 3кл.
математ рудницкая в.н. 3кл.математ рудницкая в.н. 3кл.
математ рудницкая в.н. 3кл.rosgdz
 
сборник задач химия хомченко
сборник задач химия хомченкосборник задач химия хомченко
сборник задач химия хомченкоrosgdz
 
русский язык 5 класс львова
русский язык 5 класс львоварусский язык 5 класс львова
русский язык 5 класс львоваrosgdz
 
алгебра дидактические материалы 9 класс макарычев
алгебра дидактические материалы 9 класс макарычевалгебра дидактические материалы 9 класс макарычев
алгебра дидактические материалы 9 класс макарычевrosgdz
 
матем рудницкая в.н. 1 кл
матем рудницкая в.н. 1 клматем рудницкая в.н. 1 кл
матем рудницкая в.н. 1 клrosgdz
 
Community College Newsletter - Fall 2016
Community College Newsletter - Fall 2016Community College Newsletter - Fall 2016
Community College Newsletter - Fall 2016David Johnson
 

Viewers also liked (12)

Aprendizaje colaborativo haidee nicolas
Aprendizaje colaborativo haidee nicolasAprendizaje colaborativo haidee nicolas
Aprendizaje colaborativo haidee nicolas
 
геометрия 8 класс атанасян
геометрия 8 класс атанасянгеометрия 8 класс атанасян
геометрия 8 класс атанасян
 
химия 9 класс габриелян
химия 9 класс габриелянхимия 9 класс габриелян
химия 9 класс габриелян
 
ZOA traineeship program
ZOA traineeship programZOA traineeship program
ZOA traineeship program
 
математ рудницкая в.н. 3кл.
математ рудницкая в.н. 3кл.математ рудницкая в.н. 3кл.
математ рудницкая в.н. 3кл.
 
сборник задач химия хомченко
сборник задач химия хомченкосборник задач химия хомченко
сборник задач химия хомченко
 
русский язык 5 класс львова
русский язык 5 класс львоварусский язык 5 класс львова
русский язык 5 класс львова
 
алгебра дидактические материалы 9 класс макарычев
алгебра дидактические материалы 9 класс макарычевалгебра дидактические материалы 9 класс макарычев
алгебра дидактические материалы 9 класс макарычев
 
Pybr 12
Pybr 12Pybr 12
Pybr 12
 
матем рудницкая в.н. 1 кл
матем рудницкая в.н. 1 клматем рудницкая в.н. 1 кл
матем рудницкая в.н. 1 кл
 
Community College Newsletter - Fall 2016
Community College Newsletter - Fall 2016Community College Newsletter - Fall 2016
Community College Newsletter - Fall 2016
 
Filosofía de vida
Filosofía de vidaFilosofía de vida
Filosofía de vida
 

Similar to 61_Empirical

Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesijsc
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques ijsc
 
4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...IJECEIAES
 
A scenario based approach for dealing with
A scenario based approach for dealing withA scenario based approach for dealing with
A scenario based approach for dealing withijcsa
 
Text classification based on gated recurrent unit combines with support vecto...
Text classification based on gated recurrent unit combines with support vecto...Text classification based on gated recurrent unit combines with support vecto...
Text classification based on gated recurrent unit combines with support vecto...IJECEIAES
 
A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...Editor Jacotech
 
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERIJCSEA Journal
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesIRJET Journal
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXmlaij
 
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCERKNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCERcscpconf
 
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSSCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSijdkp
 
The pertinent single-attribute-based classifier for small datasets classific...
The pertinent single-attribute-based classifier  for small datasets classific...The pertinent single-attribute-based classifier  for small datasets classific...
The pertinent single-attribute-based classifier for small datasets classific...IJECEIAES
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory basedijaia
 
Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...IRJET Journal
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...IJDKP
 
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...sherinmm
 
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...sherinmm
 
Ijarcet vol-2-issue-7-2363-2368
Ijarcet vol-2-issue-7-2363-2368Ijarcet vol-2-issue-7-2363-2368
Ijarcet vol-2-issue-7-2363-2368Editor IJARCET
 

Similar to 61_Empirical (20)

Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
 
4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...4Data Mining Approach of Accident Occurrences Identification with Effective M...
4Data Mining Approach of Accident Occurrences Identification with Effective M...
 
A scenario based approach for dealing with
A scenario based approach for dealing withA scenario based approach for dealing with
A scenario based approach for dealing with
 
2-IJCSE-00536
2-IJCSE-005362-IJCSE-00536
2-IJCSE-00536
 
Text classification based on gated recurrent unit combines with support vecto...
Text classification based on gated recurrent unit combines with support vecto...Text classification based on gated recurrent unit combines with support vecto...
Text classification based on gated recurrent unit combines with support vecto...
 
A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...A survey of modified support vector machine using particle of swarm optimizat...
A survey of modified support vector machine using particle of swarm optimizat...
 
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINERANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
ANALYSIS AND COMPARISON STUDY OF DATA MINING ALGORITHMS USING RAPIDMINER
 
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering TechniquesFeature Subset Selection for High Dimensional Data using Clustering Techniques
Feature Subset Selection for High Dimensional Data using Clustering Techniques
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOX
 
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCERKNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
 
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSSCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
 
The pertinent single-attribute-based classifier for small datasets classific...
The pertinent single-attribute-based classifier  for small datasets classific...The pertinent single-attribute-based classifier  for small datasets classific...
The pertinent single-attribute-based classifier for small datasets classific...
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
 
A study on rough set theory based
A study on rough set theory basedA study on rough set theory based
A study on rough set theory based
 
Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...Density Based Clustering Approach for Solving the Software Component Restruct...
Density Based Clustering Approach for Solving the Software Component Restruct...
 
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
TUPLE VALUE BASED MULTIPLICATIVE DATA PERTURBATION APPROACH TO PRESERVE PRIVA...
 
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
MAXIMUM CORRENTROPY BASED DICTIONARY LEARNING FOR PHYSICAL ACTIVITY RECOGNITI...
 
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
Maximum Correntropy Based Dictionary Learning Framework for Physical Activity...
 
Ijarcet vol-2-issue-7-2363-2368
Ijarcet vol-2-issue-7-2363-2368Ijarcet vol-2-issue-7-2363-2368
Ijarcet vol-2-issue-7-2363-2368
 

61_Empirical

  • 1. ISSN(Online): 2320-9801 ISSN (Print): 2320-9798 International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization) Vol. 2, Issue 12, December 2014 10.15680/ijircce.2014.0212083 Copyright to IJIRCCE www.ijircce.com 7632 Empirical Comparative Study to Some Supervised Approaches Boshra F. Zopon AL_Bayaty1 , Dr. Shashank Joshi2 Department of Computer Science, Yashwantrao Mohite College, Bharati Vidyapeeth University, AL-Mustansiriya University, Baghdad, Iraq1 Department of Computer Engineering, Engineering College, Bharati Vidyapeeth University, Pune, Maharashtra,India2 ABSTRACT:Word sense disambiguation is solved with the help of various data mining approaches like Naïve Bayes Approach, Decision List, decision tree, and SVM (Support Vector machine). These approaches help to find out correct meaning of word by referring WordNet 2.1. Experiment performed is discussed in this paper along with the comparison of SVM algorithm with various approaches. In this study Decision List achieved the best result among all other approaches. KEYWORDS: Support Vector Machine, Naive Bayes, Decision List, Decision Tree, Supervised learning approaches, Senseval-3, WSD, WordNet. I. INTRODUCTION Natural language processing is study of word and their meaning role from meaningful language. Most for every system this word acts as an input. While inferring means out of it, if system misinterprets it entire system will get affected. That’s why WSD is extremely important to infer correctly meaning of word as per the perception of user or machine who has inserted it. Word sense disambiguation is a task to identify correct meaning of word by using some algorithm with the help of some or other approach [1] . To accomplish this process system is trained to identify correct results meaning of word according to the multiple words like Map. Map is a geographical representation of particular place or it is an association between two terms (Mapping). So problem statement is to identify the meaning of given word as per the requirement of user [2] . Fig. 1: The Screenshot Shows the Multiple of Anger Word
  • 2. ISSN(Online): 2320-9801 ISSN (Print): 2320-9798 International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization) Vol. 2, Issue 12, December 2014 10.15680/ijircce.2014.0212083 Copyright to IJIRCCE www.ijircce.com 7633 II. BACK GROUND AND RELATED WORK Many researchers have contributed to this field of disambiguation. There are various approaches to accomplish this task of disambiguation.  Support vector machine is to generate a hyperplan to separate hyperspace by separating them as per the category or group. Distance between closest edges of plan is known as support vector [3] .  Naive Bayes approach is a way to calculate posterior probability by using conditional probability. Naïve part of the classifier is to extract features dependency. It is assumed that there is no dependency among the features extracted[4] :     n i i i FFp CFFp FFCp 1 21 21 21 ),( )|,( ,| Where: F1, F2 are features C is category.  Decision tree deals with information gained during the experiment. In decision tree processing is from top to bottom that is from root to leaf. So if a length or tree is higher probability of data storage or information gain is comparatively higher. This also calculates the error rate in terms of entropy. Maximum is entropy minimum will be the accuracy and vice versa[5] .  Decision list works on condition like (If-else) structure. If condition is satisfied visit the node deal with data otherwise leave it. Repeat the process till desired data or conditions are not meet[6] . These approaches and their comparison is discussed in this paper based on the experiment which is performed to meet the goal of word sense disambiguation using effective approach for empirical retrieval of information. 1. Motivation To address the challenge discussed earlier resinous efforts are needed because every approach facer some or other drawback. The figure below represent support vector machine approach implemented in this paper: Fig 2. Support Vector Machine with Hyperplane Where x and y are various categories on which the data instances are separated. So the motivation to conduct this experiment is to increase the overall accuracy, address word sense disambiguation by considering some classifier, which will train the database and identify meaning of word correctly out of total list of meanings which are provided. This task is carried out by referring the context to resolve disambiguation. F(x) F(y) Data instance Support Vector
  • 3. ISSN(Online): 2320-9801 ISSN (Print): 2320-9798 International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization) Vol. 2, Issue 12, December 2014 10.15680/ijircce.2014.0212083 Copyright to IJIRCCE www.ijircce.com 7634 II. EXPERIMENTAL SETUP 1.1 Data Experiment is conducted by using a WordNet repository, 10 nouns and 5 verbs[7] . To know the accuracy of sense context is designed by following senseval norms. This representation is made by using XML representation. With the help of algorithm and context mentioned in a database meaning of word is calculated. To accomplish this task semistructured and unstructured representation is used, because of the latency; that is the required to store and retrieve the data to and from database[8] . 1.2 Implementation Supervised Machine Learning Techniques To identify meaning of word two types of techniques used, Supervised, unsupervised techniques. If a data is identified on the best is of frequency of occurrence then it is unsupervised approach; But all the time we cannot completely relay on unsupervised approach, because meaning could very as per the context used and perception. Supervised technique, because system is trained with some defined context to predict meaning based on the surrounding word. Their predictions are maide with suitable data mining algorithm like, Naïve Bayes, Decision tree algorithm, Decision List algorithm, and Support vector machine. These algorithms are munitions and empirically implemented in this paper, and the comparative analysis based on the accuracy of that algorithms to predict the meaning. 4.4.1 Naïve Bayes Naïve Bayes approach works on conditional probability. In some approaches it gives better result while in other approaches it does not deliver appropriate results. There are few scores where Naïve bayes provide better result and these top 3 results according to the accuracy are: {Name: 1000, worlds: 1000, Day: 1000}. In some cases performance of Naïve Bayes algorithm is not satisfactory lowest three such cases are: {Worship: 414, Trust: 167, Help: 414} Box.(1): Naive Bayes Algorithm implemented on Our Data Set Overall accuracy of Naïve Bayes algorithm is (58.32 %) which need to be improved to find but desired word correctly [9] . 4.4.2 Decision Tree Decision tree is based on storage of result or meaning at node. As far as WSD is concerned for data set that we are referring overall accuracy of decision tree is not satisfactory. Overall accuracy is (45.14%). 1. Initialize context c, sense s, and ambiguous word w. 2. As per training context 3.   )|( )|(),|( ,| cwp cspcswp cwsp  Calculate Maximized ),|( cwsp 4. Select one with highest value 5. Map sense according to the highest accuracy.
  • 4. ISSN(Online): 2320-9801 ISSN (Print): 2320-9798 International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization) Vol. 2, Issue 12, December 2014 10.15680/ijircce.2014.0212083 Copyright to IJIRCCE www.ijircce.com 7635 Box.(2) : C4.5 Algorithm implemented on Our Data Set. Though overall accuracy of decision tree is not up to the mark but for few cases it gives better results, such top 2 cases are: {Name: 1000, Worlds:1000}. On this contrary, there are some result its where performance is not satisfactory such lowest three cases are: {Trust: 167, Day: 109, Help: 125}[10] . 4.4.3 Decision List Among the approaches discussed, so for decision list provides more accurate result by forming if else ladder. The efficiency and accuracy would be noted by few cases where results are better are mentioned below: {Praise: 1000, Name: 1000, Worlds: 1000, Lord: 1000, Recompense: 1000, Day: 1000}. Box (3): Decision List Algorithm implemented Though overall accuracy is better in case of decision list there are some cases where the performance is not according to the expectation is not satisfactory are as below: {Trust: 167, Help: 125, Favored: 250, Path: 333}[11] . 4.2.4 Support Vector Machine Support vector machine is a technique to separate a data in a hyperspace with the help of hyperplane. This separation is done creating hyperplane by maximizing the distance between the data instances which are located at the edge. If we observer working of SVM carefully it is observed that it is practically difficult to sprat data instances clearly, so this gap is known as slack. This slack is to be maximized to separate data instances and categorize them under one heading. Support vector machine is an idea example of binary classifier but when it comes to word sense disambiguation performance or the results are not up to the mark. Such top 4 “four cases” where results are at pick are mentioned below: {Name: 1000, Worlds: 1000, Guide: 1000, Day: 1000}. In some cases performance of Support vector machine algorithm is not satisfactory lowest three such cases are: {Worship: 414, Lord: 431, Trust, 167, Path: 318, Favored: 250 Help: 125}[12] . 1. Read data set and calculation POS (e.g. recompense.) 2. Prepare context containing various senses of word (e.g. Recompense- reward) 3. Calculate frequency at context (i.e. - p- and +P+) -P- Negative -P+ Positive 4. Calculate information gain for calculating entropy (S) = - P+log2P+-P-log2P- 5. Gain (S,A)= Entropy(S) -  DAv S Sv || || Entropy (Sv) 6. Select highest (Entropy, Attribute ratio) 1. Identify and calculate feature (f). 2. Calculate value of sense (Si). 3. Identify collocation one value per collocation basis. 4. Repeat this process for multiple senses. 5. Calculate absolute (log) of P (Si |f) for all sense. 6. Select maximum value out of it. Max log       )|2( )|( fsP fsiP
  • 5. ISSN(Online): 2320-9801 ISSN (Print): 2320-9798 International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization) Vol. 2, Issue 12, December 2014 10.15680/ijircce.2014.0212083 Copyright to IJIRCCE www.ijircce.com 7636 Box.(4):7636SVM Algorithm implemented on Our Data Set III. THE RESULTS Disambiguation is performed in this paper via a four supervised approaches, using WordNet and Senseval-3. Table (1), shows the results of four approaches, Naïve Bayes, Decision tree,, Decision List, and Support vector machine, which’s has been given based on their score and accuracy.ich shows a comparative of those different approaches has been given based on their score and accuracy. 1. Data Set: five verbs, ten nouns [7] , and WordNet repository ℎ ∈ ∈ {−1,1} 2. Training data: Part of speech, user, context, and words are used to train the systems. ( ) = { ≥ 0 = ±1; < 0 = −1} 3. For linear approach F (x) = wT x +b w Normal line b base line wT weight vector 4. Processing of Algorithm application[8] : For all sense of word decide meaning with high score Where m meaning 5. Testing: By delivering final result with accuracy and score helps to decide performance of algorithm [9] .
  • 6. ISSN(Online): 2320-9801 ISSN (Print): 2320-9798 International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization) Vol. 2, Issue 12, December 2014 10.15680/ijircce.2014.0212083 Copyright to IJIRCCE www.ijircce.com 7637 IV. CONCLUSION We have presented a comparative study for four supervised learning machine algorithms, using WordNet, and Senseval-3, table (2), below shows the final results and accuracy for each approach. In conclusion, Decision List algorithm, obtained high accuracy. TABLE 1. DATA SET OF WORDS AND RESULTS OF SUPERVISED LEARNING MACHINE CLASSIFIERS Word POS # Sense Naïve Bayes Decision Tree Decision List SVM Score Accuracy Score Accuracy Score Accuracy Score Accuracy Praise n 2 0.408 0.592 405 593 668 1000 592 594 Name n 6 0.189 1.0 184 1000 1000 1000 189 1000 Worship v 3 0.172 0.414 308 425 387 500 352 414 Worlds n 8 0.137 1.0 1000 1000 142 1000 1000 1000 Lord n 3 0.341 0.681 187 426 489 1000 418 431 Owner n 2 0.406 0.594 405 595 755 999 592 594 Recompe-nse n 2 0.48 0.594 405 595 791 1000 592 594 Trust v 6 0.167 0.167 167 167 167 167 167 167 Guide v 5 0.352 0.648 199 247 387 995 244 1000 Straight n 3 0.496 0.504 462 462 500 500 69 465 Path n 4 0.415 0.585 316 316 333 333 47 318 anger n 3 0.412 0.588 462 462 500 500 69 465 Day n 10 0.109 1.0 109 109 111 1000 109 1000 Favored v 4 0.587 0.648 250 250 250 250 250 250 Help v 8 0.352 0.414 125 125 125 125 125 125 TABLE.2 THE FINAL RESULTS OF SUPERVISED LEARNING MACHINE CLASSIFIERS Approaches Accuracy (%) Naïve Bayes 58.32 Decision Tree 45.14 Decision List 69.12 SVM 56.11
  • 7. ISSN(Online): 2320-9801 ISSN (Print): 2320-9798 International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization) Vol. 2, Issue 12, December 2014 10.15680/ijircce.2014.0212083 Copyright to IJIRCCE www.ijircce.com 7638 ACKNOWLEDGMENT I’m grateful to my research guide respected Dr. Shashank Joshi (Professor at Bharati Vidyapeeth University, College of Engineering) for his support and cooperation all the time. REFERENCES 1. Boshra F. Zopon AL_Bayaty, Dr. Shashank Joshi, Conceptualisation of Knowledge Discovery from Web Search, Bharati Vidyapeeth University, International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February-2014, pages 1246- 1248. 2. Miller, G. et al., 1993, Introduction to WordNet: An On-line Lexical Database, ftp://ftp.cogsci.princeton.edu/pub/wordnet/5papers.pdf, Princeton University. 3. Yoong Keok Lee and Hwee Tou Ng and Tee Kiah Chia, Supervised Word Sense Disambiguation with Support Vector Machines and Multiple Knowledge Sources, 2007. 4. Ruben Izquierdo and Armando Suarez, German Rigau. An Empirical Study on Class-based Word Sense Disambiguation. Paper has been supported by the European Union under the projects QALL-ME (FP6 IST-033860) and KY-OTO (FP7 ICT-211423), and the Spanish Government under the project Text-Mess (TIN2006-15265-C06-01) and KNOWS (TIN2006-15049-C03-01). 5. Mahesh Joshi, Ted Pedersen and Richared Maclin. A Comparative Study of Support Vector Machines Applide to the Supervised Word Sense Disambiguation Problim in the Medical Domain. Bhanu Prasad (Editor): IICAI-05, pp.3449-3468, 2005. 6. Vicky Panagiotopoulou, Iraklis Varlamis, Ion Androutsopoulos, and George Tsatsaronis, Word Sense Disambiguation as an Integer Linear gramming Problem,, SETN 2012, 7th Hellenic Conference on Artificial Intelligence, May 28-31, 2012, Lamia. 7. http://www.e-quran.com/language/english. 8. http://www.senseval.org/senseval3. 9. Boshra F. Zopon AL_Bayaty, Shashank Joshi, Empirical Implementation Naive Bayes Classifier for WSD Using WordNet., Bharati Vidyapeeth University, international journal of computer engineering & technology (IJCET), ISSN 0976 – 6367(Print), ISSN 0976 – 6375(Online), Volume 5, Issue 8, August (2014), pp. 25-31,© IAEME: ww.iaeme.com/IJCET.asp, Journal Impact Factor (2014): 8.5328 (Calculated by GISI), www.jifactor.com. 10. Boshra F. Zopon AL_Bayaty, Shashank Joshi, Empirical Implementation Decision Tree Classifier to WSD Problem, International Conference on Emerging Trends Science and Cutting Edge Technology (ICETSCET), YMCA, 28,Sep, 2014, International Journal of Advanced Technology in Engineering and Science www.ijates.com,Volume No.02, Special Issue No. 01, September 2014 ISSN (online): 2348 – 7550. 11. Boshra F. Zopon AL_Bayaty, Shashank Joshi, SENSE IDENTIFICATION FOR AMBIGUOUS WORD USING DECISION LIST, International Journal of Advance Research In Science And Engineering http://www.ijarse.com, Vol. No.3, Issue No.10, October 2014 ISSN- 2319-8354(E). 12. Boshra F. Zopon AL_Bayaty, Shashank Joshi, Implementation SVM to Solve Multiple Meaning of Word Problem, IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. 1 Issue 9, November 2014. www.ijiset.com ISSN 2348 – 7968