Haitian culture and stuff and places and food and travel.pptx
My experiment
1. WELCOME PhD Journey In India
By : Boshra F. Zopon Al_Bayaty
Prof . Dr. Shashank. D. Joshi
(Guide)
Knowledge Discovery from
Web Search
2. OUTLINE
PhD Course Work
Knowledge Discovery from Web Search
National and International Conferences
The Research Contribution
Conclusion and Suggestion for Future Work
3. Knowledge Discovery From Web Search, PhD Journey
PhD Course Work
• The Students Play an important part in College development
4. Knowledge Discovery From Web Search, PhD Journey
PhD Course Work
• The Students Play an important part in College development
5. INTRODUCTION
Knowledge discovery is a process to extract useful information from the source of information or data by using a
combination of machine learning, statistical analysis, search engine, modeling techniques and natural language processing.
Knowledge discovery is an extension of information retrieval. Information retrieval is extension of data mining. Therefore,
the process of IR data miming will support knowledge discovery directly or indirectly.
Because of the popularity of computers and networks, Internet has become the most important information source.
Traditionally, people use some keywords and simple Boolean algebra to search the related articles.
The best example of knowledge discovery is a tool like search engine which helps to extract information. Evaluation of any
web search engine is the key to ensure the effectiveness, efficiency, Scalability, and usability of these browsing methods.
Because of the imprecise results of keyword search in the Internet, all the studies of web mining method are trying to improve
the accuracy or value of the information gotten from the web pages.
Although search by keywords is the most efficient and popular method to find related information from the Internet, it exists
two problems by using this method.
1. The first is that some search results don’t match with the user’s requirement.
2. There are too many similar articles in the search results.
Because of the two problems, users spend a lot of time organizing the search results and finding what they really want.
6. The knowledge discovery of sense with the help of context can be done by Word Sense
Disambiguation which is open problem in Natural Language Processing.
Word Sense Disambiguation is the ability to computationally determine which sense of a word has
being used.
The main WSD methods are : Stacking and Voting, voting can be weighted and non-weighted
6
Problem Definition
Fig. 2. The Screenshot from WordNet Shows the Multiple meaning of Straight Word
Knowledge Discovery From Web Search, PhD Journey
7. Goals and objective set for research work are as follows:
1. To analyze the influence of context on determining the sense of given word with
the help of a technique by creating separate context for every sense of every
word.
2. To study different type of techniques used for knowledge discovery, apply them
for the process of disambiguation, and improve the accuracy.
3. To design and implement new model called “Master- Slave” model.
4. To evaluate the performance of proposed model with the help of different
parameters like precision, recall, F-measure.
7
Goals and Objective
Knowledge Discovery From Web Search, PhD Journey
8. 8
Supervised Algorithms Suggested
in Research work
Naive Bayes (NB)
Decision Tree (DT)
Decision list (DL)
AdaBoost (AB)
Support Vector
Machine(SVM)
System Requirements and Analysis
Fig.5. Five Supervised Selected
Knowledge Discovery From Web Search, PhD Journey
9. MASTER – SLAVE MODEL
Slave
Classifiers
Cn
Master
Classifier
O/P
O/P
Input
Data Set
Output
C1
The Reputation
Knowledge Discovery From Web Search, PhD Journey
10. THE REFERENCE OF THE CONTEXT
10
http://www. e-quran.com/language
Fig.9. The resource of data set
Knowledge Discovery From Web Search, PhD Journey
11. The Source of Context: In order to provide input of words, the process of word sense disambiguation
is executed for that word. These words are selected from one paragraph in a holy book “Al_Quran”
[E-QURAN.COM] as shown in fig. 8, to perform word sense disambiguation.
11
System Requirements and Analysis
Fig.8. The resource of data set
Knowledge Discovery From Web Search, PhD Journey
12. 12
•At this Stage Accuracy related with every algorithm still not up to mark.
• Decision List selected as Master approach for two reasons:
1. Got high Accuracy
2. It’s reputation: Decision list is one of the robust approaches in word sense disambiguation field to address sense
disambiguation. It has long history background e.g. - Kelly and stone, 1975, Block, 1988. Decision list is one of the reputed
algorithms with considerable historic background. History performance is a very important parameter that plays vital role
in deciding algorithm as Master or Slave in our suggested model. Decision list has a good reputation in WSD field, from the
results previous work is reported.
No. Approach Accuracy (%)
1. Decision List 69.12
2. Adaboost
65.27
3. Naïve Bayes
62.86
4. SVM
56.11
5. Decision Tree 45.14
TABLE 3
The final results of five supervised approaches
System Design: Select Master approach (The First Part of System)
0
50
100
Accuracy%
Decision
List
Adaboost
Naïve
Bayes
SVM
Decision
Tree
Accuracy (%) 69.12 65.27 62.86 56.11 45.14
Accuracy (%)
Fig 22: Final accuracy Algorithms graph
Knowledge Discovery From Web Search, PhD Journey
13. 13
System Development and Implementation Algorithm
Input: Data Set, Context, Choice of algorithm
Output: Correct sense according to context.
Process: Word Sense Disambiguation.
For Loop
For Loop
Step1 Select data set, Data source, context and
the algorithm.
Step2 For all words in data set (W), For all
sense (S)
Step 3 (features) find POS from data source (d)
Step 4 Use Master-Slave algorithms.
Step 5 Calculate sense wise P,R and F.
Step6 Select sense with highest value
Step7 Sum all accuracies to calculate overall
accuracy
Step8 boosting factor addition
Step9 Display sense accuracy
End Loop
End Loop
Step1. Accuracy of Master X
% is collected.
Step2. Accuracy of Slave y %
Step3. Collect voting to
improve X by using
factor F= (X - f)/100.
Step4. Accuracy of Word=old
Accuracy + F
Step5. Apply this factor for
all words, X1, X2, X3…,
and X15.
Step6. Calculate precision,
Recall, and f-measure.
System Design: The Second Part of System
Knowledge Discovery From Web Search, PhD Journey
14. 14
No. Approach Before Combination
Recall Precision F- measure
1 N.Bayes 30.573 62.86 188.58
2 D. List 44.033 69.126 207.38
3 Adaboost 45.92 65.273 195.82
Discussion on Results (Before Combination)
0
500
1000
Praise
Name
Worship
Worlds
Lord
Owner
Recompe-nse
Trust
Guide
Straight
Path
anger
Day
Favored
Help
COMPARATIVE ANALYSIS OF PRECISION
1st Experiment
Precision
2nd Experiment
Precision
The Master–Slave model deals with three experiments. In the first experiment, Decision list acts
a Master and Naïve Bayes act as Slave. Individually each algorithm gives good values of precision
and f-measure.
Fig 27: Comparative analysis Graph
Knowledge Discovery From Web Search, PhD Journey
15. 15
Approach After Combination
Recall Precision F-
measure
1st Experiment (N.Bayes +
D.L)
68.46667 51.06 1531.8
2nd Experiment (D.L+ Ada)
52.61333 69.23333 2077
3rd Experiment (N.Bayes +
Ada +D.L)
47.37333 70.14667 2104.4
0
500
1000
Praise
Name
Worship
Worlds
Lord
Owner
Recompe-nse
Trust
Guide
Straight
Path
anger
Day
Favored
Help
COMPARATIVE ANALYSIS OF RECALL
1st Experiment Recall
2nd Experiment
Recall
Second combination: used for experiment, in the combination Decision list acts as Master and
Adaboost acts as a Slave. The details of accuracies are mentioned below:
Overall precision 69.23% and recall is 52.61%, so the results of the experiment are satisfactory and
the overall rise in terms of recall and precision is 85.80 and 1.0733 respectively.
Third experiment: the details of accuracy are mentioned below:
Overall precision is 70.14%, recall is 47.37%, which gives rise of 48.73 and 14.53 respectively.
First experiment: The details of accuracy are mentioned below:
Overall precision is 51.06%, recall is 68.46%, which gives rise in Recall more than Precision
Fig 28: Comparative analysis Graph
Discussion on Results (After Combination)
Knowledge Discovery From Web Search, PhD Journey
16. 16
Approach Enhancement
Recall Precisio
n
F- measure
1st Experiment (N.Bayes +
D.L)
378.9367 -118 -354
2nd Experiment (D.L+ Ada) 85.8033 1.0733 3.2
3rd Experiment (N.Bayes +
Ada +D.L)
14.5333 48.7367 146.2
0
5000
Praise
Name
Worship
Worlds
Lord
Owner
Recompe…
Trust
Guide
Straight
Path
anger
Day
Favored
Help
COMPARATIVE ANALYSIS OF F-MEASURE
1st
Experiment
F-Measure
2nd
Experiment F-
Measure
Third experiment: It is observed that there in increase in precision and f-measure by 48.7367 and
146.2 respectively; this combination gives all round performance for precision.
Second experiment: There is increase in precision by 1.0733 and f-measure 3.2, unlike to the first
experiment recall is decreased. This is enhancement in precision to resolve word sense
disambiguation problem.
First experiment: When they are combined together its recall is enhanced which might be useful
application like search engine which requires more coverage of sample space, but word sense
disambiguation it is less useful.
Fig 29: Comparative analysis Graph
Discussion on Results (Enhancement)
Knowledge Discovery From Web Search, PhD Journey
17. Empower WSD with social N/W.
There are number of applications where Master-Slave modeling is needed, that is when user enters a query that query could be
refined with the help of the information or tags received from the social networking site from profile of that individual or the thing
which should or liked by the individual. This process will not only ensure correct sense of a word but it will also increase the
accuracy of a given results displayed.
Empower Translation online
Web-browser to run on online for WSD and provides online interface between user and system to support some application like
Google or Bing translations and this enable the user to easily comprehend the out put.
M-S model for other languages
Would like Master- Slave to support more and more languages like Arabic, Hindi, Germany and so on. 17
Conclusion and Suggestion for future Work
Knowledge Discovery From Web Search, PhD Journey
The advantages of this work are to improve the accuracy, disambiguate word, and analyze the relationship among
data set, algorithm and context.
Our proposed solution to this problem provides good level of accuracy. Result of the experiments in this research;
are as per the anticipation, delivering accuracy more than ( 70.14%).
WSD is still one of the central challenges in NLP and all researchers try to meet it.
18. 18
The Research Contribution
• Model
Proposed Model to supervised Algorithms with Master- Slave Combination
• Algorithm
The experiment performed use novel algorithm which is Master- Slave algorithm
using boosting factor. This Master- Slave algorithm (Unique Algorithm) is formed by
selecting best set of algorithms to improve the accuracy of disambiguation.
• Design
The Master-Slave algorithm performance is efficiently with the help of boosting
factor, this boosting factor depend upon the error rate and varies accuracy.
• Performance Optimization
Results of experiments presented with the help of graph proves that selected
algorithm and design work to improvise the accuracy equal to 70.14% this helps to
disambiguate sense efficiently.
•Comparison of novel approach has been made to prove the excellence of it with
respect all other approach.
Knowledge Discovery From Web Search, PhD Journey
19. National conference
Attended and published paper, National in Computer Science and Information Technology organized by Y
M College, Pune held on 27-28 Sept. 2013.
Attended and published paper, National Conference on, Modeling, Optimization and Control, NCMOC 4th
To 6th March, 2015.
Attended National Conference on Advance Technologies for Secured Communication Using 4G & LTE
(ATSC-2014), B. V. U, College of Engineering, Pune. 5-6 February, 2014.
Attended National Conference, On FOSSsumMIT’14, In association with Pune Linux Group, Department of
Computer Engineering, MITCOE, Pune, 1st to 2nd August 2014.
International Conferences
International conference IEEE Canada, IHTC, Ottawa, http://www.ihtc2015. ieee.ca/, 31 May- 4th June, 2015.
International Conference on Knowledge and Software Engineering, December 6-7 2014, Paris, France.
ickse@iacsit.com.
International Conference on Emerging Trends in Science and Cutting Edge Technology (ICETSCET),
YMCA, New Delhi, 28 September, 2014. www.icetscet.com.
International Conference on current advances in Engineering and Technology (ICET-14), Knowledge and
Software Engineering, Trivandurm, Kerala, IFERP Connecting engineers..Developing research (Unit of
VVERT), 14th December, 2014. www.icet.com.
National and International Conferences
Knowledge Discovery From Web Search, PhD Journey
22. SOME SUGGESTIONS
Advantages of Workshops.
The progress reports and Scientific research .
The Main three Stages For PhD degree.
Very Positive Result.
Knowledge Discovery From Web Search, PhD Journey
25. REVIEW AND COMMENTS FROM FIRST PRESENTATION
Introduction
Literature Review
Problem Definition (Word Sense
Disambiguation)
Objective of Study
Methodology
Research plan
Select Research Approaches (Five Supervised
Approaches)
System Modeling (Master – Slave
Techniques)
System Requirements
Publication (2 papers)
Conclusion
Source of Bibliography
References
25
Sr.
No.
Comment Status
1. Data Normalizing is required Done
2. Refer more papers based on Supervised
neural network
Done
Table. 1 The status of first presentation comments
Knowledge Discovery From Web Search, PhD Journey
The Three Stages For PhD degree
26. Review for Second Presentation
Introduction
Literature Review (Revised)
Problem Definition
Objective of Study
Motivation
Methodology
The Work Done So Far
Jump to Master – Slave Technique
The Reference of Context and Data Set selected
(Sys. Requirements and Data Normalization)
Modeling – designing- Compilation
Supervised Approaches under Study Implemented
The Comparative Analysis of the Results
The Limitation and Suggestion for future work
Conclusion
System Development Life – Cycle Phases (SDLC)
The Research Contribution in Knowledge and Scientific Research.
Bibliography
Activities and Publications
REVIEW AND COMMENTS FROM SECOND PRESENTATION
26
Sr. No. Comment Status
1. The candidate presented the program of
work which was in with the approved
objectives. It is suggested use of decision
tree and supervised learning.
Done by clarification on decision tree by using example related implementation.
2. Thesis hypothesis could be revisited. The hypothesis or the assumptions made are mentioned below:
1. To perform the combination, the algorithm selected should be based on the individual
performance and reputation.
2. To disambiguate the sense the context has to select.
3. To know POS and senses there must be trust is on the word source referred.
4. Improvement in accuracy of the disambiguation.
5. Increase the performance of algorithm using Master- Slave system.
6. Improvement in the word sense disambiguation irrespective of amount of data set,
data source, context.
7. To improved the algorithm with all combinations.
Table. 2 The status of Second presentation comments
The Three Stages For PhD degree
Knowledge Discovery From Web Search, PhD Journey