To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
Similarity Preserving Snippet Visualization
1. GLOBALSOFT TECHNOLOGIES
IEEE PROJECTS & SOFTWARE DEVELOPMENTS
IEEE FINAL YEAR PROJECTS|IEEE ENGINEERING PROJECTS|IEEE STUDENTS PROJECTS|IEEE
BULK PROJECTS|BE/BTECH/ME/MTECH/MS/MCA PROJECTS|CSE/IT/ECE/EEE PROJECTS
CELL: +91 98495 39085, +91 99662 35788, +91 98495 57908, +91 97014 40401
Visit: www.finalyearprojects.org Mail to:ieeefinalsemprojects@gmai l.com
Similarity Preserving Snippet based visualization of
Web Search Results
Abstract:
Measuring the similarity between documents is an important operation in the text
processing field. In this paper, a new similarity measure is proposed. To compute
the similarity between two documents with respect to a feature, the proposed
measure takes the following three cases into account: a) The feature appears in
both documents, b) the feature appears in only one document, and c) the feature
appears in none of the documents. For the first case, the similarity increases as the
difference between the two involved feature values decreases. Furthermore, the
contribution of the difference is normally scaled. For the second case, a fixed value
is contributed to the similarity. For the last case, the feature has no contribution to
the similarity. The proposed measure is extended to gauge the similarity between
two sets of documents. The effectiveness of our measure is evaluated on several
real-world data sets for text classification and clustering problems. The results
show that the performance obtained by the proposed measure is better than that
achieved by other measures.
Existing System:
2. • Clustering is one of the most interesting and important topics in data mining.
The aim of clustering is to find intrinsic structures in data, and organize
them into meaningful subgroups for further study and analysis.
• Existing Systems greedily picks the next frequent item set which represent
the next cluster to minimize the overlapping between the documents that
contain both the item set and some remaining item sets.
• In other words, the clustering result depends on the order of picking up the
item sets, which in turns depends on the greedy heuristic. This method does
not follow a sequential order of selecting clusters.
DISADVANTAGES:
• Its disadvantage is that it does not yield the same result with each run, since
the resulting clusters depend on the initial random assignments.
• It minimizes intra-cluster variance, but does not ensure that the result has a
global minimum of variance.
• But has the same problems as k-means, the minimum is a local minimum,
and the results depend on the initial choice of weights.
• The Expectation-maximization algorithm is a more statistically formalized
method which includes some of these ideas: partial membership in classes
Proposed System:
• The main work is to develop a novel hierarchal algorithm for document
clustering which provides maximum efficiency and performance. Propose a
novel way to evaluate similarity between documents, and consequently
formulate new criterion functions for document clustering.
• Assume that the majority. The purpose of this test is to check how much a
similarity measure coincides with the true class labels.
3. • It is particularly focused in studying and making use of cluster overlapping
phenomenon to design cluster merging criteria.
• Experiments in both public data and document clustering data show that this
approach can improve the efficiency of clustering and save computing time.
Hardware Requirements:
Processor Speed : P4 (Above 2GHZ)
RAM : 256MB
Hard Disk Drive : 40GB
Software Requirements:
Application Type : Web application
IDE : Microsoft Visual Studio 2010
Database : Sql Server 2008
Coding Language : C#.NET