GLOBALSOFT TECHNOLOGIES 
IEEE PROJECTS & SOFTWARE DEVELOPMENTS 
IEEE FINAL YEAR PROJECTS|IEEE ENGINEERING PROJECTS|IEEE STUDENTS PROJECTS|IEEE 
BULK PROJECTS|BE/BTECH/ME/MTECH/MS/MCA PROJECTS|CSE/IT/ECE/EEE PROJECTS 
CELL: +91 98495 39085, +91 99662 35788, +91 98495 57908, +91 97014 40401 
Visit: www.finalyearprojects.org Mail to:ieeefinalsemprojects@gmai l.com 
A Novel Model for Mining Association Rules from Semantic 
Web Data 
Abstract 
The amount of ontology’s and semantic annotations for 
various data of broad applications is constantly growing. This type 
of complex and heterogeneous semantic data has created new 
challenges in the area of data mining research. Association Rule 
Mining is one of the most common data mining techniques which 
can be defined as extracting the interesting relation among large 
amount of transactions. Since this technique is more concerned 
about data representation, we can say it is the most challenging data 
mining technique to be applied on semantic web data. Moreover, the 
Semantic Web technologies offer solutions to capture and efficiently 
use the domain knowledge. So, in this paper, we propose a novel
method to provide a way to address these challenges and enable 
processing huge volumes of semantic data, perform association rule 
discovery, store these new semantic rules using semantic richness of 
the concepts that exist in ontology and apply semantic technologies 
during all phases of mining process. 
Existing System 
 The topic coverage of TREC profiles was limited. The TREC user 
profiles had good precision but relatively poor recall performance. 
 Using web documents for training sets has one severe drawback: web 
information has much noise and uncertainties. 
 As a result, the web user profiles were satisfactory in terms of recall, 
but weak in terms of precision. There was no negative training set 
generated by this model. 
 Thus, semantic annotated data does not have a rigid structure. As a 
result, there would be structural heterogeneity problems. Moreover, 
traditional data mining algorithms work with homogeneous datasets 
which include transactions, subsets of items. 
 The problem of mining to discover all association rules w confidence 
greater than the user-specified support and minimum confidence 
respectively.
DISADVANTAGE 
 We should point out that even though using the IIS can significantly 
alleviate both the local interface schema inadequacy problem and the 
inconsistent label problem, it cannot solve them completely. 
 For the first problem, it is still possible that some attributes of the 
underlying entities do not appear in any local interface, and as a 
result, such attributes will not appear in the IIS. 
 If one or more of these annotations are not local attribute 
names in the attribute mapping table for this domain, then using the 
IIS cannot solve the problem and new techniques are needed.
PROPOSED SYSTEM 
 Each of these annotators exploits one type of features for annotation 
and our experimental results show that each of the annotators is 
useful and they together are capable of generating high quality 
annotation. 
 A large portion of the deep web is database based, i.e., for many 
search engines, data encoded in the returned result pages come from 
the underlying structured databases. Such type of search engines is 
often referred as Web databases (WDB). A typical result page 
returned from a WDB has multiple search result records (SRRs). 
 Specifying temporal constraints, specifically non sequenced 
semantics, in the temporal data dictionary as metadata. 
 Our proposed approach provides a mechanism to represent telic/atelic 
temporal semantics using temporal annotations.
 Using IISs has two major advantages. First, it has the potential to 
increase the annotation recall. 
 Since the IIS contains the attributes in all the LISs, it has a better 
chance that an attribute discovered from the returned results has a 
matching attribute in the IIS even though it has no matching attribute 
in the LIS 
Advantage 
 One advantage of this model is its high flexibility in the sense 
that when an existing annotator is modified or a new 
annotator is added in, all we need is to obtain the applicability 
and success rate of this new/revised annotator while keeping 
all remaining annotators unchanged. 
 We propose a clustering-based shifting technique to align 
data units into different groups so that the data units inside the 
same group have the same semantic. 
 Instead of using only the DOM tree or other HTML tag tree 
structures of the SRRs to align the data units.
 We also employ a probabilistic model to combine the results 
from different annotators into a single label. 
Hardware Requirements 
SYSTEM : Pentium IV 2.4 GHz 
HARD DISK : 40 GB 
RAM : 256 MB 
Software Requirements 
Operating system : Windows XP Professional 
IDE : Microsoft Visual Studio .Net 2008 
Database : Sql server 2005 
Coding Language : C#.NET
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association rules from semantic web data

IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association rules from semantic web data

  • 1.
    GLOBALSOFT TECHNOLOGIES IEEEPROJECTS & SOFTWARE DEVELOPMENTS IEEE FINAL YEAR PROJECTS|IEEE ENGINEERING PROJECTS|IEEE STUDENTS PROJECTS|IEEE BULK PROJECTS|BE/BTECH/ME/MTECH/MS/MCA PROJECTS|CSE/IT/ECE/EEE PROJECTS CELL: +91 98495 39085, +91 99662 35788, +91 98495 57908, +91 97014 40401 Visit: www.finalyearprojects.org Mail to:ieeefinalsemprojects@gmai l.com A Novel Model for Mining Association Rules from Semantic Web Data Abstract The amount of ontology’s and semantic annotations for various data of broad applications is constantly growing. This type of complex and heterogeneous semantic data has created new challenges in the area of data mining research. Association Rule Mining is one of the most common data mining techniques which can be defined as extracting the interesting relation among large amount of transactions. Since this technique is more concerned about data representation, we can say it is the most challenging data mining technique to be applied on semantic web data. Moreover, the Semantic Web technologies offer solutions to capture and efficiently use the domain knowledge. So, in this paper, we propose a novel
  • 2.
    method to providea way to address these challenges and enable processing huge volumes of semantic data, perform association rule discovery, store these new semantic rules using semantic richness of the concepts that exist in ontology and apply semantic technologies during all phases of mining process. Existing System  The topic coverage of TREC profiles was limited. The TREC user profiles had good precision but relatively poor recall performance.  Using web documents for training sets has one severe drawback: web information has much noise and uncertainties.  As a result, the web user profiles were satisfactory in terms of recall, but weak in terms of precision. There was no negative training set generated by this model.  Thus, semantic annotated data does not have a rigid structure. As a result, there would be structural heterogeneity problems. Moreover, traditional data mining algorithms work with homogeneous datasets which include transactions, subsets of items.  The problem of mining to discover all association rules w confidence greater than the user-specified support and minimum confidence respectively.
  • 3.
    DISADVANTAGE  Weshould point out that even though using the IIS can significantly alleviate both the local interface schema inadequacy problem and the inconsistent label problem, it cannot solve them completely.  For the first problem, it is still possible that some attributes of the underlying entities do not appear in any local interface, and as a result, such attributes will not appear in the IIS.  If one or more of these annotations are not local attribute names in the attribute mapping table for this domain, then using the IIS cannot solve the problem and new techniques are needed.
  • 4.
    PROPOSED SYSTEM Each of these annotators exploits one type of features for annotation and our experimental results show that each of the annotators is useful and they together are capable of generating high quality annotation.  A large portion of the deep web is database based, i.e., for many search engines, data encoded in the returned result pages come from the underlying structured databases. Such type of search engines is often referred as Web databases (WDB). A typical result page returned from a WDB has multiple search result records (SRRs).  Specifying temporal constraints, specifically non sequenced semantics, in the temporal data dictionary as metadata.  Our proposed approach provides a mechanism to represent telic/atelic temporal semantics using temporal annotations.
  • 5.
     Using IISshas two major advantages. First, it has the potential to increase the annotation recall.  Since the IIS contains the attributes in all the LISs, it has a better chance that an attribute discovered from the returned results has a matching attribute in the IIS even though it has no matching attribute in the LIS Advantage  One advantage of this model is its high flexibility in the sense that when an existing annotator is modified or a new annotator is added in, all we need is to obtain the applicability and success rate of this new/revised annotator while keeping all remaining annotators unchanged.  We propose a clustering-based shifting technique to align data units into different groups so that the data units inside the same group have the same semantic.  Instead of using only the DOM tree or other HTML tag tree structures of the SRRs to align the data units.
  • 6.
     We alsoemploy a probabilistic model to combine the results from different annotators into a single label. Hardware Requirements SYSTEM : Pentium IV 2.4 GHz HARD DISK : 40 GB RAM : 256 MB Software Requirements Operating system : Windows XP Professional IDE : Microsoft Visual Studio .Net 2008 Database : Sql server 2005 Coding Language : C#.NET