Your SlideShare is downloading. ×
Hl2513421351
Hl2513421351
Hl2513421351
Hl2513421351
Hl2513421351
Hl2513421351
Hl2513421351
Hl2513421351
Hl2513421351
Hl2513421351
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hl2513421351

43

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
43
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-1351Attribute Ontological Relational Weights (AORW) For CoherentAssociation Rules: A Post Mining Process For Association Rules M.Saritha1, T.Venkata Ramana2,Kareemunnisa3, Y.Sowjanya4, M.Nishantha5 1 (Asst. Professor, Department of Computer Science, GuruNanak Institutions of Technical Campus, India) 2 (Professor, Head of the Department of Computer Science, SLC‟s IET, India) 3, 5 (Asst. Professors, Department of Information Technology, GuruNanak Institutions of Technical Campus, India) 4 (M.tech student, Department of Computer Science and Engg, SLC‟s IET, India)ABSTRACT "Coding of data as binary if data is not binary” -> Association rules present one of the most “Rule generation” -> “Post-mining". This surveyimpressive techniques for the analysis of focused on post mining. It was a century after theattribute associations in a given dataset related toapplications related to retail, bioinformatics, and introduction of association rules (associationssociology. In the area of data mining, the initially discussed in 1902), it is still continuing thatimportance of the rule management in the absence of items from transactions is oftenassociating rule mining is rapidly growing. ignored in analyses.Usually, if datasets are large, the induced rulesare large in volume. The density of the rule A. Association rule miningvolume leads to the obtained knowledge hard to Given a set I that is non-empty, a rule ofbe understood and analyze. One better way of association is a statement of the form X ⇒ Y, whereminimizing the rule set size is eliminating X, Y ⊂ I such that X ≠ ∅, Y ≠ ∅, and X ∩ Y = ∅.redundant rules from rule base. Many efforts The dataset X is called the antecedent of the rule,have been made and various competent and the set Y is called the consequent of the rule, andexcellent algorithms have been proposed. But all we shall call I the master itemset. Association rulesof these models relying either on closed itemset are generated over a large set of transactions,mining or expert’s evaluation. None of these denoted by T. Xn association rule can be deemedmodels are proven best in all data set contexts. interesting if the items involved occur together oftenClosed itemset model is missing adaptability and and there are suggestions that one of the sets mightexpert’s evaluation process is resulting different in some cases lead to the presence of the other set. Xsignificance for same rule under different association rules are characterised as interesting, orexpert’s view. To overcome these limits here we not, based on mathematical notions called „support‟,proposed a post mining process called AORW as „confidence‟ and „lift‟. Although there are now aan extension to closed itemset mining algorithms. multitude of measures of interestingness available to the analyst, many of them are still based on theseKeywords: post mining, association rule mining, three functions.closed itemset, PEPP, Inference analysis, rule In many applications, it is not only thepruning. presence of items in the antecedent and the consequent parts of an association rule that may be1. INTRODUCTION of interest. Consideration, in many cases, should be In general, association rules tend to deliver given to the relationship between the absence ofan efficient method of analysing binary or items from the antecedent part and the presence ordiscredited data sets that are large in volume. One absence of items from the consequent part. Further,common practice is to determine relationships the presence of items in the antecedent part can bebetween binary variables in transaction databases, related to the absence of items from the consequentwhich is known as „market basket analyses. In the part; for example, a rule such as {margarine} ⇒case of non-binary data, initially data being coded as {not butter}, which might be referred to as abinary and then association rules will be used to „replacement rule‟. One way to incorporate theanalyse. Association rules having their impact on absence of items into the association rule mininganalysing large binary datasets and considered as paradigm is to consider rules of the form X⇒/ Yversatile approach for modern applications such as [20]. Another is to think in terms of negations.detection of bio-terrorist attacks [1] and the analysis Suppose X ⊂ I, then write ¬X to denote the absenceof gene expression data [2], to the analysis of Irish or negation of the item, or items, in X from athird level education applications [4]. The steps transaction. Considering X as a binary {0, 1}involved in a typical association rule analysis are variable, the presence of the items in X is equivalent 1342 | P a g e
  • 2. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-1351to X = 1, while the negation ¬X is equivalent to popular association rules though the interest ofXX`€=X`€0. The concept of considering association frequent sets goes further.rules involving negations, or “negative Based on the proposals [14, 12, 10, 3]implications”, is due to Silverstein [20]. recently cited in literature and their motivations, it is observable that the process of rule pruning will opt to one of the two models.B. Post mining (1) Rule pruning under post mining process Pruning rules and detection of rule that demands domain experts observationinterestingness are employed in the post-mining (2) Rule pruning under post mining processstage of the association rule mining paradigm. that aims to avoid domain expert‟s role in pruningHowever, there are a host of other techniques used process.in the post-mining stage that do not naturally fall As in the first case rule pruning accuracyunder either of these headings. Some such depends on domain expert‟s awareness on attributetechniques are in the area of redundancy- removal. relations. In this case it is always obvious to pruneThere are often a huge number of association rules the rules under reliable domain expert‟s observation.to contend with in the post-mining stage and it can In second case the models prune the rules based onbe very difficult for the user to identify which ones dynamically determined attribute relations. Thisare interesting. Therefore, it is important to remove limits the solution to specific data models. Hence itinsignificant rules, prune redundancy and do further is not adaptable for all contexts.post-mining on the discovered rules [18, 11]. Liu The Rest of the paper organized as; in[11] proposed a technique for pruning and section II we discussed the most frequently citedsummarising discovered association rules by first post mining models to improve rule accuracy.removing those insignificant associations and then Section III briefs the post mining process that weforming direction-setting rules, which provide a sort opted. Section IV briefs the approach of closedof summary of the discovered association rules. itemset mining, and inference approach for itemsetLent [19] proposed clustering the discovered pruning. Section V explores the process of Attributeassociation rules. Relational weights analysis for rule pruning. Results discussion and comparative study will be in SectionC. Influences of Input formats VI that fallowed by conclusion and references. The input formats that influence the postmining methodologies are binary data, text data and 2. RELATED WORKstreaming data. In association rule mining, much of Huawen Liu, [14] proposed post processingrecent research work has focused on the difficult approach for rule reduction using closed set to filterproblem of mining data streams such as click stream superfluous rules from knowledge base in a post-analysis, intrusion detection, and web-purchase processing manner that can be well discovered by arecommendation systems. In the case of streaming closed set mining technique. Most of these methodsdata, it is not possible to perform mining on cached are based on the rule structure analysis where theand fixed data records. The attempt of caching data relations between the rules have been analysed usingleads to memory usage issues, and the attempt of corresponding problem. This procedure claims tomining static data leads to worst time complexity eliminate noise and redundant rules in order tosince continuous dataset update leads to continuous provide users with compact and precise knowledgepasses through dataset. From the data mining point derived from databases by data mining method.of view, texts are complex data giving raise to Further, a fast rule reduction algorithm using closedinteresting challenges. First, texts may be set is introduced. Other endeavours have beenconsidered as weakly structured, compared with attempted to prune the rule bases directly. Thedatabases that rely on a predefined schema. typical cases have been elaborated and illustrated byMoreover, texts are written in natural language, eminent people from all over.carrying out implicit knowledge, and ambiguities. Modest number of proposals is addressedHence, the representation of the content of a text is on pre-pruning and post-pruning. In a line, theoften only partial and possibly noisy. One solution pruning operation occurs at the phase of generationfor handling a text or a collection of texts in a of significant rules. To add to this, the post pruningsatisfying way is to take advantage of a knowledge technique mainly concerns primarily emphasisesmodel of the domain of the texts, for guiding the that pruning operation occurs after the ruleextraction of knowledge units from the texts. One of generation, among which the rule cover is athe obvious hot topics of data mining research in the representative case. To extract interesting rules,last few years has been rule discovery from binary aprori knowledge has also been taken into accountdata. It concerns the discovery of set of attributes in literatures and a template denotes which attributesfrom large binary records such that these attributes should occur in the antecedent or the consequentare true within a same line often enough. It is then part of the rule.easy to derive rules that describe the data, the 1343 | P a g e
  • 3. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-1351 An association rule is an implication processing of rule mining. The empirical studyexpression illustrates a kind of association relation. proved that the discovery of dependent relationsA rule is said to be interesting or valid if its support from closed set helps to eliminate redundant rules.and confidence are user specified minimum support Hetal Thakkar [12]. In the case of stream data, theand confidence thresholds respectively. The post-mining of association is more challenging andassociation rule primarily comprises two phases continuous post mining of association rules is anbased on the identification of frequent item sets unavoidable requirement, which is discussed by thisfrom the given data mining contexts. However, the author. He presented a technique for continuousproblem of massive real world data transactions can post-mining of association rules in a data streambe rounded off by adopting other alternatives, which management system. He described the architecturein turn benefits in lossless representation of data. and techniques used to achieve this advancedTheoretically, transaction database and relation functionality in the Stream Mill Miner (SMM)database are two different inter-transformable prototype, an SQL-based DSMS designed to supportrepresentations of data. continuous mining queries, which is impressive. The production of association mining is a Hacene Cherfi [10] discussed a post association rulerule base with hundreds to millions of rules. In order mining approach for text mining that combines datato highlight the important and key ones, certain mining, semantic techniques for post-mining andother rules are proposed which are Second –order selection of association rules. To focus on the resultrule which states that if the cover of the item set is analysis and to find new knowledge units,known, then the corresponding relation can be easily classification of association rules according toderived. All the technical definitions given hence qualitative criteria using domain model asforth deal with the transaction of the data through background knowledge has been introduced. Theitem-sets in association mining. The equivalent authors carried out an empirical study on molecularproperty significantly states that rules and classes of biology dataset that proved the benefits of takingthe same hierarchical database support the power in into account a knowledge domain model of the data.the content. Traditional data mining techniques are Ronaldo Cristiano Prati [3]. The Receiver Operatingimplemented in order to justify the property and its Characteristics (ROC) graph is a popular way ofcorresponding definition in the specified context. assessing the performance of classification rules, butThe thus identified second order rules can be used to they are inappropriate to evaluate the quality offilter out useless rules out of the priority rule-set. association rules, as there is no class in association The effectiveness and efficiency of the rule mining and the consequent part of differentclassical methods in plating the rules is thoroughly association rules might not have any correlation atverified under the 2.8 GHZ Pentium PC. Two group all. Chapter VIII presents a novel technique ofexperiments were conducted to prune the QROC, a variation of ROC space to analyze itemsetinsignificant association rules and to remove useless costs/benefits in association rules. It can be used toassociation classification rules. Removal of non- help analysts to evaluate the relative interestingnesspredictive rules by virtue of information gain metric among different association rules in different costis much similar to CHARM and CBA which also scenarios.work on the same track. To generate association andclassification rules by pruning method of Apriori 3. ATTRIBUTE ONTOLOGICALsoftware, some external tools are essentially RELATIONAL WEIGHTS FORrequired. The effectiveness of the pruning algorithm COHERENT ASSOCIATION RULEScan be inversely related to the number of rules. This The approach Attribute Ontologicalalong with the computational time consumed, Relational Weights in short can refer as AORW isdetermine an efficient criterion of pruning. post mining process to prune the rules based on Efficient post processing methods are attribute relational relevancy. The process ofhence proposed to remove pointless rules from rule- AORW Framework can be classified asways by eliminating redundancy among rules. The  Closed itemset miningdependent relation, exploitation, makes this method  Describing item class descriptora self manageable knowledge. The pruning The input for AORW Framework isprocedure has been sliced into three stages starting 1. A set of rulesfrom derivation to pruning operation on rule-set by 2. An XML descriptor describes attributes, classes,the use of close rule-sets. It is cost-effective and class properties and class relations.consumes very little time for the transaction. Hence Here in this proposal we considered our earlier workforth, it can be applied to exploit sampling to find closed itemsets.techniques and data structures, thereby increasing The process steps involved in AORW frameworkthe efficiency. are Huawen Liu, [14] presented a technique on 1. Initially AORW measures the property supportpost-processing for rule reduction using closed set degree for each attribute involved in given rule.that was targeting to filter the otiose rules in a post- 1344 | P a g e
  • 4. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-13512. By using the property support degree of the fts ( st )|st :s p ( for each p 1..|DBS |)| attributes, Attribute Relation support of attribute pairs of the given rule will be measured. DBS Is set of sequences3. With the help of Attribute Relation supports of all attribute pairs of an itemset that belongs to a fts ( st ) : Represents the total support „ts‟ of given rule, Attribute relation support degree of sequence st is the number of super sequences of st that itemset will be measured. Qualified support „qs‟: The resultant coefficient of4. Using these Attribute relation support degrees of total support divides by size of sequence database Left Hand Side and Right Hand Side itemsets of adopt as qualified support „qs‟. Qualified support the given rule, relation confidence of the rule will can be found by using fallowing formulation. be determined. f (s )5. Prunes the rules based on their attribute relation f qs ( st )  ts t |DBS | support degree. Detailed explanation of each step can be found in Sub-sequence and Super-sequence: A Section V. sequence is sub sequence for its next projected sequence if both sequences having same total4. CLOSED ITEMSET MINING [34] support.A. Dataset adoption and formulation Super-sequence: A sequence is a super Item Sets I: A set of diverse elements by sequence for a sequence from which that projected,which the sequences generate. if both having same total support. Sub-sequence and super-sequence can be formulated nI   ik as k 1 Note: „I‟ is set of diverse elements If f ts ( st )  rs where „rs‟ is required supportSequence set „S‟: A set of sequences, where each threshold given by usersequence contains elements each element „e‟ And st :s p for any p value where fts ( st ) fts ( s p )belongs to „I‟ and true for a function p(e). Sequenceset can formulate as B. Closed Itemset Discovery[34] m As a first stage of the proposal we perform s   ei |( p (ei ),eiI ) dataset pre-processing and itemsets Database i 1 initialization. We find itemsets with single element,Represents a sequence„s‟ of items those belongs to in parallel prunes itemsets with single element thoseset of distinct items „I‟. contains total support less than required support.„m‟: total ordered items. Forward Edge Projection:P(ei): a transaction, where ei usage is true for that In this phase, we select all itemsets fromtransaction. given itemset database as input in parallel. Then we t start projecting edges from each selected itemset to S  s j all possible elements. The first iteration includes the j 1 pruning process in parallel, from second iterationS: represents set of sequences onwards this pruning is not required, which we„t‟: represents total number of sequences and its claimed as an efficient process compared to othervalue is volatile similar techniques like BIDE. In first iteration, wesj: is a sequence that belongs to S project an itemset s p that spawned from selectedSubsequence: a sequence s p of sequence set „S‟ isconsidered as subsequence of another sequence itemset si from DBS and an element ei consideredsq from „I‟. If the f ts ( s p ) is greater or equal to rs , of Sequence Set „S‟ if all items in sequence Sp isbelongs to sq as an ordered list. This can be then an edge will be defined between si and ei . Ifformulated as f ts ( si )  f ts ( s p ) then we prune si from DBS . nIf (  s pisq )( s p  sq ) This pruning process required and limited to first i 1 iteration only. n m s pS and sqS From second iteration onwards project the itemsetThen  s :  s pi qj i 1 j 1 S p that spawned from S p to each element ei of „I‟. whereTotal Support „ts‟ : occurrence count of a sequence An edge can be defined between S p and ei ifas an ordered list in all sequences in sequence set„S‟ can adopt as total support „ts‟ of that sequence. fts ( s p ) is greater or equal to rs . In this descriptionTotal support „ts‟ of a sequence can determine by S p is a projected itemset in previous iteration andfallowing formulation. 1345 | P a g e
  • 5. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-1351eligible as a sequence. Then apply the fallowing Pattern actual coverage is pattern positive score-validation to find closed sequence. pattern negative scoreEdge pruning : Interest gain: Actual coverage of the If any of f ts ( s p )  f ts ( s p ) that edge will pattern involved in association rule Coherent rule Actual coverage of the rule‟s left sidebe pruned and all disjoint graphs except s p will be pattern must be greater than or equal to actualconsidered as closed sequence and moves it coverage of the right side pattern Inference Support ias: refers actual coverage of theinto DBS and remove all disjoint graphs from patternmemory. fia ( st ) : Represents the inference support of the The above process continues till theelements available in memory those are connected sequence stthrough direct or transitive edges and projecting Approach:itemsets i.e., till graph become empty For each pattern sp of the pattern dataset, If fia ( st )  ias then we prune that patternC. Inference Analysis [35]Inferences: Pattern positive score is sum of no of 5. ATTRIBUTE ONTOLOGICALtransactions in which all items in the pattern exist, RELATIONAL WEIGHTS FRAMEWORKno of transactions in which all items in the pattern The proposed post mining process Attributedoes not exist Ontological Relational Weights described in Pattern negative score is no of transactions detailed here. Table 1 represents the notations usedin which only few items of the pattern exist in AORW framework. .Table1: Notations used in Attribute Ontological Relational Weights1 r Left side itemset of the Rule r lhs2 rrhs Right side itemset of the rule r3 RS Rule set4 cpcc Class properties count of class c5 apca Attribute property count of attribute a6 psd Property support degree7 tpc Total properties of class c8 psa Property support of attribute a of class c apca psa  cpcc ps where a  c9 psd a  a tpc10 rsc Relation support of class c is max threshold value 1.11 ARS Attribute Relation support12 ARSDi Attribute Relation support Degree of itemset i.13 If attributes a i and a j belongs to ARS (ai , a j )  1 , where {ai , a j }  c same class c14 If attribute a i belongs to class ci and psd ai  psd a j ARS (ai , a j )  attribute a j belongs to class c j , ci rsci  rsc j and c j relation is true then15 If ci and c j relation is false ARS (ai , a j )  0 ,16 ARS (ai , a j )  ARS (a j , ai ) Applicable in all cases such as both belongs to same class, both belongs to different classes that are not having relation and both belongs to different classes that are having relation17 No of attribute pairs in an itemset pc 1346 | P a g e
  • 6. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-135118 pc : is total number of pairs in a given itemset. pc  0; n : is total number of attributes in the given itemset n 1 n 1  pc  i i 1 (or ) pc  2 n19 pcrlhs Pair-count of itemset, which is lhs of rule r .20 pcrrhs Pair-count of itemset, which is rhs of rule r .21 pcrlhs rrhs Pair-count of itemset that generated from rlhs rrhs22 PS  { p , p , ......., p } Pair set that generated from itemset i i 1 2 m23 ARS ( pi ) Attribute relation support of pair pi .24 | psi | Attribute relation support degree of itemset i  ARS ( pk ) k 1 And pk  psi , here pk is kth pair of pair-set ps of itemset i ARSDi  | psi |25 RCr Relation confidence of rule r . |PSrlhs | Attribute relation support degree of itemset, which is lhs of26  pk rule r ARSDr  k 1 And pk  psrlhs , here pk is kth pair of pair-set ps of itemset lhs |PS rlhs | lhs of rule r |PSrrhs | Attribute relation support degree of itemset, which is rhs of27  pk rule r ARSDr  k 1 And pk  psrrhs , here pk is kth pair of pair-set ps of itemset rhs |PSr | rhs rhs of rule r |PS | Attribute relation support degree of itemset that generated from28 rlhs rrhs  pk lhs  rhs of rule r k 1 ARSDrlhs rrhs  |PSrlhs rrhs | And pk  PS r , here pk is kth pair of pair-set ps of lhs rrhs itemset that generated from lhs  rhs of rule r ARSDrlhs rrhs Relation confidence of r is the coefficient emerged as result29 rcr  when 30attribute support degree of all attribute involved in rule ARSDrlhs r is divided by attribute support degree of rule r ‟s lhsProperty Support: No of attribute properties are Relation confidence: is a rule level measurementmatched to number class properties to which that concludes the relation strength between left hand sideattribute belongs to [Table 1 row: 8]. itemset and right hand side itemset of a given rule[seeProperty Support degree: indicates the ratio of table 1 row: 29]properties matched to class level properties [table 1 AORW algorithm:row: 9]. Input: Rule set RS , Class Descriptor CD and relation psa confidence threshold rctEx: psd a  ; here a is an attribute of class cpcc Output: Significant Rule set RS which is subset ofc [ a  c] RS RS  RSAttribute Relation support: Indicates the strength of Begin:the relation between two attributes of an itemset that While RS is not emptyare considered as pair for equation [see table 1 row: Begin:11, 13]. Read a rule r from RSPair Count: Total number of two attributes sets; herethese attribute sets must be unique [see table 1, row Find property support psa and property support17, 18] degree psd a of each attribute a of rlhs [Table 1Attribute Relation support degree: is an itemset row: 8, 9]level measurement representing average relationstrength of the attributes those belongs to an itemset Find property support psa and property support[see table 1 row: 24] 1347 | P a g e
  • 7. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-1351degree psd a of each attribute a of rrhs [Table 1 mining techniques. 3) There is the involvement of an enhanced occurrence and a probability reduction inrow: 8, 9] the memory exploitation rate with the aid of the traitFind unique two attribute pair set PS from equivalent prognosis and also rim snipping of the rlhs PEPP with inference analysis and AORW. This is onrlhs [Table 1 row: 22] the basis of the surveillance done which concludesFind Attribute relation support ARS p for each pair that AORW implementation is far more noteworthy and important in contrast with the likes of otherp , where p  PS rlhs [Table 1; row: 13, 15, 15 and notable models [10, 12, 14].16] JAVA 1.6_ 20th build was employed forFind Attribute relation support degree ARSDrlhs of accomplishment of the AORW along with PEPP under inference analysis. A workstation equippedrlhs [Table 1; row: 24, 26]. with core2duo processor, 2GB RAM and WindowsFind unique two attribute pair set PS r from XP installation was made use of for investigation of lhs the algorithms. The parallel replica was deployed torlhs [Table 1 row: 22] attain the thread concept in JAVA.Find Attribute relation support ARS p for each pair Dataset Characteristics:p , where pPSrrhs [Table 1; row: 13, 15, 15 and 16] Pi is supposedly found to be a very opaqueFind Attribute relation support degree ARSDrrhs of dataset, which assists in excavating enormous quantity of recurring clogged series with a profitablyrlhs [Table 1; row: 24, 27]. high threshold somewhere close to 90%. It also has aFind unique two attribute pair set PSr r from distinct element of being enclosed with 190 protein lhs rhs series and 21 divergent objects. Reviewing ofrlhs  rrhs [Table 1 row: 22] serviceable legacy‟s consistency has been made use of by this dataset. Fig. 3 portrays an image depictingFind Attribute relation support ARS p for each pair dataset series extent status.p , where p  PS rlhs  rrhs [Table 1; row: 13, 15, 15 In assessment with all the other regularly quoted forms like [14,12,10], Post-Processing forand 16] Rule Reduction Using Closed Set[14] has made itsFind Attribute relation support degree ARSDr mark as a most preferable, superior and sealed lhs rrhs example of post mining copy, taking in view theof rlhs  rrhs [Table 1; row: 24, 28]. detailed study of the factors mainly, expertsFind unique two attribute pair set PSr r from involvement, memory consumption and runtime. lhs rhsrlhs  rrhs [Table 1 row: 22]Find Attribute relation support ARS p for each pairp , where p  PS [Table 1; row: 13, 15, 15 rrhs rlhsand 16]Find Attribute relation support degree ARSDr rhs rlhsof rlhs  rrhs [Table 1; row: 24, 28].Find Relation confidence rcr of rule rIf rcr  rct then add rule r to resultant rule set RS EndEndFig 2: Attribute Ontological Relational Weightsalgorithm This segment focuses mainly on providingevidence on asserting the claimed assumptions that 1) Fig 3: A comparison report for RuntimeThe post mining framework AORW is competentenough to momentously surpass results whenevaluated against other post mining techniques [14, 10,12. 2) Utilization of memory and computationalcomplexity is less when compared to other post 1348 | P a g e
  • 8. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-1351 Fig 6: Rules pruned in in % by AORW and RRUC[14]Fig: 4: A comparison report for memory usage 6. CONCLUSION We proposed a post mining process called Attribute relation analysis framework (AORW) for pruning association rules that are contextually irrelevant. In earlier works [10, 12, 14] the contextual irrelevancy was identified in various ways such as (1) rule evaluation by domain expert, (2) rule evaluation by itemset closeness. We argued that none of these two models is significant in all contexts. In second case adaptability to various data contexts is missing. In the first case, rule selection highly influenced by the experts view, that is when expert changes then rule significance might be rated differently. To defend these limits here we proposed a post mining processFig 5: Sequence length and number of sequences at as an extension to our earlier proposed closed itemsetdifferent thresholds in Pi dataset mining algorithm PEPP with inference analysis [34, 35]. Here in this proposed post mining process In contrast to AORW and RRUC [14], a very AORW, the experts view is not defending one to other,intense dataset Pi is used which has petite recurrent rather it extends or refines. This become possible inclosed series whose end to end distance is less than 10, AORW because of the proposed concept calledeven in the instance of high support amounting to attribute class relation descriptor. In this work wearound 90%. The diagrammatic representation consider relation confidence as bench mark for ruledisplayed in Fig 3 explains that the above mentioned pruning; in future this work can be extended to prunetwo algorithms execute in a similar fashion in case of the rules by opting relation confidence thresholdrules that are generated at support 90% and above. under inference analysis.But in situations when the support case is 88% andless, then the act of AORW surpasses RRUC [14]‟s REFERENCESroutine. The disparity in memory exploitation of [1] Fienberg, S. E. & Shmueli, G: "StatisticalAORW and RRUC [14] can be clearly observed issues and challenges associated with rapidbecause of the consumption level of AORW being detection of bio-terrorist attacks. Statistics inlow than that of RRUC. Fig 5 indicates that rules that Medicine, 2005are contextually irrelevant have been pruned by [2] Carmona-Saez, P., Chagoyen, M.,AORW in high probable rate and stable. Apart from Rodriguez, A., Trelles, O., Carazo, J. M., &the benefits observed, the rules identified by AORW Pascual-Montano, A: :Integrated analysis ofare more relevant to transaction consequences. It gene expression by association rulesbecomes possible since there is no task of experts discovery". BMC Bioinformatics, 2006evaluation in post mining process. Due to the concept [3] Ronaldo Cristiano Prati, "QROC: Aof attribute class relation descriptor, the relation Variation of ROC Space to Analyze Item Setbetween attributes involved in rule is stable. Costs/Benefits in Association Rules", Post- Mining of Association Rules: Techniques for Effective Knowledge Extraction, IGIGlobal, 2009 Pages: 133-148 pp. [4] McNicholas, P. D: "Association rule analysis of CAO data (with discussion). Journal of the Statistical and Social Inquiry Society of Ireland, 2007 1349 | P a g e
  • 9. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / InternationalJournal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-1351[5] Prati, R., & Flach, P. "ROCCER: an Knowledge". ACM SIGKDD Conference on algorithm for rule learning based on ROC Knowledge Discovery in Databases, 2004 analysis". Proceeding of the 19th [16] Jaroszewicz, S., & Scheffer, T: "Fast International Joint Conference on Artificial Discovery of Unexpected Patterns in Data, Intelligence, 2005 Relative to a Bayesian Network". ACM[6] Fawcett, T: "PRIE: A system to generate SIGKDD Conference on Knowledge rulelists to maximize ROC performance". Discovery in Databases, 2005 Data Mining and Knowledge Discovery [17] Faure, C., Delprat, D., Boulicaut, J. F., & Journal, 2008, 17(2), 207-224. Mille, A: "Iterative Bayesian Network[7] Kawano, H., & Kawahara, M: "Extended Implementation by using Annotated Association Algorithm Based on ROC Association Rules." 15th Int‟l Conf. on Analysis for Visual Information Navigator". Knowledge Engineering and Knowledge Lecture Notes In Computer Science, 2002, Management – Managing Knowledge in a 2281, 640-649. World of Networks, 2006[8] Piatetsky-Shapiro, G., Piatetsky-Shapiro, G., [18] Liu, B., & Hsu, W: "Post-Analysis of Frawley, W. J., Brin, S., Motwani, R., Learned Rules". AAAI/IAAI, 1996 Ullman, J. D., : "Discovery, Analysis, and [19] Lent, B., Swami, A. N., & Widom, J: Presentation of Strong Rules". Proceedings "Clustering Association Rules". Proceedings of the 11th international symposium on of the Thirteenth International Conference on Applied Stochastic Models and Data Data Engineering, 1997, Birmingham U.K., Analysis ASMDA, 2005, 16, 191-200 IEEE Computer Society[9] Liu, B., Hsu, W., & Ma, Y: "Pruning and [20] Savasere, A., Omiecinski, E., & Navathe, S. summarizing the discovered associations". B: "Mining for strong negative associations Proceedings of the fifth ACM SIGKDD in a large database of customer transactions". international conference on Knowledge Proceedings of the 14th International discovery and data mining 1999, (pp. 125- Conference on Data Engineering, 134) Washington DC, USA, 1998[10] Hacene Cherfi, Amedeo Napoli, Yannick [21] Basu, S., Mooney, R. J., Pasupuleti, K. V., & Toussaint: "A Conformity Measure Using Ghosh J: "Evaluating the Novelty of Text- Background Knowledge for Association Mined Rules using Lexical Knowledge". 7th Rules: Application to Text Mining", Post- ACM SIGKDD International Conference on Mining of Association Rules: Techniques for Knowledge Discovery in Databases, 2001 Effective Knowledge Extraction, IGIGlobal, [22] Claudia Marinica and Fabrice Guillet: 2009 Pages: 100-115 pp. "Knowledge-Based Interactive Postmining of[11] Liu, B., Hsu, W., & Ma, Y.: "Pruning and Association Rules Using Ontologies", IEEE summarizing the discovered associations". TRANSACTIONS ON KNOWLEDGE Proceedings of the fifth ACM SIGKDD AND DATA ENGINEERING, VOL. 22, international conference on Knowledge NO. 6, JUNE 2010 discovery and data mining, 1999 [23] M.J. Zaki and C.J. Hsiao, “Charm: An[12] Hetal Thakkar, Barzan Mozafari, Carlo Efficient Algorithm for Closed Itemset Zaniolo: "Continuous Post-Mining of Mining,” Proc. Second SIAM Int‟l Conf. Association Rules in a Data Stream Data Mining, pp. 34-43, 2002. Management System", Post-Mining of [24] J. Pei, J. Han, and R. Mao, “Closet: An Association Rules: Techniques for Effective Efficient Algorithm for Mining Frequent Knowledge Extraction, IGIGlobal, 2009 Closed Itemsets,” Proc. ACM SIGMOD Pages: 116-132 pp. Workshop Research Issues in Data Mining[13] Kuntz, P., Guillet, F., Lehn, R., & Briand, H: and Knowledge Discovery, pp. 21-30, 2000. "A User-Driven Process for Mining [25] M.J. Zaki and M. Ogihara, “Theoretical Association Rules". In D. Zighed, H. Foundations of Association Rules,” Proc. Komorowski, & J. Zytkow (Eds.), 4th Eur. Workshop Research Issues in Data Mining Conf. on Principles of Data Mining and and Knowledge Discovery (DMKD ‟98), pp. Knowledge Discovery 2000 1-8, June 1998.[14] Huawen Liu, Jigui Sun, Huijie Zhang: "Post- [26] D. Burdick, M. Calimlim, J. Flannick, J. Processing for Rule Reduction Using Closed Gehrke, and T. Yiu, “Mafia: A Maximal Set", Post-Mining of Association Rules: Frequent Itemsgorithm,” IEEE Trans. Techniques for Effective Knowledge Knowledge and Data Eng., vol. 17, no. 11, Extraction, IGIGlobal,2009 Pages: 81-99 pp. pp. 1490-1504, Nov. 2005.[15] Jaroszewicz, S., & Simovici, D. A: [27] J. Li, “On Optimal Rule Discovery,” IEEE "Interestingness of Frequent Itemsets using Trans. Knowledge and Data Eng., vol. 18, Bayesian networks as Background no. 4, pp. 460-471, Apr. 2006. 1350 | P a g e
  • 10. M.Saritha, T.Venkata Ramana, Kareemunnisa, Y.Sowjanya, M.Nishantha / InternationalJournal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue 5, September- October 2012, pp.1342-1351 [28] M. Hahsler, C. Buchta, and K. Hornik, SHORT BIO DATA FOR THE AUTHOR “Selective Association Rule Generation,” Computational Statistic, vol. 23, no. 2, pp. Ms. M. Saritha working as Assistant Professor in the 303-315, Kluwer Academic Publishers, Department of Computer Science and Engineering. 2008. Guru Nanak Institutions of Technical Campus with a[29] H. Toivonen, M. Klemettinen, P. Ronkainen, teaching experience of 1.5 years. She is a K. Hatonen, and H. Mannila, “Pruning and B.Tech(ECE), and M.Tech in Computer science (PC) Grouping of Discovered Association Rules,” and Her areas of interest includes Mobile Computing, Proc. ECML-95 Workshop Statistics, DataMining and Information Security. Machine Learning, and Knowledge Discovery in Databases, pp. 47-52, 1995. Professor.ThaduriVenkataRamana received the[30] J. Bayardo, J. Roberto, and R. Agrawal, Master Degree in Computer Science and Engineering “Mining the Most Interesting Rules,” Proc. from Osmania University, Hyderabad. Pursuing Ph.D ACM SIGKDD, pp. 145-154, 1999. in Computer Science and Engineering from[31] M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A.I. Verkamo, “Finding Jawaharlal Nehru Technological University, Interesting Rules from Large Sets of Hyderabad (JNTUH), Hyderabad. . His area of Discovered Association Rules,” Proc. Int‟l interest in research is Information Retrieval & Data Conf. Information and Knowledge mining. Presently working as a Professor & Head, Management (CIKM), pp. 401-407, 1994. Department of Computer Science and Engineering,[32] R. Srikant and R. Agrawal, “Mining SLC‟s Institute of Engineering and Technology, Generalized Association Rules,” Proc. 21st Int‟l Conf. Very Large Databases, pp. 407- Hyderabad. 419, http://citeseer.ist.psu.edu/srikant95mining.ht Ms. Kareemunnisa working as Assistant Professor in ml, 1995. the Department of Computer Science and[33] Barzan Mozafari, Hetal Thakkar, Carlo Engineering. Guru Nanak Institutions of Technical Zaniolo. Verifying and Mining Frequent Campus with a teaching experience of 6 years. She is Patterns from Large Windows over Data a B.Tech(IT), and M.Tech in Computer scienceand Streams. In Proceedings of the 24th Engineering (CSE) and Her areas of interest includes International Conference on Data Mobile Computing, Data Mining and Information Engineering (ICDE 2008), Cancún, México, Security. April 7-12, 2008 Ms. Y. Sowjanya received Bachelor Degree in[34] Parallel Edge Projection and Pruning (PEPP) Information Technology from Jawaharlal Nehru Based Sequence Graph protrude approach for Technological University, Hyderabad (JNTUH) and Closed Itemset Mining kalli Srinivasa pursuing Master Degree in Computer Science and Nageswara Prasad, Sri Venkateswara Engineering (CSE) form Jawaharlal Nehru University, Tirupati, Andhra Pradesh, India. Technological University, Hyderabad (JNTUH). Her Prof. S. Ramakrishna, Department of research interest includes Computer Networks and Computer Science, Sri Venkateswara Information Retrieval Systems, Data Mining. University, Tirupati, Andhra Pradesh, India ol. 9 No. 9 September 2011 International Ms. M. Nishantha working as Assistant Professor in Journal of Computer Science and the Department of Computer Science and Information Security Publication September Engineering. Guru Nanak Institutions of Technical 2011, Volume 9 No. 9 Campus with a teaching experience of 6 years. She is[35] Mining Closed Itemsets for Coherent Rules: a B.Tech(IT), and M.Tech in Computer scienceand An Inference Analysis Approach By Kalli Engineering (IT) and Her areas of interest includes Srinivasa Nageswara Prasad, Prof. S. Mobile Computing, DataMining and Information Ramakrishna, Global Journal of Computer Security. Science and Technology Volume 11 Issue 19 Version 1.0 November 2011 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online ISSN: 0975-4172 & Print ISSN: 0975-4350 1351 | P a g e

×