• Save
Data Mining: Mining ,associations, and correlations
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Data Mining: Mining ,associations, and correlations

on

  • 1,731 views

Data Mining: Mining ,associations, and correlations

Data Mining: Mining ,associations, and correlations

Statistics

Views

Total Views
1,731
Views on SlideShare
1,652
Embed Views
79

Actions

Likes
0
Downloads
0
Comments
0

2 Embeds 79

http://dataminingtools.net 40
http://www.dataminingtools.net 39

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Data Mining: Mining ,associations, and correlations Presentation Transcript

  • 1. Mining ,Associations, and Correlations
  • 2. What is Market Basket Analysis?
    Market basket analysis may be performed on the retail data of customer transactions at a store. That can be then used to plan marketing or advertising strategies, or in the design of a new catalog. Market basket analysis can also help retailers plan which items to put on sale at reduced prices. If customers tend to purchase computers and printers together, then having a sale on printers may encourage the sale of printers as well as computers.
  • 3. What is Association rule mining?
    Association rule mining can be viewed as a two-step process:
    1. Find all frequent item-sets: By definition, each of these item-sets will occur at least as frequently as a predetermined minimum support count, min sup.
    2. Generate strong association rules from the frequent item-sets: By definition, these rules must satisfy minimum support and minimum confidence.
  • 4. Basis for pattern Mining
    The completeness of patterns to be mined
    The levels of abstraction involved in the rule set
    The number of data dimensions involved in the rule:
    The types of values handled in the rule
    The kinds of rules to be mined
    The kinds of patterns to be mined
  • 5. Methods to improve the efficiency of Apriori algorithm for mining
    Hash-based technique
    Hashing item-sets into corresponding buckets
    A hash-based technique can be used to reduce the size of the candidate k-item-sets, Ck , for k > 1.
  • 6. Methods to improve the efficiency of Apriori algorithm for mining
    Transaction reduction
    Reducing the number of transactions scanned in future iterations
    A transaction that does not contain any frequent k-item-sets cannot contain any frequent (k + 1)-item-sets.
  • 7. Methods to improve the efficiency of Apriori algorithm for mining
    Partitioning
    Partitioning the data to find candidate item-sets
    A partitioning technique can be used that requires just two database scans to mine the frequent item-sets as shown below , It has two phases
  • 8. Methods to improve the efficiency of Apriori algorithm for mining
    Sampling
    Mining on a subset of the given data
    The basic idea of the sampling approach is to pick a random sample S of the given data D, and then search for frequent item-sets in S instead of D. In this way, we trade off some degree of accuracy against efficiency
  • 9. Methods to improve the efficiency of Apriori algorithm for mining
    Dynamic item-set counting
    Adding candidate item-sets at different points during a scan
    A dynamic item-set counting technique was proposed in which the database is partitioned into blocks marked by start points.
  • 10. Pruning strategies in data mining
    Item merging: If every transaction containing a frequent item-set X also contains an item-set Y but not any proper superset of Y , then X ∪Y forms a frequent closed item-set and there is no need to search for any item-set containing X but no Y . Sub-item-set pruning: If a frequent item-set X is a proper subset of an already found frequent closed item-set Y and support count(X) = support count(Y ), then X and all of X’s descendants in the set enumeration tree cannot be frequent closed item-sets and thus can be pruned.
  • 11. Pruning strategies in data mining
    Item skipping: In the depth-first mining of closed item-sets, at each level, there will be a prefix item-set X associated with a header table and a projected database. If a local frequent item p has the same support in several header tables at different levels, we can safely prune p from the header tables at higher levels.
  • 12. What are Constraint-Based Association Mining?
    The constraints can include the following:
    Knowledge type constraints: These specify the type of knowledge to be mined, such as association or correlation.
    Data constraints: These specify the set of task-relevant data.
    Dimension/level constraints: These specify the desired dimensions (or attributes) of the data, or levels of the concept hierarchies, to be used in mining.
    Interestingness constraints: These specify thresholds on statistical measures of rule interestingness, such as support, confidence, and correlation.
    Rule constraints: These specify the form of rules to be mined.
  • 13. Meta rule-Guided Mining of Association Rules
    Metarules allow users to specify the syntactic form of rules that they are interested in mining. The rule forms can be used as constraints to help improve the efficiency of the mining process.
  • 14. Constraint Pushing or Mining Guided by Rule Constraints
    Rule constraints specify expected set/subset relationships of the variables in the mined rules, constant initiation of variables, and aggregate functions.
  • 15. Visit more self help tutorials
    Pick a tutorial of your choice and browse through it at your own pace.
    The tutorials section is free, self-guiding and will not involve any additional support.
    Visit us at www.dataminingtools.net