Efficient algorithms for mining the concise and lossless representation of high utility itemsets

•

0 likes•546 views

ieeepondy

Efficient Algorithms for Mining the Concise and Lossless Representation
of Closed+ High Utility Itemsets
Abstract:
Mining high utility itemsets (HUIs) from databases is an important data
mining task, which refers to the discovery of itemsets with high utilities
(e.g. high profits). However, it may present too many HUIs to users, which
also degrades the efficiency of the mining process. To achieve high
efficiency for the mining task and provide a concise mining result to users,
we propose a novel framework in this paper for mining closed+ high utility
itemsets (CHUIs), which serves as a compact and lossless representation of
HUIs. We propose three efficient algorithms named AprioriCH (Apriori-
based algorithm for mining High utility Closed+ itemsets), AprioriHC-D
(AprioriHC algorithm with Discarding unpromising and isolated items)
and CHUD (Closed+ High Utility itemset Discovery) to find this
representation. Further, a method called DAHU (Derive All High Utility
itemsets) is proposed to recover all HUIs from the set of CHUIs without
accessing the original database. Results on real and synthetic datasets show
that the proposed algorithms are very efficient and that our approaches

achieve a massive reduction in the number of HUIs. In addition, when all
HUIs can be recovered by DAHU, the combination of CHUD and DAHU
outperforms the state-of-the-art algorithms for mining HUIs.
Existing System:
Frequent itemset mining (FIM) is a fundamental research topic in data
mining. One of its popular applications is market basket analysis, which
refers to the discovery of sets of items (itemsets) that are frequently
purchased together by customers.
However, in this application, the traditional model of FIM may discover a
large amount of frequent but low revenue itemsets and lose the
information on valuable itemsets having low selling frequencies.
These problems are caused by the facts that (1) FIM treats all items as
having the same importance/unit profit/weight and (2) it assumes that
every item in a transaction appears in a binary form, i.e., an item can be
either present or absent in a transaction, which does not indicate its
purchase quantity in the transaction.
Hence, FIM cannot satisfy the requirement of users who desire to discover
itemsets with high utilities such as high profits.
Proposed System:

HUI mining is not an easy task since the downward closure property in
FIM does not hold in utility mining. In other words, the search space for
mining HUIs cannot be directly reduced as it is done in FIM because a
superset of a low utility itemset can be a high utility itemset.
Were proposed for mining HUIs, but they often present a large number of
high utility itemsets to users. A very large number of high utility itemsets
makes it difficult for the users to comprehend the results. It may also cause
the algorithms to become inefficient in terms of time and memory
requirement, or even run out of memory.
It is widely recognized that the more high utility itemsets the algorithms
generate, the more processing they consume. The performance of the
mining task decreases greatly for low minimum utility thresholds or when
dealing with dense databases.
Hardware Requirements:
• System : Pentium IV 2.4 GHz.
• Hard Disk : 40 GB.
• Floppy Drive : 1.44 Mb.
• Monitor : 15 VGA Colour.
• Mouse : Logitech.
• RAM : 256 Mb.
Software Requirements:

• Operating system : - Windows XP.
• Front End : - JSP
• Back End : - SQL Server
Software Requirements:
• Operating system : - Windows XP.
• Front End : - .Net
• Back End : - SQL Server

Similar to Efficient algorithms for mining the concise and lossless representation of high utility itemsets

IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...IRJET Journal

Discovering High Utility Item Sets to Achieve Lossless Mining using Apriori A...rahulmonikasharma

International Journal of Engineering Research and Development (IJERD)IJERD Editor

A Fuzzy Algorithm for Mining High Utility Rare Itemsets – FHURIidescitation

Efficient algorithms for mining top k high utility item setsShakas Technologies

A New Data Stream Mining Algorithm for Interestingness-rich Association RulesVenu Madhav

Improved Map reduce Framework using High Utility Transactional DatabasesInternational Journal of Engineering Inventions www.ijeijournal.com

Mining High Utility Patterns in Large Databases using Mapreduce FrameworkIRJET Journal

Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...Association of Scientists, Developers and Faculties

Ijcatr04051008Editor IJCATR

REVIEW: Frequent Pattern Mining TechniquesEditor IJMTER

H044063843IJERA Editor

Hardware enhanced association rule miningStudsPlanet.com

Comparison Between High Utility Frequent Item sets Mining Techniquesijsrd.com

Tulipp_H2020_Hipeac'17 Conference_PEPGUM Workshop_January 017Tulipp. Eu

DAIS19: On the Performance of ARM TrustZoneLEGATO project

Mining high utility itemsets in data streams based on the weighted sliding wi...IJDKP

2014 IEEE JAVA DATA MINING PROJECT Secure mining of association rules in hori...IEEEMEMTECHSTUDENTSPROJECTS

IEEE 2014 JAVA DATA MINING PROJECTS Secure mining of association rules in hor...IEEEFINALYEARSTUDENTPROJECTS

How to get access to World Largest AI super Computer to do Advanced AI researchGanesan Narayanasamy

Similar to Efficient algorithms for mining the concise and lossless representation of high utility itemsets (20)

IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...

Discovering High Utility Item Sets to Achieve Lossless Mining using Apriori A...

International Journal of Engineering Research and Development (IJERD)

A Fuzzy Algorithm for Mining High Utility Rare Itemsets – FHURI

Efficient algorithms for mining top k high utility item sets

A New Data Stream Mining Algorithm for Interestingness-rich Association Rules

Improved Map reduce Framework using High Utility Transactional Databases

Mining High Utility Patterns in Large Databases using Mapreduce Framework

Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...

Ijcatr04051008

REVIEW: Frequent Pattern Mining Techniques

H044063843

Hardware enhanced association rule mining

Comparison Between High Utility Frequent Item sets Mining Techniques

Tulipp_H2020_Hipeac'17 Conference_PEPGUM Workshop_January 017

DAIS19: On the Performance of ARM TrustZone

Mining high utility itemsets in data streams based on the weighted sliding wi...

2014 IEEE JAVA DATA MINING PROJECT Secure mining of association rules in hori...

IEEE 2014 JAVA DATA MINING PROJECTS Secure mining of association rules in hor...

How to get access to World Largest AI super Computer to do Advanced AI research

More from ieeepondy

Demand aware network function placementieeepondy

Service description in the nfv revolution trends, challenges and a way forwardieeepondy

Secure optimization computation outsourcing in cloud computing a case study o...ieeepondy

Spatial related traffic sign inspection for inventory purposes using mobile l...ieeepondy

Standards for hybrid cloudsieeepondy

Rfhoc a random forest approach to auto-tuning hadoop's configurationieeepondy

Resource and instance hour minimization for deadline constrained dag applicat...ieeepondy

Reliable and confidential cloud storage with efficient data forwarding functi...ieeepondy

Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...ieeepondy

Scalable cloud–sensor architecture for the internet of thingsieeepondy

Scalable algorithms for nearest neighbor joins on big trajectory dataieeepondy

Robust workload and energy management for sustainable data centersieeepondy

Privacy preserving deep computation model on cloud for big data feature learningieeepondy

Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...ieeepondy

Protection of big data privacyieeepondy

Power optimization with bler constraint for wireless fronthauls in c ranieeepondy

Performance aware cloud resource allocation via fitness-enabled auctionieeepondy

Performance limitations of a text search application running in cloud instancesieeepondy

Performance analysis and optimal cooperative cluster size for randomly distri...ieeepondy

Predictive control for energy aware consolidation in cloud datacentersieeepondy

More from ieeepondy (20)

Demand aware network function placement

Service description in the nfv revolution trends, challenges and a way forward

Secure optimization computation outsourcing in cloud computing a case study o...

Spatial related traffic sign inspection for inventory purposes using mobile l...

Standards for hybrid clouds

Rfhoc a random forest approach to auto-tuning hadoop's configuration

Resource and instance hour minimization for deadline constrained dag applicat...

Reliable and confidential cloud storage with efficient data forwarding functi...

Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...

Scalable cloud–sensor architecture for the internet of things

Scalable algorithms for nearest neighbor joins on big trajectory data

Robust workload and energy management for sustainable data centers

Privacy preserving deep computation model on cloud for big data feature learning

Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...

Protection of big data privacy

Power optimization with bler constraint for wireless fronthauls in c ran

Performance aware cloud resource allocation via fitness-enabled auction

Performance limitations of a text search application running in cloud instances

Performance analysis and optimal cooperative cluster size for randomly distri...

Predictive control for energy aware consolidation in cloud datacenters

Efficient algorithms for mining the concise and lossless representation of high utility itemsets

1. Efficient Algorithms for Mining the Concise and Lossless Representation of Closed+ High Utility Itemsets Abstract: Mining high utility itemsets (HUIs) from databases is an important data mining task, which refers to the discovery of itemsets with high utilities (e.g. high profits). However, it may present too many HUIs to users, which also degrades the efficiency of the mining process. To achieve high efficiency for the mining task and provide a concise mining result to users, we propose a novel framework in this paper for mining closed+ high utility itemsets (CHUIs), which serves as a compact and lossless representation of HUIs. We propose three efficient algorithms named AprioriCH (Apriori- based algorithm for mining High utility Closed+ itemsets), AprioriHC-D (AprioriHC algorithm with Discarding unpromising and isolated items) and CHUD (Closed+ High Utility itemset Discovery) to find this representation. Further, a method called DAHU (Derive All High Utility itemsets) is proposed to recover all HUIs from the set of CHUIs without accessing the original database. Results on real and synthetic datasets show that the proposed algorithms are very efficient and that our approaches

2. achieve a massive reduction in the number of HUIs. In addition, when all HUIs can be recovered by DAHU, the combination of CHUD and DAHU outperforms the state-of-the-art algorithms for mining HUIs. Existing System: Frequent itemset mining (FIM) is a fundamental research topic in data mining. One of its popular applications is market basket analysis, which refers to the discovery of sets of items (itemsets) that are frequently purchased together by customers. However, in this application, the traditional model of FIM may discover a large amount of frequent but low revenue itemsets and lose the information on valuable itemsets having low selling frequencies. These problems are caused by the facts that (1) FIM treats all items as having the same importance/unit profit/weight and (2) it assumes that every item in a transaction appears in a binary form, i.e., an item can be either present or absent in a transaction, which does not indicate its purchase quantity in the transaction. Hence, FIM cannot satisfy the requirement of users who desire to discover itemsets with high utilities such as high profits. Proposed System:

3. HUI mining is not an easy task since the downward closure property in FIM does not hold in utility mining. In other words, the search space for mining HUIs cannot be directly reduced as it is done in FIM because a superset of a low utility itemset can be a high utility itemset. Were proposed for mining HUIs, but they often present a large number of high utility itemsets to users. A very large number of high utility itemsets makes it difficult for the users to comprehend the results. It may also cause the algorithms to become inefficient in terms of time and memory requirement, or even run out of memory. It is widely recognized that the more high utility itemsets the algorithms generate, the more processing they consume. The performance of the mining task decreases greatly for low minimum utility thresholds or when dealing with dense databases. Hardware Requirements: • System : Pentium IV 2.4 GHz. • Hard Disk : 40 GB. • Floppy Drive : 1.44 Mb. • Monitor : 15 VGA Colour. • Mouse : Logitech. • RAM : 256 Mb. Software Requirements:

4. • Operating system : - Windows XP. • Front End : - JSP • Back End : - SQL Server Software Requirements: • Operating system : - Windows XP. • Front End : - .Net • Back End : - SQL Server

Efficient algorithms for mining the concise and lossless representation of high utility itemsets

Recommended

Recommended

More Related Content

Similar to Efficient algorithms for mining the concise and lossless representation of high utility itemsets

Similar to Efficient algorithms for mining the concise and lossless representation of high utility itemsets (20)

More from ieeepondy

More from ieeepondy (20)

Efficient algorithms for mining the concise and lossless representation of high utility itemsets