The document presents an algorithm for sanitizing a database to hide sensitive patterns while minimizing changes to the original data. It identifies sensitive transactions containing restrictive patterns to be hidden and sorts them by degree and size. It then selects items with the maximum cover across restrictive patterns and removes them from sensitive transactions, reducing support for the patterns. This process iterates until restrictive pattern support is reduced to 0. The sanitized database combines modified sensitive transactions with unmodified non-sensitive transactions. The algorithm is tested on sample databases to evaluate effectiveness with minimal impact on the original data.
Output Privacy Protection With Pattern-Based Heuristic Algorithmijcsit
Privacy Preserving Data Mining(PPDM) is an ongoing research area aimed at bridging the gap between
the collaborative data mining and data confidentiality There are many different approaches which have
been adopted for PPDM, of them the rule hiding approach is used in this article. This approach ensures
output privacy that prevent the mined patterns(itemsets) from malicious inference problems. An efficient
algorithm named as Pattern-based Maxcover Algorithm is proposed with experimental results. This
algorithm minimizes the dissimilarity between the source and the released database; Moreover the
patterns protected cannot be retrieved from the released database by an adversary or counterpart even
with an arbitrarily low support threshold.
The document proposes an algorithm called MSApriori_VDB for efficiently mining rare association rules from transactional databases. It first converts the transaction database to a vertical data format to reduce the number of scans. It then uses a multiple minimum support framework where each item is assigned a minimum item support based on its frequency. The algorithm generates candidate itemsets, calculates their support, and prunes uninteresting itemsets to identify interesting rare associations with high confidence. Experimental results show the algorithm outperforms previous approaches in memory usage and runtime.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Survey on Features and Techniques Description for Privacy of Sensitive Info...IRJET Journal
This document summarizes techniques for preserving privacy when mining sensitive data. It discusses threats to privacy from data mining like identity disclosure and attribute disclosure. It then describes several techniques for modifying data to prevent privacy leaks, including data perturbation, suppression, swapping, and noise addition. The document reviews related work applying these techniques and analyzes privacy threats. It concludes that further research is needed to develop effective methods for anomaly detection while addressing design issues for privacy-preserving data mining.
Postdiffset Algorithm in Rare Pattern: An Implementation via Benchmark Case S...IJECEIAES
Frequent and infrequent itemset mining are trending in data mining techniques. The pattern of Association Rule (AR) generated will help decision maker or business policy maker to project for the next intended items across a wide variety of applications. While frequent itemsets are dealing with items that are most purchased or used, infrequent items are those items that are infrequently occur or also called rare items. The AR mining still remains as one of the most prominent areas in data mining that aims to extract interesting correlations, patterns, association or casual structures among set of items in the transaction databases or other data repositories. The design of database structure in association rules mining algorithms are based upon horizontal or vertical data formats. These two data formats have been widely discussed by showing few examples of algorithm of each data formats. The efforts on horizontal format suffers in huge candidate generation and multiple database scans which resulting in higher memory consumptions. To overcome the issue, the solutions on vertical approaches are proposed. One of the established algorithms in vertical data format is Eclat.ECLAT or Equivalence Class Transformation algorithm is one example solution that lies in vertical database format. Because of its ‘fast intersection’, in this paper, we analyze the fundamental Eclat and Eclatvariants such asdiffsetand sortdiffset. In response to vertical data format and as a continuity to Eclat extension, we propose a postdiffset algorithm as a new member in Eclat variants that use tidset format in the first looping and diffset in the later looping. In this paper, we present the performance of Postdiffset algorithm prior to implementation in mining of infrequent or rare itemset. Postdiffset algorithm outperforms 23% and 84% to diffset and sortdiffset in mushroom and 94% and 99% to diffset and sortdiffset in retail dataset.
NEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUEcscpconf
Data mining is the process of extracting hidden patterns of data. Association rule mining is an
important data mining task that finds interesting association among a large set of data item. It
may disclose pattern and various kinds of sensitive information. Such information may be
protected against unauthorized access. Association rule hiding is one of the techniques of
privacy preserving data mining to protect the association rules generated by association rule
mining. This paper adopts data distortion technique for hiding sensitive association rules.
Algorithms based on this technique either hide a specific rule using data alteration technique or
hide the rules depending on the sensitivity of the items to be hidden. In the proposed technique,
positions of sensitive items are altered while maintaining the support. The proposed technique
uses the idea of representative rules to prune the rules first and then hides the sensitive rules.
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...IOSR Journals
The document analyzes pattern transformation algorithms for sensitive knowledge protection in data mining. It discusses:
1) Three main privacy preserving techniques - heuristic, cryptography, and reconstruction-based. The proposed algorithms use heuristic-based techniques.
2) Four proposed heuristic-based algorithms - item-based Maxcover (IMA), pattern-based Maxcover (PMA), transaction-based Maxcover (TMA), and Sensitivity Cost Sanitization (SCS) - that modify sensitive transactions to decrease support of restrictive patterns.
3) Performance improvements including parallel and incremental approaches to handle large, dynamic databases while balancing privacy and utility.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Output Privacy Protection With Pattern-Based Heuristic Algorithmijcsit
Privacy Preserving Data Mining(PPDM) is an ongoing research area aimed at bridging the gap between
the collaborative data mining and data confidentiality There are many different approaches which have
been adopted for PPDM, of them the rule hiding approach is used in this article. This approach ensures
output privacy that prevent the mined patterns(itemsets) from malicious inference problems. An efficient
algorithm named as Pattern-based Maxcover Algorithm is proposed with experimental results. This
algorithm minimizes the dissimilarity between the source and the released database; Moreover the
patterns protected cannot be retrieved from the released database by an adversary or counterpart even
with an arbitrarily low support threshold.
The document proposes an algorithm called MSApriori_VDB for efficiently mining rare association rules from transactional databases. It first converts the transaction database to a vertical data format to reduce the number of scans. It then uses a multiple minimum support framework where each item is assigned a minimum item support based on its frequency. The algorithm generates candidate itemsets, calculates their support, and prunes uninteresting itemsets to identify interesting rare associations with high confidence. Experimental results show the algorithm outperforms previous approaches in memory usage and runtime.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Survey on Features and Techniques Description for Privacy of Sensitive Info...IRJET Journal
This document summarizes techniques for preserving privacy when mining sensitive data. It discusses threats to privacy from data mining like identity disclosure and attribute disclosure. It then describes several techniques for modifying data to prevent privacy leaks, including data perturbation, suppression, swapping, and noise addition. The document reviews related work applying these techniques and analyzes privacy threats. It concludes that further research is needed to develop effective methods for anomaly detection while addressing design issues for privacy-preserving data mining.
Postdiffset Algorithm in Rare Pattern: An Implementation via Benchmark Case S...IJECEIAES
Frequent and infrequent itemset mining are trending in data mining techniques. The pattern of Association Rule (AR) generated will help decision maker or business policy maker to project for the next intended items across a wide variety of applications. While frequent itemsets are dealing with items that are most purchased or used, infrequent items are those items that are infrequently occur or also called rare items. The AR mining still remains as one of the most prominent areas in data mining that aims to extract interesting correlations, patterns, association or casual structures among set of items in the transaction databases or other data repositories. The design of database structure in association rules mining algorithms are based upon horizontal or vertical data formats. These two data formats have been widely discussed by showing few examples of algorithm of each data formats. The efforts on horizontal format suffers in huge candidate generation and multiple database scans which resulting in higher memory consumptions. To overcome the issue, the solutions on vertical approaches are proposed. One of the established algorithms in vertical data format is Eclat.ECLAT or Equivalence Class Transformation algorithm is one example solution that lies in vertical database format. Because of its ‘fast intersection’, in this paper, we analyze the fundamental Eclat and Eclatvariants such asdiffsetand sortdiffset. In response to vertical data format and as a continuity to Eclat extension, we propose a postdiffset algorithm as a new member in Eclat variants that use tidset format in the first looping and diffset in the later looping. In this paper, we present the performance of Postdiffset algorithm prior to implementation in mining of infrequent or rare itemset. Postdiffset algorithm outperforms 23% and 84% to diffset and sortdiffset in mushroom and 94% and 99% to diffset and sortdiffset in retail dataset.
NEW ALGORITHM FOR SENSITIVE RULE HIDING USING DATA DISTORTION TECHNIQUEcscpconf
Data mining is the process of extracting hidden patterns of data. Association rule mining is an
important data mining task that finds interesting association among a large set of data item. It
may disclose pattern and various kinds of sensitive information. Such information may be
protected against unauthorized access. Association rule hiding is one of the techniques of
privacy preserving data mining to protect the association rules generated by association rule
mining. This paper adopts data distortion technique for hiding sensitive association rules.
Algorithms based on this technique either hide a specific rule using data alteration technique or
hide the rules depending on the sensitivity of the items to be hidden. In the proposed technique,
positions of sensitive items are altered while maintaining the support. The proposed technique
uses the idea of representative rules to prune the rules first and then hides the sensitive rules.
Analysis of Pattern Transformation Algorithms for Sensitive Knowledge Protect...IOSR Journals
The document analyzes pattern transformation algorithms for sensitive knowledge protection in data mining. It discusses:
1) Three main privacy preserving techniques - heuristic, cryptography, and reconstruction-based. The proposed algorithms use heuristic-based techniques.
2) Four proposed heuristic-based algorithms - item-based Maxcover (IMA), pattern-based Maxcover (PMA), transaction-based Maxcover (TMA), and Sensitivity Cost Sanitization (SCS) - that modify sensitive transactions to decrease support of restrictive patterns.
3) Performance improvements including parallel and incremental approaches to handle large, dynamic databases while balancing privacy and utility.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Survey on Identification of Closed Frequent Item Sets Using Intersecting Al...IOSR Journals
This document summarizes research on using an intersection approach to identify closed frequent item sets from transactional data. It discusses how existing intersection algorithms enumerate and intersect candidate transaction sets or use a cumulative scheme with a repository that new transactions are intersected with. The document also reviews research on reducing the size of prefix trees used to store candidate sets as the number of transactions increases. It aims to draw attention to the intersection approach as a less researched area that could be improved to effectively identify closed frequent item sets from large transactional datasets.
Comparative study of frequent item set in data miningijpla
In this paper, we are an overview of already presents frequent item set mining algorithms. In these days
frequent item set mining algorithm is very popular but in the frequent item set mining computationally
expensive task. Here we described different process which use for item set mining, We also compare
different concept and algorithm which used for generation of frequent item set mining From the all the
types of frequent item set mining algorithms that have been developed we will compare important ones. We
will compare the algorithms and analyze their run time performance.
Classification on multi label dataset using rule mining techniqueeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
One of the most important problems in modern finance is finding efficient ways to summarize and visualize
the stock market data to give individuals or institutions useful information about the market behavior for
investment decisions Therefore, Investment can be considered as one of the fundamental pillars of national
economy. So, at the present time many investors look to find criterion to compare stocks together and
selecting the best and also investors choose strategies that maximize the earning value of the investment
process. Therefore the enormous amount of valuable data generated by the stock market has attracted
researchers to explore this problem domain using different methodologies. Therefore research in data
mining has gained a high attraction due to the importance of its applications and the increasing generation
information. So, Data mining tools such as association rule, rule induction method and Apriori algorithm
techniques are used to find association between different scripts of stock market, and also much of the
research and development has taken place regarding the reasons for fluctuating Indian stock exchange.
But, now days there are two important factors such as gold prices and US Dollar Prices are more
dominating on Indian Stock Market and to find out the correlation between gold prices, dollar prices and
BSE index statistical correlation is used and this helps the activities of stock operators, brokers, investors
and jobbers. They are based on the forecasting the fluctuation of index share prices, gold prices, dollar
prices and transactions of customers. Hence researcher has considered these problems as a topic for
research.
The document summarizes a novel approach for privacy preserving data mining on continuous and discrete data sets. The approach converts original sample data sets into a group of "unreal" data sets from which the original samples cannot be reconstructed. However, an accurate decision tree can still be built directly from the unreal data sets. This protects privacy while maintaining data mining utility. The approach determines information entropy and generates a decision tree using the unreal data sets and a perturbing set, without reconstructing the original samples.
This document proposes a new approach for preserving sensitive data privacy when clustering data. It involves adding noise to numeric attributes in the data using a fuzzy membership function, which distorts the data while maintaining the original clusters. The fuzzy membership function uses a S-shaped curve to map original attribute values to modified values. Clustering is then performed on the distorted data. This approach aims to preserve privacy while reducing processing time compared to other privacy-preserving methods like cryptographic techniques, data swapping, and noise addition.
This document summarizes a research paper that proposes a multidimensional data mining algorithm to determine association rules across different granularities. The algorithm addresses weaknesses in existing techniques, such as having to rescan the entire database when new attributes are added. It uses a concept taxonomy structure to represent the search space and finds association patterns by selecting concepts from individual taxonomies. An experiment on a wholesale business dataset demonstrates that the algorithm is linear and highly scalable to the number of records and can flexibly handle different data types.
In this paper, we present a literature survey of existing frequent item set mining algorithms. The concept of frequent item set mining is also discussed in brief. The working procedure of some modern frequent item set mining techniques is given. Also the merits and demerits of each method are described. It is found that the frequent item set mining is still a burning research topic.
This document describes a proposed modified cluster-based fuzzy-genetic data mining algorithm. The algorithm aims to mine both association rules and membership functions from quantitative transaction data. It uses a genetic algorithm approach that represents each set of membership functions as a chromosome. Chromosomes are clustered using a modified k-means approach to reduce computational costs. The representative chromosome of each cluster is used to calculate fitness values. Offspring are produced through genetic operators and selected through roulette wheel selection. The algorithm iterates until obtaining a set of membership functions with high fitness. These are then used to mine multilevel fuzzy association rules from the transaction data. The algorithm is illustrated through a simple example involving transaction data containing purchases of items like milk, bread, etc
Enhancement techniques for data warehouse staging areaIJDKP
This document discusses techniques for enhancing the performance of data warehouse staging areas. It proposes two algorithms: 1) A semantics-based extraction algorithm that reduces extraction time by pruning useless data using semantic information. 2) A semantics-based transformation algorithm that similarly aims to reduce transformation time. It also explores three scheduling techniques (FIFO, minimum cost, round robin) for loading data into the data warehouse and experimentally evaluates their performance. The goal is to enhance each stage of the ETL process to maximize overall performance.
A Quantified Approach for large Dataset Compression in Association MiningIOSR Journals
Abstract: With the rapid development of computer and information technology in the last several decades, an
enormous amount of data in science and engineering will continuously be generated in massive scale; data
compression is needed to reduce the cost and storage space. Compression and discovering association rules by
identifying relationships among sets of items in a transaction database is an important problem in Data Mining.
Finding frequent itemsets is computationally the most expensive step in association rule discovery and therefore
it has attracted significant research attention. However, existing compression algorithms are not appropriate in
data mining for large data sets. In this research a new approach is describe in which the original dataset is
sorted in lexicographical order and desired number of groups are formed to generate the quantification tables.
These quantification tables are used to generate the compressed dataset, which is more efficient algorithm for
mining complete frequent itemsets from compressed dataset. The experimental results show that the proposed
algorithm performs better when comparing it with the mining merge algorithm with different supports and
execution time.
Keywords: Apriori Algorithm, mining merge Algorithm, quantification table
GeneticMax: An Efficient Approach to Mining Maximal Frequent Itemsets Based o...ITIIIndustries
This paper presents a new approach based on genetic algorithms (GAs) to generate maximal frequent itemsets (MFIs) from large datasets. This new algorithm, GeneticMax, is heuristic which mimics natural selection approaches for finding MFIs in an efficient way. The search strategy of this algorithm uses a lexicographic tree that avoids level by level searching which reduces the time required to mine the MFIs in a linear way. Our implementation of the search strategy includes bitmap representation of the nodes in a lexicographic tree and identifying frequent itemsets (FIs) from superset-subset relationships of nodes. This new algorithm uses the principles of GAs to perform global searches. The time complexity is less than many of the other algorithms since it uses a non-deterministic approach. We separate the effect of each step of this algorithm by experimental analysis on real datasets such as Tic-Tac-Toe, Zoo, and a 10000×8 dataset. Our experimental results showed that this approach is efficient and scalable for different sizes of itemsets. It accesses a major dataset to calculate a support value for fewer number of nodes to find the FIs even when the search space is very large, dramatically reducing the search time. The proposed algorithm shows how evolutionary method can be used on real datasets to find all the MFIs in an efficient way.
The premise of this paper is to discover frequent patterns by the use of data grids in WEKA 3.8 environment. Workload imbalance occurs due to the dynamic nature of the grid computing hence data grids are used for the creation and validation of data. Association rules are used to extract the useful information from the large database. In this paper the researcher generate the best rules by using WEKA 3.8 for better performance. WEKA 3.8 is used to accomplish best rules and implementation of various algorithms.
Top Down Approach to find Maximal Frequent Item Sets using Subset Creationcscpconf
Association rule has been an area of active research in the field of knowledge discovery. Data
mining researchers had improved upon the quality of association rule mining for business
development by incorporating influential factors like value (utility), quantity of items sold
(weight) and more for the mining of association patterns. In this paper, we propose an efficient
approach to find maximal frequent item set first. Most of the algorithms in literature used to find
minimal frequent item first, then with the help of minimal frequent item sets derive the maximal
frequent item sets. These methods consume more time to find maximal frequent item sets. To
overcome this problem, we propose a navel approach to find maximal frequent item set directly using the concepts of subsets. The proposed method is found to be efficient in finding maximal frequent item sets.
A Survey on Fuzzy Association Rule Mining MethodologiesIOSR Journals
Abstract : Fuzzy association rule mining (Fuzzy ARM) uses fuzzy logic to generate interesting association
rules. These association relationships can help in decision making for the solution of a given problem. Fuzzy
ARM is a variant of classical association rule mining. Classical association rule mining uses the concept of
crisp sets. Because of this reason classical association rule mining has several drawbacks. To overcome those
drawbacks the concept of fuzzy association rule mining came. Today there is a huge number of different types of
fuzzy association rule mining algorithms are present in research works and day by day these algorithms are
getting better. But as the problem domain is also becoming more complex in nature, continuous research work
is still going on. In this paper, we have studied several well-known methodologies and algorithms for fuzzy
association rule mining. Four important methodologies are briefly discussed in this paper which will show the
recent trends and future scope of research in the field of fuzzy association rule mining.
Keywords: Knowledge discovery in databases, Data mining, Fuzzy association rule mining, Classical
association rule mining, Very large datasets, Minimum support, Cardinality, Certainty factor, Redundant rule,
Equivalence , Equivalent rules
This document presents the results of an XRF analysis of 23 bentonite samples collected from different parts of Jharkhand, India. The analysis was conducted to determine the chemical composition and theoretical molecular formula of the samples. Key findings include:
- The chemical composition of most samples is comparable to bentonite from Rajmahal hills and literature values, indicating good quality bentonite.
- Composition varied across samples but most were high in silica (50.93-59.87%) and had silica to alumina ratios consistent with bentonite.
- One sample had an unusually low alumina content and high silica to alumina ratio, suggesting poorer quality.
- The theoretical molecular formulas derived were consistent with
1. The document discusses new skills required for testers as the field of software testing evolves. It notes changes like shifting from quality assurance to operational assurance and ever-changing requirements.
2. It recommends testers develop skills like understanding multiple languages to communicate effectively with different teams. It also suggests studying epistemology to improve testing strategies and ability to recognize mistakes.
3. Epistemology is the study of evidence and reasoning, and how knowledge is acquired and justified. Studying it can help testers determine when enough testing has been done and construct defensible test reports.
Implementation of FC-TCR for Reactive Power ControlIOSR Journals
This document discusses the implementation of a Fixed Capacitor Thyristor Controlled Reactor (FC-TCR) system for reactive power control. FC-TCR is a type of Static VAR Compensator (SVC) that can inject or absorb reactive power to control voltage. It consists of a fixed capacitor in parallel with a thyristor controlled reactor. The reactor current is controlled by varying the firing angle of thyristors, allowing both lagging and leading reactive power. MATLAB simulation results show that reactive power output from the FC-TCR increases as the reactor inductance increases while keeping the capacitor constant, demonstrating effective reactive power control.
MDSR to Reduce Link Breakage Routing Overhead in MANET Using PRMIOSR Journals
This document proposes a modification to the Dynamic Source Routing (DSR) protocol called Modified DSR (MDSR) to reduce routing overhead caused by frequent link breakages in mobile ad hoc networks. MDSR adds a link breakage prediction algorithm that uses signal strength measurements to predict when a link may break. Intermediate nodes monitor signal strength and warn the source node if a link may soon break. This allows the source to proactively rebuild the route or switch to a backup route to avoid disconnection. Simulation results showed MDSR can reduce the number of dropped packets by at least 25% compared to standard DSR. The document also discusses how DSR works and the proposed proactive route maintenance concept in M
A New Filtering Method and a Novel Converter Transformer for HVDC System.IOSR Journals
This document presents a new filtering method and converter transformer design for HVDC systems. The new design aims to address issues with traditional converter transformers and passive filtering methods, such as additional harmonic losses and difficulties meeting insulation requirements.
The new converter transformer uses a prolonged-delta winding configuration and phase shifts of 15 degrees to provide 12-phase commutation voltages. It also employs an inductive filtering mechanism where a tap connects the prolonged and common windings to an LC resonance circuit. This allows harmonic currents to balance out so no inductive harmonics flow in the primary winding.
Simulation results show the new design greatly reduces harmonic content and transformer losses compared to traditional designs. The primary current waveform has lower distortion and THD with the
Improving security for data migration in cloud computing using randomized enc...IOSR Journals
1) The document proposes an encryption technique using randomization to improve security for data migration in cloud computing. It aims to address major security issues in cloud data migration like confidentiality, integrity, reliability and data security.
2) The proposed method uses a random key to encrypt data, and then encrypts the random key with a shared key before transmission. This adds an extra layer of security by obscuring the actual encryption key.
3) It is concluded that the randomized encryption technique makes it difficult for attackers to analyze encrypted texts and determine if they correspond to the same plaintext, improving security over existing methods for cloud data migration.
A Survey on Identification of Closed Frequent Item Sets Using Intersecting Al...IOSR Journals
This document summarizes research on using an intersection approach to identify closed frequent item sets from transactional data. It discusses how existing intersection algorithms enumerate and intersect candidate transaction sets or use a cumulative scheme with a repository that new transactions are intersected with. The document also reviews research on reducing the size of prefix trees used to store candidate sets as the number of transactions increases. It aims to draw attention to the intersection approach as a less researched area that could be improved to effectively identify closed frequent item sets from large transactional datasets.
Comparative study of frequent item set in data miningijpla
In this paper, we are an overview of already presents frequent item set mining algorithms. In these days
frequent item set mining algorithm is very popular but in the frequent item set mining computationally
expensive task. Here we described different process which use for item set mining, We also compare
different concept and algorithm which used for generation of frequent item set mining From the all the
types of frequent item set mining algorithms that have been developed we will compare important ones. We
will compare the algorithms and analyze their run time performance.
Classification on multi label dataset using rule mining techniqueeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
One of the most important problems in modern finance is finding efficient ways to summarize and visualize
the stock market data to give individuals or institutions useful information about the market behavior for
investment decisions Therefore, Investment can be considered as one of the fundamental pillars of national
economy. So, at the present time many investors look to find criterion to compare stocks together and
selecting the best and also investors choose strategies that maximize the earning value of the investment
process. Therefore the enormous amount of valuable data generated by the stock market has attracted
researchers to explore this problem domain using different methodologies. Therefore research in data
mining has gained a high attraction due to the importance of its applications and the increasing generation
information. So, Data mining tools such as association rule, rule induction method and Apriori algorithm
techniques are used to find association between different scripts of stock market, and also much of the
research and development has taken place regarding the reasons for fluctuating Indian stock exchange.
But, now days there are two important factors such as gold prices and US Dollar Prices are more
dominating on Indian Stock Market and to find out the correlation between gold prices, dollar prices and
BSE index statistical correlation is used and this helps the activities of stock operators, brokers, investors
and jobbers. They are based on the forecasting the fluctuation of index share prices, gold prices, dollar
prices and transactions of customers. Hence researcher has considered these problems as a topic for
research.
The document summarizes a novel approach for privacy preserving data mining on continuous and discrete data sets. The approach converts original sample data sets into a group of "unreal" data sets from which the original samples cannot be reconstructed. However, an accurate decision tree can still be built directly from the unreal data sets. This protects privacy while maintaining data mining utility. The approach determines information entropy and generates a decision tree using the unreal data sets and a perturbing set, without reconstructing the original samples.
This document proposes a new approach for preserving sensitive data privacy when clustering data. It involves adding noise to numeric attributes in the data using a fuzzy membership function, which distorts the data while maintaining the original clusters. The fuzzy membership function uses a S-shaped curve to map original attribute values to modified values. Clustering is then performed on the distorted data. This approach aims to preserve privacy while reducing processing time compared to other privacy-preserving methods like cryptographic techniques, data swapping, and noise addition.
This document summarizes a research paper that proposes a multidimensional data mining algorithm to determine association rules across different granularities. The algorithm addresses weaknesses in existing techniques, such as having to rescan the entire database when new attributes are added. It uses a concept taxonomy structure to represent the search space and finds association patterns by selecting concepts from individual taxonomies. An experiment on a wholesale business dataset demonstrates that the algorithm is linear and highly scalable to the number of records and can flexibly handle different data types.
In this paper, we present a literature survey of existing frequent item set mining algorithms. The concept of frequent item set mining is also discussed in brief. The working procedure of some modern frequent item set mining techniques is given. Also the merits and demerits of each method are described. It is found that the frequent item set mining is still a burning research topic.
This document describes a proposed modified cluster-based fuzzy-genetic data mining algorithm. The algorithm aims to mine both association rules and membership functions from quantitative transaction data. It uses a genetic algorithm approach that represents each set of membership functions as a chromosome. Chromosomes are clustered using a modified k-means approach to reduce computational costs. The representative chromosome of each cluster is used to calculate fitness values. Offspring are produced through genetic operators and selected through roulette wheel selection. The algorithm iterates until obtaining a set of membership functions with high fitness. These are then used to mine multilevel fuzzy association rules from the transaction data. The algorithm is illustrated through a simple example involving transaction data containing purchases of items like milk, bread, etc
Enhancement techniques for data warehouse staging areaIJDKP
This document discusses techniques for enhancing the performance of data warehouse staging areas. It proposes two algorithms: 1) A semantics-based extraction algorithm that reduces extraction time by pruning useless data using semantic information. 2) A semantics-based transformation algorithm that similarly aims to reduce transformation time. It also explores three scheduling techniques (FIFO, minimum cost, round robin) for loading data into the data warehouse and experimentally evaluates their performance. The goal is to enhance each stage of the ETL process to maximize overall performance.
A Quantified Approach for large Dataset Compression in Association MiningIOSR Journals
Abstract: With the rapid development of computer and information technology in the last several decades, an
enormous amount of data in science and engineering will continuously be generated in massive scale; data
compression is needed to reduce the cost and storage space. Compression and discovering association rules by
identifying relationships among sets of items in a transaction database is an important problem in Data Mining.
Finding frequent itemsets is computationally the most expensive step in association rule discovery and therefore
it has attracted significant research attention. However, existing compression algorithms are not appropriate in
data mining for large data sets. In this research a new approach is describe in which the original dataset is
sorted in lexicographical order and desired number of groups are formed to generate the quantification tables.
These quantification tables are used to generate the compressed dataset, which is more efficient algorithm for
mining complete frequent itemsets from compressed dataset. The experimental results show that the proposed
algorithm performs better when comparing it with the mining merge algorithm with different supports and
execution time.
Keywords: Apriori Algorithm, mining merge Algorithm, quantification table
GeneticMax: An Efficient Approach to Mining Maximal Frequent Itemsets Based o...ITIIIndustries
This paper presents a new approach based on genetic algorithms (GAs) to generate maximal frequent itemsets (MFIs) from large datasets. This new algorithm, GeneticMax, is heuristic which mimics natural selection approaches for finding MFIs in an efficient way. The search strategy of this algorithm uses a lexicographic tree that avoids level by level searching which reduces the time required to mine the MFIs in a linear way. Our implementation of the search strategy includes bitmap representation of the nodes in a lexicographic tree and identifying frequent itemsets (FIs) from superset-subset relationships of nodes. This new algorithm uses the principles of GAs to perform global searches. The time complexity is less than many of the other algorithms since it uses a non-deterministic approach. We separate the effect of each step of this algorithm by experimental analysis on real datasets such as Tic-Tac-Toe, Zoo, and a 10000×8 dataset. Our experimental results showed that this approach is efficient and scalable for different sizes of itemsets. It accesses a major dataset to calculate a support value for fewer number of nodes to find the FIs even when the search space is very large, dramatically reducing the search time. The proposed algorithm shows how evolutionary method can be used on real datasets to find all the MFIs in an efficient way.
The premise of this paper is to discover frequent patterns by the use of data grids in WEKA 3.8 environment. Workload imbalance occurs due to the dynamic nature of the grid computing hence data grids are used for the creation and validation of data. Association rules are used to extract the useful information from the large database. In this paper the researcher generate the best rules by using WEKA 3.8 for better performance. WEKA 3.8 is used to accomplish best rules and implementation of various algorithms.
Top Down Approach to find Maximal Frequent Item Sets using Subset Creationcscpconf
Association rule has been an area of active research in the field of knowledge discovery. Data
mining researchers had improved upon the quality of association rule mining for business
development by incorporating influential factors like value (utility), quantity of items sold
(weight) and more for the mining of association patterns. In this paper, we propose an efficient
approach to find maximal frequent item set first. Most of the algorithms in literature used to find
minimal frequent item first, then with the help of minimal frequent item sets derive the maximal
frequent item sets. These methods consume more time to find maximal frequent item sets. To
overcome this problem, we propose a navel approach to find maximal frequent item set directly using the concepts of subsets. The proposed method is found to be efficient in finding maximal frequent item sets.
A Survey on Fuzzy Association Rule Mining MethodologiesIOSR Journals
Abstract : Fuzzy association rule mining (Fuzzy ARM) uses fuzzy logic to generate interesting association
rules. These association relationships can help in decision making for the solution of a given problem. Fuzzy
ARM is a variant of classical association rule mining. Classical association rule mining uses the concept of
crisp sets. Because of this reason classical association rule mining has several drawbacks. To overcome those
drawbacks the concept of fuzzy association rule mining came. Today there is a huge number of different types of
fuzzy association rule mining algorithms are present in research works and day by day these algorithms are
getting better. But as the problem domain is also becoming more complex in nature, continuous research work
is still going on. In this paper, we have studied several well-known methodologies and algorithms for fuzzy
association rule mining. Four important methodologies are briefly discussed in this paper which will show the
recent trends and future scope of research in the field of fuzzy association rule mining.
Keywords: Knowledge discovery in databases, Data mining, Fuzzy association rule mining, Classical
association rule mining, Very large datasets, Minimum support, Cardinality, Certainty factor, Redundant rule,
Equivalence , Equivalent rules
This document presents the results of an XRF analysis of 23 bentonite samples collected from different parts of Jharkhand, India. The analysis was conducted to determine the chemical composition and theoretical molecular formula of the samples. Key findings include:
- The chemical composition of most samples is comparable to bentonite from Rajmahal hills and literature values, indicating good quality bentonite.
- Composition varied across samples but most were high in silica (50.93-59.87%) and had silica to alumina ratios consistent with bentonite.
- One sample had an unusually low alumina content and high silica to alumina ratio, suggesting poorer quality.
- The theoretical molecular formulas derived were consistent with
1. The document discusses new skills required for testers as the field of software testing evolves. It notes changes like shifting from quality assurance to operational assurance and ever-changing requirements.
2. It recommends testers develop skills like understanding multiple languages to communicate effectively with different teams. It also suggests studying epistemology to improve testing strategies and ability to recognize mistakes.
3. Epistemology is the study of evidence and reasoning, and how knowledge is acquired and justified. Studying it can help testers determine when enough testing has been done and construct defensible test reports.
Implementation of FC-TCR for Reactive Power ControlIOSR Journals
This document discusses the implementation of a Fixed Capacitor Thyristor Controlled Reactor (FC-TCR) system for reactive power control. FC-TCR is a type of Static VAR Compensator (SVC) that can inject or absorb reactive power to control voltage. It consists of a fixed capacitor in parallel with a thyristor controlled reactor. The reactor current is controlled by varying the firing angle of thyristors, allowing both lagging and leading reactive power. MATLAB simulation results show that reactive power output from the FC-TCR increases as the reactor inductance increases while keeping the capacitor constant, demonstrating effective reactive power control.
MDSR to Reduce Link Breakage Routing Overhead in MANET Using PRMIOSR Journals
This document proposes a modification to the Dynamic Source Routing (DSR) protocol called Modified DSR (MDSR) to reduce routing overhead caused by frequent link breakages in mobile ad hoc networks. MDSR adds a link breakage prediction algorithm that uses signal strength measurements to predict when a link may break. Intermediate nodes monitor signal strength and warn the source node if a link may soon break. This allows the source to proactively rebuild the route or switch to a backup route to avoid disconnection. Simulation results showed MDSR can reduce the number of dropped packets by at least 25% compared to standard DSR. The document also discusses how DSR works and the proposed proactive route maintenance concept in M
A New Filtering Method and a Novel Converter Transformer for HVDC System.IOSR Journals
This document presents a new filtering method and converter transformer design for HVDC systems. The new design aims to address issues with traditional converter transformers and passive filtering methods, such as additional harmonic losses and difficulties meeting insulation requirements.
The new converter transformer uses a prolonged-delta winding configuration and phase shifts of 15 degrees to provide 12-phase commutation voltages. It also employs an inductive filtering mechanism where a tap connects the prolonged and common windings to an LC resonance circuit. This allows harmonic currents to balance out so no inductive harmonics flow in the primary winding.
Simulation results show the new design greatly reduces harmonic content and transformer losses compared to traditional designs. The primary current waveform has lower distortion and THD with the
Improving security for data migration in cloud computing using randomized enc...IOSR Journals
1) The document proposes an encryption technique using randomization to improve security for data migration in cloud computing. It aims to address major security issues in cloud data migration like confidentiality, integrity, reliability and data security.
2) The proposed method uses a random key to encrypt data, and then encrypts the random key with a shared key before transmission. This adds an extra layer of security by obscuring the actual encryption key.
3) It is concluded that the randomized encryption technique makes it difficult for attackers to analyze encrypted texts and determine if they correspond to the same plaintext, improving security over existing methods for cloud data migration.
Solar Based Stand Alone High Performance Interleaved Boost Converter with Zvs...IOSR Journals
This document summarizes a research paper on a solar-based interleaved boost converter with zero-voltage switching and zero-current switching. The converter uses two boost converters connected in parallel with a phase shift to reduce ripple and improve efficiency. Soft-switching techniques are used to reduce switching losses. Simulation results show the converter maintains a constant output voltage while the induction motor output varies with time, and PWM signals control the switches. The converter achieves a power factor of 0.93 and performs efficiently for power conversion from solar panels.
The document discusses compressive wideband power spectrum analysis for EEG signals using FastICA and neural networks. It first provides background on EEG signals and how they are measured. It then describes using FastICA to extract independent components from EEG signals related to detecting epileptic seizures. The independent components are then used to train a backpropagation neural network for effective detection of epileptic seizures. The proposed method involves preprocessing EEG signals, performing spectral estimation using FastICA, and classifying brain activity patterns using the neural network.
The document summarizes a study on the effects of sowing date and crop spacing on growth, yield attributes, and quality of sesame. The study found that sowing early in the second fortnight of February and using a rectangular spacing of 45 x 15 cm resulted in superior performance of the sesame variety KS 95010. This combination led to taller plants, higher leaf area index, more branches, capsules, and seeds per plant. It also resulted in higher test weight, seed yield of 908 kg/ha, net income of 19,801 rupees, and a benefit-cost ratio of 3.09. The optimal plant density and spacing of 45 x 15 cm allowed for better resource utilization and maxim
Social Networking Websites and Image PrivacyIOSR Journals
The document discusses privacy issues related to social networking websites. It begins by providing background on social networking sites and how they allow users to construct profiles, connect with other users, and share content. However, it notes that a lack of awareness and proper privacy tools means users' personal data is at risk.
It then proposes several new privacy policies and describes their implementation in a social networking site built with PHP. These include an "Album Privacy Policy" that allows customizing access permissions for specific albums and photos, and an "Image Protection Policy" that prevents other users from copying or downloading protected images without permission. The goal is to provide users more flexible privacy controls over their data.
This document describes the development and validation of a stability-indicating high-performance thin layer chromatography (HPTLC) method for the analysis of modafinil, both as a bulk drug and in tablet formulations. The method utilizes silica gel plates with an ethyl acetate, acetone and methanol mobile phase. Modafinil demonstrates good linearity, precision, accuracy and robustness within the method validation parameters. The method is also shown to distinguish modafinil from its degradation products formed under various stress conditions like acid and base hydrolysis, oxidation, photolysis and heat. The developed HPTLC method can be applied for the quantitative analysis and identification of modafinil in pharmaceutical formulations.
The document provides an overview of steganography, including:
1) Steganography is the technique of hiding secret information within a cover file such that the existence of the secret information is concealed. It aims for invisible communication.
2) The main components of a steganographic system are the secret message, cover file, stego file, key, embedding and extracting methods.
3) Steganography differs from cryptography in that it does not alter the structure of the secret message and aims to conceal the very existence of communication, whereas cryptography scrambles messages and is known to transmit encrypted messages.
An Optimal Approach to derive Disjunctive Positive and Negative Rules from As...IOSR Journals
This document discusses an optimal approach to derive disjunctive positive and negative association rules from association rule mining using a genetic algorithm. It aims to address some shortfalls of conventional algorithms like Apriori by supporting disjunctive rules, using multiple minimum support thresholds, and effectively identifying negative rules. The proposed approach uses a modified FP-Growth algorithm and genetic algorithm to generate conjunctive and disjunctive positive and negative rules in an optimized manner by reducing candidate generation time and capturing useful rare item relationships.
Adjustment of Cost 231 Hata Path Model For Cellular Transmission in Rivers StateIOSR Journals
The document presents an adjustment of the COST 231 Hata path loss model for predicting radio signal propagation in Rivers State, Nigeria. Field measurements of received signal strength were taken in urban, suburban, and rural areas and compared to the COST 231 Hata model, Stanford University Interim model, and ECC-33 model. The COST 231 Hata model gave better predictions but with high error values outside acceptable ranges. The COST 231 Hata model was then adjusted using a linear least squares algorithm based on the field measurements. The adjusted COST 231 Hata model provided better predictions with minimum error within acceptable values and can accurately predict radio characteristics in Rivers State.
A Music Visual Interface via Emotion Detection SupervisorIOSR Journals
This document describes a music recommendation system that detects a user's emotion from their facial expressions using a webcam. It uses principal component analysis (PCA) to extract features from facial images to classify emotions like happy, sad, normal, and surprised. The system then recommends songs that match the detected emotion by considering song relevance and potential to influence the user's mood. It aims to automatically generate playlists suited to the user's current emotional state without requiring manual emotion input. The document discusses PCA in detail and evaluates the emotion detection accuracy of the proposed system.
A Novel Rebroadcast Technique for Reducing Routing Overhead In Mobile Ad Hoc ...IOSR Journals
This document presents a novel rebroadcast technique called Neighbor Coverage based Probabilistic Rebroadcast (NCPR) protocol to reduce routing overhead in mobile ad hoc networks. The NCPR protocol calculates a rebroadcast delay based on the number of common neighbors between nodes to prioritize dissemination of neighbor information. It also calculates a rebroadcast probability based on additional neighbor coverage ratio and connectivity factor to reduce unnecessary rebroadcasts while maintaining network connectivity. The protocol is implemented by enhancing the AODV routing protocol in NS-2 to reduce overhead from hello packets and neighbor lists in route requests. Its performance is evaluated under varying network sizes, traffic loads, and packet loss conditions.
Generation and Implementation of Barker and Nested Binary codesIOSR Journals
This document discusses the generation and implementation of Barker and nested binary codes for use in radar applications. It begins with background on Barker codes and nested binary codes, which are types of phase coded waveforms used for pulse compression. Barker codes have the optimal autocorrelation sidelobe properties but are limited in length. Nested binary codes are formed by taking the Kronecker product of two Barker codes, which allows the generation of longer codes while maintaining good autocorrelation. The document then presents the methodology for implementing Barker and nested binary codes using linear feedback shift registers (LFSRs). Finally, it discusses measures for comparing signal performance such as merit factor and proposes an efficient VLSI architecture using LFSRs to generate these codes for implementation
E-Healthcare Billing and Record Management Information System using Android w...IOSR Journals
This document describes a proposed electronic healthcare billing and record management system called MedBook that utilizes cloud computing and mobile technologies. MedBook is designed as a Software-as-a-Service platform built on open source cloud technologies running on an Infrastructure-as-a-Service platform. It aims to provide a secure, scalable platform for patients, healthcare providers, and payers to exchange electronic health records and conduct billing activities via mobile apps and cloud services. The system architecture of MedBook and its implementation using the Jelastic cloud service are discussed.
Comparison of different Ant based techniques for identification of shortest p...IOSR Journals
This document compares different ant colony optimization (ACO) techniques for identifying the shortest path in a distributed network. ACO is based on the behavior of ants finding food sources and uses pheromone trails to probabilistically determine paths. The document reviews several ACO algorithms and techniques, including Max-Min, rank-based, and fuzzy rule-based approaches. It then implements an efficient ACO algorithm that performs better at finding the shortest path compared to other existing ACO techniques.
Power Optimization in MIMO-CN with MANETIOSR Journals
This document summarizes research on power optimization in MIMO cooperative networks (MIMO-CN) with mobile ad hoc networks (MANETs). It discusses using cooperative diversity to improve total network channel capacity by decoding signals from both the direct and relayed paths. The research analyzes the optimal power allocation structure between the source and multiple relays. It also examines different relaying strategies like amplify-forward and decode-forward. The goal is to understand the structural properties and maximize the end-to-end achievable rate of multi-relay MIMO-CN systems while considering per-node power constraints.
An improved Item-based Maxcover Algorithm to protect Sensitive Patterns in La...IOSR Journals
This document presents an improved item-based maxcover algorithm to protect sensitive patterns in large databases. The algorithm aims to minimize information loss when sanitizing databases to hide sensitive patterns. It works by identifying sensitive transactions containing restrictive patterns. It then sorts these transactions by degree and size and selects victim items to remove based on which items have the maximum cover across multiple patterns. This is done with only one scan of the source database. Experimental results on real datasets show the algorithm achieves zero hiding failure and low misses costs between 0-2.43% while keeping the sanitization rate between 40-68% and information loss below 1.1%.
The document analyzes pattern transformation algorithms for sensitive knowledge protection in data mining. It discusses:
1) Three main privacy preserving techniques - heuristic, cryptography, and reconstruction-based. The proposed algorithms use heuristic-based techniques.
2) Four proposed heuristic-based algorithms - item-based Maxcover (IMA), pattern-based Maxcover (PMA), transaction-based Maxcover (TMA), and Sensitivity Cost Sanitization (SCS) - that modify sensitive transactions to decrease support of restrictive patterns.
3) Performance improvements including parallel and incremental approaches to handle large, dynamic databases while balancing privacy and utility.
This document discusses privacy-preserving techniques for association rule mining. It introduces the problem of protecting sensitive rules mined from transactional databases before releasing the data. Two data restriction algorithms are described in detail: the Sliding Window Algorithm (SWA) and Item Grouping Algorithm (IGA). SWA sanitizes sensitive transactions by removing items, prioritizing the shortest transactions. IGA groups rules sharing items and sanitizes overlapping transactions together. The algorithms' effectiveness is evaluated using a synthetic dataset based on their ability to prevent discovery of restricted patterns in the sanitized data.
Distortion Based Algorithms For Privacy Preserving Frequent Item Set Mining IJDKP
Data mining services require accurate input data for their results to be meaningful, but privacy concerns
may influence users to provide spurious information. In order to preserve the privacy of the client in data
mining process, a variety of techniques based on random perturbation of data records have been proposed
recently. We focus on an improved distortion process that tries to enhance the accuracy by selectively
modifying the list of items. The normal distortion procedure does not provide the flexibility of tuning the
probability parameters for balancing privacy and accuracy parameters, and each item's presence/absence
is modified with an equal probability. In improved distortion technique, frequent one item-sets, and nonfrequent one item-sets are modified with a different probabilities controlled by two probability parameters
fp, nfp respectively. The owner of the data has a flexibility to tune these two probability parameters (fp and
nfp) based on his/her requirement for privacy and accuracy. The experiments conducted on real time
datasets confirmed that there is a significant increase in the accuracy at a very marginal cost in privacy.
A literature review of modern association rule mining techniquesijctet
This document discusses association rule mining techniques for extracting useful patterns from large datasets. It provides background on association rule mining and defines key concepts like support, confidence and frequent itemsets. The document then reviews several classic association rule mining algorithms like AIS, Apriori and FP-Growth. It explains that these algorithms aim to improve quality and efficiency by reducing database scans, generating fewer candidate itemsets and using pruning techniques.
This document presents an improved algorithm for hiding sensitive association rules in privacy preserving data mining. The algorithm aims to completely hide any given sensitive rule while minimizing side effects and database modifications. It works by calculating a weight for each transaction based on the number of sensitive rules it supports. The transaction with the highest weight that contains an item from a sensitive rule is then modified by removing that item. This process continues iteratively until all sensitive rules are no longer generated from the modified database. Experimental results show that the proposed algorithm has lower time complexity and requires fewer database modifications than previous rule hiding algorithms. It is also able to hide sensitive rules without generating additional unexpected rules.
This document discusses an improved algorithm for hiding sensitive association rules in privacy preserving data mining. The algorithm aims to completely hide any given sensitive rule while minimizing side effects by modifying the original database. It compares the performance of the proposed algorithm to existing algorithms like ISL, DSR and WSDA in terms of execution time and side effects generated. The algorithm focuses on transactions where an item has the highest weight, defined as the maximum number of rules in the sensitive rule set supported by the transaction item. It hides sensitive rules by increasing or decreasing the support of items in the rule while maintaining database quality.
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology
The document presents a proposed algorithm called MSApriori_VDB for efficiently mining rare association rules from transactional databases. The algorithm first converts the transaction database to a vertical data format to reduce the number of scans. It then uses a multiple minimum support framework where each item is assigned a minimum item support based on its frequency. The algorithm generates candidate itemsets, calculates their support, and prunes uninteresting itemsets to identify interesting rare associations with high confidence. Experimental results show the algorithm outperforms previous approaches in memory usage and runtime.
An Efficient Compressed Data Structure Based Method for Frequent Item Set Miningijsrd.com
Frequent pattern mining is very important for business organizations. The major applications of frequent pattern mining include disease prediction and analysis, rain forecasting, profit maximization, etc. In this paper, we are presenting a new method for mining frequent patterns. Our method is based on a new compact data structure. This data structure will help in reducing the execution time.
SECURED FREQUENT ITEMSET DISCOVERY IN MULTI PARTY DATA ENVIRONMENT FREQUENT I...Editor IJMTER
Security and privacy methods are used to protect the data values. Private data values are secured with
confidentiality and integrity methods. Privacy model hides the individual identity over the public data values.
Sensitive attributes are protected using anonymity methods. Two or more parties have their own private data under
the distributed environment. The parties can collaborate to calculate any function on the union of their data. Secure
Multiparty Computation (SMC) protocols are used in privacy preserving data mining in distributed environments.
Association rule mining techniques are used to fetch frequent patterns.Apriori algorithm is used to mine association
rules in databases. Homogeneous databases share the same schema but hold information on different entities.
Horizontal partition refers the collection of homogeneous databases that are maintained in different parties. Fast
Distributed Mining (FDM) algorithm is an unsecured distributed version of the Apriori algorithm. Kantarcioglu
and Clifton protocol is used for secure mining of association rules in horizontally distributed databases. Unifying
lists of locally Frequent Itemsets Kantarcioglu and Clifton (UniFI-KC) protocol is used for the rule mining process
in partitioned database environment. UniFI-KC protocol is enhanced in two methods for security enhancement.
Secure computation of threshold function algorithm is used to compute the union of private subsets in each of the
interacting players. Set inclusion computation algorithm is used to test the inclusion of an element held by one
player in a subset held by another.The system is improved to support secure rule mining under vertical partitioned
database environment. The subgroup discovery process is adapted for partitioned database environment. The
system can be improved to support generalized association rule mining process. The system is enhanced to control
security leakages in the rule mining process.
This document discusses sequential pattern mining, which aims to discover patterns or rules in sequential data where events are ordered by time. It provides background on sequential pattern mining and its applications. The document also discusses related work on mining sequential patterns and rules from time-series data and across multiple sequences. It describes algorithms for efficiently mining sequential patterns at scale from large databases.
An apriori based algorithm to mine association rules with inter itemset distanceIJDKP
Association rules discovered from transaction databases can be large in number. Reduction of association
rules is an issue in recent times. Conventionally by varying support and confidence number of rules can be
increased and decreased. By combining additional constraint with support number of frequent itemsets can
be reduced and it leads to generation of less number of rules. Average inter itemset distance(IID) or
Spread, which is the intervening separation of itemsets in the transactions has been used as a measure of
interestingness for association rules with a view to reduce the number of association rules. In this paper by
using average Inter Itemset Distance a complete algorithm based on the apriori is designed and
implemented with a view to reduce the number of frequent itemsets and the association rules and also to
find the distribution pattern of the association rules in terms of the number of transactions of non
occurrences of the frequent itemsets. Further the apriori algorithm is also implemented and results are
compared. The theoretical concepts related to inter itemset distance are also put forward.
Comparative analysis of association rule generation algorithms in data streamsIJCI JOURNAL
This document summarizes the results of an experiment that compares three algorithms for generating association rules from data streams: Association Outliers, Frequent Item Sets, and Supervised Association Rule. The algorithms were tested on partitioned windows of a connectivity dataset containing 1,000 to 10,000 instances. Association rules and execution time were used as performance metrics. The Frequent Item Set algorithm generated more rules faster than the other two algorithms across all window sizes and data volumes tested.
Privacy Preserving Approaches for High Dimensional Dataijtsrd
This paper proposes a model for hiding sensitive association rules for Privacy preserving in high dimensional data. Privacy preservation is a big challenge in data mining. The protection of sensitive information becomes a critical issue when releasing data to outside parties. Association rule mining could be very useful in such situations. It could be used to identify all the possible ways by which ˜non-confidential data can reveal ˜confidential data, which is commonly known as ˜inference problem. This issue is solved using Association Rule Hiding (ARH) techniques in Privacy Preserving Data Mining (PPDM). Association rule hiding aims to conceal these association rules so that no sensitive information can be mined from the database. Tata Gayathri | N Durga"Privacy Preserving Approaches for High Dimensional Data" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-1 | Issue-5 , August 2017, URL: http://www.ijtsrd.com/papers/ijtsrd2430.pdf http://www.ijtsrd.com/engineering/computer-engineering/2430/privacy-preserving-approaches-for-high-dimensional-data/tata-gayathri
A Novel Filtering based Scheme for Privacy Preserving Data MiningIRJET Journal
This document proposes a novel filtering-based algorithm for privacy-preserving data mining. It summarizes existing techniques like k-anonymity, association rule mining, and feature selection using ReliefF. A two-phase algorithm is presented that first applies k-anonymity, ReliefF, and column filtering, followed by association rule mining and row filtering to generate a sanitized dataset. Experimental results on German credit and Titanic datasets show the sanitized datasets, feature selection, rules mined at different minimum support levels, and time required. The approach aims to preserve privacy while maintaining data utility and no information loss.
Introduction To Multilevel Association Rule And Its MethodsIJSRD
Association rule mining is a popular and well researched method for discovering interesting relations between variables in large databases. In this paper we introduce the concept of Data mining, Association rule and Multilevel association rule with different algorithm, its advantage and concept of Fuzzy logic and Genetic Algorithm. Multilevel association rules can be mined efficiently using concept hierarchies under a support-confidence framework.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
The document discusses attribute ontological relational weights (AORW), a post-mining process for pruning association rules. It aims to overcome limitations of prior approaches that rely on closed itemset mining or expert evaluation.
The proposed AORW process begins by measuring the property support degree of each attribute in a given rule based on an XML descriptor of attribute classes/relations. It then measures the attribute relation support of attribute pairs to determine how relationally coherent the attributes are. Rules with incoherent attribute pairs would be pruned.
Prior work on post-mining of association rules is also discussed, including approaches based on closed itemsets, rule clustering, domain knowledge models, and continuous post-mining of data streams
An Ontological Approach for Mining Association Rules from Transactional DatasetIJERA Editor
This document discusses using an ontology relational weights measure (ORWM) approach to mine interesting infrequent item sets from transactional datasets. It introduces three algorithms: 1) Infrequent Weighted Item Set Miner, which mines infrequent item sets using a FP-growth-like approach; 2) Minimal Infrequent Weighted Item Set Miner, which avoids extracting non-minimal item sets; and 3) ORWM, which integrates user knowledge, prunes rules to minimize the number generated, and uses a weighted support measure to discover interesting patterns. The ORWM represents items as a directed graph and applies the HITS model to rank items and discover high authority/hub infrequent item sets of interest to users.
Similar to An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases (20)
This document provides a technical review of secure banking using RSA and AES encryption methodologies. It discusses how RSA and AES are commonly used encryption standards for secure data transmission between ATMs and bank servers. The document first provides background on ATM security measures and risks of attacks. It then reviews related work analyzing encryption techniques. The document proposes using a one-time password in addition to a PIN for ATM authentication. It concludes that implementing encryption standards like RSA and AES can make transactions more secure and build trust in online banking.
This document analyzes the performance of various modulation schemes for achieving energy efficient communication over fading channels in wireless sensor networks. It finds that for long transmission distances, low-order modulations like BPSK are optimal due to their lower SNR requirements. However, as transmission distance decreases, higher-order modulations like 16-QAM and 64-QAM become more optimal since they can transmit more bits per symbol, outweighing their higher SNR needs. Simulations show lifetime extensions up to 550% are possible in short-range networks by using higher-order modulations instead of just BPSK. The optimal modulation depends on transmission distance and balancing the energy used by electronic components versus power amplifiers.
This document provides a review of mobility management techniques in vehicular ad hoc networks (VANETs). It discusses three modes of communication in VANETs: vehicle-to-infrastructure (V2I), vehicle-to-vehicle (V2V), and hybrid vehicle (HV) communication. For each communication mode, different mobility management schemes are required due to their unique characteristics. The document also discusses mobility management challenges in VANETs and outlines some open research issues in improving mobility management for seamless communication in these dynamic networks.
This document provides a review of different techniques for segmenting brain MRI images to detect tumors. It compares the K-means and Fuzzy C-means clustering algorithms. K-means is an exclusive clustering algorithm that groups data points into distinct clusters, while Fuzzy C-means is an overlapping clustering algorithm that allows data points to belong to multiple clusters. The document finds that Fuzzy C-means requires more time for brain tumor detection compared to other methods like hierarchical clustering or K-means. It also reviews related work applying these clustering algorithms to segment brain MRI images.
1) The document simulates and compares the performance of AODV and DSDV routing protocols in a mobile ad hoc network under three conditions: when users are fixed, when users move towards the base station, and when users move away from the base station.
2) The results show that both protocols have higher packet delivery and lower packet loss when users are either fixed or moving towards the base station, since signal strength is better in those scenarios. Performance degrades when users move away from the base station due to weaker signals.
3) AODV generally has better performance than DSDV, with higher throughput and packet delivery rates observed across the different user mobility conditions.
This document describes the design and implementation of 4-bit QPSK and 256-bit QAM modulation techniques using MATLAB. It compares the two techniques based on SNR, BER, and efficiency. The key steps of implementing each technique in MATLAB are outlined, including generating random bits, modulation, adding noise, and measuring BER. Simulation results show scatter plots and eye diagrams of the modulated signals. A table compares the results, showing that 256-bit QAM provides better performance than 4-bit QPSK. The document concludes that QAM modulation is more effective for digital transmission systems.
The document proposes a hybrid technique using Anisotropic Scale Invariant Feature Transform (A-SIFT) and Robust Ensemble Support Vector Machine (RESVM) to accurately identify faces in images. A-SIFT improves upon traditional SIFT by applying anisotropic scaling to extract richer directional keypoints. Keypoints are processed with RESVM and hypothesis testing to increase accuracy above 95% by repeatedly reprocessing images until the threshold is met. The technique was tested on similar and different facial images and achieved better results than SIFT in retrieval time and reduced keypoints.
This document studies the effects of dielectric superstrate thickness on microstrip patch antenna parameters. Three types of probes-fed patch antennas (rectangular, circular, and square) were designed to operate at 2.4 GHz using Arlondiclad 880 substrate. The antennas were tested with and without an Arlondiclad 880 superstrate of varying thicknesses. It was found that adding a superstrate slightly degraded performance by lowering the resonant frequency and increasing return loss and VSWR, while decreasing bandwidth and gain. Specifically, increasing the superstrate thickness or dielectric constant resulted in greater changes to the antenna parameters.
This document describes a wireless environment monitoring system that utilizes soil energy as a sustainable power source for wireless sensors. The system uses a microbial fuel cell to generate electricity from the microbial activity in soil. Two microbial fuel cells were created using different soil types and various additives to produce different current and voltage outputs. An electronic circuit was designed on a printed circuit board with components like a microcontroller and ZigBee transceiver. Sensors for temperature and humidity were connected to the circuit to monitor the environment wirelessly. The system provides a low-cost way to power remote sensors without needing battery replacement and avoids the high costs of wiring a power source.
1) The document proposes a model for a frequency tunable inverted-F antenna that uses ferrite material.
2) The resonant frequency of the antenna can be significantly shifted from 2.41GHz to 3.15GHz, a 31% shift, by increasing the static magnetic field placed on the ferrite material.
3) Altering the permeability of the ferrite allows tuning of the antenna's resonant frequency without changing the physical dimensions, providing flexibility to operate over a wide frequency range.
This document summarizes a research paper that presents a speech enhancement method using stationary wavelet transform. The method first classifies speech into voiced, unvoiced, and silence regions based on short-time energy. It then applies different thresholding techniques to the wavelet coefficients of each region - modified hard thresholding for voiced speech, semi-soft thresholding for unvoiced speech, and setting coefficients to zero for silence. Experimental results using speech from the TIMIT database corrupted with white Gaussian noise at various SNR levels show improved performance over other popular denoising methods.
This document reviews the design of an energy-optimized wireless sensor node that encrypts data for transmission. It discusses how sensing schemes that group nodes into clusters and transmit aggregated data can reduce energy consumption compared to individual node transmissions. The proposed node design calculates the minimum transmission power needed based on received signal strength and uses a periodic sleep/wake cycle to optimize energy when not sensing or transmitting. It aims to encrypt data at both the node and network level to further optimize energy usage for wireless communication.
This document discusses group consumption modes. It analyzes factors that impact group consumption, including external environmental factors like technological developments enabling new forms of online and offline interactions, as well as internal motivational factors at both the group and individual level. The document then proposes that group consumption modes can be divided into four types based on two dimensions: vertical (group relationship intensity) and horizontal (consumption action period). These four types are instrument-oriented, information-oriented, enjoyment-oriented, and relationship-oriented consumption modes. Finally, the document notes that consumption modes are dynamic and can evolve over time.
The document summarizes a study of different microstrip patch antenna configurations with slotted ground planes. Three antenna designs were proposed and their performance evaluated through simulation: a conventional square patch, an elliptical patch, and a star-shaped patch. All antennas were mounted on an FR4 substrate. The effects of adding different slot patterns to the ground plane on resonance frequency, bandwidth, gain and efficiency were analyzed parametrically. Key findings were that reshaping the patch and adding slots increased bandwidth and shifted resonance frequency. The elliptical and star patches in particular performed better than the conventional design. Three antenna configurations were selected for fabrication and measurement based on the simulations: a conventional patch with a slot under the patch, an elliptical patch with slots
1) The document describes a study conducted to improve call drop rates in a GSM network through RF optimization.
2) Drive testing was performed before and after optimization using TEMS software to record network parameters like RxLevel, RxQuality, and events.
3) Analysis found call drops were occurring due to issues like handover failures between sectors, interference from adjacent channels, and overshooting due to antenna tilt.
4) Corrective actions taken included defining neighbors between sectors, adjusting frequencies to reduce interference, and lowering the mechanical tilt of an antenna.
5) Post-optimization drive testing showed improvements in RxLevel, RxQuality, and a reduction in dropped calls.
This document describes the design of an intelligent autonomous wheeled robot that uses RF transmission for communication. The robot has two modes - automatic mode where it can make its own decisions, and user control mode where a user can control it remotely. It is designed using a microcontroller and can perform tasks like object recognition using computer vision and color detection in MATLAB, as well as wall painting using pneumatic systems. The robot's movement is controlled by DC motors and it uses sensors like ultrasonic sensors and gas sensors to navigate autonomously. RF transmission allows communication between the robot and a remote control unit. The overall aim is to develop a low-cost robotic system for industrial applications like material handling.
This document reviews cryptography techniques to secure the Ad-hoc On-Demand Distance Vector (AODV) routing protocol in mobile ad-hoc networks. It discusses various types of attacks on AODV like impersonation, denial of service, eavesdropping, black hole attacks, wormhole attacks, and Sybil attacks. It then proposes using the RC6 cryptography algorithm to secure AODV by encrypting data packets and detecting and removing malicious nodes launching black hole attacks. Simulation results show that after applying RC6, the packet delivery ratio and throughput of AODV increase while delay decreases, improving the security and performance of the network under attack.
The document describes a proposed modification to the conventional Booth multiplier that aims to increase its speed by applying concepts from Vedic mathematics. Specifically, it utilizes the Urdhva Tiryakbhyam formula to generate all partial products concurrently rather than sequentially. The proposed 8x8 bit multiplier was coded in VHDL, simulated, and found to have a path delay 44.35% lower than a conventional Booth multiplier, demonstrating its potential for higher speed.
This document discusses image deblurring techniques. It begins by introducing image restoration and focusing on image deblurring. It then discusses challenges with image deblurring being an ill-posed problem. It reviews existing approaches to screen image deconvolution including estimating point spread functions and iteratively estimating blur kernels and sharp images. The document also discusses handling spatially variant blur and summarizes the relationship between the proposed method and previous work for different blur types. It proposes using color filters in the aperture to exploit parallax cues for segmentation and blur estimation. Finally, it proposes moving the image sensor circularly during exposure to prevent high frequency attenuation from motion blur.
This document describes modeling an adaptive controller for an aircraft roll control system using PID, fuzzy-PID, and genetic algorithm. It begins by introducing the aircraft roll control system and motivation for developing an adaptive controller to minimize errors from noisy analog sensor signals. It then provides the mathematical model of aircraft roll dynamics and describes modeling the real-time flight control system in MATLAB/Simulink. The document evaluates PID, fuzzy-PID, and PID-GA (genetic algorithm) controllers for aircraft roll control and finds that the PID-GA controller delivers the best performance.
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...alexjohnson7307
Predictive maintenance is a proactive approach that anticipates equipment failures before they happen. At the forefront of this innovative strategy is Artificial Intelligence (AI), which brings unprecedented precision and efficiency. AI in predictive maintenance is transforming industries by reducing downtime, minimizing costs, and enhancing productivity.
A Comprehensive Guide to DeFi Development Services in 2024Intelisync
DeFi represents a paradigm shift in the financial industry. Instead of relying on traditional, centralized institutions like banks, DeFi leverages blockchain technology to create a decentralized network of financial services. This means that financial transactions can occur directly between parties, without intermediaries, using smart contracts on platforms like Ethereum.
In 2024, we are witnessing an explosion of new DeFi projects and protocols, each pushing the boundaries of what’s possible in finance.
In summary, DeFi in 2024 is not just a trend; it’s a revolution that democratizes finance, enhances security and transparency, and fosters continuous innovation. As we proceed through this presentation, we'll explore the various components and services of DeFi in detail, shedding light on how they are transforming the financial landscape.
At Intelisync, we specialize in providing comprehensive DeFi development services tailored to meet the unique needs of our clients. From smart contract development to dApp creation and security audits, we ensure that your DeFi project is built with innovation, security, and scalability in mind. Trust Intelisync to guide you through the intricate landscape of decentralized finance and unlock the full potential of blockchain technology.
Ready to take your DeFi project to the next level? Partner with Intelisync for expert DeFi development services today!
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
1. IOSR Journal of Computer Engineering (IOSRJCE)
ISSN: 2278-0661 Volume 5, Issue 1 (Sep-Oct. 2012), PP 06-11
Www.iosrjournals.org
www.iosrjournals.org 6 | Page
An Effective Heuristic Approach for Hiding Sensitive Patterns in
Databases
Mrs. P.Cynthia Selvi1
, Dr. A.R.Mohamed Shanavas2
1
Associate Professor, Dept. of Computer Science, KNGA College(W), Thanjavur 613007 /Affiliated to
Bharathidasan University, Tiruchirapalli, TamilNadu, India.
2
Associate Professor, Dept. of Comuter Science, Jamal Mohamed College, Tiruchirapalli 620 020/ Affiliated to
Bharathidasan University, Tiruchirapalli, TamilNadu, India.
Abstract: Privacy has been identified as a vital requirement in designing and implementing Data Mining (DM)
systems. This motivated Privacy Preservation in Data Mining (PPDM) as a rising field of research and various
approaches are being introduced by the researchers. One of the approaches is a sanitization process, that
transforms the source database into a modified one that the adversaries cannot extract the sensitive patterns
from. This study address this concept and proposes an effective heuristic-based algorithm which is aimed at
minimizing the number of removal of items in the source database possibly with no hiding failure.
Keywords: Privacy Preserving Data Mining, Restrictive Patterns, Sanitized database, Sensitive Transactions.
I. Introduction
PPDM is a novel research direction in DM, where DM algorithms are analyzed for the side-effects they
incur in data privacy. The main objective of PPDM is to develop algorithms for modifying the original data in
some way, so that the private data and private knowledge remain private even after the mining process[1]. In
DM, the users are provided with the data and not the association rules and are free to use their own tools; So,
the restriction for privacy has to be applied on the data itself before the mining phase.
For this reason, we need to develop mechanisms that can lead to new privacy control systems to
convert a given database into a new one in such a way to preserve the general rules mined from the original
database. The procedure of transforming the source database into a new database that hides some sensitive
patterns or rules is called the sanitization process[2]. To do so, a small number of transactions have to be
modified by deleting one or more items from them or even adding noise to the data by turning some items from
0 to 1 in some transactions. The released database is called the sanitized database. On one hand, this approach
slightly modifies some data, but this is perfectly acceptable in some real applications[3, 4]. On the other hand,
such an approach must hold the following restrictions:
o The impact on the source database has to be minimal
o An appropriate balance between the need for privacy and knowledge has to be guaranteed.
This study mainly focus on the task of minimizing the impact on the source database by reducing the
number of removed items from the source database with only one scan of the database. Section-2 briefly
summarizes the previous work done by various researchers; In Section-3 preliminaries are given. Section-4
states some basic definitions and of which definition 5 is framed by us which is used in the proposed heuristic-
based algorithm. In Section-5 the proposed algorithm is presented with illustration and example. As the detailed
analysis of the experimental results on large databases is under process, only the basic measures of effectiveness
is presented in this paper, after testing the algorithm for a sample generated database.
II. Related Work
Many researchers have paid attention to address the problem of privacy preservation in association rule
mining in recent years. The class of solutions for this problem has been restricted basically to randomization,
data partition, and data sanitization.
The idea behind data sanitization to reduce the support values of restrictive itemsets was first
introduced by Atallah et.al[1] and they have proved that the optimal sanitization process is NP-hard problem. In
[4], the authors generalized the problem in the sense that they considered the hiding of both sensitive frequent
itemsets and sensitive rules. Although these algorithms ensure privacy preservation, they are CPU-intensive
since they require multiple scans over a transactional database. In the same direction, Saygin [5] introduced a
method for selectively removing individual values from a database to prevent the discovery of a set of rules,
while preserving the data for other applications. They proposed some algorithms to obscure a given set of
sensitive rules by replacing known values with unknowns, while minimizing the side effects on non-sensitive
rules. These algorithms also require various scans to sanitize a database depending on the number of association
rules to be hidden.
2. An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
www.iosrjournals.org 7 | Page
Oliveira introduced many algorithms of which IGA[6] & SWA[7] aims at multiple rule hiding.
However, IGA has low misses cost; It groups restrictive itemsets and assigns a victim item to each group. This
clustering leads to the overlap between groups and it is not an efficient method to optimally cluster the itemsets.
It can be improved further by reducing the number of deleted items. Whereas, SWA improves the balance
between protection of sensitive knowledge and pattern discovery but it incurs an extra cost because some rules
are removed inadvertently. In our work, we focus on the heuristic based data sanitization approach.
III. Preliminaries
Transactional Database. A transactional database is a relation consisting of transactions in which each
transaction t is characterized by an ordered pair, defined as t = ˂Tid, list-of-elements˃, where Tid is a unique
transaction identifier number and list-of-elements represents a list of items making up the transactions. For
instance, in market basket data, a transactional database is composed of business transactions in which the list-
of-elements represents items purchased in a store.
Basics of Association Rules . One of the most studied problems in data mining is the process of discovering
association rules from large databases. Most of the existing algorithms for association rules rely on the support-
confidence framework introduced in [8].
Formally, association rules are defined as follows: Let I = {i1,...,in} be a set of literals, called items. Let
D be a database of transactions, where each transaction t is an itemset such that . A unique identifier, called
Tid, is associated with each transaction. A transaction t supports X, a set of items in I, if . An association
rule is an implication of the form , where , and . Thus, we say that a
rule holds in the database D with support if , where N is the number of transactions in D.
Similarly, we say that a rule holds in the database D with confidence ) if , where is the
number of occurrences of the set of items A in the set of transactions D. While the support is a measure of the
frequency of a rule, the confidence is a measure of the strength of the relation between sets of items.
Association rule mining algorithms rely on the two attributes, minimum Support(minSup ) and
minimum Confidence(minConf ). The problem of mining association rules have been first proposed in 1993[8].
Frequent Pattern. A pattern X is called a frequent pattern if Sup(X) ≥ minSup or if the absolute support of X
satisfies the corresponding minimum support count threshold. [pattern is an itemset; in this article, both terms
are used synonymously]. All association rules can directly be derived from the set of frequent patterns[8, 9]. The
conventions followed here are
o Apriori property[10]: all non empty subsets of a frequent itemsets(patterns) must also be frequent.
o Antimonotone property: if a set cannot pass a test, then all of its supersets will fail the same test as well.
Privacy Preservation in Frequent Patterns. The most basic model of privacy preserving data processing is one
in which we erase the sensitive entries in the data. These erased entries are usually particular patterns which are
decided by the user, who may either be the owner or the contributor of the data.
IV. Problem Definition
In this approach, the goal is to hide a group of frequent patterns which contains highly sensitive
knowledge. Such sensitive patterns that should be hidden are called restrictive patterns. Restrictive patterns can
always be generated from frequent patterns.
Definition 1. Let D be a source database, containing a set of all transactions. T denotes a set of transactions,
each transaction containing itemset . In addition, each k-itemset has an associated set of
transactions , where and .
Definition 2 : Restrictive Patterns : Let D be a source database, P be a set of all frequent patterns that can be
mined from D, and RulesH be a set of decision support rules that need to be hidden according to some security
policies. A set of patterns, denoted by RP is said to be restrictive, if RP ⊂ P and if and only if RP would
derive the set RulesH. RP is the set of non-restrictive patterns such that RP RP = P.
Definition 3 : Sensitive Transactions : Let T be a set of all transactions in a source database D, and RP be a set
of restrictive patterns mined from D. A set of transactions is said to be sensitive, denoted by ST, if every t ST
contain atleast one restrictive pattern, ie ST ={ T | X RP, X ⊆ t }. Moreover, if ST T then all restrictive
patterns can be mined one and only from ST.
3. An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
www.iosrjournals.org 8 | Page
Definition 4 : Transaction Degree : Let D be a source database and ST be a set of all sensitive transactions in
D. The degree of a sensitive transaction t, denoted as deg(t), such that t ST is defined as the number of
restrictive patterns that t contains.
Definition 5: Cover : The Cover of an item Ak can be defined as, CAk = { rpi | Ak rpi RP, 1 i |RP|} i.e.,
set of all restrictive patterns(rpi’s) which contain Ak. The item that is included in a maximum number of rpi’s is
the one with maximal cover or maxCover; i.e., maxCover = max( |CA1|, |CA2| , … |CAn| ) such that Ak rpi RP.
Based on the above definitions, the main strategy addressed in this work can be stated as given below:
If D is the source database of transactions and P is the set of relevant patterns that would be mined
from D, the goal is to transform D into a sanitized database D’, so that the most frequent patterns in P can still
be mined from D’ while others will be hidden. In this case, D’ becomes the released database.
V. Sanitization Algorithm
The optimal sanitization has been proved to be an NP-hard problem. To alleviate the complexity of the
optimal sanitization, some heuristics could be used. A heuristic does not guarantee the optimal solution but
usually finds a solution close to the best one in a faster response time. In this section, the proposed sanitizing
algorithm and the heuristics to sanitize a source database are introduced.
Given the source database (D), and the restrictive patterns(RP), the goal of the sanitization process is to
protect RP against the mining techniques used to disclose them. The sanitization process decreases the support
values of restrictive patterns by removing items from sensitive transactions. This process mainly includes four
sub-problems:
1. identifying the set of sensitive transactions for each restrictive pattern;
2. selecting the partial sensitive transactions to sanitize;
3. identify the candidate item(victim item) to be removed;
4. rewriting the modified database after removing the victim items.
Basically, all sanitizing algorithms differs only in subproblems 2 & 3.
5.1. Heuristic approach
In this work, the proposed algorithm is based on the following heuristics:
Heuristic-1. To solve subproblem-2 stated above, ST is sorted in decreasing order of (deg + size), thereby the
sensitive transactions that contains more number of patterns can be selected; this enable multiple patterns to be
sanitized in a single iteration.
Heuristic-2. To solve subproblem-3, the following heuristic is used in the algorithm :
for every item Ak RP, find cover and starting from maximal Cover, find T = ;
for every t T, mark Ak as the victim item and remove.
The rationale behind these two heuristics is to minimize the sanitization rate and thereby reducing the
impact on the source database.
Note : In this work, no sensitive transaction is completely removed. (ie, the number of transactions in the
source database is not altered).
5.2. Algorithm
Input : (i) D – Source Database (ii) RP – Set of all Restrictive Patterns
Output : D’ – Sanitized Database
Pre-requisites :
(i) Find Frequent Patterns(Itemsets) using Matrix Apriori Algorithm;
(ii) Form Look-up Table-1 : Ak RP , LT1(Ak) t-list / t D
(iii) Form Look-up Table-2 : rpi RP, LT2(rpi) t-list / t D
(iv) Form Look-up Table-3 : Ak RP, LT3(Ak) rpi-list / rpi RP
Algorithm maxCover1: // based on Heuristics 1 & 2 //
Step 1 : calculate supCount(rpi) rpi RP and sort in decreasing order ;
Step 2 : find Sensitive Transactions(ST) w.r.t. RP ;
a) calculate deg(t), size(t) t ST ;
b) sort t ST in decreasing order of deg & size ;
Step 3 : find ST D ST ; // ST - non sensitive transactions //
4. An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
www.iosrjournals.org 9 | Page
Step 4 : // Find ST’ //
find cover for every item Ak RP and sort in decreasing order of cover;
for each item Ak RP do
{
repeat
find T =
for each t T do
{
delete item Ak in non VictimTransactions such that Ak rpi rpi-list ; // Ak – victimItem //
// initially all t are nonvictim //
decrease supCount of rpi’s for which t is nonVictim;
mark t as victimTransaction in each t-list of rpi rpi _list(Ak ) ;
}
until (supCount = 0) for all rpi RP
}
Step 5 : D’ ST ST’
5.3. Illustration
The following examples help understand how the proposed Heuristics work. Refer the Source
Database(D) in Table-1. The set of all Restrictive Patterns to be hidden are given in Table-2. The sensitive
transactions- ST (transactions which include atleast one Restrictive Pattern) are identified from D and are
extracted. They are sorted in decreasing order of their deg and size(Table-3). Non sensitive transactions( ST)
are also filtered and stored separately(Table-5).
The proposed algorithm refer the Look-up Tables(listed below) to speed up the process.
LookUp Table-1 [ item ← t-list ] LookUp Table-2 [ rpi ← t-list ]
Item Transactions No.of
Trans.
C T01,T04,T02,T03,T06 5
D T01,T04,T02,T03,T06 5
E T01,T04,T03,T06 4
A T01,T04,T02 3
Item Transactions SupCount
R1 T01,T04,T02,T03,T06 5
R2 T01,T04,T03,T06 4
R3 T01,T04,T02 3
5. An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
www.iosrjournals.org 10 | Page
LookUp Table-3 [ item ← rpi-list ] Table.6. Sanitized Database(D’)
The proposed algorithm is based on the Heuristics 1 and 2. Here, for every item Ak starting from maxCover
(refer LookUp Table-3), find the sensitive transactions which are common for all rules associated with Ak.
i.e., [T = ]. In every transaction t in T, remove Ak and decrease the supCount of all restrictive
patterns( rpi) associated with t and which contain Ak.
Example : Item C (being the first item with maxCover), R1 & R3 form the rpi-list(C); whose common
transactions are [T01, T04, T02]. Remove C from these transactions, which reduce the supCount of both R1
and R3 by 3. Mark these t as victim transactions for R1 and R3. When we consider the next item D, the t-list of
rpi-list(D) are [T01, T04, T03, T06]. Removing D from T01 and T04 would not reduce the supCount of
R1(because they are already considered in the previous iteration); but would reduce the supCount of R2. Hence,
remove it and decrease only the supCount of R2. Whereas removing D from T03 and T06 would reduce the
supCount of both R1 and R2. This process is repeated until the supCount of all rpi’s are reduced to 0. The
modified form of sensitive transactions are denoted as ST’.
Then the Sanitized database D’(refer table-6)is formed bycombining ST (Table-5) and ST’.
VI. Implementation
This algorithm is tested in the environment of Intel core 2 duo Processor with 2.5 GHz speed and 4 GB
RAM, running Windows XP. We have used NetBean 6.9.1 to code the algorithm using Java(JDK 1.7) with
SQL Server 2005. It has been tested for a sample database to verify the main objectives (minimal removal of
items and no hiding failure) and it requires only one scan of the source database. However, the testing process
for very large real databases is in progress for a detailed analysis. Before the hiding process, the frequent
patterns are obtained using Matrix Apriori Algorithm[11], which is faster and use simpler data structure than
the Apriori algorithm[9,10]. Moreover, it scans the database only twice and works without candidate generation.
VII. Effectiveness Measures
(i) Sanitization Rate(SR) : It is defined as the ratio of removed items(victim items) to the total support value of
restrictive patterns(rpi) in the source database D.
SR =
With the sample database tested, it is found that the SR is less than 50%.
(ii) Hiding Failure(HF) : It is assessed by the restrictive patterns that were failed to be hidden. In other words,
if a hidden restrictive pattern cannot be extracted from the released database D’ with an arbitrary minSup, the
hidden pattern has no hiding failure occurrence.
HF =
As far as the algorithm maxcover1 is concerned, the restrictive patterns are modified till their
respective supcount becomes zero. Moreover it ensures that no transaction is completely removed. Hence this
algorithm is 100% HF free.
(iii) Misses Cost(MC) : This measure deals with the legitimate patterns(non restrictive patterns) that were
accidently missed.
MC =
Note: There is a compromise between the MC and HF; ie, the more patterns we hide, the more legitimate
patterns we miss. But with the sample database tested, this algorithm has 0% MC.
(iv) Artifactual Pattern(AP) : AP occurs when some artificial patterns are generated form D’ as an outcome of
the sanitization process.
AP =
Item Rules Cover
C R1, R3 2
D R1, R2 2
E R2 1
A R3 1
Tid Pattern(Itemset)
T01 E,A,B
T02 D,A,B,F
T03 C,E
T04 E,A
T05 E,B,F
T06 C,E
6. An Effective Heuristic Approach for Hiding Sensitive Patterns in Databases
www.iosrjournals.org 11 | Page
As this algorithm hides restrictive patterns by selectively removing items (instead of swapping, replacement,
etc.,) from the source database(D), it does not generate any artifactual pattern.
VIII. Conclusion
In this competitive but cooperative business environment, companies need to share information with
others, while at the same time, have to protect their own confidential knowledge. To facilitate this kind of data
sharing with privacy protection, the algorithm based on maxCover is proposed. This algorithm ensure that no
counterpart or adversary can mine the restrictive patterns even with an arbitrarily very small supCount.
This algorithm is based on the strategy to simultaneously decrease the support count of maximum
number of sensitive patterns (itemsets), with possibly minimum number of removal of items and it reduce the
impact on the source database. The proposed algorithm has minimal sanitization rate possibly with no hiding
failure and low misses cost. Above all this algorithm scans the original database only once. It is important to
note that the proposed algorithm is robust in the sense that there is no desanitization possible. The alterations to
the original database are not saved anywhere, since the owner of the database still keeps an original copy of the
database intact, while disturbing the sanitized database. Moreover, There is no possible way to reproduce the
original database from the sanitized one, as there is no encryption involved.
As already mentioned, the work on the time complexity analysis and the dissimilarity study between
the original and sanitized databases in an elaborate manner using very large databases is in progress for which
the publicly available real databases are being used.
References
[1]. Verykios,V.S, Bertino.E, Fovino.I.N, ProvenzaL.P, Saygin.Y and Theodoridis.Y, “State-of-the-art in Privacy Preservation Data
Mining”, New York,ACM SIGMOD Record, vol.33, no.2, pp.50-57,2004.
[2]. Atallah.M, Bertino.E, Elmagarmid.A, Ibrahim.M and Verykios.V.S, “Disclosure Limitation of Sensitive Rules”, In Proc. of IEEE
Knowledge and Data Engineering Workshop, pages 45–52, Chicago, Illinois, November 1999.
[3]. Clifton.C and Marks.D, “ Security and Privacy Implications of Data Mining”, In Workshop on Data Mining and Knowledge
Discovery, pages 15–19, Montreal, Canada, February 1996.
[4]. Dasseni.E, Verykios.V.S, Elmagarmid.A.K and Bertino.E, “ Hiding Association Rules by Using Confidence and Support”, In Proc.
of the 4th Information Hiding Workshop, pages 369– 383, Pittsburg, PA, April 2001.
[5]. Saygin.Y, Verykios.V.S, and Clifton.C, “Using Unknowns to Prevent Discovery of Association Rules”, SIGMOD Record,
30(4):45–54, December 2001.
[6]. Oliveira.S.R.M and Zaiane.O.R, “Privacy preserving Frequent Itemset Mining”, in the Proc. of the IEEE ICDM Workshop on
Privacy, Security, and Data Mining, Pages 43-54, Maebashi City, Japan, December 2002.
[7]. Oliveira.S.R.M and Zaiane.O.R, “An Efficient One-Scan Sanitization for Improving the Balance between Privacy and Knowledge
Discovery”, Technical Report TR 03-15, June 2003.
[8]. Agrawal R, Imielinski T, and Swami.A, “Mining association rules between sets of items in large databases”, Proceedings of 1993
ACM SIGMOD international conference on management of data, Washington, DC; 1993. p. 207-16.
[9]. Agrawal R, and Srikant R. “Fast algorithms for mining association rules”, Proceedings of 20th
international conference on very large
data bases, Santiago, Chile; 1994. p. 487-99.
[10]. Han J, and Kamber M, “Data Mining Concepts and Techniques”, Oxford University Press, 2009.
[11]. Pavon.J, Viana.S, and Gomez.S, “Matrix Apriori: speeding up the search for frequent patterns,” Proc. 24th IASTED International
Conference on Databases and Applications, 2006, pp. 75-82.