Abstract Many techniques have been designed for privacy preserving and micro data publishing, such as generalization and bucketization. Several works showed that generalization loses some amount of information especially for high dimensional data. So it’s not efficient for high dimensional data. In case of Bucketization, it does not prevents membership disclosure and also does not applicable for data that do not have a clear separation between Quasi-identifying attributes and sensitive attributes. In this paper, we presenting an innovative technique called data slicing which partitions the data. An efficient algorithm is developed for computing sliced data that obeys l-diversity requirement. we also show how data slicing is better than generalization and bucketization. Data slicing preserves better utility than generalization and also does not requires clear separation between Quasi-identifying and sensitive attributes. Data slicing is also used to prevent attribute disclosure and develop an efficient algorithm for computing the sliced data that obeys l-diversity requirement. Experimental results confirm that data slicing preserves data utility than generalization and more effective than bucketization involving sensitive attributes. Experimental results demonstrate the effectiveness of this method. Keywords –Privacy preserving, Data Security, Data Publishing, Microdata
A Novel Method for Privacy Preserving Micro data Publishing using SlicingIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
International Journal of Modern Engineering Research (IJMER) covers all the fields of engineering and science: Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Computer Engineering, Agricultural Engineering, Aerospace Engineering, Thermodynamics, Structural Engineering, Control Engineering, Robotics, Mechatronics, Fluid Mechanics, Nanotechnology, Simulators, Web-based Learning, Remote Laboratories, Engineering Design Methods, Education Research, Students' Satisfaction and Motivation, Global Projects, and Assessment…. And many more.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Improved Slicing Algorithm For Greater Utility In Privacy Preserving Data Pub...Waqas Tariq
Several algorithms and techniques have been proposed in recent years for the publication of sensitive microdata. However, there is a trade-off to be considered between the level of privacy offered and the usefulness of the published data. Recently, slicing was proposed as a novel technique for increasing the utility of an anonymized published dataset by partitioning the dataset vertically and horizontally. This work proposes a novel technique to increase the utility of a sliced dataset even further by allowing overlapped clustering while maintaining the prevention of membership disclosure. It is further shown that using an alternative algorithm to Mondrian increases the efficiency of slicing. This paper shows though workload experiments that these improvements help preserve data utility better than traditional slicing.
A review on anonymization techniques for privacy preserving data publishingeSAT Journals
Abstract
With the increased and vast use of online data, privacy in data publishing has now become very important issue. Now days, many
organizations are collecting and storing huge volumes of data in large databases. Data publisher collected data from data
holders then data publisher released this data to data recipient for research analysis and mining purpose. The released data
reveals some information, which is considered to be private and personal. Privacy of information in such scenario becomes
subject of research. In recent years, data anonymization techniques become subject of research. In this paper, we provide a
review of the statistical Anonymization techniques that can be used for preserving privacy of publish data. Microdata publishing
consist of variety of different anonymization methods like Generalization, Bucketization and Suppression. But for high
dimensional data generalization loses significant amount of data and affected from curse of dimensionality. Whereas,
bucketization fails to preserve membership disclosure and it needs a clear difference between quasi attributes and Sensitive
attributes. Suppression reduces quality of data drastically. To deal with these problems, slicing method is introduced. Slicing is
new anonymization technique for preserving publish data. In this data is partitioning horizontally as well as vertically.
Dimensionality of the data is decreased by slicing. Utility is preserved & correlations between attributes which are highlycorrelated
together are maintained. Compare to generalization, slicing provides better utility of data. Compare with
bucketization, slicing is more effective. High-dimensional data can be handled by slicing. Slicing also Provides attribute
disclosure and membership disclosure.
Key Words: privacy preserving Data publishing, microdata, Information Disclosure, Data Anonymization
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
Dichotomous data is a type of categorical data, which is binary with categories zero and one. Health care data is one of the heavily used categorical data. Binary data are the simplest form of data used for heath care databases in which close ended questions can be used; it is very efficient based on computational efficiency and memory capacity to represent categorical type data. Clustering health care or medical data is very tedious due to its complex data representation models, high dimensionality and data sparsity. In this paper, clustering is performed after transforming the dichotomous data into real by wiener transformation. The proposed algorithm can be usable for determining the correlation of the health disorders and symptoms observed in large medical and health binary databases. Computational results show that the clustering based on Wiener transformation is very efficient in terms of objectivity and subjectivity.
The various anonymization techniques, called generalization and bucketization, have been designed
for providing data privacy and preserving micro data publishing. Recent work shows that generalization
loses considerable amount of information on high dimensional data and bucketization did not prevent
membership disclosure with clear separation between quasi-identifying attributes and sensitive attributes.
The slicing techniques partitions the data both horizontally and vertically is proposed with entity resolution
which preserves data utility and membership disclosure protection .Slicing develops an efficient algorithm
with the ‘l-diversity requirement. The workload experiments with sensitive attribute ensures that slicing
provides better utility than generalization and is more effective than bucketization. As an extension we
proposed a technique called overlapped slicing, were the attributes are divided into more than one column
and release in each column consists of more attribute correlations.
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...rahulmonikasharma
In microdata releases, main task is to protect the privacy of data subjects. Microaggregation technique use to disclose the limitation at protecting the privacy of microdata. This technique is an alternative to generalization and suppression, which use to generate k-anonymous data sets. In this dataset, identity of each subject is hidden within a group of k subjects. Microaggregation perturbs the data and additional masking allows refining data utility in many ways, like increasing data granularity, to avoid discretization of numerical data, to reduce the impact of outliers. If the variability of the private data values in a group of k subjects is too small, k-anonymity does not provide protection against attribute disclosure. In this work Role based access control is assumed. The access control policies define selection predicates to roles. Then use the concept of imprecision bound for each permission to define a threshold on the amount of imprecision that can be tolerated. So the proposed approach reduces the imprecision for each selection predicate. Anonymization is carried out only for the static relational table in the existing papers. Privacy preserving access control mechanism is applied to the incremental data.
A Novel Method for Privacy Preserving Micro data Publishing using SlicingIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
International Journal of Modern Engineering Research (IJMER) covers all the fields of engineering and science: Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Computer Engineering, Agricultural Engineering, Aerospace Engineering, Thermodynamics, Structural Engineering, Control Engineering, Robotics, Mechatronics, Fluid Mechanics, Nanotechnology, Simulators, Web-based Learning, Remote Laboratories, Engineering Design Methods, Education Research, Students' Satisfaction and Motivation, Global Projects, and Assessment…. And many more.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Improved Slicing Algorithm For Greater Utility In Privacy Preserving Data Pub...Waqas Tariq
Several algorithms and techniques have been proposed in recent years for the publication of sensitive microdata. However, there is a trade-off to be considered between the level of privacy offered and the usefulness of the published data. Recently, slicing was proposed as a novel technique for increasing the utility of an anonymized published dataset by partitioning the dataset vertically and horizontally. This work proposes a novel technique to increase the utility of a sliced dataset even further by allowing overlapped clustering while maintaining the prevention of membership disclosure. It is further shown that using an alternative algorithm to Mondrian increases the efficiency of slicing. This paper shows though workload experiments that these improvements help preserve data utility better than traditional slicing.
A review on anonymization techniques for privacy preserving data publishingeSAT Journals
Abstract
With the increased and vast use of online data, privacy in data publishing has now become very important issue. Now days, many
organizations are collecting and storing huge volumes of data in large databases. Data publisher collected data from data
holders then data publisher released this data to data recipient for research analysis and mining purpose. The released data
reveals some information, which is considered to be private and personal. Privacy of information in such scenario becomes
subject of research. In recent years, data anonymization techniques become subject of research. In this paper, we provide a
review of the statistical Anonymization techniques that can be used for preserving privacy of publish data. Microdata publishing
consist of variety of different anonymization methods like Generalization, Bucketization and Suppression. But for high
dimensional data generalization loses significant amount of data and affected from curse of dimensionality. Whereas,
bucketization fails to preserve membership disclosure and it needs a clear difference between quasi attributes and Sensitive
attributes. Suppression reduces quality of data drastically. To deal with these problems, slicing method is introduced. Slicing is
new anonymization technique for preserving publish data. In this data is partitioning horizontally as well as vertically.
Dimensionality of the data is decreased by slicing. Utility is preserved & correlations between attributes which are highlycorrelated
together are maintained. Compare to generalization, slicing provides better utility of data. Compare with
bucketization, slicing is more effective. High-dimensional data can be handled by slicing. Slicing also Provides attribute
disclosure and membership disclosure.
Key Words: privacy preserving Data publishing, microdata, Information Disclosure, Data Anonymization
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
Dichotomous data is a type of categorical data, which is binary with categories zero and one. Health care data is one of the heavily used categorical data. Binary data are the simplest form of data used for heath care databases in which close ended questions can be used; it is very efficient based on computational efficiency and memory capacity to represent categorical type data. Clustering health care or medical data is very tedious due to its complex data representation models, high dimensionality and data sparsity. In this paper, clustering is performed after transforming the dichotomous data into real by wiener transformation. The proposed algorithm can be usable for determining the correlation of the health disorders and symptoms observed in large medical and health binary databases. Computational results show that the clustering based on Wiener transformation is very efficient in terms of objectivity and subjectivity.
The various anonymization techniques, called generalization and bucketization, have been designed
for providing data privacy and preserving micro data publishing. Recent work shows that generalization
loses considerable amount of information on high dimensional data and bucketization did not prevent
membership disclosure with clear separation between quasi-identifying attributes and sensitive attributes.
The slicing techniques partitions the data both horizontally and vertically is proposed with entity resolution
which preserves data utility and membership disclosure protection .Slicing develops an efficient algorithm
with the ‘l-diversity requirement. The workload experiments with sensitive attribute ensures that slicing
provides better utility than generalization and is more effective than bucketization. As an extension we
proposed a technique called overlapped slicing, were the attributes are divided into more than one column
and release in each column consists of more attribute correlations.
Enhanced Privacy Preserving Accesscontrol in Incremental Datausing Microaggre...rahulmonikasharma
In microdata releases, main task is to protect the privacy of data subjects. Microaggregation technique use to disclose the limitation at protecting the privacy of microdata. This technique is an alternative to generalization and suppression, which use to generate k-anonymous data sets. In this dataset, identity of each subject is hidden within a group of k subjects. Microaggregation perturbs the data and additional masking allows refining data utility in many ways, like increasing data granularity, to avoid discretization of numerical data, to reduce the impact of outliers. If the variability of the private data values in a group of k subjects is too small, k-anonymity does not provide protection against attribute disclosure. In this work Role based access control is assumed. The access control policies define selection predicates to roles. Then use the concept of imprecision bound for each permission to define a threshold on the amount of imprecision that can be tolerated. So the proposed approach reduces the imprecision for each selection predicate. Anonymization is carried out only for the static relational table in the existing papers. Privacy preserving access control mechanism is applied to the incremental data.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
A Rule based Slicing Approach to Achieve Data Publishing and Privacyijsrd.com
several anonymization techniques, such as generalization and bucketization, have been designed for privacy preserving micro data publishing. Recent work has shown that generalization loses considerable amount of information, especially for high dimensional data. Bucketization, on the other hand, does not prevent membership disclosure and does not apply for data that do not have a clear separation between quasi-identifying attributes and sensitive attributes. The existing system proposed slicing concept to overcome the tuple based partition this has been done to overcome the previous generalization and bucketization. In this paper, present a novel technique called rule based slicing, which partitions the data both horizontally and vertically. We show that slicing preserves better data utility than generalization and can be used for membership disclosure protection. Another important advantage of slicing is that it can handle high-dimensional data. We show how slicing can be used for attribute disclosure protection and develop an efficient algorithm for computing the sliced data that obey the l-diversity requirement. The workload experiments confirm that slicing preserves better utility than generalization and is more effective than bucketization in workloads involving the sensitive attribute. The experiments also demonstrate that slicing can be used to prevent membership disclosure
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINEaciijournal
Today, enormous amount of data is collected in medical databases. These databases may contain valuable
information encapsulated in nontrivial relationships among symptoms and diagnoses. Extracting such
dependencies from historical data is much easier to done by using medical systems. Such knowledge can be
used in future medical decision making. In this paper, a new algorithm based on C4.5 to mind data for
medince applications proposed and then it is evaluated against two datasets and C4.5 algorithm in terms of
accuracy.
Abstract Learning Analytics by nature relies on computational information processing activities intended to extract from raw data some interesting aspects that can be used to obtain insights into the behaviors of learners, the design of learning experiences, etc. There is a large variety of computational techniques that can be employed, all with interesting properties, but it is the interpretation of their results that really forms the core of the analytics process. As a rising subject, data mining and business intelligence are playing an increasingly important role in the decision support activity of every walk of life. The Variance Rover System (VRS) mainly focused on the large data sets obtained from online web visiting and categorizing this into clusters according some similarity and the process of predicting customer behavior and selecting actions to influence that behavior to benefit the company, so as to take optimized and beneficial decisions of business expansion. Keywords: Analytics, Business intelligence, Clustering, Data Mining, Standard K-means, Optimized K-means
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Privacy Preserving of Data Mining Based on Enumeration and Concatenation of A...AM Publications
Privacy preserving data mining and security of data has been developed for privacy protection and
protection of data during the data mining process and discovery of knowledge. The individual personal data can
be used for data mining process without disclosing the individual identity or any information regarding the
individual identity should be removed. Privacy preserving data mining is concerned with the valid data mining
results. For example, data mining applications like banking, credit card, hospitals which is concerned with the
individual personal sensitive information for those applications that is concerned with personal data, the
individual sensitive data can be removed from the database or the personal information can be encrypted and
stored at different locations in order to protect the privacy of the individuals. This paper includes a formal
protection model of enumeration and concatenating attribute using k-anonymity. The methods used in this
paper1.Enumeration and concatenation of age, sex attribute 2. Concatenating and encoding city and zipcode
attribute.
An approach for discrimination prevention in data miningeSAT Journals
Abstract In the age of Database technologies a large amount of data is collected and analyzed by using data mining techniques. However, the main issue in data mining is potential privacy invasion and potential discrimination. One of the techniques used in data mining for making decision is classification. On the other hand, if the dataset is biased then the discriminatory decision may occur. Therefore, in this paper we review the recent state of the art approaches for antidiscrimination techniques and also focuses on discrimination discovery and prevention in data mining. On the other hand, we also study a theoretical proposal for enhancing the results of the data quality. Keywords- Antidiscrimination, data mining, direct and indirect discrimination prevention, rule protection, rule generalization, privacy.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
TWO PARTY HIERARICHAL CLUSTERING OVER HORIZONTALLY PARTITIONED DATA SETIJDKP
Data mining is a task in which data is extracted from the large database to make itin an understandable
form or structure so that it can be used for further use. In this paper we present an approach by which the
concept of hierarchal clustering applied over the horizontally partitioned data set. We also explain the
desired algorithm like hierarichal clustering, algorithms for finding the minimum closest cluster. In this
paper wealso explain the two party computations. Privacy of any data is the most important thing in these
days hence we present an approach by which we can apply privacy preservation over the two party which
are distributing their data horizontally. We also explain about the hierarichal clustering which we are
going to apply in our present method.
Conceptual model,
Think of the conceptual model as an architect’s conceptual drawing of a house. It provides a good idea of what is required with very little additional detail;
Data modelling is a technique for defining business requirements for a database. It is sometimes called database modelling because a data model is eventually implemented in a database
Purpose of the data base system, data abstraction, data model, data independence, data definition
language, data manipulation language, data base manager, data base administrator, data base users,
overall structure.
ER Models, entities, mapping constrains, keys, E-R diagram, reduction E-R diagrams to tables,
generatio, aggregation, design of an E-R data base scheme.
Oracle RDBMS, architecture, kernel, system global area (SGA), data base writer, log writer, process
monitor, archiver, database files, control files, redo log files, oracle utilities.
SQL: commands and data types, data definition language commands, data manipulation commands,
data query language commands, transaction language control commands, data control language
commands.
Joins, equi-joins, non-equi-joins, self joins, other joins, aggregate functions, math functions, string
functions, group by clause, data function and concepts of null values, sub-querries, views.
PL/SQL, basics of pl/sql, data types, control structures, database access with PL/SQL, data base
connections, transaction management, data base locking, cursor management.
Joss Wright, Oxford Internet Institute (Plenary): Privacy-Preserving Data Ana...i_scienceEU
Network of Excellence Internet Science Summer School. The theme of the summer school is "Internet Privacy and Identity, Trust and Reputation Mechanisms".
More information: http://www.internet-science.eu/
ARX - a comprehensive tool for anonymizing / de-identifying biomedical dataarx-deidentifier
Website with further information: http://arx.deidentifier.org
Description of this talk:
Collaboration and data sharing have become core elements of biomedical research. Especially when sensitive data from distributed sources are linked, privacy threats have to be considered. Statistical disclosure control allows the protection of sensitive data by introducing fuzziness. Reduction of data quality, however, needs to be balanced against gains in protection. Therefore, tools are needed which provide a good overview of the anonymization process to those responsible for data sharing. These tools require graphical interfaces and the use of intuitive and replicable methods. In addition, extensive testing, documentation and openness to reviews by the community are important. Existing publicly available software is limited in functionality, and often active support is lacking. We present the data anonymization tool ARX, which has been developed in close cooperation between the Chair for Biomedical Informatics, the Chair for IT Security and the Chair for Database Systems at Technische Universität München (TUM), Germany. ARX enables the de-identification of structured data (i.e., tabular data) and implements a wide variety of privacy methods in a highly efficient manner. It is extensible, well documented and actively supported. ARX provides an intuitive cross-platform graphical interface and offers a public API for integration with other software systems.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
A Rule based Slicing Approach to Achieve Data Publishing and Privacyijsrd.com
several anonymization techniques, such as generalization and bucketization, have been designed for privacy preserving micro data publishing. Recent work has shown that generalization loses considerable amount of information, especially for high dimensional data. Bucketization, on the other hand, does not prevent membership disclosure and does not apply for data that do not have a clear separation between quasi-identifying attributes and sensitive attributes. The existing system proposed slicing concept to overcome the tuple based partition this has been done to overcome the previous generalization and bucketization. In this paper, present a novel technique called rule based slicing, which partitions the data both horizontally and vertically. We show that slicing preserves better data utility than generalization and can be used for membership disclosure protection. Another important advantage of slicing is that it can handle high-dimensional data. We show how slicing can be used for attribute disclosure protection and develop an efficient algorithm for computing the sliced data that obey the l-diversity requirement. The workload experiments confirm that slicing preserves better utility than generalization and is more effective than bucketization in workloads involving the sensitive attribute. The experiments also demonstrate that slicing can be used to prevent membership disclosure
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINEaciijournal
Today, enormous amount of data is collected in medical databases. These databases may contain valuable
information encapsulated in nontrivial relationships among symptoms and diagnoses. Extracting such
dependencies from historical data is much easier to done by using medical systems. Such knowledge can be
used in future medical decision making. In this paper, a new algorithm based on C4.5 to mind data for
medince applications proposed and then it is evaluated against two datasets and C4.5 algorithm in terms of
accuracy.
Abstract Learning Analytics by nature relies on computational information processing activities intended to extract from raw data some interesting aspects that can be used to obtain insights into the behaviors of learners, the design of learning experiences, etc. There is a large variety of computational techniques that can be employed, all with interesting properties, but it is the interpretation of their results that really forms the core of the analytics process. As a rising subject, data mining and business intelligence are playing an increasingly important role in the decision support activity of every walk of life. The Variance Rover System (VRS) mainly focused on the large data sets obtained from online web visiting and categorizing this into clusters according some similarity and the process of predicting customer behavior and selecting actions to influence that behavior to benefit the company, so as to take optimized and beneficial decisions of business expansion. Keywords: Analytics, Business intelligence, Clustering, Data Mining, Standard K-means, Optimized K-means
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Privacy Preserving of Data Mining Based on Enumeration and Concatenation of A...AM Publications
Privacy preserving data mining and security of data has been developed for privacy protection and
protection of data during the data mining process and discovery of knowledge. The individual personal data can
be used for data mining process without disclosing the individual identity or any information regarding the
individual identity should be removed. Privacy preserving data mining is concerned with the valid data mining
results. For example, data mining applications like banking, credit card, hospitals which is concerned with the
individual personal sensitive information for those applications that is concerned with personal data, the
individual sensitive data can be removed from the database or the personal information can be encrypted and
stored at different locations in order to protect the privacy of the individuals. This paper includes a formal
protection model of enumeration and concatenating attribute using k-anonymity. The methods used in this
paper1.Enumeration and concatenation of age, sex attribute 2. Concatenating and encoding city and zipcode
attribute.
An approach for discrimination prevention in data miningeSAT Journals
Abstract In the age of Database technologies a large amount of data is collected and analyzed by using data mining techniques. However, the main issue in data mining is potential privacy invasion and potential discrimination. One of the techniques used in data mining for making decision is classification. On the other hand, if the dataset is biased then the discriminatory decision may occur. Therefore, in this paper we review the recent state of the art approaches for antidiscrimination techniques and also focuses on discrimination discovery and prevention in data mining. On the other hand, we also study a theoretical proposal for enhancing the results of the data quality. Keywords- Antidiscrimination, data mining, direct and indirect discrimination prevention, rule protection, rule generalization, privacy.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
TWO PARTY HIERARICHAL CLUSTERING OVER HORIZONTALLY PARTITIONED DATA SETIJDKP
Data mining is a task in which data is extracted from the large database to make itin an understandable
form or structure so that it can be used for further use. In this paper we present an approach by which the
concept of hierarchal clustering applied over the horizontally partitioned data set. We also explain the
desired algorithm like hierarichal clustering, algorithms for finding the minimum closest cluster. In this
paper wealso explain the two party computations. Privacy of any data is the most important thing in these
days hence we present an approach by which we can apply privacy preservation over the two party which
are distributing their data horizontally. We also explain about the hierarichal clustering which we are
going to apply in our present method.
Conceptual model,
Think of the conceptual model as an architect’s conceptual drawing of a house. It provides a good idea of what is required with very little additional detail;
Data modelling is a technique for defining business requirements for a database. It is sometimes called database modelling because a data model is eventually implemented in a database
Purpose of the data base system, data abstraction, data model, data independence, data definition
language, data manipulation language, data base manager, data base administrator, data base users,
overall structure.
ER Models, entities, mapping constrains, keys, E-R diagram, reduction E-R diagrams to tables,
generatio, aggregation, design of an E-R data base scheme.
Oracle RDBMS, architecture, kernel, system global area (SGA), data base writer, log writer, process
monitor, archiver, database files, control files, redo log files, oracle utilities.
SQL: commands and data types, data definition language commands, data manipulation commands,
data query language commands, transaction language control commands, data control language
commands.
Joins, equi-joins, non-equi-joins, self joins, other joins, aggregate functions, math functions, string
functions, group by clause, data function and concepts of null values, sub-querries, views.
PL/SQL, basics of pl/sql, data types, control structures, database access with PL/SQL, data base
connections, transaction management, data base locking, cursor management.
Joss Wright, Oxford Internet Institute (Plenary): Privacy-Preserving Data Ana...i_scienceEU
Network of Excellence Internet Science Summer School. The theme of the summer school is "Internet Privacy and Identity, Trust and Reputation Mechanisms".
More information: http://www.internet-science.eu/
ARX - a comprehensive tool for anonymizing / de-identifying biomedical dataarx-deidentifier
Website with further information: http://arx.deidentifier.org
Description of this talk:
Collaboration and data sharing have become core elements of biomedical research. Especially when sensitive data from distributed sources are linked, privacy threats have to be considered. Statistical disclosure control allows the protection of sensitive data by introducing fuzziness. Reduction of data quality, however, needs to be balanced against gains in protection. Therefore, tools are needed which provide a good overview of the anonymization process to those responsible for data sharing. These tools require graphical interfaces and the use of intuitive and replicable methods. In addition, extensive testing, documentation and openness to reviews by the community are important. Existing publicly available software is limited in functionality, and often active support is lacking. We present the data anonymization tool ARX, which has been developed in close cooperation between the Chair for Biomedical Informatics, the Chair for IT Security and the Chair for Database Systems at Technische Universität München (TUM), Germany. ARX enables the de-identification of structured data (i.e., tabular data) and implements a wide variety of privacy methods in a highly efficient manner. It is extensible, well documented and actively supported. ARX provides an intuitive cross-platform graphical interface and offers a public API for integration with other software systems.
A Review Study on the Privacy Preserving Data Mining Techniques and Approaches14894
In this paper we review on the
various privacy preserving data mining techniques like data
modification and secure multiparty computation based on the
different aspects.
Index Terms– Privacy and Security, Data Mining, Privacy
Preserving, Secure Multiparty Computation (SMC) and Data
Modification
10 главных идей гибкой разработки. С элементами, потому что довольно сложно применять чистый Scrum к большому проекту, в котором есть много поддержки и форсмажора. Подготовлена на базе книги Scrum Джеффа Сазерленда и материалов компании ScrumTrek. Презентация родилась после прочтения книги и посещения антиконференции AgileCamp. Рассказал команде проектов Колёса, Крыша и Маркет, будем более активно применять идеи и методики, которые помогают в разработке проектов по всему миру.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Privacy preservation techniques in data miningeSAT Journals
Abstract In this paper different privacy preservation techniques are compared. Classification is the most commonly applied data mining technique, which employs a set of pre-classified examples to develop a model that can classify the population of records at large. Fraud detection and credit risk applications are particularly well suited to this type of analysis. This approach frequently employs decision tree or neural network-based classification algorithms. The data classification process involves learning and classification. In Learning the training data are analyzed by classification algorithm. In classification test data are used to estimate the accuracy of the classification rules. If the accuracy is acceptable the rules can be applied to the new data tuples . For a fraud detection application, this would include complete records of both fraudulent and valid activities determined on a record-by-record basis. The classifier-training algorithm uses these pre-classified examples to determine the set of parameters required for proper discrimination. The algorithm then encodes these parameters into a model called a classifier Index Terms: Data Mining, Privacy Preservation, Clustering, Classification Techniques, Naive Bayes.
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSijdkp
Subspace clustering discovers the clusters embedded in multiple, overlapping subspaces of high
dimensional data. Many significant subspace clustering algorithms exist, each having different
characteristics caused by the use of different techniques, assumptions, heuristics used etc. A comprehensive
classification scheme is essential which will consider all such characteristics to divide subspace clustering
approaches in various families. The algorithms belonging to same family will satisfy common
characteristics. Such a categorization will help future developers to better understand the quality criteria to
be used and similar algorithms to be used to compare results with their proposed clustering algorithms. In
this paper, we first proposed the concept of SCAF (Subspace Clustering Algorithms’ Family).
Characteristics of SCAF will be based on the classes such as cluster orientation, overlap of dimensions etc.
As an illustration, we further provided a comprehensive, systematic description and comparison of few
significant algorithms belonging to “Axis parallel, overlapping, density based” SCAF.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CAREijistjournal
Dichotomous data is a type of categorical data, which is binary with categories zero and one. Health care data is one of the heavily used categorical data. Binary data are the simplest form of data used for heath care databases in which close ended questions can be used; it is very efficient based on computational efficiency and memory capacity to represent categorical type data. Clustering health care or medical data is very tedious due to its complex data representation models, high dimensionality and data sparsity. In this paper, clustering is performed after transforming the dichotomous data into real by wiener transformation. The proposed algorithm can be usable for determining the correlation of the health disorders and symptoms observed in large medical and health binary databases. Computational results show that the clustering based on Wiener transformation is very efficient in terms of objectivity and subjectivity.
Mechanical properties of hybrid fiber reinforced concrete for pavementseSAT Journals
Abstract
The effect of addition of mono fibers and hybrid fibers on the mechanical properties of concrete mixture is studied in the present
investigation. Steel fibers of 1% and polypropylene fibers 0.036% were added individually to the concrete mixture as mono fibers and
then they were added together to form a hybrid fiber reinforced concrete. Mechanical properties such as compressive, split tensile and
flexural strength were determined. The results show that hybrid fibers improve the compressive strength marginally as compared to
mono fibers. Whereas, hybridization improves split tensile strength and flexural strength noticeably.
Keywords:-Hybridization, mono fibers, steel fiber, polypropylene fiber, Improvement in mechanical properties.
Material management in construction – a case studyeSAT Journals
Abstract
The objective of the present study is to understand about all the problems occurring in the company because of improper application
of material management. In construction project operation, often there is a project cost variance in terms of the material, equipments,
manpower, subcontractor, overhead cost, and general condition. Material is the main component in construction projects. Therefore,
if the material management is not properly managed it will create a project cost variance. Project cost can be controlled by taking
corrective actions towards the cost variance. Therefore a methodology is used to diagnose and evaluate the procurement process
involved in material management and launch a continuous improvement was developed and applied. A thorough study was carried
out along with study of cases, surveys and interviews to professionals involved in this area. As a result, a methodology for diagnosis
and improvement was proposed and tested in selected projects. The results obtained show that the main problem of procurement is
related to schedule delays and lack of specified quality for the project. To prevent this situation it is often necessary to dedicate
important resources like money, personnel, time, etc. To monitor and control the process. A great potential for improvement was
detected if state of the art technologies such as, electronic mail, electronic data interchange (EDI), and analysis were applied to the
procurement process. These helped to eliminate the root causes for many types of problems that were detected.
Managing drought short term strategies in semi arid regions a case studyeSAT Journals
Abstract
Drought management needs multidisciplinary action. Interdisciplinary efforts among the experts in various fields of the droughts
prone areas are helpful to achieve tangible and permanent solution for this recurring problem. The Gulbarga district having the total
area around 16, 240 sq.km, and accounts 8.45 per cent of the Karnataka state area. The district has been situated with latitude 17º 19'
60" North and longitude of 76 º 49' 60" east. The district is situated entirely on the Deccan plateau positioned at a height of 300 to
750 m above MSL. Sub-tropical, semi-arid type is one among the drought prone districts of Karnataka State. The drought
management is very important for a district like Gulbarga. In this paper various short term strategies are discussed to mitigate the
drought condition in the district.
Keywords: Drought, South-West monsoon, Semi-Arid, Rainfall, Strategies etc.
Life cycle cost analysis of overlay for an urban road in bangaloreeSAT Journals
Abstract
Pavements are subjected to severe condition of stresses and weathering effects from the day they are constructed and opened to traffic
mainly due to its fatigue behavior and environmental effects. Therefore, pavement rehabilitation is one of the most important
components of entire road systems. This paper highlights the design of concrete pavement with added mono fibers like polypropylene,
steel and hybrid fibres for a widened portion of existing concrete pavement and various overlay alternatives for an existing
bituminous pavement in an urban road in Bangalore. Along with this, Life cycle cost analyses at these sections are done by Net
Present Value (NPV) method to identify the most feasible option. The results show that though the initial cost of construction of
concrete overlay is high, over a period of time it prove to be better than the bituminous overlay considering the whole life cycle cost.
The economic analysis also indicates that, out of the three fibre options, hybrid reinforced concrete would be economical without
compromising the performance of the pavement.
Keywords: - Fatigue, Life cycle cost analysis, Net Present Value method, Overlay, Rehabilitation
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materialseSAT Journals
Abstract
The issue of growing demand on our nation’s roadways over that past couple of decades, decreasing budgetary funds, and the need to
provide a safe, efficient, and cost effective roadway system has led to a dramatic increase in the need to rehabilitate our existing
pavements and the issue of building sustainable road infrastructure in India. With these emergency of the mentioned needs and this
are today’s burning issue and has become the purpose of the study.
In the present study, the samples of existing bituminous layer materials were collected from NH-48(Devahalli to Hassan) site.The
mixtures were designed by Marshall Method as per Asphalt institute (MS-II) at 20% and 30% Reclaimed Asphalt Pavement (RAP).
RAP material was blended with virgin aggregate such that all specimens tested for the, Dense Bituminous Macadam-II (DBM-II)
gradation as per Ministry of Roads, Transport, and Highways (MoRT&H) and cost analysis were carried out to know the economics.
Laboratory results and analysis showed the use of recycled materials showed significant variability in Marshall Stability, and the
variability increased with the increase in RAP content. The saving can be realized from utilization of recycled materials as per the
methodology, the reduction in the total cost is 19%, 30%, comparing with the virgin mixes.
Keywords: Reclaimed Asphalt Pavement, Marshall Stability, MS-II, Dense Bituminous Macadam-II
Laboratory investigation of expansive soil stabilized with natural inorganic ...eSAT Journals
Abstract
Soil stabilization has proven to be one of the oldest techniques to improve the soil properties. Literature review conducted revealed
that uses of natural inorganic stabilizers are found to be one of the best options for soil stabilization. In this regard an attempt has
been made to evaluate the influence of RBI-81 stabilizer on properties of black cotton soil through laboratory investigations. Black
cotton soil with varying percentages of RBI-81 viz., 0, 0.5, 1, 1.5, 2, and 2.5 percent were studied for moisture density relationships
and strength behaviour of soils. Also the effect of curing period was evaluated as literature review clearly emphasized the strength
gain of soils stabilized with RBI-81 over a period of time. The results obtained shows that the unconfined compressive strength of
specimens treated with RBI-81 increased approximately by 250% for a curing period of 28 days as compared to virgin soil. Further
the CBR value improved approximately by 400%. The studies indicated an increasing trend for soil strength behaviour with
increasing percentage of RBI-81 suggesting its potential applications in soil stabilization.
Influence of reinforcement on the behavior of hollow concrete block masonry p...eSAT Journals
Abstract
Reinforced masonry was developed to exploit the strength potential of masonry and to solve its lack of tensile strength. Experimental
and analytical studies have been carried out to investigate the effect of reinforcement on the behavior of hollow concrete block
masonry prisms under compression and to predict ultimate failure compressive strength. In the numerical program, three dimensional
non-linear finite elements (FE) model based on the micro-modeling approach is developed for both unreinforced and reinforced
masonry prisms using ANSYS (14.5). The proposed FE model uses multi-linear stress-strain relationships to model the non-linear
behavior of hollow concrete block, mortar, and grout. Willam-Warnke’s five parameter failure theory has been adopted to model the
failure of masonry materials. The comparison of the numerical and experimental results indicates that the FE models can successfully
capture the highly nonlinear behavior of the physical specimens and accurately predict their strength and failure mechanisms.
Keywords: Structural masonry, Hollow concrete block prism, grout, Compression failure, Finite element method,
Numerical modeling.
Influence of compaction energy on soil stabilized with chemical stabilizereSAT Journals
Abstract
Increase in traffic along with heavier magnitude of wheel loads cause rapid deterioration in pavements. There is a need to improve
density, strength of soil subgrade and other pavement layers. In this study an attempt is made to improve the properties of locally
available loamy soil using twin approaches viz., i) increasing the compaction of soil and ii) treating the soil with chemical stabilizer.
Laboratory studies are carried out on both untreated and treated soil samples compacted by different compaction efforts. Studies
show that increase in compaction effort results in increase in density of soil. However in soil treated with chemical stabilizer, rate of
increase in density is not significant. The soil treated with chemical stabilizer exhibits improvement in both strength and performance
properties.
Keywords: compaction, density, subgradestabilization, resilient modulus
Geographical information system (gis) for water resources managementeSAT Journals
Abstract
Water resources projects are inherited with overlapping and at times conflicting objectives. These projects are often of varied sizes
ranging from major projects with command areas of millions of hectares to very small projects implemented at the local level. Thus,
in all these projects there is seldom proper coordination which is essential for ensuring collective sustainability.
Integrated watershed development and management is the accepted answer but in turn requires a comprehensive framework that can
enable planning process involving all the stakeholders at different levels and scales is compulsory. Such a unified hydrological
framework is essential to evaluate the cause and effect of all the proposed actions within the drainage basins.
The present paper describes a hydrological framework developed in the form of a Hydrologic Information System (HIS) which is
intended to meet the specific information needs of the various line departments of a typical State connected with water related aspects.
The HIS consist of a hydrologic information database coupled with tools for collating primary and secondary data and tools for
analyzing and visualizing the data and information. The HIS also incorporates hydrological model base for indirect assessment of
various entities of water balance in space and time. The framework would be maintained and updated to reflect fully the most
accurate ground truth data and the infrastructure requirements for planning and management.
Keywords: Hydrological Information System (HIS); WebGIS; Data Model; Web Mapping Services
Forest type mapping of bidar forest division, karnataka using geoinformatics ...eSAT Journals
Abstract
The study demonstrate the potentiality of satellite remote sensing technique for the generation of baseline information on forest types
including tree plantation details in Bidar forest division, Karnataka covering an area of 5814.60Sq.Kms. The Total Area of Bidar
forest division is 5814Sq.Kms analysis of the satellite data in the study area reveals that about 84% of the total area is Covered by
crop land, 1.778% of the area is covered by dry deciduous forest, 1.38 % of mixed plantation, which is very threatening to the
environmental stability of the forest, future plantation site has been mapped. With the use of latest Geo-informatics technology proper
and exact condition of the trees can be observed and necessary precautions can be taken for future plantation works in an appropriate
manner
Keywords:-RS, GIS, GPS, Forest Type, Tree Plantation
Factors influencing compressive strength of geopolymer concreteeSAT Journals
Abstract
To study effects of several factors on the properties of fly ash based geopolymer concrete on the compressive strength and also the
cost comparison with the normal concrete. The test variables were molarities of sodium hydroxide(NaOH) 8M,14M and 16M, ratio of
NaOH to sodium silicate (Na2SiO3) 1, 1.5, 2 and 2.5, alkaline liquid to fly ash ratio 0.35 and 0.40 and replacement of water in
Na2SiO3 solution by 10%, 20% and 30% were used in the present study. The test results indicated that the highest compressive
strength 54 MPa was observed for 16M of NaOH, ratio of NaOH to Na2SiO3 2.5 and alkaline liquid to fly ash ratio of 0.35. Lowest
compressive strength of 27 MPa was observed for 8M of NaOH, ratio of NaOH to Na2SiO3 is 1 and alkaline liquid to fly ash ratio of
0.40. Alkaline liquid to fly ash ratio of 0.35, water replacement of 10% and 30% for 8 and 16 molarity of NaOH and has resulted in
compressive strength of 36 MPa and 20 MPa respectively. Superplasticiser dosage of 2 % by weight of fly ash has given higher
strength in all cases.
Keywords: compressive strength, alkaline liquid, fly ash
Experimental investigation on circular hollow steel columns in filled with li...eSAT Journals
Abstract
Composite Circular hollow Steel tubes with and without GFRP infill for three different grades of Light weight concrete are tested for
ultimate load capacity and axial shortening , under Cyclic loading. Steel tubes are compared for different lengths, cross sections and
thickness. Specimens were tested separately after adopting Taguchi’s L9 (Latin Squares) Orthogonal array in order to save the initial
experimental cost on number of specimens and experimental duration. Analysis was carried out using ANN (Artificial Neural
Network) technique with the assistance of Mini Tab- a statistical soft tool. Comparison for predicted, experimental & ANN output is
obtained from linear regression plots. From this research study, it can be concluded that *Cross sectional area of steel tube has most
significant effect on ultimate load carrying capacity, *as length of steel tube increased- load carrying capacity decreased & *ANN
modeling predicted acceptable results. Thus ANN tool can be utilized for predicting ultimate load carrying capacity for composite
columns.
Keywords: Light weight concrete, GFRP, Artificial Neural Network, Linear Regression, Back propagation, orthogonal
Array, Latin Squares
Experimental behavior of circular hsscfrc filled steel tubular columns under ...eSAT Journals
Abstract
This paper presents an outlook on experimental behavior and a comparison with predicted formula on the behaviour of circular
concentrically loaded self-consolidating fibre reinforced concrete filled steel tube columns (HSSCFRC). Forty-five specimens were
tested. The main parameters varied in the tests are: (1) percentage of fiber (2) tube diameter or width to wall thickness ratio (D/t
from 15 to 25) (3) L/d ratio from 2.97 to 7.04 the results from these predictions were compared with the experimental data. The
experimental results) were also validated in this study.
Keywords: Self-compacting concrete; Concrete-filled steel tube; axial load behavior; Ultimate capacity.
Evaluation of punching shear in flat slabseSAT Journals
Abstract
Flat-slab construction has been widely used in construction today because of many advantages that it offers. The basic philosophy in
the design of flat slab is to consider only gravity forces; this method ignores the effect of punching shear due to unbalanced moments
at the slab column junction which is critical. An attempt has been made to generate generalized design sheets which accounts both
punching shear due to gravity loads and unbalanced moments for cases (a) interior column; (b) edge column (bending perpendicular
to shorter edge); (c) edge column (bending parallel to shorter edge); (d) corner column. These design sheets are prepared as per
codal provisions of IS 456-2000. These design sheets will be helpful in calculating the shear reinforcement to be provided at the
critical section which is ignored in many design offices. Apart from its usefulness in evaluating punching shear and the necessary
shear reinforcement, the design sheets developed will enable the designer to fix the depth of flat slab during the initial phase of the
design.
Keywords: Flat slabs, punching shear, unbalanced moment.
Evaluation of performance of intake tower dam for recent earthquake in indiaeSAT Journals
Abstract
Intake towers are typically tall, hollow, reinforced concrete structures and form entrance to reservoir outlet works. A parametric
study on dynamic behavior of circular cylindrical towers can be carried out to study the effect of depth of submergence, wall thickness
and slenderness ratio, and also effect on tower considering dynamic analysis for time history function of different soil condition and
by Goyal and Chopra accounting interaction effects of added hydrodynamic mass of surrounding and inside water in intake tower of
dam
Key words: Hydrodynamic mass, Depth of submergence, Reservoir, Time history analysis,
Evaluation of operational efficiency of urban road network using travel time ...eSAT Journals
Abstract
Efficiency of the road network system is analyzed by travel time reliability measures. The study overlooks on an important measure of
travel time reliability and prioritizing Tiruchirappalli road network. Traffic volume and travel time were collected using license plate
matching method. Travel time measures were estimated from average travel time and 95th travel time. Effect of non-motorized vehicle
on efficiency of road system was evaluated. Relation between buffer time index and traffic volume was created. Travel time model has
been developed and travel time measure was validated. Then service quality of road sections in network were graded based on
travel time reliability measures.
Keywords: Buffer Time Index (BTI); Average Travel Time (ATT); Travel Time Reliability (TTR); Buffer Time (BT).
Estimation of surface runoff in nallur amanikere watershed using scs cn methodeSAT Journals
Abstract
The development of watershed aims at productive utilization of all the available natural resources in the entire area extending from
ridge line to stream outlet. The per capita availability of land for cultivation has been decreasing over the years. Therefore, water and
the related land resources must be developed, utilized and managed in an integrated and comprehensive manner. Remote sensing and
GIS techniques are being increasingly used for planning, management and development of natural resources. The study area, Nallur
Amanikere watershed geographically lies between 110 38’ and 110 52’ N latitude and 760 30’ and 760 50’ E longitude with an area of
415.68 Sq. km. The thematic layers such as land use/land cover and soil maps were derived from remotely sensed data and overlayed
through ArcGIS software to assign the curve number on polygon wise. The daily rainfall data of six rain gauge stations in and around
the watershed (2001-2011) was used to estimate the daily runoff from the watershed using Soil Conservation Service - Curve Number
(SCS-CN) method. The runoff estimated from the SCS-CN model was then used to know the variation of runoff potential with different
land use/land cover and with different soil conditions.
Keywords: Watershed, Nallur watershed, Surface runoff, Rainfall-Runoff, SCS-CN, Remote Sensing, GIS.
Estimation of morphometric parameters and runoff using rs & gis techniqueseSAT Journals
Abstract
Land and water are the two vital natural resources, the optimal management of these resources with minimum adverse environmental
impact are essential not only for sustainable development but also for human survival. Satellite remote sensing with geographic
information system has a pragmatic approach to map and generate spatial input layers of predicting response behavior and yield of
watershed. Hence, in the present study an attempt has been made to understand the hydrological process of the catchment at the
watershed level by drawing the inferences from moprhometric analysis and runoff. The study area chosen for the present study is
Yagachi catchment situated in Chickamaglur and Hassan district lies geographically at a longitude 75⁰52’08.77”E and
13⁰10’50.77”N latitude. It covers an area of 559.493 Sq.km. Morphometric analysis is carried out to estimate morphometric
parameters at Micro-watershed to understand the hydrological response of the catchment at the Micro-watershed level. Daily runoff
is estimated using USDA SCS curve number model for a period of 10 years from 2001 to 2010. The rainfall runoff relationship of the
study shows there is a positive correlation.
Keywords: morphometric analysis, runoff, remote sensing and GIS, SCS - method
-
Effect of variation of plastic hinge length on the results of non linear anal...eSAT Journals
Abstract The nonlinear Static procedure also well known as pushover analysis is method where in monotonically increasing loads are applied to the structure till the structure is unable to resist any further load. It is a popular tool for seismic performance evaluation of existing and new structures. In literature lot of research has been carried out on conventional pushover analysis and after knowing deficiency efforts have been made to improve it. But actual test results to verify the analytically obtained pushover results are rarely available. It has been found that some amount of variation is always expected to exist in seismic demand prediction of pushover analysis. Initial study is carried out by considering user defined hinge properties and default hinge length. Attempt is being made to assess the variation of pushover analysis results by considering user defined hinge properties and various hinge length formulations available in literature and results compared with experimentally obtained results based on test carried out on a G+2 storied RCC framed structure. For the present study two geometric models viz bare frame and rigid frame model is considered and it is found that the results of pushover analysis are very sensitive to geometric model and hinge length adopted. Keywords: Pushover analysis, Base shear, Displacement, hinge length, moment curvature analysis
Effect of use of recycled materials on indirect tensile strength of asphalt c...eSAT Journals
Abstract
Depletion of natural resources and aggregate quarries for the road construction is a serious problem to procure materials. Hence
recycling or reuse of material is beneficial. On emphasizing development in sustainable construction in the present era, recycling of
asphalt pavements is one of the effective and proven rehabilitation processes. For the laboratory investigations reclaimed asphalt
pavement (RAP) from NH-4 and crumb rubber modified binder (CRMB-55) was used. Foundry waste was used as a replacement to
conventional filler. Laboratory tests were conducted on asphalt concrete mixes with 30, 40, 50, and 60 percent replacement with RAP.
These test results were compared with conventional mixes and asphalt concrete mixes with complete binder extracted RAP
aggregates. Mix design was carried out by Marshall Method. The Marshall Tests indicated highest stability values for asphalt
concrete (AC) mixes with 60% RAP. The optimum binder content (OBC) decreased with increased in RAP in AC mixes. The Indirect
Tensile Strength (ITS) for AC mixes with RAP also was found to be higher when compared to conventional AC mixes at 300C.
Keywords: Reclaimed asphalt pavement, Foundry waste, Recycling, Marshall Stability, Indirect tensile strength.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Online aptitude test management system project report.pdfKamal Acharya
The purpose of on-line aptitude test system is to take online test in an efficient manner and no time wasting for checking the paper. The main objective of on-line aptitude test system is to efficiently evaluate the candidate thoroughly through a fully automated system that not only saves lot of time but also gives fast results. For students they give papers according to their convenience and time and there is no need of using extra thing like paper, pen etc. This can be used in educational institutions as well as in corporate world. Can be used anywhere any time as it is a web based application (user Location doesn’t matter). No restriction that examiner has to be present when the candidate takes the test.
Every time when lecturers/professors need to conduct examinations they have to sit down think about the questions and then create a whole new set of questions for each and every exam. In some cases the professor may want to give an open book online exam that is the student can take the exam any time anywhere, but the student might have to answer the questions in a limited time period. The professor may want to change the sequence of questions for every student. The problem that a student has is whenever a date for the exam is declared the student has to take it and there is no way he can take it at some other time. This project will create an interface for the examiner to create and store questions in a repository. It will also create an interface for the student to take examinations at his convenience and the questions and/or exams may be timed. Thereby creating an application which can be used by examiners and examinee’s simultaneously.
Examination System is very useful for Teachers/Professors. As in the teaching profession, you are responsible for writing question papers. In the conventional method, you write the question paper on paper, keep question papers separate from answers and all this information you have to keep in a locker to avoid unauthorized access. Using the Examination System you can create a question paper and everything will be written to a single exam file in encrypted format. You can set the General and Administrator password to avoid unauthorized access to your question paper. Every time you start the examination, the program shuffles all the questions and selects them randomly from the database, which reduces the chances of memorizing the questions.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
6th International Conference on Machine Learning & Applications (CMLA 2024)
Data slicing technique to privacy preserving and data publishing
1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 10 | Oct-2013, Available @ http://www.ijret.org 120
DATA SLICING TECHNIQUE TO PRIVACY PRESERVING AND DATA
PUBLISHING
Alphonsa Vedangi1
, V.Anandam2
1
Student, 2
Professor, Department of CSE, CMR Institute of Technology, Hyderabad, Andhra Pradesh, India
alphonsa.vedangi@yahoo.com, velaanand@yahoo.com
Abstract
Many techniques have been designed for privacy preserving and micro data publishing, such as generalization and bucketization.
Several works showed that generalization loses some amount of information especially for high dimensional data. So it’s not efficient
for high dimensional data. In case of Bucketization, it does not prevents membership disclosure and also does not applicable for data
that do not have a clear separation between Quasi-identifying attributes and sensitive attributes. In this paper, we presenting an
innovative technique called data slicing which partitions the data. An efficient algorithm is developed for computing sliced data that
obeys l-diversity requirement. we also show how data slicing is better than generalization and bucketization. Data slicing preserves
better utility than generalization and also does not requires clear separation between Quasi-identifying and sensitive attributes. Data
slicing is also used to prevent attribute disclosure and develop an efficient algorithm for computing the sliced data that obeys l-
diversity requirement. Experimental results confirm that data slicing preserves data utility than generalization and more effective than
bucketization involving sensitive attributes. Experimental results demonstrate the effectiveness of this method.
Keywords –Privacy preserving, Data Security, Data Publishing, Microdata.
----------------------------------------------------------------------***-----------------------------------------------------------------------
1 INTRODUCTION
Privacy preserving microdata publishing has been studied
extensively in recent years. Today most of the organizations
need to publish microdata. Microdata contain records each of
which contains information about an individual entity, such as
a person or a household. Many microdata anonymization
techniques have been proposed and the most popular ones are
generalization with k-anonymity and bucketization with l-
diversity. In both methods attributes are into three categories,
some of them are identifiers that can be uniquely identified
such as Name or security number, some are quasi-identifiers.
These quasi–identifiers are set of attributes are those that in
combination can be linked with the external information to
reidentify such as birth date, sex and zip code and the third
category is sensitive attributes, this kind of attributes are
unknown to the opponent and are considered sensitive such as
disease and salary. These are three categories of attributes in
microdata. In both the anonymization techniques first
identifiers are removed from the data and then partitions the
tuples into buckets. Generalization transforms the quasi-
identifying values in each bucket into less specific and
semantically constant so that tuples in the same bucket cannot
be distinguished by their QI values. In bucketization, one
separates the SA values from the QI values by randomly
permuting the SA values in the bucket .The anonymized data
consist of a set of buckets with permuted sensitive attribute
values. The identity of patients must be protected when patient
data is shared .Previously we used techniques using k-
anonymity and l-diversity. Existing works mainly considers
datasets with a single sensitive attribute while patient data
consists multiple sensitive attributes such as diagnosis and
treatment. So both techniques are not so efficient for
preserving patient data. So, we are presenting a new technique
for preserving patient data and publishing by slicing the data
both horizontally and vertically. Data slicing can also be used
to prevent membership disclosure and is efficient for high
dimensional data and preserves better data utility.
1.1 Related Work
To improve the disclosure of the patient data and to preserve
better data utility sliced data is more efficient when compared
to generalization and bucketization. In case of generalization
[29, 31, 30], it is shown that generalization loses considerable
amount of information especially for high dimensional data. In
order to perform data analysis or data mining tasks on the
generalization table, the data analyst has to make the uniform
distribution assumption that every value in a generalized set is
equally possible and no other distribution assumption can be
justified. This significally reduces the data utility of the
generalized data. In generalizes table each attribute is
generalized separately, correlations between different
attributes are lost. This is an inherent problem of
generalization. In case of bucketization, it has better data
utility than generalization but does not prevent membership
disclosure. Secondly bucketization publishes the QI values in
their original forms, an opponent can easily find out whether
2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 10 | Oct-2013, Available @ http://www.ijret.org 121
an individual has a record in the published data or not. This
means that membership information of most individuals can
be inferred from the bucketization table. Also bucketization
requires clear separation between QI and SI values .By
separating the sensitive attributes from the quasi-identifying
attributes, bucketization breaks the attribute correlation
between the QIs and SAs. However in many data sets it is
unclear that which attributes are QI’s and which SA’s are. So,
bucketization also not so efficient for preserving microdata
and publishing Slicing has some connections to marginal
publication [15], both of them release correlations among a set
of attributes. Slicing is quite different from marginal
publication. First, marginal publication can be viewed as a
special case of data slicing which does not have horizontal
partitioning. Therefore correlations among attributes in
different columns are lost in marginal publication.
2 BASIC IDEAS OF DATA SLICING
In this paper, we introduce a new method, called DATA
SLICING. This method partitions the data both horizontally
and vertically. Vertical partitioning is done by grouping
attributes into columns based on the correlations among the
attributes. Each column contains a subset of attributes that are
highly correlated. Horizontal partitioning is done by grouping
tuples into buckets. At last, within each bucket, values in each
column are randomly permutated to break the association
between different columns. The core idea of data slicing is to
break the association cross columns, but to preserve the
association within each column. This reduces the
dimensionality of the data and preserves better data utility than
bucketization and generalization. Data analysis methods such
as query answering can be easily viewed on sliced data.
2.1 Overview
The overall method of slicing has been discussed above. The
original microdata consist of quasi identifying values and
sensitive attributes. In figure 1 patient data in a hospital. The
data consists of Age, Sex, Zip code, disease. Here the QI
values are{age, sex, zip code} and the sensitive attribute is
{disease}.A generalized table replaces values
Figure 1: Original microdata published
In generalization there are several recodings.The recoding that
preserves the most information is “local recoding”. In local
recoding first tuples are grouped into buckets and then for
each bucket, one replaces all values of one attribute with a
generalized value, because same attribute value may be
generalized differently when they appear in different buckets.
Figure 2: Generalized data
In bucketization also attributes are partitioned into columns,
one column contains QI values and the other column contains
SA values. In bucketization, one separates the QI and SA
values by randomly permuting the SA values in each bucket.
In some cases we cannot determine the difference between
them two. So it has one drawback for microdata publishing. It
also does not prevent membership disclosure
Figure 3: Bucketized data
Slicing does not require the separation of those two attributes.
The basic idea of slicing is to break the association cross
columns, but to preserve the association within each column.
This reduces the dimensionality of data and preserves better
utility. Slicing partitions the dataset both horizontally and
vertically. Data slicing can also handle high-dimensional data.
It provides attribute disclosure protection.
Age Sex Zip code Disease
22 M 47906 Cancer
22 F 47906 Thyroid
33 F 47905 Thyroid
52 F 47905 Diabetes
54 M 47902 Thyroid
60 M 47902 Cancer
60 F 47904 Cancer
Age Sex Zip code Disease
[20-52]
[20-52]
[20-52]
[20-52]
*
*
*
*
4790*
4790*
4790*
4790*
Cancer
Thyroid
Thyroid
Diabetes
[54-64]
[54-64]
[54-64]
[54-64]
*
*
*
*
4790*
4790*
4790*
4790*
Cancer
Nausea
Cancer
Thyroid
Age Sex Zip code Disease
22
22
33
52
M
F
F
F
47906
47906
47905
47905
Thyroid
Cancer
iabetes
Thyroid
54
60
60
64
M
M
M
F
47902
47902
47902
47902
Nausea
Thyroid
Cancer
Cancer
3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 10 | Oct-2013, Available @ http://www.ijret.org 122
Figure 4: Sliced data
3. SLICING ALGORITHM
Step 1: In the initial stage we consider a queue of buckets Q
and a set of sliced buckets SB. Initially Q contains
only one bucket which includes all tuples and SB is
empty. So Q={T};SB=∅.
Step 2: In each Iteration the algorithm removes a bucket from
Q and splits the bucket into two buckets. Q=Q-{B};
for l-diversity check(T,Q∪{B1,B2}∪SB,l);The main
part of tuple partitioning algorithm is to check
whether a sliced table satisfies l- diversity.
Step 3: In the diversity check algorithm for each tuple t, it
maintains a list of statistics L[t] contains Statistics
about one matching bucket B. t∈T,L[t]=∅.The
matching probability p(t,B) and the distribution of
candidate sensitive values D(t,B).
Step 4: Q=Q∪{B1,B2} here two buckets are moved to the end
of the Q
Step 5: else SB=SB∪{B} in this step we cannot split the
bucket more so the bucket is sent to SB
Step 6: Thus a final result return SB,here when Q becomes
empty we have Computed the sliced table. the set of
sliced buckets is SB .So, finally Return SB
3.1 Algorithm for Diversity-Check
Step 1: For each tuple t ∈ T, L[t]= ∅.
Step 2: For each bucket B in T.
Step 3: Record f(v) for each column value v in bucket B.
Step 4: For each tuple t ∈ T.
Step 5: Calculate P(t,B) and find D(t,B).
Step 6: L[t]= L[t] ∪ {hp(t,B),D(t,B) i}.
Step 7: for each tuple t ∈ T.
Step 8: Calculate p(t,s) for each s based on L[t].
Step 9: if p(t,s)≥1/L, return false.
Step 10: Return true,
4. ATTRIBUTE DISCLOSURE PROTECTION
Data slicing is used to prevent attribute disclosure, introducing
the notion of l-diverse slicing. The sliced table in figure4
satisfies 2-diversity.considering tuple t1 with QI
values(22,M,47906).In order to determine t1’s sensitive value
one has to check t1’s matching buckets. Consider an adversary
who knows all QI values of t and attempts to find out t
sensitive value from the sliced table. First let p(t,B) be the
probability that t is in bucket B. Then the adversary computes
p(t,s),the probability that t takes a sensitive value s.
specifically let p(s/t,B) be the probability that t takes sensitive
value s given that t is in bucket B,then according to law of
total probability p(t,s) is
P(t,s)= p t, B p(
s
t
, B)
B
(1)
P(t,B)=f(t,B)/f(t)
Once we computed p(t,B) and p(s/t,B) we can find out the
probability p(t,s) based on eq(1).
p t, s =s p t, B p(
s
t
, B)Bs =1 (2)
SO l-diverse slicing is based on the probability p(t,s).A tuple t
satisfies l-diversity iff for any sensitive value s, p(t,s)<=1/L.A
sliced table satisfies l-diversity iff every tuple in it satisfies l-
diversity.our analysis directly shows that from an l-diverse
slice table, An adversary cannot learn the sensitive value of
any individual with a probability greater than 1/L.
5. EXISTING SYSTEM
We have many privacy preserving techniques and conducted
extensive workload experiments. First, many existing
clustering algorithms like k-means requires the calculation of
centroids But there is no notion of centroids in our setting
where each attribute forms a data points in the clustering
space. Secondly, k-medoid method is very robust to the
existence of outliers that is the data points that are very far
away from the rest of the data points. Third, the order in which
the data points are examined and does not affect the clusters
computed from the k-medoid method.
5.1 Disadvantages:
1. Existing anonymization algorithms can be used for column
generalization.
2. Existing data analysis such as query answering methods can
be easily used on the sliced data.
3. Existing privacy measures for membership disclosure
protection include differential privacy and presence.
6. PROPOSED SYSTEM
We present a novel technique called slicing, which partitions
the data both horizontally and vertically. We show that slicing
preserves better data utility than can be used for membership
disclosure protection. Another important advantage of slicing
is that it can handle high dimensional data. We show how
slicing can be used for attribute disclosure protection and
(Age,Sex) (Zipcode, disease)
(22,M)
(22,F)
(33,F)
(52,F)
(47905,Thyroid)
(47906,Cancer)
(47905,Diabetes)
(47906,Thyroid)
(54,M)
(60,M)
(60,M)
(64,F)
(47904,Nausea)
(47902,Thyroid)
(47902,Cancer)
(47904,Cancer)
4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 10 | Oct-2013, Available @ http://www.ijret.org 123
develop an efficient algorithm for computing the sliced data
that obey the l-diversity requirement. Slicing preserves better
utility than generalization and is more effective than
bucketization in workloads involving the sensitive attribute.
6.1 Advantages:
1. Slicing can be effectively used for preventing attribute
disclosure based on the privacy requirement of l-diversity.
2. Developing an efficient algorithm for computing the sliced
table that satisfies the l-diversity. Proposed algorithm
partitions attributes into columns applies column
generalization and partitions tuples into buckets. Attributes
that are highly correlated are in the same column.
7. EXPERIMENTAL RESULTS
Here we see how doctors and patients registers using their
details based on quasi-identifying attributes and sensitive
attributes. After registering doctor logins and see the patient
published details based on their diseases.
HOME DOTOR
REGISTER
PATIENT
REGISTER
DOCTOR
LOGIN
PATIENT
LOGIN
ADMIN
LOGIN
doctor register here
Doctor Id :
Username :
Password :
Designation :
Gender :
Date of birth :
Mobile :
City :
Zipcode :
Figure 7(a): Doctor Registration form
154Shiva
Shiva
MD
M
21-7-1970
6756898467
Hyderabad
345678
SUBMIT CLEAR
5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 10 | Oct-2013, Available @ http://www.ijret.org 124
Figure 7(b): Patient registration forn
HOME DOCTOR
REGISTER
PATIENT
REGISTER
DOCTOR
LOGIN
PATIENT
LOGIN
ADMIN
LOGIN
DOCTOR LOGIN HERE
Username :
Password :
Figure 7(C): doctor login page
HOME DOTOR
REGISTE
R
PATIENT
REGISTE
R
DOCTOR
LOGIN
PATIENT
LOGIN
ADMIN
LOGIN
Patient register here
Patient Id :
Username :
Password :
Disease :
Gender :
Blood Group :
Date of birth :
Mobile :
City :
Zipcode :
HOME SEARCH
DISEASE
COMMON
DATA
CHANGE
PASSWORD
LOGOUT
PATIENT ID NAME DISEASE DOB AGE SEX ZIPCODE
454 Sita Fever 21/1/19** 20-30 F 600***
459 Ali Nausea 11/7/17** 40-50 M 600***
324 Raju Flu 02/5/17** 1-30 M 479***
Shiva
Shiva
SUBMIT
SUBMIT
657Kiran
Kiran
Cancer
M
5-3-1976
7896543567
Vizag
897654
O +ve
SUBMIT CLEAR
6. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 10 | Oct-2013, Available @ http://www.ijret.org 125
Figure 7(d): Preserved data
Figure 7(e): Data Searching
Figure 7(f): published data
Figure 7(g): Sliced data
765 John Cancer 21/2/19** 1-40 M 467***
321 Swathi Fever 14/5/19** 10-30 F 600***
876 Harika Thyroid 12/8/18** 20-60 F 600***
564 Amir Cancer 2 22/4/19** 40-60 M 479***
234 David Diabetes 19/1/19** 30-60 M 600***
123 Geeta Nausea 16/5/19** 10-50 F 600***
932 Latha Thyroid 2 22/8/19** 1 20-70 F 479***
645 Priya Fever 1 16/1/19** 1 10-30 F 479***
HOME SEARCH
DISEASE
COMMON
DATA
CHANGE
PASSWORD
LOGOUT
Search User Details
Search Disease
HOME SEARCH
DISEASE
COMMON
DATA
CHANGE
PASSWORD
LOGOUT
PATIENT ID NAME DISEASE DOB AGE SEX ZIPCODE
454 Sita Fever 21/1/19** 20-30 F 600***
321 Swathi Fever 14/5/19** 10-30 F 600***
645 Priya Fever 1 16/1/19** 1 10-30 F 479***
HOME DATA SLICED CHANGE
PASSWORD
LOGOUT
PATIENT ID NAME A AGE,SEX ZIPCODE,DISEASE DELETE
454 Sita (20-30,F) (600045,Fever) Delete
459 Ali (40-50,M) (600045,Nausea) Delete
324 Raju (1-30,M) (479801,Flu) Delete
765 John (1-40,M) (467803,Cancer) Delete
321 Swathi (10-30,F) (600045,Fever) Delete
876 Harika (20-60,F)
(600032,Thyroid)
Delete
564 Amar 2 (40-60,M) (479863,Cancer) Delete
234 David (30-60,M) (600045,Diabetes) Delete
123 Geeta (10-50,F) (600045,Nausea) Delete
932 Latha 2 (20-70,F) 1
(479650,Thyroid)
Delete
645 Priya 1 (10-30,F) (479650,Fever) Delete
Fever
Search
7. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 02 Issue: 10 | Oct-2013, Available @ http://www.ijret.org 126
8. FUTURE WORK
This proposed work motivates for several researches.
Basically in this paper we considered slicing where each
attribute is in exactly one column. The extension is the notion
of overlapping slicing which releases more attribute
correlations. One could choose to include the Disease attribute
in the first column also and the privacy implications need to be
carefully understood. This could provide better data utility.
The tradeoff between utility of data and data privacy is very
interesting. Finally, even though we are having many number
of anonymization technique, it remains a problem how to use
the anonymized data. In our work we randomly generated the
associations between column values of a bucket.
CONCLUSIONS
In this paper, we present a new anonymization method that is
data slicing for privacy preserving and data publishing. Data
Slicing overcomes the limitations of generalization and
bucketization and preserves better utility while protecting
against privacy threats. We illustrate that how slicing is used
to prevent attribute disclosures. The general methodology of
this work is before data anonymization one can analyze the
data characteristics in data anonymization. The basic idea is
one can easily design better anonymization techniques when
we know the data perfectly. Finally, we have showed some
advantages of data slicing comparing with generalization and
bucketization. Data slicing is a promising technique for
handling high dimensional data. By partitioning attributes into
columns, privacy is protected.
10 REFERENCES
[1]. G.Ghinita,Y.Tao,and P.Kalnis.On the anonymization of
sparse high-dimensional data. In ICDE,pages 715-
724,2008.
[2]. A.Inan,M.Kantarcioglu,and E.Bertino.Using
anonymized data for classification.In ICDE,2009 .
[3]. J.Brickell and V,Shmatikov.The cost of
privacy:destruction of data-mining utlity in anonymized
data publishing.In KDD,pages 70-78,2008.
[4]. J.Li,Y.Tao and X.Xiao.privation of proximity privacy
in publishing numerical sensitive data. In
SIGMOD,pages 473-486,2008.
[5]. R.C.-W.Wong,J.Li, A.W.-C. Fu, and K.Wang.(α,k)-
anonymity: an enhanced k-anonymity model for
privacy preserving data publishing.In KDD,pages 754-
759,2006.
[6]. V.Rastogi,D.Suciu,and S.Hong.The boundary between
privacy and utility in data publishing.In VLDB,pages
531-542,2007.
[7]. T.Li and N.Li. On the tradeoff between privacy and
utility in data publishing.In KDD,pages 517-526,2009.
[8]. K.LeFevre,D.DeWitt, and R.Ramakrishnan. Mondrian
mulyidimensional k-anonymity.In ICDE,page 25,2006.
[9]. T.Li and N.Li. On the trade off between privacy and
utility in data publishing .In KDD,pages517-526,2009.
BIOGRAPHIES
ALPHONSA VEDANGI, Department of
Computer Science and Engineering, CMR
Institute of Technology, Hyderabad With peer
guidance from staff completed the research.
Interested in researches and having first class
aggregate till masters.
V.ANANDAM, B.E, M.S., Ph.D, Professor at
CMR Institute of Technology, Department of
Computer Science, Hyderabad Having Peer
experience in teaching and administrative
experience in academic colleges Coupled with 26 years of
National and International project experience in high speed
computer controls in process engineering