Limitations of Privacy Solutions for Log Files
We have considered applying a range of privacy solutions to log files.
We found that methods such as differential privacy and k-anonymity are not suitable for log files.
We make a proposal that replaces personal identifiers with ring signatures when collecting log files.
In particular we offer a light weight ring signature proposal which significantly improves the privacy for collecting log files while allowing
processing of those log files for tasks such as identifying IoCs.
This document summarizes technical privacy solutions. It begins with an introduction and agenda. It then discusses privacy use cases and types of data. It provides legal and practical definitions of privacy. It outlines implementing privacy using security approaches like network segmentation and data segmentation. The main body discusses formal approaches to privacy like differential privacy, k-anonymity, homomorphic encryption, Monero-style privacy, secure multiparty computation, and federated learning. Each approach is described along with examples and limitations. The document concludes that privacy solutions should generate business value and notes tools are still maturing.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Utility privacy tradeoff in databases an information-theoretic approachIEEEFINALYEARPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
Here are the key points about security enhancements for IEEE 802.11 wireless LANs through Wired Equivalent Privacy (WEP) protocol:
- WEP was the original security protocol for 802.11 wireless networks. It aimed to provide a level of security comparable to that of a wired network.
- WEP uses RC4 stream cipher for confidentiality. A pre-shared key (PSK) is used by both the client and access point to encrypt packets.
- The main weaknesses of WEP are: small key size (40-bit or 104-bit), use of static keys, no key management. This makes it vulnerable to eavesdropping and traffic injection attacks.
- To address
Data Leakage Detection and Security Using Cloud ComputingIJERA Editor
The data owner will store the data in the cloud. Every user must registered in the cloud. Cloud provider must
verify the authorized user. If someone try to access the account, data will get leaked. This leaked data will
present in an unauthorized place (e.g., on the internet or someone’s laptop). In this paper, we propose Division
and Replication of Data in the Cloud for Optimal Performance and Security (DROPS) that collectively
approaches the security and performance issues. In DROPS methodology, we have to select the file and then
store the particular file in the cloud account. In order to provide security we are going to implement DROPS
concepts. Now we divide the file into various fragments based on the threshold value. Each and every fragments
are stored in the node using T-Coloring. After the placement of fragments in node, it is necessary to replicate
each fragments for one time in cloud.
In this era, there are need to secure data in distributed database system. For collaborative data
publishing some anonymization techniques are available such as generalization and bucketization. We consider
the attack can call as “insider attack” by colluding data providers who may use their own records to infer
others records. To protect our database from these types of attacks we used slicing technique for anonymization,
as above techniques are not suitable for high dimensional data. It cause loss of data and also they need clear
separation of quasi identifier and sensitive database. We consider this threat and make several contributions.
First, we introduce a notion of data privacy and used slicing technique which shows that anonymized data
satisfies privacy and security of data which classifies data vertically and horizontally. Second, we present
verification algorithms which prove the security against number of providers of data and insure high utility and
data privacy of anonymized data with efficiency. For experimental result we use the hospital patient datasets
and suggest that our slicing approach achieves better or comparable utility and efficiency than baseline
algorithms while satisfying data security. Our experiment successfully demonstrates the difference between
computation time of encryption algorithm which is used to secure data and our system.
Intrusion Detection and Discovery via Log Correlation to support HIPAA Securi...David Sweigert
This document discusses log correlation and network forensics. It covers ensuring log integrity, managing timestamps, normalization and filtering of logs. Log integrity can be compromised during transmission between acquisition and collection points. Normalization is needed to correlate different log formats. Correlation and filtering tools use either a top-down or bottom-up approach to interpret logs. Ensuring log reliability and integrity is important for network forensics investigations and attributions.
This document discusses seven key security concepts: authentication, authorization, confidentiality, data/message integrity, accountability, availability, and non-repudiation. It defines each concept and provides examples to illustrate how they are implemented in technological security systems like web servers. Physical security and security policies/procedures are also discussed as important components of a holistic security approach.
This document summarizes technical privacy solutions. It begins with an introduction and agenda. It then discusses privacy use cases and types of data. It provides legal and practical definitions of privacy. It outlines implementing privacy using security approaches like network segmentation and data segmentation. The main body discusses formal approaches to privacy like differential privacy, k-anonymity, homomorphic encryption, Monero-style privacy, secure multiparty computation, and federated learning. Each approach is described along with examples and limitations. The document concludes that privacy solutions should generate business value and notes tools are still maturing.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Utility privacy tradeoff in databases an information-theoretic approachIEEEFINALYEARPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
Here are the key points about security enhancements for IEEE 802.11 wireless LANs through Wired Equivalent Privacy (WEP) protocol:
- WEP was the original security protocol for 802.11 wireless networks. It aimed to provide a level of security comparable to that of a wired network.
- WEP uses RC4 stream cipher for confidentiality. A pre-shared key (PSK) is used by both the client and access point to encrypt packets.
- The main weaknesses of WEP are: small key size (40-bit or 104-bit), use of static keys, no key management. This makes it vulnerable to eavesdropping and traffic injection attacks.
- To address
Data Leakage Detection and Security Using Cloud ComputingIJERA Editor
The data owner will store the data in the cloud. Every user must registered in the cloud. Cloud provider must
verify the authorized user. If someone try to access the account, data will get leaked. This leaked data will
present in an unauthorized place (e.g., on the internet or someone’s laptop). In this paper, we propose Division
and Replication of Data in the Cloud for Optimal Performance and Security (DROPS) that collectively
approaches the security and performance issues. In DROPS methodology, we have to select the file and then
store the particular file in the cloud account. In order to provide security we are going to implement DROPS
concepts. Now we divide the file into various fragments based on the threshold value. Each and every fragments
are stored in the node using T-Coloring. After the placement of fragments in node, it is necessary to replicate
each fragments for one time in cloud.
In this era, there are need to secure data in distributed database system. For collaborative data
publishing some anonymization techniques are available such as generalization and bucketization. We consider
the attack can call as “insider attack” by colluding data providers who may use their own records to infer
others records. To protect our database from these types of attacks we used slicing technique for anonymization,
as above techniques are not suitable for high dimensional data. It cause loss of data and also they need clear
separation of quasi identifier and sensitive database. We consider this threat and make several contributions.
First, we introduce a notion of data privacy and used slicing technique which shows that anonymized data
satisfies privacy and security of data which classifies data vertically and horizontally. Second, we present
verification algorithms which prove the security against number of providers of data and insure high utility and
data privacy of anonymized data with efficiency. For experimental result we use the hospital patient datasets
and suggest that our slicing approach achieves better or comparable utility and efficiency than baseline
algorithms while satisfying data security. Our experiment successfully demonstrates the difference between
computation time of encryption algorithm which is used to secure data and our system.
Intrusion Detection and Discovery via Log Correlation to support HIPAA Securi...David Sweigert
This document discusses log correlation and network forensics. It covers ensuring log integrity, managing timestamps, normalization and filtering of logs. Log integrity can be compromised during transmission between acquisition and collection points. Normalization is needed to correlate different log formats. Correlation and filtering tools use either a top-down or bottom-up approach to interpret logs. Ensuring log reliability and integrity is important for network forensics investigations and attributions.
This document discusses seven key security concepts: authentication, authorization, confidentiality, data/message integrity, accountability, availability, and non-repudiation. It defines each concept and provides examples to illustrate how they are implemented in technological security systems like web servers. Physical security and security policies/procedures are also discussed as important components of a holistic security approach.
This document discusses the concepts and implementation of a Security Operations Center (SOC). It defines the key modules of a SOC as: event generators (E boxes), event collectors (C boxes), message databases (D boxes), analysis engines (A boxes), and reaction management software (R boxes).
The document outlines the challenges with each module, such as performance issues with event collection and ensuring availability of the database. It then proposes a global architecture with these modules, detailing how the knowledge base (K boxes) would store information on systems, vulnerabilities, and security policies to aid analysis. Event filtering strategies are also discussed to balance exhaustiveness of logs with performance.
This document presents research on compressing encrypted data. The researchers investigate reversing the traditional order of compressing data before encrypting it. They show that by using principles of coding with side information, it is possible to first encrypt data and then compress it without loss of optimal compression efficiency or security. They prove the theoretical feasibility of this approach and describe a system to implement compression of encrypted data. Computer simulations demonstrate the performance of the proposed system. The researchers identify connections to distributed source coding theory and demonstrate that in some scenarios, reversing the order of encryption and compression does not compromise effectiveness or security.
This document discusses database management systems (DBMS). A DBMS is software that allows for the storage, management, and retrieval of large amounts of data. It provides benefits like data independence, concurrency control, crash recovery, and security. A DBMS typically uses a multi-layer architecture and implements concepts like transactions, locking, and logging to ensure data integrity and consistency when multiple users access the database concurrently. Database administrators are responsible for designing database schemas and tuning the system to meet evolving needs.
The document discusses data security and provides an overview of key concepts including security measures, policies, principles, and technologies and threats related to data security. It covers topics such as the definition of security and data, how computers are used to store important data, sensitive information, and the threats to security including natural disasters, human errors, hackers, and more. Security services like secrecy, integrity, availability, and access control are explained. The presentation also discusses security policies and models.
DATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLEijdms
Oracle is one of the largest vendors and the best DBMS solution of Object Relational DBMS in the IT world. Oracle Database is one of the three market-leading database technologies, along with Microsoft SQL Server's Database and IBM's DB2. Hence in this paper, we have tried to answer the million-dollar question “What is user’s responsibility to harden the oracle database for its security?” This paper gives practical guidelines for hardening the oracle database, so that attacker will be prevented to get access into the database. The practical lookout for protecting TNS, Accessing Remote Server and Prevention, Accessing Files on Remote Server, Fetching Environment Variables, Privileges and Authorizations, Access Control, writing security policy, Database Encryption, Oracle Data Mask, Standard built in Auditing and Fine Grained Auditing (FGA) is illustrated with SQL syntax and executed with suitable real life examples and its output is tested and verified. This structured method acts as Data Invictus wall for the attacker and protect user’s database.
This document discusses privacy-preserving data mining and cryptography. It explains that separate medical institutions may want to conduct joint research while preserving patient privacy. It also discusses how ultra-large databases hold transaction records and how privacy-preserving protocols are needed to limit information leaks during distributed computations, even from adversarial participants. Finally, it discusses how cryptography can enable functions to be computed securely in a way that preserves individual privacy and reveals only the final results of data mining computations.
Data Sharing: Ensure Accountability Distribution in the CloudSuraj Mehta
The document proposes a system for ensuring distributed accountability and security for user data stored in the cloud. The system encrypts user data and wraps it in a JAR file along with access policies. It uses DES for encryption, RSA for JAR file security, and MD5 for authentication. Log records of access are generated, encrypted, and stored in log files. A log harmonizer tracks the logs and can push or pull them to ensure the data owner's data is secure. The system aims to provide accountability, enforce access controls, and prevent attacks like copying or disassembling protected data.
Collusion Attack: A Kernel-Based Privacy Preserving Techniques in Data Miningdbpublications
Data leakage happens whenever a system that is designed to be closed to an eavesdropper reveals some information to unauthorized parties. We know that for business purpose, it is necessary to transfer important data among many business partner and between the numbers of employees. But during this transfer of data, information is reach to unauthorized person. So it is very challenging and necessary to find leakage and guilty person responsible for information leakage. In this system we find the person which is responsible for the leakage of text as well as image file. For this we used distributor and agent. Distributor means owner of data and agents means trusted parties to whom we send data. This system finds the insider collusion attack. An insider attack is a malicious threat to an organization that come from people within the organization such as employees, contractor or business associate, who have inside information concerning the organization’s security practices, data and computer system. The main aim of this system is to find data of owner which is leaked and detect agent who leaked data. Here for text data we used kernel based algorithm and for image file we used steganography concept.
With the growth of cloud technologies, computing
resources and cloud storage have become the most
demanding online services. There are several companies
desiring to outsource their data storage and resources as
well. While storing private and sensitive data on a third
party data center, it is necessary to consider security and
privacy which become major issues. In this paper, a novel
Double Encryption with Single Decryption (DESD) crypto
technique is proposed to secure the data in cloud storage.
The proposed technique comprises of encryption and
decryption phases where in the encryption phase the data is
randomly partitioned into multiple fragments. Double
encryption is done on each fragment by prime numbers, as
well as Invertible Non-linear Function (INF). These
multiple encrypted data are stored at the multiple cloud
storages with the help of cloud service provider (CSP).
After all verification process the data user collects the key
from the data owner and decrypts the gathered data from
the cloud with the knowledge of inverse INF. The proposed
crypto technique provides more security and privacy to
cloud data and any illegitimate users cannot retrieve the
original data. The performance of the proposed DESD
technique is compared with AES and Triple DES
techniques and the experimental results are plotted which
shows the proposed technique is efficient and faster.
The document discusses various security threats and protection mechanisms. It covers basics of cryptography including symmetric and public key cryptography. It also discusses digital signatures, user authentication, and threats from intruders both internal and external to a system. Protection mechanisms aim to achieve goals of data confidentiality, integrity, and system availability despite security threats.
This document summarizes a research paper on secured authorized deduplication in a hybrid cloud system. The system aims to provide data deduplication, differential authorization for access, and confidentiality of data files. It involves a public cloud for storage, a private cloud for managing access tokens, and users who generate keys for files stored on the public cloud. When uploading a file, the user encrypts it and sends it to the public cloud along with the key to the private cloud. To download, the user must provide the correct key to the private cloud to gain access to encrypted files from the public cloud. This hybrid cloud model uses deduplication for storage optimization while controlling access through differential authorization of private keys.
Secured Authorized Deduplication Based Hybrid Cloudtheijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Theoretical work submitted to the Journal should be original in its motivation or modeling structure. Empirical analysis should be based on a theoretical framework and should be capable of replication. It is expected that all materials required for replication (including computer programs and data sets) should be available upon request to the authors.
The International Journal of Engineering & Science would take much care in making your article published without much delay with your kind cooperation
Privacy-Preserving Updates to Anonymous and Confidential Databaseijdmtaiir
The current trend in the application space towards
systems of loosely coupled and dynamically bound
components that enables just-in-time integration jeopardizes
the security of information that is shared between the broker,
the requester, and the provider at runtime. In particular, new
advances in data mining and knowledge discovery that allow
for the extraction of hidden knowledge in an enormous amount
of data impose new threats on the seamless integration of
information. We consider the problem of building privacy
preserving algorithms for one category of data mining
techniques, association rule mining.Suppose Alice owns a kanonymous database and needs to determine whether her
database, when inserted with a tuple owned by Bob, is still kanonymous. Also, suppose that access to the database is strictly
controlled, because for example data are used for certain
experiments that need to be maintained confidential. Clearly,
allowing Alice to directly read the contents of the tuple breaks
the privacy of Bob (e.g., a patient’s medical record); on the
other hand, the confidentiality of the database managed by
Alice is violated once Bob has access to the contents of the
database. Thus, the problem is to check whether the database
inserted with the tuple is still k-anonymous, without letting
Alice and Bob know the contents of the tuple and the database,
respectively. In this paper, we propose two protocols solving
this problem on suppression-based and generalization-based kanonymous and confidential databases. The protocols rely on
well-known cryptographic assumptions, and we provide
theoretical analyses to proof their soundness and experimental
results to illustrate their efficiency.We have presented two
secure protocols for privately checking whether a kanonymous database retains its anonymity once a new tuple is
being inserted to it. Since the proposed protocols ensure the
updated database remains K-anonymous, the results returned
from a user’s (or a medical researcher’s) query are also kanonymous. Thus, the patient or the data provider’s privacy
cannot be violated from any query. As long as the database is
updated properly using the proposed protocols, the user queries
under our application domain are always privacy-preserving
Bt0088 cryptography and network security2Techglyphs
Military security policy ranks information sensitivity levels from unclassified to top secret and limits access based on a need-to-know principle. The Chinese Wall security policy groups related company information and prevents accessing competing company data after accessing one group. Impersonation threats are more significant in wide area networks where attackers can obtain another's identity details. Link encryption protects data in transit while end-to-end encryption protects data throughout its network path. Security associations connect security services and keys to traffic between IPSec peers.
This document provides an overview of security layers and principles, including confidentiality, integrity, availability, threats, risks, and attack surfaces. It discusses social engineering, site security, computer security, operating system security using Active Directory, security policies like passwords and account lockout. It also covers security software like DMZ, NAT, IPsec, SSH, and protecting wireless networks. Common attacks, malware, Windows updates, and phishing/pharming are described. The document emphasizes the importance of security for computers and networks in organizations.
This document proposes a new encryption scheme called compact summation key encryption for secure data sharing in hybrid cloud storage. It aims to address limitations of existing approaches like predefined hierarchical schemes, attribute-based encryption, and identity-based encryption which cannot provide security to individual files or have non-constant size keys. The new scheme uses five algorithms: setup, key generation, encryption, extraction and decryption. It generates constant size public and master secret keys. Encryption uses file indexes and bilinear groups to create ciphertexts. Extraction combines decryption keys into a single compact summation key using bilinear pairing operations. This key can then decrypt ciphertexts for multiple file indexes, improving flexibility and efficiency of secure data sharing in cloud storage.
Secure Data Sharing Using Compact Summation key in Hybrid Cloud StorageIOSR Journals
This document proposes a new encryption scheme called compact summation key encryption for secure data sharing in hybrid cloud storage. It aims to address limitations of existing approaches like predefined hierarchical schemes, attribute-based encryption, and identity-based encryption which cannot provide security to individual files or have non-constant size keys. The new scheme uses five algorithms: setup, key generation, encryption, extraction and decryption. It generates constant size public and master secret keys. Encryption uses file indexes and bilinear groups to create ciphertexts. Extraction combines decryption keys into a single compact summation key using bilinear pairing operations. This key can then decrypt ciphertexts for multiple file indexes, improving flexibility and efficiency of secure data sharing in cloud storage.
Improving Cloud Security Using Multi Level Encryption and AuthenticationAM Publications,India
As people have become more social and electronically attached, the concern for information sharing over the internet still persist. As known many powerful cryptographical approaches have been proposed in the past which are practically impossible to break, yet there exists a major concern of total encryption and decryption time taken as a whole. It is a known fact that in encrypting a large chunk of data, traditional asymmetric key algorithm may be slower to symmetric key algorithm by 1000 times or more. Hence this paper proposes a hierarchical structure in which the parties are first authenticated, then exchange keys by asymmetric key algorithm, then do actual encryption and decryption by the symmetric key algorithm. This will be useful to improve the security in cloud applications.
The document analyzes a spam campaign from April to June 2012 that distributed malware via the Blackhole Exploit Kit. It found 245 separate spam runs spoofing 17-40 organizations each month. The spam used social engineering to trick users into clicking links that led to compromised websites and exploit pages hosting the Blackhole Exploit Kit. These pages attempted to exploit vulnerabilities in browsers and software to download malware like ZeuS and Cridex. The campaign was highly effective due to its scale and use of redirection, compromised sites and thousands of URLs daily, making it difficult for traditional security methods to keep up.
Similarity digests have gained popularity for many
security applications like blacklisting/whitelisting, and finding
similar variants of malware. TLSH has been shown to be
particularly good at hunting similar malware, and is resistant to
evasion as compared to other similarity digests like ssdeep and
sdhash. Searching and clustering are fundamental tools which
help the security analysts and security operations center (SOC)
operators in hunting and analyzing malware. Current approaches
which aim to cluster malware are not scalable enough to keep
up with the vast amount of malware and goodware available
in the wild. In this paper, we present techniques which allow
for fast search and clustering of TLSH hash digests which
can aid analysts to inspect large amounts of malware/goodware.
Our approach builds on fast nearest neighbor search techniques
to build a tree-based index which performs fast search based
on TLSH hash digests. The tree-based index is used in our
threshold based Hierarchical Agglomerative Clustering (HAC-T)
algorithm which is able to cluster digests in a scalable manner.
Our clustering technique can cluster digests in O(n logn) time on
average. We performed an empirical evaluation by comparing our
approach with many standard and recent clustering techniques.
We demonstrate that our approach is much more scalable and
still is able to produce good cluster quality. We measured
cluster quality using purity on 10 million samples obtained from
VirusTotal. We obtained a high purity score in the range from
0.97 to 0.98 using labels from five major anti-virus vendors
(Kaspersky, Microsoft, Symantec, Sophos, and McAfee) which
demonstrates the effectiveness of the proposed method.
This document discusses the concepts and implementation of a Security Operations Center (SOC). It defines the key modules of a SOC as: event generators (E boxes), event collectors (C boxes), message databases (D boxes), analysis engines (A boxes), and reaction management software (R boxes).
The document outlines the challenges with each module, such as performance issues with event collection and ensuring availability of the database. It then proposes a global architecture with these modules, detailing how the knowledge base (K boxes) would store information on systems, vulnerabilities, and security policies to aid analysis. Event filtering strategies are also discussed to balance exhaustiveness of logs with performance.
This document presents research on compressing encrypted data. The researchers investigate reversing the traditional order of compressing data before encrypting it. They show that by using principles of coding with side information, it is possible to first encrypt data and then compress it without loss of optimal compression efficiency or security. They prove the theoretical feasibility of this approach and describe a system to implement compression of encrypted data. Computer simulations demonstrate the performance of the proposed system. The researchers identify connections to distributed source coding theory and demonstrate that in some scenarios, reversing the order of encryption and compression does not compromise effectiveness or security.
This document discusses database management systems (DBMS). A DBMS is software that allows for the storage, management, and retrieval of large amounts of data. It provides benefits like data independence, concurrency control, crash recovery, and security. A DBMS typically uses a multi-layer architecture and implements concepts like transactions, locking, and logging to ensure data integrity and consistency when multiple users access the database concurrently. Database administrators are responsible for designing database schemas and tuning the system to meet evolving needs.
The document discusses data security and provides an overview of key concepts including security measures, policies, principles, and technologies and threats related to data security. It covers topics such as the definition of security and data, how computers are used to store important data, sensitive information, and the threats to security including natural disasters, human errors, hackers, and more. Security services like secrecy, integrity, availability, and access control are explained. The presentation also discusses security policies and models.
DATABASE PRIVATE SECURITY JURISPRUDENCE: A CASE STUDY USING ORACLEijdms
Oracle is one of the largest vendors and the best DBMS solution of Object Relational DBMS in the IT world. Oracle Database is one of the three market-leading database technologies, along with Microsoft SQL Server's Database and IBM's DB2. Hence in this paper, we have tried to answer the million-dollar question “What is user’s responsibility to harden the oracle database for its security?” This paper gives practical guidelines for hardening the oracle database, so that attacker will be prevented to get access into the database. The practical lookout for protecting TNS, Accessing Remote Server and Prevention, Accessing Files on Remote Server, Fetching Environment Variables, Privileges and Authorizations, Access Control, writing security policy, Database Encryption, Oracle Data Mask, Standard built in Auditing and Fine Grained Auditing (FGA) is illustrated with SQL syntax and executed with suitable real life examples and its output is tested and verified. This structured method acts as Data Invictus wall for the attacker and protect user’s database.
This document discusses privacy-preserving data mining and cryptography. It explains that separate medical institutions may want to conduct joint research while preserving patient privacy. It also discusses how ultra-large databases hold transaction records and how privacy-preserving protocols are needed to limit information leaks during distributed computations, even from adversarial participants. Finally, it discusses how cryptography can enable functions to be computed securely in a way that preserves individual privacy and reveals only the final results of data mining computations.
Data Sharing: Ensure Accountability Distribution in the CloudSuraj Mehta
The document proposes a system for ensuring distributed accountability and security for user data stored in the cloud. The system encrypts user data and wraps it in a JAR file along with access policies. It uses DES for encryption, RSA for JAR file security, and MD5 for authentication. Log records of access are generated, encrypted, and stored in log files. A log harmonizer tracks the logs and can push or pull them to ensure the data owner's data is secure. The system aims to provide accountability, enforce access controls, and prevent attacks like copying or disassembling protected data.
Collusion Attack: A Kernel-Based Privacy Preserving Techniques in Data Miningdbpublications
Data leakage happens whenever a system that is designed to be closed to an eavesdropper reveals some information to unauthorized parties. We know that for business purpose, it is necessary to transfer important data among many business partner and between the numbers of employees. But during this transfer of data, information is reach to unauthorized person. So it is very challenging and necessary to find leakage and guilty person responsible for information leakage. In this system we find the person which is responsible for the leakage of text as well as image file. For this we used distributor and agent. Distributor means owner of data and agents means trusted parties to whom we send data. This system finds the insider collusion attack. An insider attack is a malicious threat to an organization that come from people within the organization such as employees, contractor or business associate, who have inside information concerning the organization’s security practices, data and computer system. The main aim of this system is to find data of owner which is leaked and detect agent who leaked data. Here for text data we used kernel based algorithm and for image file we used steganography concept.
With the growth of cloud technologies, computing
resources and cloud storage have become the most
demanding online services. There are several companies
desiring to outsource their data storage and resources as
well. While storing private and sensitive data on a third
party data center, it is necessary to consider security and
privacy which become major issues. In this paper, a novel
Double Encryption with Single Decryption (DESD) crypto
technique is proposed to secure the data in cloud storage.
The proposed technique comprises of encryption and
decryption phases where in the encryption phase the data is
randomly partitioned into multiple fragments. Double
encryption is done on each fragment by prime numbers, as
well as Invertible Non-linear Function (INF). These
multiple encrypted data are stored at the multiple cloud
storages with the help of cloud service provider (CSP).
After all verification process the data user collects the key
from the data owner and decrypts the gathered data from
the cloud with the knowledge of inverse INF. The proposed
crypto technique provides more security and privacy to
cloud data and any illegitimate users cannot retrieve the
original data. The performance of the proposed DESD
technique is compared with AES and Triple DES
techniques and the experimental results are plotted which
shows the proposed technique is efficient and faster.
The document discusses various security threats and protection mechanisms. It covers basics of cryptography including symmetric and public key cryptography. It also discusses digital signatures, user authentication, and threats from intruders both internal and external to a system. Protection mechanisms aim to achieve goals of data confidentiality, integrity, and system availability despite security threats.
This document summarizes a research paper on secured authorized deduplication in a hybrid cloud system. The system aims to provide data deduplication, differential authorization for access, and confidentiality of data files. It involves a public cloud for storage, a private cloud for managing access tokens, and users who generate keys for files stored on the public cloud. When uploading a file, the user encrypts it and sends it to the public cloud along with the key to the private cloud. To download, the user must provide the correct key to the private cloud to gain access to encrypted files from the public cloud. This hybrid cloud model uses deduplication for storage optimization while controlling access through differential authorization of private keys.
Secured Authorized Deduplication Based Hybrid Cloudtheijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Theoretical work submitted to the Journal should be original in its motivation or modeling structure. Empirical analysis should be based on a theoretical framework and should be capable of replication. It is expected that all materials required for replication (including computer programs and data sets) should be available upon request to the authors.
The International Journal of Engineering & Science would take much care in making your article published without much delay with your kind cooperation
Privacy-Preserving Updates to Anonymous and Confidential Databaseijdmtaiir
The current trend in the application space towards
systems of loosely coupled and dynamically bound
components that enables just-in-time integration jeopardizes
the security of information that is shared between the broker,
the requester, and the provider at runtime. In particular, new
advances in data mining and knowledge discovery that allow
for the extraction of hidden knowledge in an enormous amount
of data impose new threats on the seamless integration of
information. We consider the problem of building privacy
preserving algorithms for one category of data mining
techniques, association rule mining.Suppose Alice owns a kanonymous database and needs to determine whether her
database, when inserted with a tuple owned by Bob, is still kanonymous. Also, suppose that access to the database is strictly
controlled, because for example data are used for certain
experiments that need to be maintained confidential. Clearly,
allowing Alice to directly read the contents of the tuple breaks
the privacy of Bob (e.g., a patient’s medical record); on the
other hand, the confidentiality of the database managed by
Alice is violated once Bob has access to the contents of the
database. Thus, the problem is to check whether the database
inserted with the tuple is still k-anonymous, without letting
Alice and Bob know the contents of the tuple and the database,
respectively. In this paper, we propose two protocols solving
this problem on suppression-based and generalization-based kanonymous and confidential databases. The protocols rely on
well-known cryptographic assumptions, and we provide
theoretical analyses to proof their soundness and experimental
results to illustrate their efficiency.We have presented two
secure protocols for privately checking whether a kanonymous database retains its anonymity once a new tuple is
being inserted to it. Since the proposed protocols ensure the
updated database remains K-anonymous, the results returned
from a user’s (or a medical researcher’s) query are also kanonymous. Thus, the patient or the data provider’s privacy
cannot be violated from any query. As long as the database is
updated properly using the proposed protocols, the user queries
under our application domain are always privacy-preserving
Bt0088 cryptography and network security2Techglyphs
Military security policy ranks information sensitivity levels from unclassified to top secret and limits access based on a need-to-know principle. The Chinese Wall security policy groups related company information and prevents accessing competing company data after accessing one group. Impersonation threats are more significant in wide area networks where attackers can obtain another's identity details. Link encryption protects data in transit while end-to-end encryption protects data throughout its network path. Security associations connect security services and keys to traffic between IPSec peers.
This document provides an overview of security layers and principles, including confidentiality, integrity, availability, threats, risks, and attack surfaces. It discusses social engineering, site security, computer security, operating system security using Active Directory, security policies like passwords and account lockout. It also covers security software like DMZ, NAT, IPsec, SSH, and protecting wireless networks. Common attacks, malware, Windows updates, and phishing/pharming are described. The document emphasizes the importance of security for computers and networks in organizations.
This document proposes a new encryption scheme called compact summation key encryption for secure data sharing in hybrid cloud storage. It aims to address limitations of existing approaches like predefined hierarchical schemes, attribute-based encryption, and identity-based encryption which cannot provide security to individual files or have non-constant size keys. The new scheme uses five algorithms: setup, key generation, encryption, extraction and decryption. It generates constant size public and master secret keys. Encryption uses file indexes and bilinear groups to create ciphertexts. Extraction combines decryption keys into a single compact summation key using bilinear pairing operations. This key can then decrypt ciphertexts for multiple file indexes, improving flexibility and efficiency of secure data sharing in cloud storage.
Secure Data Sharing Using Compact Summation key in Hybrid Cloud StorageIOSR Journals
This document proposes a new encryption scheme called compact summation key encryption for secure data sharing in hybrid cloud storage. It aims to address limitations of existing approaches like predefined hierarchical schemes, attribute-based encryption, and identity-based encryption which cannot provide security to individual files or have non-constant size keys. The new scheme uses five algorithms: setup, key generation, encryption, extraction and decryption. It generates constant size public and master secret keys. Encryption uses file indexes and bilinear groups to create ciphertexts. Extraction combines decryption keys into a single compact summation key using bilinear pairing operations. This key can then decrypt ciphertexts for multiple file indexes, improving flexibility and efficiency of secure data sharing in cloud storage.
Improving Cloud Security Using Multi Level Encryption and AuthenticationAM Publications,India
As people have become more social and electronically attached, the concern for information sharing over the internet still persist. As known many powerful cryptographical approaches have been proposed in the past which are practically impossible to break, yet there exists a major concern of total encryption and decryption time taken as a whole. It is a known fact that in encrypting a large chunk of data, traditional asymmetric key algorithm may be slower to symmetric key algorithm by 1000 times or more. Hence this paper proposes a hierarchical structure in which the parties are first authenticated, then exchange keys by asymmetric key algorithm, then do actual encryption and decryption by the symmetric key algorithm. This will be useful to improve the security in cloud applications.
The document analyzes a spam campaign from April to June 2012 that distributed malware via the Blackhole Exploit Kit. It found 245 separate spam runs spoofing 17-40 organizations each month. The spam used social engineering to trick users into clicking links that led to compromised websites and exploit pages hosting the Blackhole Exploit Kit. These pages attempted to exploit vulnerabilities in browsers and software to download malware like ZeuS and Cridex. The campaign was highly effective due to its scale and use of redirection, compromised sites and thousands of URLs daily, making it difficult for traditional security methods to keep up.
Similarity digests have gained popularity for many
security applications like blacklisting/whitelisting, and finding
similar variants of malware. TLSH has been shown to be
particularly good at hunting similar malware, and is resistant to
evasion as compared to other similarity digests like ssdeep and
sdhash. Searching and clustering are fundamental tools which
help the security analysts and security operations center (SOC)
operators in hunting and analyzing malware. Current approaches
which aim to cluster malware are not scalable enough to keep
up with the vast amount of malware and goodware available
in the wild. In this paper, we present techniques which allow
for fast search and clustering of TLSH hash digests which
can aid analysts to inspect large amounts of malware/goodware.
Our approach builds on fast nearest neighbor search techniques
to build a tree-based index which performs fast search based
on TLSH hash digests. The tree-based index is used in our
threshold based Hierarchical Agglomerative Clustering (HAC-T)
algorithm which is able to cluster digests in a scalable manner.
Our clustering technique can cluster digests in O(n logn) time on
average. We performed an empirical evaluation by comparing our
approach with many standard and recent clustering techniques.
We demonstrate that our approach is much more scalable and
still is able to produce good cluster quality. We measured
cluster quality using purity on 10 million samples obtained from
VirusTotal. We obtained a high purity score in the range from
0.97 to 0.98 using labels from five major anti-virus vendors
(Kaspersky, Microsoft, Symantec, Sophos, and McAfee) which
demonstrates the effectiveness of the proposed method.
- The document discusses TLSH (Trendmicro Locality Sensitive Hash), a fuzzy hashing algorithm invented by the author that is useful for processing malware at scale.
- It provides an example of using TLSH to cluster samples from Malware Bazaar into 16452 clusters and predict the family of new samples with 80% accuracy.
- TLSH works by generating k-skip n-grams from files and calculating the distance between hashes to measure similarity, allowing large malware databases to be clustered and labeled for analysis.
2019 TrustCom: The role of ML and AI in SecurityJonathanOliver26
Discusses the role of ML and AI in Security.
Discusses some problems with training and decision surfaces.
Explains why ML models in security overestimate their accuracy.
Using lexigraphical distancing, an algorithm that estimates the probability that one term is an edited version of another, can help identify variants of spam terms like "Viagra" and improve spam detection rates. The algorithm identified 51 out of 60 variants of Viagra, while spell checking only caught 24. When included as a pre-processing step in a naive Bayes classifier, it incorrectly flagged few additional good emails as spam but caught 27% more spam messages, improving spam detection rates. Lexigraphical distancing provides a robust way to identify term variants for applications like spam filtering and content analysis.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise boosts blood flow, releases endorphins, and promotes changes in the brain which help regulate emotions and stress levels.
Blood finder application project report (1).pdfKamal Acharya
Blood Finder is an emergency time app where a user can search for the blood banks as
well as the registered blood donors around Mumbai. This application also provide an
opportunity for the user of this application to become a registered donor for this user have
to enroll for the donor request from the application itself. If the admin wish to make user
a registered donor, with some of the formalities with the organization it can be done.
Specialization of this application is that the user will not have to register on sign-in for
searching the blood banks and blood donors it can be just done by installing the
application to the mobile.
The purpose of making this application is to save the user’s time for searching blood of
needed blood group during the time of the emergency.
This is an android application developed in Java and XML with the connectivity of
SQLite database. This application will provide most of basic functionality required for an
emergency time application. All the details of Blood banks and Blood donors are stored
in the database i.e. SQLite.
This application allowed the user to get all the information regarding blood banks and
blood donors such as Name, Number, Address, Blood Group, rather than searching it on
the different websites and wasting the precious time. This application is effective and
user friendly.
Build the Next Generation of Apps with the Einstein 1 Platform.
Rejoignez Philippe Ozil pour une session de workshops qui vous guidera à travers les détails de la plateforme Einstein 1, l'importance des données pour la création d'applications d'intelligence artificielle et les différents outils et technologies que Salesforce propose pour vous apporter tous les bénéfices de l'IA.
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)GiselleginaGloria
3rd International Conference on Artificial Intelligence Advances (AIAD 2024) will act as a major forum for the presentation of innovative ideas, approaches, developments, and research projects in the area advanced Artificial Intelligence. It will also serve to facilitate the exchange of information between researchers and industry professionals to discuss the latest issues and advancement in the research area. Core areas of AI and advanced multi-disciplinary and its applications will be covered during the conferences.
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...DharmaBanothu
The Network on Chip (NoC) has emerged as an effective
solution for intercommunication infrastructure within System on
Chip (SoC) designs, overcoming the limitations of traditional
methods that face significant bottlenecks. However, the complexity
of NoC design presents numerous challenges related to
performance metrics such as scalability, latency, power
consumption, and signal integrity. This project addresses the
issues within the router's memory unit and proposes an enhanced
memory structure. To achieve efficient data transfer, FIFO buffers
are implemented in distributed RAM and virtual channels for
FPGA-based NoC. The project introduces advanced FIFO-based
memory units within the NoC router, assessing their performance
in a Bi-directional NoC (Bi-NoC) configuration. The primary
objective is to reduce the router's workload while enhancing the
FIFO internal structure. To further improve data transfer speed,
a Bi-NoC with a self-configurable intercommunication channel is
suggested. Simulation and synthesis results demonstrate
guaranteed throughput, predictable latency, and equitable
network access, showing significant improvement over previous
designs
Impartiality as per ISO /IEC 17025:2017 StandardMuhammadJazib15
This document provides basic guidelines for imparitallity requirement of ISO 17025. It defines in detial how it is met and wiudhwdih jdhsjdhwudjwkdbjwkdddddddddddkkkkkkkkkkkkkkkkkkkkkkkwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwioiiiiiiiiiiiii uwwwwwwwwwwwwwwwwhe wiqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq gbbbbbbbbbbbbb owdjjjjjjjjjjjjjjjjjjjj widhi owqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq uwdhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhwqiiiiiiiiiiiiiiiiiiiiiiiiiiiiw0pooooojjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj whhhhhhhhhhh wheeeeeeee wihieiiiiii wihe
e qqqqqqqqqqeuwiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiqw dddddddddd cccccccccccccccv s w c r
cdf cb bicbsad ishd d qwkbdwiur e wetwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww w
dddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddfffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffw
uuuuhhhhhhhhhhhhhhhhhhhhhhhhe qiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii iqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc ccccccccccccccccccccccccccccccccccc bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbu uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuum
m
m mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm m i
g i dijsd sjdnsjd ndjajsdnnsa adjdnawddddddddddddd uw
Open Channel Flow: fluid flow with a free surfaceIndrajeet sahu
Open Channel Flow: This topic focuses on fluid flow with a free surface, such as in rivers, canals, and drainage ditches. Key concepts include the classification of flow types (steady vs. unsteady, uniform vs. non-uniform), hydraulic radius, flow resistance, Manning's equation, critical flow conditions, and energy and momentum principles. It also covers flow measurement techniques, gradually varied flow analysis, and the design of open channels. Understanding these principles is vital for effective water resource management and engineering applications.
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...Dr.Costas Sachpazis
Consolidation Settlement Calculation Program-The Python Code
By Professor Dr. Costas Sachpazis, Civil Engineer & Geologist
This program calculates the consolidation settlement for a foundation based on soil layer properties and foundation data. It allows users to input multiple soil layers and foundation characteristics to determine the total settlement.
1. Limitations of Privacy Solutions for Log Files
Jonathan Oliver jon oliver@trendmicro.com
31 August 2021
1 Introduction
In this paper we are considering collecting log files (in particular log files for security purposes)
and the storage / processing of those log files. Some use cases include:
• Working with data which has PII (personally identifiable information) embedded in it.
For example, data with email addresses in it.
• When data is processed in a 3rd party country. For example, data which is collected in
country A may be hosted on on cloud servers in country B. Complex situations may arise
because the data may fall under the laws of country B.
• Extracting IoCs (indicators of compromise) from data. We are interested in IoCs which
are public knowledge and do not uniquely identify a victim.
1.0.1 Privacy Example
Consider a situation with 3 people: Alice, Bob and Charlie. Each person generates log files
which track various events which occur on their computers.
Attackers send personalized malware with the string XYZZY (the malicious IoC) and the
name of the victim encoded. So the logs look like
Person Computer Event Data
------ -------- ----- ----
Alice Computer1 EventA-1 XYZZY-abc
Alice Computer1 EventA-2 XYZZY-abc
Alice Computer1 EventA-3 XYZZY-abc
...
Bob Computer2 EventB-1 XYZZY-def
Bob Computer2 EventB-2 XYZZY-def
Bob Computer2 EventB-3 XYZZY-def
...
Charlie Computer3 EventC-1 XYZZY-ghj
Charlie Computer3 EventC-2 XYZZY-ghj
Charlie Computer3 EventC-3 XYZZY-ghj
where
abc = encrypted(Alice)
def = encrypted(Bob)
ghj = encrypted(Charlie)
We want to extract an IOC associated with this malware (XYZZY in this case) while maximising
the privacy afforded to Alice / Bob / Charlie.
This example is typical of various log files which are generated by security products such as:
1
2. • Email logs;
• Window events logs;
• Firewall logs;
• . . .
1.1 Desireable Properties
We desire a privacy solution which allows us to collect the logs from various machines / computers
and process it in a way that protects the privacy of the individuals. Specifically we want to do
this in a way which meets our privacy requirements
• Collect these logs from multiple computers into a single repository
• Transform / delete parts of the data which identifies a person
• Retaining data which occurs accross multiple people (and hence may be considered public
data)
in a reasonable ammount of computation.
1.2 Review of Privacy Approaches
Here we give a review of the various privacy methods and attempt to apply them to our example
above. here we distinguish between 2 types of data:
• Descriptive data: which has one row per person (the majority of privacy methods ade-
quately address this problem)
• Log files: where a person may contribute multiple rows (typically many rows). This covers
the various log files mentioned above (event logs, firewall logs, etc) and we discuss below
why privacy solutions (such as differential privacy or k-anonymity) do not adeqautely
address these types of data.
1.2.1 Descriptive Data
A typical list of people might look like:
Person Country Industry
Id Name Email
1 Person A a@abc.company Argentina Accounting
2 Person B b@b.company Brazil Manufacturing
. . . . . .
100 Person Z z@z.company USA Health
This type of data can be made “private” using differential privacy or k-anonymity (well respected
privacy approaches used around the world).
1.2.2 Log Files
Log files consist of 2 seperate tables (explicitly or implicitly). Most log files take the form where
the first table defines the people under consideration, and the second table defines events or
transactions for each person in the first table.
2
3. The first table is a list of people:
Table 1
PID Col1 . . . ColMax1
P1 . . .
. . . . . .
PMax . . .
Column 1 is a PID which defines each person.
The second table is a list of events (or transactions) from the people in Table 1:
Table 2
PID Event Id Col1 . . . ColMax2
P1 Event1 . . .
P1 Event2 . . .
P1 Event3 . . .
. . . . . . . . .
Pj EventMax . . .
In the second table, we allow multiple events associated with a personal identifier. For example,
Table 3 has 3 events associated with PID P1.
1.2.3 Privacy Approaches
We review a range of privacy mechanisms in this paper, and consider how they can be applied
to the log file problem. We consider:
• Differential Privacy [1, 2]
• k-anonymity [3, 4]
• Homomorphic Encryption [5, 6]
• Monero style privacy [7]
• Secure Multiparty Computation [8, 9] (which also covers Federated Machine Learning [10])
• Secret Sharing Schemes [11]
1.2.4 Privacy Operations
The operations used by privacy mechanisms (including those listed above) include:
• Suppressing data (either deleting it or replacing it with NULL values);
• Generalizing data (example transforming a persons age into an age range);
• Encrypting data;
• Hashing data; and
• Adding errors to data.
3
4. 2 Differential Privacy
Differential privacy is a system for publicly sharing information about a dataset by describing
the patterns of groups within the dataset while withholding information about individuals in
the dataset.
Consider the situation where we have a data row of interest. If errors are added in a
systematic way so that you get similar or the same answers with / without the row in question,
then we have protected the privacy of that row.
The definition and maths can extend to making 2 rows, 3 rows, ... private. This covers the
case that we may want to allow groups of individuals up to some size N to remain private. So
given N a maximum number of rows that we need to make private at once, we can determine
the error distribution to achieve that.
Differential Privacy is not suited for the log file problem. The amount error required to
achieve privacy on a log file depends on the number of rows which which may be associated with
a person. So a email log file for 1 day, might contain 100 emails from a user. To ensure the
privacy of this data would require an extra-ordinary ammount of error to be added, and almost
certainly make any analysis useless.
3 K-Anonymity
k-anonymity is a property possessed by certain anonymized data. A release of data is said to
have the k-anonymity property if the information for each person contained in the release cannot
be distinguished from at least k − 1 individuals whose information also appear in the release.
k-anonymity does appear to be relevant to the log file problem.
3.1 Limitations K-Anonymity
k-anonymity suffers from the following limitations:
• Background knowledge may be available that is not in the dataset which allows identifi-
cation.
• k-anonymity is not a good method to anonymize high dimensional data For example,
researchers from MIT [12] showed that, given 4 locations, the unicity 1 of mobile phone
timestamp-location datasets can be as high as 95
k-anonymity is not suited for the log file problem, or checking IoCs. The k value in k-
anonymity needs to be replaced by the MaxRows that we associate with a person. So if we
are analysing network logs where a single user has 100 rows, then we would need to apply
k-anonymity with k = 100 which would probably result in nearly all data in the log being
suppressed.
4 Homomorphic Encryption
Homomorphic Encryption involves doing computation on encrypted data. Microsoft in 2012 re-
ported a slow down of 6-7 orders of magnitude (https://www.microsoft.com/en-us/research/wp-
content/uploads/2016/02/323.pdf). UPenn in 2016 reported a slow down of 9 orders of magni-
tude (https://haeberlen.cis.upenn.edu/papers/seabed-osdi2016.pdf). It would appear that Ho-
momorphic Encryption is not yet feasible for working with data at scale or processing large log
files.
1
Unicity is measured by the number of points needed to uniquely identify an individual in a data set.
4
5. 5 Monero Style Privacy
Monero is a crypto-currency where the key features are those around privacy and anonymity:
• The value of transactions is obfuscated.
• Sending addresses are hidden in combination with other addresses (in a ”ring signature”)
so it is not clear exactly who sent a transaction.
• Receiving addresses are hidden using stealth addresses which are generated using a secret
sharing scheme.
There has been a back and forth between Monero and researchers who have pointed out
privacy concerns in the approaches used by Monero. More recently (September 2020), the
United States IRS posted a USD $625,000 bounty to a company to develop tools to help trace
Monero and related crypto-currencies.
6 Secure Multi-party Computation / Federated Learning
The example in Section 1.0.1 high-lights the problem with Federated Learning.
• A learner at Computer1 cannot distinguish between the IoC (XYZZY) and an encoded
version of the first victim (abc).
• A learner at Computer2 cannot distinguish between the IoC (XYZZY) and an encoded
version of the second victim (def).
• A learner at Computer3 cannot distinguish between the IoC (XYZZY) and an encoded
version of the thrid victim (ghj).
We need to merge the records from different people to identify which elements are private and
which elements are suitable as public IoCs. But the very process of merging the records breaks
the very privacy that we are attempting to create.
7 An Approach for Making Log Files Private
7.1 Proposal Step 1: Rewrite Identifiers with a Ring Signature
We may have sensitive data sets where we want/need to replace a personal identifier with another
token for the purposes of clustering / pivoting / identifying IoCs / etc.
The problematic table in a log file is Table 2:
Table 2
PID Event Id Col1 . . . ColMax2
P1 Event1 . . .
P1 Event2 . . .
P1 Event3 . . .
. . . . . . . . .
Pj EventMax . . .
We replace the PID with a Ring Signature for that data row. We define a parameter R to
determine how imprecise each Ring Signature will be. The Ring Signature for EventE which
came from person Pi should be created by
1. SetE = randomly generate a set of R − 1 people;
5
6. 2. RSE = generate a ring signature for the set Pi + SetE
This gives us the following Table:
Table 3
Ring Event Id Col1 . . . ColMax2
Signature
RS1 Event1 . . .
RS2 Event2 . . .
RS3 Event3 . . .
. . . . . . . . .
RSj EventMax . . .
7.2 Proposal Step 2: Apply a modified k-anonymity
We now apply a modified k-anonymity procedure to Table 3. We apply a range of feature
extraction approaches (from Security or Machine Learning). Each of these methods gives use a
candidate feature, F, with a group of rows, G.
We apply the following steps to determine if F is potentially a privacy violation.
1. get the set of ring signatures for group G
2. MinPID(F) = process this set of ring signatures to determine the minimum number of
identities in the group
3. If MinPID(F) ≤ k then feature F is a privacy violation and needs to be suppressed or
deleted.
If MinPID(F) > k, then F (independant of other features) can be considered anonymous since
in isolation we can associate a set of identities with it (at least k identities).
7.3 Properties of Table 3
Table 3 is a useful table for identifying pivots and IoCs.
Lets consider the situation where we have logs from 100 people and each person has 100
events in Table 3. Let the Ring imprecision parameter R = 5. Table 3 has 10,000 events. Lets
consider what an attacker who got the entire contents of Table 3 might do:
• They may try to extract information about a specific event. Due to the ring signature,
they have R = 5 unidentified people that it may come from.
• They may try to extract all the events for person Pi. They would get a collection of 100
events from Pi and a collection of 400 events which were not generated by person Pi.
All they could identify was that each event had a chance of 1
R of really being from some
unidentified person.
7.4 Light Weight Ring Signatures (LWRS)
Most Ring Signature approaches create large signatures; the size of the cryptographic signature
increases linearly with the number of people (identifiers) which you are anonymizing [13, Section
Efficiency]. This makes their use for large log files / large sets of people more difficult.
Many aspects of the above proposal can be satisfied by the following approach:
• Allocate each person a large prime (a few hundred bits);
• The ring signature for a set of people is the product of the primes for each person;
6
7. • Given two light weight ring signatures, we can determine if they have one or more people
in common by performing a greatest common divisor (GCD) operation.
If the GCD(LWRS1, LWRS2) = 1 then we know that these 2 rows came from different
identities. We can do pairwise GCD calculations to show a group of LWRS came from > k
identities.
7.5 Worked Example
We now apply the proposal to the example from Section 1.0.1.
The data:
Person Location data Event Data
------ -------- ----- ----
Alice Computer1 EventA-1 XYZZY-abc
Alice Computer1 EventA-2 XYZZY-abc
Alice Computer1 EventA-3 XYZZY-abc
...
Bob Computer2 EventB-1 XYZZY-def
Bob Computer2 EventB-2 XYZZY-def
Bob Computer2 EventB-3 XYZZY-def
...
Charlie Computer3 EventC-1 XYZZY-ghj
Charlie Computer3 EventC-2 XYZZY-ghj
Charlie Computer3 EventC-3 XYZZY-ghj
where
abc = encrypted(Alice)
def = encrypted(Bob)
ghj = encrypted(Charlie)
7.6 Step 1: Rewrite Identifiers with a Ring Signature
We assign the following primes2:
Alice 3
Bob 13
Charlie 19
We generate Light Weight Ring Signatures for each person.
This results in an intermediate data set:
LW Ring Signature Data
----------------- ----
3 x 11 x 23 XYZZY-abc
3 x 29 x 31 XYZZY-abc
3 x 29 x 37 XYZZY-abc
...
5 x 13 x 17 XYZZY-def
13 x 19 x 57 XYZZY-def
13 x 7 x 61 XYZZY-def
...
19 x 57 x 67 XYZZY-ghj
19 x 5 x 71 XYZZY-ghj
11 x 19 x 73 XYZZY-ghj
2
In this example, we use small primes, but it a real application we would use large primes with 200+ binary
digits.
7
8. 7.7 Step 2: Apply a modified k-anonymity
We define the GCD of a feature:
GCD(F) = GCD(set of LWRS for Feature F)
We now evaluate the GCD for a range of features:
• “XYZZY-abc”
• “XYZZY-def”
• “XYZZY-ghi”
• “XYZZY”
• “abc”
• “def”
• “ghi”
The group of data associated with feature = ”XYZZY-abc” has
GCD(“XYZZY − abc′′
) = GCD(3x11x23, 3x29x31, 3x29x37) = 3
and hence there data rows most likely came from a single person. Thus this feature should be
rejected.
Similarly,
GCD(“XYZZY − def′′
) = 13 AND GCD(“XYZZY − ghj′′
) = 19
and hence these strings must not be retained.
When we apply common string algorithms to the data, we also consider the strings ”abc”,
”def”, ”ghjh” and ”XYZZY”. We find that
GCD(“abc′′
) = 3 AND GCD(“def′′
) = 13 AND GCD(“ghj′′
) = 19
so these strings must not be retained. We find
GCD(“XYZZY′′
) = 1
so this feature can be used - we know it comes from multiple people.
The final transformed data set is:
LW Ring Signature Data
----------------- ----
3 x 11 x 23 XYZZY
3 x 29 x 31 XYZZY
3 x 29 x 37 XYZZY
...
5 x 13 x 17 XYZZY
13 x 19 x 57 XYZZY
13 x 7 x 61 XYZZY
...
19 x 57 x 67 XYZZY
19 x 5 x 71 XYZZY
11 x 19 x 73 XYZZY
8
9. 8 Conclusion
We have considered applying a range of privacy solutions to log files. We found that methods
such as differential privacy and k-anonymity are not suitable for log files. We make a proposal
that replaces personal identifiers with ring signatures when collecting log files. In particular we
offer a light weight ring signature proposal which significantly improves the privacy for collecting
log files while allowing processing of those log files for tasks such as identifying IoCs.
References
[1] C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in
private data analysis,” in Theory of cryptography conference. Springer, 2006, pp. 265–284,
https://link.springer.com/content/pdf/10.1007/11681878 14.pdf.
[2] “Differential privacy,” https://en.wikipedia.org/wiki/Differential privacy, [Online; accessed
17-May-2020].
[3] P. Samarati and L. Sweeney, “Protecting privacy when disclosing information:
k-anonymity and its enforcement through generalization and suppression,” 1998,
https://dataprivacylab.org/dataprivacy/projects/kanonymity/paper3.pdf.
[4] “K-anonymity,” https://en.wikipedia.org/wiki/K-anonymity, [Online; accessed 17-May-
2020].
[5] C. Gentry, “Fully homomorphic encryption using ideal lattices,” in Proceedings of the forty-
first annual ACM symposium on Theory of computing, 2009, pp. 169–178.
[6] “Homomorphic encryption,” https://en.wikipedia.org/wiki/Homomorphic encryption,
[Online; accessed 17-May-2020].
[7] “Monero,” https://en.wikipedia.org/wiki/Monero, [Online; accessed 17-May-2020].
[8] A. C. Yao, “Protocols for secure computations,” in 23rd annual symposium on foundations
of computer science (sfcs 1982). IEEE, 1982, pp. 160–164.
[9] “Secure multi-party computation,” https://en.wikipedia.org/wiki/Secure multi-party computation,
[Online; accessed 17-May-2020].
[10] “Federated learning,” https://en.wikipedia.org/wiki/Federated learning, [Online; accessed
17-May-2020].
[11] “Secret sharing,” https://en.wikipedia.org/wiki/Secret sharing, [Online; accessed 17-May-
2020].
[12] Y.-A. De Montjoye, C. A. Hidalgo, M. Verleysen, and V. D. Blondel, “Unique in the crowd:
The privacy bounds of human mobility,” Scientific reports, vol. 3, no. 1, pp. 1–5, 2013,
https://www.nature.com/articles/srep01376.
[13] “Ring signature,” https://en.wikipedia.org/wiki/Ring signature, [Online; accessed 17-May-
2020].
9