SlideShare a Scribd company logo
Top Ten Big Data Security
and Privacy Challenges
November 2012
© Copyright 2012, Cloud Security Alliance. All rights reserved. 2
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
© 2012 Cloud Security Alliance
All rights reserved. You may download, store, display on your computer, view, print, and link to the Cloud
Security Alliance Security as a Service Implementation Guidance at,
subject to the following: (a) the Guidance may be used solely for your personal, informational, non-commercial
use; (b) the Guidance may not be modified or altered in any way; (c) the Guidance may not be redistributed; and
(d) the trademark, copyright or other notices may not be removed. You may quote portions of the Guidance as
permitted by the Fair Use provisions of the United States Copyright Act, provided that you attribute the portions
to the Cloud Security Alliance Security as a Service Implementation Guidance Version 1.0 (2012).
© Copyright 2012, Cloud Security Alliance. All rights reserved. 3
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
1.0 Abstract ................................................................................................................................................................5
2.0 Introduction..........................................................................................................................................................5
3.0 Secure Computations in Distributed Programming Frameworks.........................................................................6
3.1 Use Cases..........................................................................................................................................................6
4.0 Security Best Practices for Non-Relational Data Stores .......................................................................................6
4.1 Use Cases..........................................................................................................................................................6
5.0 Secure Data Storage and Transactions Logs.........................................................................................................7
5.1 Use Cases..........................................................................................................................................................7
6.0 End-Point Input Validation/Filtering ....................................................................................................................7
6.1 Use Cases..........................................................................................................................................................7
7.0 Real-time Security/Compliance Monitoring.........................................................................................................7
7.1 Use Cases..........................................................................................................................................................8
8.0 Scalable and Composable Privacy-Preserving Data Mining and Analytics...........................................................8
8.1 Use Cases..........................................................................................................................................................8
9.0 Cryptographically Enforced Access Control and Secure Communication............................................................9
9.1 Use Cases..........................................................................................................................................................9
10.0 Granular Access Control.....................................................................................................................................9
10.1 Use Cases........................................................................................................................................................9
11.0 Granular Audits................................................................................................................................................ 10
11.1 Use Cases..................................................................................................................................................... 10
12.0 Data Provenance ............................................................................................................................................. 10
12.1 Use Cases..................................................................................................................................................... 10
13.0 Conclusion ....................................................................................................................................................... 11
© Copyright 2012, Cloud Security Alliance. All rights reserved. 4
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
CSA Big Data Working Group Co-Chairs
Lead: Sreeranga Rajan, Fujitsu
Co-Chair: Wilco van Ginkel, Verizon
Co-Chair: Neel Sundaresan, eBay
Alvaro Cardenas Mora, Fujitsu
Yu Chen, SUNY Binghamton
Adam Fuchs, Sqrrl
Adrian Lane, Securosis
Rongxing Lu, University of Waterloo
Pratyusa Manadhata, HP Labs
Jesus Molina, Fujitsu
Praveen Murthy, Fujitsu
Arnab Roy, Fujitsu
Shiju Sathyadevan, Amrita University
CSA Global Staff
Aaron Alva, Graduate Research Intern
Luciano JR Santos, Research Director
Evan Scoboria, Webmaster
Kendall Scoboria, Graphic Designer
John Yeoh, Research Analyst
© Copyright 2012, Cloud Security Alliance. All rights reserved. 5
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
1.0 Abstract
Security and privacy issues are magnified by velocity, volume, and variety of big data, such as large-scale cloud
infrastructures, diversity of data sources and formats, streaming nature of data acquisition, and high volume
inter-cloud migration. Therefore, traditional security mechanisms, which are tailored to securing small-scale
static (as opposed to streaming) data, are inadequate. In this paper, we highlight top ten big data-specific
security and privacy challenges. Our expectation from highlighting the challenges is that it will bring renewed
focus on fortifying big data infrastructures.
2.0 Introduction
The term big data refers to the massive amounts of digital information companies and governments collect
about us and our surroundings. Every day, we create 2.5 quintillion bytes of data—so much that 90% of the
data in the world today has been created in the last two years alone. Security and privacy issues are magnified
by velocity, volume, and variety of big data, such as large-scale cloud infrastructures, diversity of data sources
and formats, streaming nature of data acquisition and high volume inter-cloud migration. The use of large scale
cloud infrastructures, with a diversity of software platforms, spread across large networks of computers, also
increases the attack surface of the entire system
Traditional security mechanisms, which are tailored to securing small-scale static (as opposed to streaming)
data, are inadequate. For example, analytics for anomaly detection would generate too many outliers.
Similarly, it is not clear how to retrofit provenance in existing cloud infrastructures. Streaming data demands
ultra-fast response times from security and privacy solutions.
In this paper, we highlight the top ten big data specific security and privacy challenges. We interviewed Cloud
Security Alliance members and surveyed security practitioner-oriented trade journals to draft an initial list of
high-priority security and privacy problems, studied published research, and arrived at the following top ten
1. Secure computations in distributed programming frameworks
2. Security best practices for non-relational data stores
3. Secure data storage and transactions logs
4. End-point input validation/filtering
5. Real-time security/compliance monitoring
6. Scalable and composable privacy-preserving data mining and analytics
7. Cryptographically enforced access control and secure communication
8. Granular access control
9. Granular audits
10. Data provenance
In the rest of the paper, we provide brief descriptions and narrate use cases.
© Copyright 2012, Cloud Security Alliance. All rights reserved. 6
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
3.0 Secure Computations in Distributed
Programming Frameworks
Distributed programming frameworks utilize parallelism in computation and storage to process massive
amounts of data. A popular example is the MapReduce framework, which splits an input file into multiple
chunks. In the first phase of MapReduce, a Mapper for each chunk reads the data, performs some computation,
and outputs a list of key/value pairs. In the next phase, a Reducer combines the values belonging to each
distinct key and outputs the result. There are two major attack prevention measures: securing the mappers and
securing the data in the presence of an untrusted mapper.
3.1 Use Cases
Untrusted mappers could return wrong results, which will in turn generate incorrect aggregate results. With
large data sets, it is next to impossible to identify, resulting in significant damage, especially for scientific and
financial computations.
Retailer consumer data is often analyzed by marketing agencies for targeted advertising or customer-
segmenting. These tasks involve highly parallel computations over large data sets, and are particularly suited for
MapReduce frameworks such as Hadoop. However, the data mappers may contain intentional or unintentional
leakages. For example, a mapper may emit a very unique value by analyzing a private record, undermining
users’ privacy.
4.0 Security Best Practices for Non-Relational Data
Non-relational data stores popularized by NoSQL databases are still evolving with respect to security
infrastructure. For instance, robust solutions to NoSQL injection are still not mature. Each NoSQL DBs were
built to tackle different challenges posed by the analytics world and hence security was never part of the model
at any point of its design stage. Developers using NoSQL databases usually embed security in the middleware.
NoSQL databases do not provide any support for enforcing it explicitly in the database. However, clustering
aspect of NoSQL databases poses additional challenges to the robustness of such security practices.
4.1 Use Cases
Companies dealing with big unstructured data sets may benefit by migrating from a traditional relational
database to a NoSQL database in terms of accommodating/processing huge volume of data. In general, the
security philosophy of NoSQL databases relies in external enforcing mechanisms. To reduce security incidents,
the company must review security policies for the middleware adding items to its engine and at the same time
toughen NoSQL database itself to match its counterpart RDBs without compromising on its operational features.
© Copyright 2012, Cloud Security Alliance. All rights reserved. 7
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
5.0 Secure Data Storage and Transactions Logs
Data and transaction logs are stored in multi-tiered storage media. Manually moving data between tiers gives
the IT manager direct control over exactly what data is moved and when. However, as the size of data set has
been, and continues to be, growing exponentially, scalability and availability have necessitated auto-tiering for
big data storage management. Auto-tiering solutions do not keep track of where the data is stored, which poses
new challenges to secure data storage. New mechanisms are imperative to thwart unauthorized access and
maintain the 24/7 availability.
5.1 Use Cases
A manufacturer wants to integrate data from different divisions. Some of this data is rarely retrieved, while
some divisions constantly utilize the same data pools. An auto-tier storage system will save the manufacturer
money by pulling the rarely utilized data to a lower (and cheaper) tier. However, this data may consist in R&D
results, not popular but containing critical information. As lower-tier often provides decreased security, the
company should study carefully tiering strategies.
6.0 End-Point Input Validation/Filtering
Many big data use cases in enterprise settings require data collection from many sources, such as end-point
devices. For example, a security information and event management system (SIEM) may collect event logs from
millions of hardware devices and software applications in an enterprise network. A key challenge in the data
collection process is input validation: how can we trust the data? How can we validate that a source of input
data is not malicious and how can we filter malicious input from our collection? Input validation and filtering is a
daunting challenge posed by untrusted input sources, especially with the bring your own device (BYOD) model.
6.1 Use Cases
Both data retrieved from weather sensors and feedback votes sent by an iPhone application share a similar
validation problem. A motivated adversary may be able to create “rogue” virtual sensors, or spoof iPhone IDs to
rig the results. This is further complicated by the amount of data collected, which may exceed millions of
readings/votes. To perform these tasks effectively, algorithms need to be created to validate the input for large
data sets.
7.0 Real-time Security/Compliance Monitoring
Real-time security monitoring has always been a challenge, given the number of alerts generated by (security)
devices. These alerts (correlated or not) lead to many false positives, which are mostly ignored or simply
“clicked away,” as humans cannot cope with the shear amount. This problem might even increase with big data,
© Copyright 2012, Cloud Security Alliance. All rights reserved. 8
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
given the volume and velocity of data streams. However, big data technologies might also provide an
opportunity, in the sense that these technologies do allow for fast processing and analytics of different types of
data. Which in its turn can be used to provide, for instance, real-time anomaly detection based on scalable
security analytics.
7.1 Use Cases
Most industries and government (agencies) will benefit from real-time security analytics, although the use cases
may differ. There are use cases which are common, like, “Who is accessing which data from which resource at
what time”; “Are we under attack?” or “Do we have a breach of compliance standard C because of action A?”
These are not really new, but the difference is that we have more data at our disposal to make faster and better
decisions (e.g., less false positives) in that regard. However, new use cases can be defined or we can redefine
existing use cases in lieu of big data. For example, the health industry largely benefits from big data
technologies, potentially saving billions to the tax-payer, becoming more accurate with the payment of claims
and reducing the fraud related to claims. However, at the same time, the records stored may be extremely
sensitive and have to be compliant with HIPAA or regional/local regulations, which call for careful protection of
that same data. Detecting in real-time the anomalous retrieval of personal information, intentional or
unintentional, allows the health care provider to timely repair the damage created and to prevent further
8.0 Scalable and Composable Privacy-Preserving
Data Mining and Analytics
Big data can be seen as a troubling manifestation of Big Brother by potentially enabling invasions of privacy,
invasive marketing, decreased civil freedoms, and increase state and corporate control.
A recent analysis of how companies are leveraging data analytics for marketing purposes identified an example
of how a retailer was able to identify that a teenager was pregnant before her father knew. Similarly,
anonymizing data for analytics is not enough to maintain user privacy. For example, AOL released anonymized
search logs for academic purposes, but users were easily identified by their searchers. Netflix faced a similar
problem when users of their anonymized data set were identified by correlating their Netflix movie scores with
IMDB scores.
Therefore, it is important to establish guidelines and recommendations for preventing inadvertent privacy
8.1 Use Cases
User data collected by companies and government agencies are constantly mined and analyzed by inside
analysts and also potentially outside contractors or business partners. A malicious insider or untrusted partner
can abuse these datasets and extract private information from customers.
© Copyright 2012, Cloud Security Alliance. All rights reserved. 9
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
Similarly, intelligence agencies require the collection of vast amounts of data. The data sources are numerous
and may include chat-rooms, personal blogs and network routers. Most collected data is, however, innocent in
nature, need not be retained, and anonymity preserved.
Robust and scalable privacy preserving mining algorithms will increase the chances of collecting relevant
information to increase user safety.
9.0 Cryptographically Enforced Access Control and
Secure Communication
To ensure that the most sensitive private data is end-to-end secure and only accessible to the authorized
entities, data has to be encrypted based on access control policies. Specific research in this area such as
attribute-based encryption (ABE) has to be made richer, more efficient, and scalable. To ensure authentication,
agreement and fairness among the distributed entities, a cryptographically secure communication framework
has to be implemented.
9.1 Use Cases
Sensitive data is routinely stored unencrypted in the cloud. The main problem to encrypt data, especially large
data sets, is the all-or-nothing retrieval policy of encrypted data, disallowing users to easily perform fine grained
actions such as sharing records or searches. ABE alleviates this problem by utilizing a public key cryptosystem
where attributes related to the data encrypted serve to unlock the keys. On the other hand, we have
unencrypted less sensitive data as well, such as data useful for analytics. Such data has to be communicated in a
secure and agreed-upon way using a cryptographically secure communication framework.
10.0 Granular Access Control
The security property that matters from the perspective of access control is secrecy—preventing access to data
by people that should not have access. The problem with course-grained access mechanisms is that data that
could otherwise be shared is often swept into a more restrictive category to guarantee sound security. Granular
access control gives data managers a scalpel instead of a sword to share data as much as possible without
compromising secrecy.
10.1 Use Cases
Big data analysis and cloud computing are increasingly focused on handling diverse data sets, both in terms of
variety of schemas and variety of security requirements. Legal and policy restrictions on data come from
numerous sources. The Sarbanes-Oxley Act levees requirements to protect corporate financial information, and
the Health Insurance Portability and Accountability Act includes numerous restrictions on sharing personal
health records. Executive Order 13526 outlines an elaborate system of protecting national security information.
© Copyright 2012, Cloud Security Alliance. All rights reserved. 10
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
Privacy policies, sharing agreements, and corporate policy also impose requirements on data handling.
Managing this plethora of restrictions has so far resulted in increased costs for developing applications and a
walled garden approach in which few people can participate in the analysis. Granular access control is necessary
for analytical systems to adapt to this increasingly complex security environment.
11.0 Granular Audits
With real-time security monitoring (see section 12.0), we try to be notified at the moment an attack takes place.
In reality, this will not always be the case (e.g., new attacks, missed true positives). In order to get to the bottom
of a missed attack, we need audit information. This is not only relevant because we want to understand what
happened and what went wrong, but also because compliance, regulation and forensics reasons. In that regard,
auditing is not something new, but the scope and granularity might be different. For example, we have to deal
with more data objects, which probably are (but not necessarily) distributed.
11.1 Use Cases
Compliance requirements (e.g., HIPAA, PCI, Sarbanes-Oxley) require financial firms to provide granular auditing
records. Additionally, the loss of records containing private information is estimated at $200/record. Legal
action – depending on the geographic region – might follow in case of a data breach. Key personnel at financial
institutions require access to large data sets containing PI, such as SSN. Marketing firms want access, for
instance, to personal social media information to optimize their customer-centric approach regarding online ads.
12.0 Data Provenance
Provenance metadata will grow in complexity due to large provenance graphs generated from provenance-
enabled programming environments in big data applications. Analysis of such large provenance graphs to detect
metadata dependencies for security/confidentiality applications is computationally intensive.
12.1 Use Cases
Several key security applications require the history of a digital record – such as details about its creation.
Examples include detecting insider trading for financial companies or to determine the accuracy of the data
source for research investigations. These security assessments are time sensitive in nature, and require fast
algorithms to handle the provenance metadata containing this information. In addition, data provenance
complements audit logs for compliance requirements, such as PCI or Sarbanes-Oxley.
© Copyright 2012, Cloud Security Alliance. All rights reserved. 11
CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges
13.0 Conclusion
Big data is here to stay. It is practically impossible to imagine the next application without it consuming data,
producing new forms of data, and containing data-driven algorithms. As compute environments become
cheaper, applications environments become networked, and system and analytics environments become shared
over the cloud, security, access control, compression and encryption and compliance introduce challenges that
have to be addressed in a systematic way. The Cloud Security Alliance (CSA) Big Data Working Group
(BDWG) recognizes these challenges and has a mission for addressing these in a standardized and systematic
In this paper, we have highlighted the top ten security and privacy problems that need to be addressed for
making big data processing and computing infrastructure more secure. Some common elements in this list of
top ten issues that are specific to big data arise from the use of multiple infrastructure tiers (both storage and
computing) for processing big data, the use of new compute infrastructures such as NoSQL databases (for fast
throughput necessitated by big data volumes) that have not been thoroughly vetted for security issues, the non-
scalability of encryption for large data sets, non-scalability of real-time monitoring techniques that might be
practical for smaller volumes of data, the heterogeneity of devices that produce the data, and confusion with
the plethora of diverse legal and policy restrictions that leads to ad hoc approaches for ensuring security and
privacy. Many of the items in the list of top ten challenges also serve to clarify specific aspects of the attack
surface of the entire big data processing infrastructure that should be analyzed for these types of threats. We
plan to use OpenMobius, an open-source, large scale, distributed data processing, analytics, and tools platform
from eBay Research Labs as an experimental test bed.
Our hope is that this paper will spur action in the research and development community to collaboratively
increase focus on the top ten challenges, leading to greater security and privacy in big data platforms.

More Related Content

What's hot

Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop Way
Xoriant Corporation
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Survey
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
Mohamed Magdy
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
Jeffrey Funk
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
Big data storage
Big data storageBig data storage
Big data storage
Vikram Nandini
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
Arockiaraj Durairaj
Big Data
Big DataBig Data
Big Data
Kirubaburi R
Decision trees in hadoop
Decision trees in hadoopDecision trees in hadoop
Decision trees in hadoop
Revolution Analytics
Big data ppt
Big data pptBig data ppt
Big data ppt
Yash Raj
Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptx
Big data analysis
Big data analysisBig data analysis
Big data analysis
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Hadoop(Term Paper)
Hadoop(Term Paper)Hadoop(Term Paper)
Hadoop(Term Paper)
Dux Chandegra
Better Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartBetter Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and Smart
Paul Boal
Big data issues and challenges
Big data issues and challengesBig data issues and challenges
Big data issues and challenges
Dilpreet kaur Virk

What's hot (20)

Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop Way
Big data - what, why, where, when and how
Big data - what, why, where, when and howBig data - what, why, where, when and how
Big data - what, why, where, when and how
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Survey
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
The book of elephant tattoo
The book of elephant tattooThe book of elephant tattoo
The book of elephant tattoo
Big Data and Health Care
Big Data and Health CareBig Data and Health Care
Big Data and Health Care
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
Big data storage
Big data storageBig data storage
Big data storage
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
Big Data
Big DataBig Data
Big Data
Decision trees in hadoop
Decision trees in hadoopDecision trees in hadoop
Decision trees in hadoop
Big data ppt
Big data pptBig data ppt
Big data ppt
Lesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptxLesson 1 introduction to_big_data_and_hadoop.pptx
Lesson 1 introduction to_big_data_and_hadoop.pptx
Big data analysis
Big data analysisBig data analysis
Big data analysis
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Hadoop(Term Paper)
Hadoop(Term Paper)Hadoop(Term Paper)
Hadoop(Term Paper)
Better Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and SmartBetter Architecture for Data: Adaptable, Scalable, and Smart
Better Architecture for Data: Adaptable, Scalable, and Smart
Big data issues and challenges
Big data issues and challengesBig data issues and challenges
Big data issues and challenges

Viewers also liked

7 Characteristics of a Bad (Big) Data Platform
7 Characteristics of a Bad (Big) Data Platform7 Characteristics of a Bad (Big) Data Platform
7 Characteristics of a Bad (Big) Data Platform
Harshal Deo (HD)
Providing Proofs of Past Data Possession in Cloud Forensics
Providing Proofs of Past Data Possession in Cloud Forensics Providing Proofs of Past Data Possession in Cloud Forensics
Providing Proofs of Past Data Possession in Cloud Forensics zawoad
NGN Japan 2012-2017
NGN Japan 2012-2017NGN Japan 2012-2017
NGN Japan 2012-2017
Kabir Ahmad
Delivering Secure OpenStack IaaS for SaaS Products
Delivering Secure OpenStack IaaS for SaaS ProductsDelivering Secure OpenStack IaaS for SaaS Products
Delivering Secure OpenStack IaaS for SaaS Products
IaaS Security - Back to the Drawing Board
IaaS Security - Back to the Drawing BoardIaaS Security - Back to the Drawing Board
IaaS Security - Back to the Drawing Board
K Logic Future Marketing
How to tackle big data from a security
How to tackle big data from a securityHow to tackle big data from a security
How to tackle big data from a security
Tyrone Systems
Big Data
Big DataBig Data
Big Data
Marian Borca
Automatski - The Internet of Things - Privacy in IoT
Automatski - The Internet of Things - Privacy in IoTAutomatski - The Internet of Things - Privacy in IoT
Automatski - The Internet of Things - Privacy in IoT
IoT: Security & Privacy at IGNITE 2015
IoT: Security & Privacy at IGNITE 2015IoT: Security & Privacy at IGNITE 2015
IoT: Security & Privacy at IGNITE 2015
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
Cloudera, Inc.
(130928) #fitalk cloud storage forensics - dropbox
(130928) #fitalk   cloud storage forensics - dropbox(130928) #fitalk   cloud storage forensics - dropbox
(130928) #fitalk cloud storage forensics - dropbox
Cloud Computing : Security and Forensics
Cloud Computing : Security and ForensicsCloud Computing : Security and Forensics
Cloud Computing : Security and Forensics
Govind Maheswaran
Big Data Solutions Executive Overview
Big Data Solutions Executive OverviewBig Data Solutions Executive Overview
Big Data Solutions Executive Overview
RCG Global Services
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
Kenny Huang Ph.D.
WSO2 Big Data Analytics Platform
WSO2 Big Data Analytics PlatformWSO2 Big Data Analytics Platform
WSO2 Big Data Analytics Platform
Samisa Abeysinghe
Big data and cyber security legal risks and challenges
Big data and cyber security legal risks and challengesBig data and cyber security legal risks and challenges
Big data and cyber security legal risks and challenges
Kapil Mehrotra
How to design a linear control system
How to design a linear control systemHow to design a linear control system
How to design a linear control system
Alireza Mirzaei
IoT Security and Privacy Considerations
IoT Security and Privacy ConsiderationsIoT Security and Privacy Considerations
IoT Security and Privacy Considerations
Kenny Huang Ph.D.
Privacy, Drones, and IoT
Privacy, Drones, and IoTPrivacy, Drones, and IoT
Privacy, Drones, and IoT

Viewers also liked (20)

7 Characteristics of a Bad (Big) Data Platform
7 Characteristics of a Bad (Big) Data Platform7 Characteristics of a Bad (Big) Data Platform
7 Characteristics of a Bad (Big) Data Platform
Providing Proofs of Past Data Possession in Cloud Forensics
Providing Proofs of Past Data Possession in Cloud Forensics Providing Proofs of Past Data Possession in Cloud Forensics
Providing Proofs of Past Data Possession in Cloud Forensics
NGN Japan 2012-2017
NGN Japan 2012-2017NGN Japan 2012-2017
NGN Japan 2012-2017
Delivering Secure OpenStack IaaS for SaaS Products
Delivering Secure OpenStack IaaS for SaaS ProductsDelivering Secure OpenStack IaaS for SaaS Products
Delivering Secure OpenStack IaaS for SaaS Products
IaaS Security - Back to the Drawing Board
IaaS Security - Back to the Drawing BoardIaaS Security - Back to the Drawing Board
IaaS Security - Back to the Drawing Board
How to tackle big data from a security
How to tackle big data from a securityHow to tackle big data from a security
How to tackle big data from a security
Big Data
Big DataBig Data
Big Data
Automatski - The Internet of Things - Privacy in IoT
Automatski - The Internet of Things - Privacy in IoTAutomatski - The Internet of Things - Privacy in IoT
Automatski - The Internet of Things - Privacy in IoT
IoT: Security & Privacy at IGNITE 2015
IoT: Security & Privacy at IGNITE 2015IoT: Security & Privacy at IGNITE 2015
IoT: Security & Privacy at IGNITE 2015
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
(130928) #fitalk cloud storage forensics - dropbox
(130928) #fitalk   cloud storage forensics - dropbox(130928) #fitalk   cloud storage forensics - dropbox
(130928) #fitalk cloud storage forensics - dropbox
Cloud Computing : Security and Forensics
Cloud Computing : Security and ForensicsCloud Computing : Security and Forensics
Cloud Computing : Security and Forensics
Big Data Solutions Executive Overview
Big Data Solutions Executive OverviewBig Data Solutions Executive Overview
Big Data Solutions Executive Overview
Big Data : Risks and Opportunities
Big Data : Risks and OpportunitiesBig Data : Risks and Opportunities
Big Data : Risks and Opportunities
WSO2 Big Data Analytics Platform
WSO2 Big Data Analytics PlatformWSO2 Big Data Analytics Platform
WSO2 Big Data Analytics Platform
Big data and cyber security legal risks and challenges
Big data and cyber security legal risks and challengesBig data and cyber security legal risks and challenges
Big data and cyber security legal risks and challenges
How to design a linear control system
How to design a linear control systemHow to design a linear control system
How to design a linear control system
IoT Security and Privacy Considerations
IoT Security and Privacy ConsiderationsIoT Security and Privacy Considerations
IoT Security and Privacy Considerations
Privacy, Drones, and IoT
Privacy, Drones, and IoTPrivacy, Drones, and IoT
Privacy, Drones, and IoT

Similar to Big data analysis concepts and references by Cloud Security Alliance

The notorious nine_cloud_computing_top_threats_in_2013
The notorious nine_cloud_computing_top_threats_in_2013The notorious nine_cloud_computing_top_threats_in_2013
The notorious nine_cloud_computing_top_threats_in_2013
Vinoth Sivasubramanan
Sample Cloud Application Security and Operations Policy [release]
Sample Cloud Application Security and Operations Policy [release]Sample Cloud Application Security and Operations Policy [release]
Sample Cloud Application Security and Operations Policy [release]
Cloud security
Cloud security Cloud security
Cloud security
Mohamed Shalash
sarah david
sarah david
Security for Effective Data Storage in Multi Clouds
Security for Effective Data Storage in Multi CloudsSecurity for Effective Data Storage in Multi Clouds
Security for Effective Data Storage in Multi Clouds
Top ten big data security and privacy challenges
Top ten big data security and privacy challengesTop ten big data security and privacy challenges
Top ten big data security and privacy challenges
Expanded top ten_big_data_security_and_privacy_challenges
Expanded top ten_big_data_security_and_privacy_challengesExpanded top ten_big_data_security_and_privacy_challenges
Expanded top ten_big_data_security_and_privacy_challenges
Tom Kirby
Cloud computing seminar report
Cloud computing seminar reportCloud computing seminar report
Cloud computing seminar report
Legal And Regulatory Issues Cloud Computing...V2.0
Legal And Regulatory Issues Cloud Computing...V2.0Legal And Regulatory Issues Cloud Computing...V2.0
Legal And Regulatory Issues Cloud Computing...V2.0David Spinks
McMahon and Associates Cloud Usage Policy Paper
McMahon and Associates Cloud Usage Policy PaperMcMahon and Associates Cloud Usage Policy Paper
McMahon and Associates Cloud Usage Policy Paper
Matthew J McMahon
Evolution security controls towards Cloud Services
Evolution security controls towards Cloud ServicesEvolution security controls towards Cloud Services
Evolution security controls towards Cloud Services
Hugo Rodrigues
Design and implement a new cloud security method based on multi clouds on ope...
Design and implement a new cloud security method based on multi clouds on ope...Design and implement a new cloud security method based on multi clouds on ope...
Design and implement a new cloud security method based on multi clouds on ope...
IOSR Journals
A Secure Framework for Cloud Computing With Multi-cloud Service Providers
A Secure Framework for Cloud Computing With Multi-cloud Service ProvidersA Secure Framework for Cloud Computing With Multi-cloud Service Providers
A Secure Framework for Cloud Computing With Multi-cloud Service Providers
Cloud computing
Cloud computingCloud computing
Cloud computingleninlal
Cloud computing
Cloud computingCloud computing
Cloud computingsfu-kras

Similar to Big data analysis concepts and references by Cloud Security Alliance (20)

The notorious nine_cloud_computing_top_threats_in_2013
The notorious nine_cloud_computing_top_threats_in_2013The notorious nine_cloud_computing_top_threats_in_2013
The notorious nine_cloud_computing_top_threats_in_2013
Sample Cloud Application Security and Operations Policy [release]
Sample Cloud Application Security and Operations Policy [release]Sample Cloud Application Security and Operations Policy [release]
Sample Cloud Application Security and Operations Policy [release]
Cloud security
Cloud security Cloud security
Cloud security
Security for Effective Data Storage in Multi Clouds
Security for Effective Data Storage in Multi CloudsSecurity for Effective Data Storage in Multi Clouds
Security for Effective Data Storage in Multi Clouds
Top ten big data security and privacy challenges
Top ten big data security and privacy challengesTop ten big data security and privacy challenges
Top ten big data security and privacy challenges
Expanded top ten_big_data_security_and_privacy_challenges
Expanded top ten_big_data_security_and_privacy_challengesExpanded top ten_big_data_security_and_privacy_challenges
Expanded top ten_big_data_security_and_privacy_challenges
Cloud computing seminar report
Cloud computing seminar reportCloud computing seminar report
Cloud computing seminar report
Top Cloud Threats
Top Cloud ThreatsTop Cloud Threats
Top Cloud Threats
Legal And Regulatory Issues Cloud Computing...V2.0
Legal And Regulatory Issues Cloud Computing...V2.0Legal And Regulatory Issues Cloud Computing...V2.0
Legal And Regulatory Issues Cloud Computing...V2.0
McMahon and Associates Cloud Usage Policy Paper
McMahon and Associates Cloud Usage Policy PaperMcMahon and Associates Cloud Usage Policy Paper
McMahon and Associates Cloud Usage Policy Paper
Evolution security controls towards Cloud Services
Evolution security controls towards Cloud ServicesEvolution security controls towards Cloud Services
Evolution security controls towards Cloud Services
Design and implement a new cloud security method based on multi clouds on ope...
Design and implement a new cloud security method based on multi clouds on ope...Design and implement a new cloud security method based on multi clouds on ope...
Design and implement a new cloud security method based on multi clouds on ope...
A Secure Framework for Cloud Computing With Multi-cloud Service Providers
A Secure Framework for Cloud Computing With Multi-cloud Service ProvidersA Secure Framework for Cloud Computing With Multi-cloud Service Providers
A Secure Framework for Cloud Computing With Multi-cloud Service Providers
Cloud computing
Cloud computingCloud computing
Cloud computing
Cloud computing
Cloud computingCloud computing
Cloud computing

More from Information Security Awareness Group

Securing the Data in Big Data Security Analytics by Kevin Bowers, Nikos Trian...
Securing the Data in Big Data Security Analytics by Kevin Bowers, Nikos Trian...Securing the Data in Big Data Security Analytics by Kevin Bowers, Nikos Trian...
Securing the Data in Big Data Security Analytics by Kevin Bowers, Nikos Trian...
Information Security Awareness Group
Mobile Device Security by Michael Gong, Jake Kreider, Chris Lugo, Kwame Osaf...
 Mobile Device Security by Michael Gong, Jake Kreider, Chris Lugo, Kwame Osaf... Mobile Device Security by Michael Gong, Jake Kreider, Chris Lugo, Kwame Osaf...
Mobile Device Security by Michael Gong, Jake Kreider, Chris Lugo, Kwame Osaf...
Information Security Awareness Group
Mobile Devices – Using Without Losing Mark K. Mellis, Associate Information S...
Mobile Devices – Using Without Losing Mark K. Mellis, Associate Information S...Mobile Devices – Using Without Losing Mark K. Mellis, Associate Information S...
Mobile Devices – Using Without Losing Mark K. Mellis, Associate Information S...
Information Security Awareness Group
IBM Security Strategy Intelligence,
IBM Security Strategy Intelligence,IBM Security Strategy Intelligence,
IBM Security Strategy Intelligence,
Information Security Awareness Group
Addressing Big Data Security Challenges: The Right Tools for Smart Protection...
Addressing Big Data Security Challenges: The Right Tools for Smart Protection...Addressing Big Data Security Challenges: The Right Tools for Smart Protection...
Addressing Big Data Security Challenges: The Right Tools for Smart Protection...
Information Security Awareness Group
Authorization Policy in a PKI Environment Mary Thompson Srilekha Mudumbai A...
 Authorization Policy in a PKI Environment  Mary Thompson Srilekha Mudumbai A... Authorization Policy in a PKI Environment  Mary Thompson Srilekha Mudumbai A...
Authorization Policy in a PKI Environment Mary Thompson Srilekha Mudumbai A...
Information Security Awareness Group
Pki by Steve Lamb
Pki by Steve LambPki by Steve Lamb
PKI by Gene Itkis
PKI by Gene ItkisPKI by Gene Itkis
Introduction to distributed security concepts and public key infrastructure m...
Introduction to distributed security concepts and public key infrastructure m...Introduction to distributed security concepts and public key infrastructure m...
Introduction to distributed security concepts and public key infrastructure m...
Information Security Awareness Group
OThe Open Science Grid: Concepts and Patterns Ruth Pordes, Mine Altunay, Bria...
OThe Open Science Grid: Concepts and Patterns Ruth Pordes, Mine Altunay, Bria...OThe Open Science Grid: Concepts and Patterns Ruth Pordes, Mine Altunay, Bria...
OThe Open Science Grid: Concepts and Patterns Ruth Pordes, Mine Altunay, Bria...
Information Security Awareness Group
Optimal Security Response to Attacks on Open Science Grids Mine Altunay, Sven...
Optimal Security Response to Attacks on Open Science Grids Mine Altunay, Sven...Optimal Security Response to Attacks on Open Science Grids Mine Altunay, Sven...
Optimal Security Response to Attacks on Open Science Grids Mine Altunay, Sven...
Information Security Awareness Group
Information Security Awareness Group
Open Science Grid security-atlas-t2 Bob Cowles
Open Science Grid security-atlas-t2 Bob CowlesOpen Science Grid security-atlas-t2 Bob Cowles
Open Science Grid security-atlas-t2 Bob Cowles
Information Security Awareness Group
Security Open Science Grid Doug Olson
Security Open Science Grid Doug OlsonSecurity Open Science Grid Doug Olson
Security Open Science Grid Doug Olson
Information Security Awareness Group
Open Science Group Security Kevin Hill
Open Science Group Security Kevin HillOpen Science Group Security Kevin Hill
Open Science Group Security Kevin Hill
Information Security Awareness Group
Xrootd proxies Andrew Hanushevsky
Xrootd proxies Andrew HanushevskyXrootd proxies Andrew Hanushevsky
Xrootd proxies Andrew Hanushevsky
Information Security Awareness Group
Privilege Project Vikram Andem
Privilege Project Vikram AndemPrivilege Project Vikram Andem
Privilege Project Vikram Andem
Information Security Awareness Group
DES Block Cipher Hao Qi
DES Block Cipher Hao QiDES Block Cipher Hao Qi
DES Block Cipher Hao Qi
Information Security Awareness Group
Cache based side_channel_attacks Anestis Bechtsoudis
Cache based side_channel_attacks Anestis BechtsoudisCache based side_channel_attacks Anestis Bechtsoudis
Cache based side_channel_attacks Anestis Bechtsoudis
Information Security Awareness Group

More from Information Security Awareness Group (20)

Securing the Data in Big Data Security Analytics by Kevin Bowers, Nikos Trian...
Securing the Data in Big Data Security Analytics by Kevin Bowers, Nikos Trian...Securing the Data in Big Data Security Analytics by Kevin Bowers, Nikos Trian...
Securing the Data in Big Data Security Analytics by Kevin Bowers, Nikos Trian...
Mobile Device Security by Michael Gong, Jake Kreider, Chris Lugo, Kwame Osaf...
 Mobile Device Security by Michael Gong, Jake Kreider, Chris Lugo, Kwame Osaf... Mobile Device Security by Michael Gong, Jake Kreider, Chris Lugo, Kwame Osaf...
Mobile Device Security by Michael Gong, Jake Kreider, Chris Lugo, Kwame Osaf...
Mobile Devices – Using Without Losing Mark K. Mellis, Associate Information S...
Mobile Devices – Using Without Losing Mark K. Mellis, Associate Information S...Mobile Devices – Using Without Losing Mark K. Mellis, Associate Information S...
Mobile Devices – Using Without Losing Mark K. Mellis, Associate Information S...
IBM Security Strategy Intelligence,
IBM Security Strategy Intelligence,IBM Security Strategy Intelligence,
IBM Security Strategy Intelligence,
Addressing Big Data Security Challenges: The Right Tools for Smart Protection...
Addressing Big Data Security Challenges: The Right Tools for Smart Protection...Addressing Big Data Security Challenges: The Right Tools for Smart Protection...
Addressing Big Data Security Challenges: The Right Tools for Smart Protection...
PKI by Tim Polk
PKI by Tim PolkPKI by Tim Polk
PKI by Tim Polk
Authorization Policy in a PKI Environment Mary Thompson Srilekha Mudumbai A...
 Authorization Policy in a PKI Environment  Mary Thompson Srilekha Mudumbai A... Authorization Policy in a PKI Environment  Mary Thompson Srilekha Mudumbai A...
Authorization Policy in a PKI Environment Mary Thompson Srilekha Mudumbai A...
Pki by Steve Lamb
Pki by Steve LambPki by Steve Lamb
Pki by Steve Lamb
PKI by Gene Itkis
PKI by Gene ItkisPKI by Gene Itkis
PKI by Gene Itkis
Introduction to distributed security concepts and public key infrastructure m...
Introduction to distributed security concepts and public key infrastructure m...Introduction to distributed security concepts and public key infrastructure m...
Introduction to distributed security concepts and public key infrastructure m...
OThe Open Science Grid: Concepts and Patterns Ruth Pordes, Mine Altunay, Bria...
OThe Open Science Grid: Concepts and Patterns Ruth Pordes, Mine Altunay, Bria...OThe Open Science Grid: Concepts and Patterns Ruth Pordes, Mine Altunay, Bria...
OThe Open Science Grid: Concepts and Patterns Ruth Pordes, Mine Altunay, Bria...
Optimal Security Response to Attacks on Open Science Grids Mine Altunay, Sven...
Optimal Security Response to Attacks on Open Science Grids Mine Altunay, Sven...Optimal Security Response to Attacks on Open Science Grids Mine Altunay, Sven...
Optimal Security Response to Attacks on Open Science Grids Mine Altunay, Sven...
Open Science Grid security-atlas-t2 Bob Cowles
Open Science Grid security-atlas-t2 Bob CowlesOpen Science Grid security-atlas-t2 Bob Cowles
Open Science Grid security-atlas-t2 Bob Cowles
Security Open Science Grid Doug Olson
Security Open Science Grid Doug OlsonSecurity Open Science Grid Doug Olson
Security Open Science Grid Doug Olson
Open Science Group Security Kevin Hill
Open Science Group Security Kevin HillOpen Science Group Security Kevin Hill
Open Science Group Security Kevin Hill
Xrootd proxies Andrew Hanushevsky
Xrootd proxies Andrew HanushevskyXrootd proxies Andrew Hanushevsky
Xrootd proxies Andrew Hanushevsky
Privilege Project Vikram Andem
Privilege Project Vikram AndemPrivilege Project Vikram Andem
Privilege Project Vikram Andem
DES Block Cipher Hao Qi
DES Block Cipher Hao QiDES Block Cipher Hao Qi
DES Block Cipher Hao Qi
Cache based side_channel_attacks Anestis Bechtsoudis
Cache based side_channel_attacks Anestis BechtsoudisCache based side_channel_attacks Anestis Bechtsoudis
Cache based side_channel_attacks Anestis Bechtsoudis

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Founder Sachin Dev Duggal's Strategic Approach to Create an Innova... Founder Sachin Dev Duggal's Strategic Approach to Create an Founder Sachin Dev Duggal's Strategic Approach to Create an Innova... Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
Abida Shariff
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf Founder Sachin Dev Duggal's Strategic Approach to Create an Innova... Founder Sachin Dev Duggal's Strategic Approach to Create an Founder Sachin Dev Duggal's Strategic Approach to Create an Innova... Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

Big data analysis concepts and references by Cloud Security Alliance

  • 1. Top Ten Big Data Security and Privacy Challenges November 2012
  • 2. © Copyright 2012, Cloud Security Alliance. All rights reserved. 2 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges © 2012 Cloud Security Alliance All rights reserved. You may download, store, display on your computer, view, print, and link to the Cloud Security Alliance Security as a Service Implementation Guidance at, subject to the following: (a) the Guidance may be used solely for your personal, informational, non-commercial use; (b) the Guidance may not be modified or altered in any way; (c) the Guidance may not be redistributed; and (d) the trademark, copyright or other notices may not be removed. You may quote portions of the Guidance as permitted by the Fair Use provisions of the United States Copyright Act, provided that you attribute the portions to the Cloud Security Alliance Security as a Service Implementation Guidance Version 1.0 (2012).
  • 3. © Copyright 2012, Cloud Security Alliance. All rights reserved. 3 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges Contents Acknowledgments......................................................................................................................................................4 1.0 Abstract ................................................................................................................................................................5 2.0 Introduction..........................................................................................................................................................5 3.0 Secure Computations in Distributed Programming Frameworks.........................................................................6 3.1 Use Cases..........................................................................................................................................................6 4.0 Security Best Practices for Non-Relational Data Stores .......................................................................................6 4.1 Use Cases..........................................................................................................................................................6 5.0 Secure Data Storage and Transactions Logs.........................................................................................................7 5.1 Use Cases..........................................................................................................................................................7 6.0 End-Point Input Validation/Filtering ....................................................................................................................7 6.1 Use Cases..........................................................................................................................................................7 7.0 Real-time Security/Compliance Monitoring.........................................................................................................7 7.1 Use Cases..........................................................................................................................................................8 8.0 Scalable and Composable Privacy-Preserving Data Mining and Analytics...........................................................8 8.1 Use Cases..........................................................................................................................................................8 9.0 Cryptographically Enforced Access Control and Secure Communication............................................................9 9.1 Use Cases..........................................................................................................................................................9 10.0 Granular Access Control.....................................................................................................................................9 10.1 Use Cases........................................................................................................................................................9 11.0 Granular Audits................................................................................................................................................ 10 11.1 Use Cases..................................................................................................................................................... 10 12.0 Data Provenance ............................................................................................................................................. 10 12.1 Use Cases..................................................................................................................................................... 10 13.0 Conclusion ....................................................................................................................................................... 11
  • 4. © Copyright 2012, Cloud Security Alliance. All rights reserved. 4 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges Acknowledgments CSA Big Data Working Group Co-Chairs Lead: Sreeranga Rajan, Fujitsu Co-Chair: Wilco van Ginkel, Verizon Co-Chair: Neel Sundaresan, eBay Contributors Alvaro Cardenas Mora, Fujitsu Yu Chen, SUNY Binghamton Adam Fuchs, Sqrrl Adrian Lane, Securosis Rongxing Lu, University of Waterloo Pratyusa Manadhata, HP Labs Jesus Molina, Fujitsu Praveen Murthy, Fujitsu Arnab Roy, Fujitsu Shiju Sathyadevan, Amrita University CSA Global Staff Aaron Alva, Graduate Research Intern Luciano JR Santos, Research Director Evan Scoboria, Webmaster Kendall Scoboria, Graphic Designer John Yeoh, Research Analyst
  • 5. © Copyright 2012, Cloud Security Alliance. All rights reserved. 5 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges 1.0 Abstract Security and privacy issues are magnified by velocity, volume, and variety of big data, such as large-scale cloud infrastructures, diversity of data sources and formats, streaming nature of data acquisition, and high volume inter-cloud migration. Therefore, traditional security mechanisms, which are tailored to securing small-scale static (as opposed to streaming) data, are inadequate. In this paper, we highlight top ten big data-specific security and privacy challenges. Our expectation from highlighting the challenges is that it will bring renewed focus on fortifying big data infrastructures. 2.0 Introduction The term big data refers to the massive amounts of digital information companies and governments collect about us and our surroundings. Every day, we create 2.5 quintillion bytes of data—so much that 90% of the data in the world today has been created in the last two years alone. Security and privacy issues are magnified by velocity, volume, and variety of big data, such as large-scale cloud infrastructures, diversity of data sources and formats, streaming nature of data acquisition and high volume inter-cloud migration. The use of large scale cloud infrastructures, with a diversity of software platforms, spread across large networks of computers, also increases the attack surface of the entire system Traditional security mechanisms, which are tailored to securing small-scale static (as opposed to streaming) data, are inadequate. For example, analytics for anomaly detection would generate too many outliers. Similarly, it is not clear how to retrofit provenance in existing cloud infrastructures. Streaming data demands ultra-fast response times from security and privacy solutions. In this paper, we highlight the top ten big data specific security and privacy challenges. We interviewed Cloud Security Alliance members and surveyed security practitioner-oriented trade journals to draft an initial list of high-priority security and privacy problems, studied published research, and arrived at the following top ten challenges: 1. Secure computations in distributed programming frameworks 2. Security best practices for non-relational data stores 3. Secure data storage and transactions logs 4. End-point input validation/filtering 5. Real-time security/compliance monitoring 6. Scalable and composable privacy-preserving data mining and analytics 7. Cryptographically enforced access control and secure communication 8. Granular access control 9. Granular audits 10. Data provenance In the rest of the paper, we provide brief descriptions and narrate use cases.
  • 6. © Copyright 2012, Cloud Security Alliance. All rights reserved. 6 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges 3.0 Secure Computations in Distributed Programming Frameworks Distributed programming frameworks utilize parallelism in computation and storage to process massive amounts of data. A popular example is the MapReduce framework, which splits an input file into multiple chunks. In the first phase of MapReduce, a Mapper for each chunk reads the data, performs some computation, and outputs a list of key/value pairs. In the next phase, a Reducer combines the values belonging to each distinct key and outputs the result. There are two major attack prevention measures: securing the mappers and securing the data in the presence of an untrusted mapper. 3.1 Use Cases Untrusted mappers could return wrong results, which will in turn generate incorrect aggregate results. With large data sets, it is next to impossible to identify, resulting in significant damage, especially for scientific and financial computations. Retailer consumer data is often analyzed by marketing agencies for targeted advertising or customer- segmenting. These tasks involve highly parallel computations over large data sets, and are particularly suited for MapReduce frameworks such as Hadoop. However, the data mappers may contain intentional or unintentional leakages. For example, a mapper may emit a very unique value by analyzing a private record, undermining users’ privacy. 4.0 Security Best Practices for Non-Relational Data Stores Non-relational data stores popularized by NoSQL databases are still evolving with respect to security infrastructure. For instance, robust solutions to NoSQL injection are still not mature. Each NoSQL DBs were built to tackle different challenges posed by the analytics world and hence security was never part of the model at any point of its design stage. Developers using NoSQL databases usually embed security in the middleware. NoSQL databases do not provide any support for enforcing it explicitly in the database. However, clustering aspect of NoSQL databases poses additional challenges to the robustness of such security practices. 4.1 Use Cases Companies dealing with big unstructured data sets may benefit by migrating from a traditional relational database to a NoSQL database in terms of accommodating/processing huge volume of data. In general, the security philosophy of NoSQL databases relies in external enforcing mechanisms. To reduce security incidents, the company must review security policies for the middleware adding items to its engine and at the same time toughen NoSQL database itself to match its counterpart RDBs without compromising on its operational features.
  • 7. © Copyright 2012, Cloud Security Alliance. All rights reserved. 7 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges 5.0 Secure Data Storage and Transactions Logs Data and transaction logs are stored in multi-tiered storage media. Manually moving data between tiers gives the IT manager direct control over exactly what data is moved and when. However, as the size of data set has been, and continues to be, growing exponentially, scalability and availability have necessitated auto-tiering for big data storage management. Auto-tiering solutions do not keep track of where the data is stored, which poses new challenges to secure data storage. New mechanisms are imperative to thwart unauthorized access and maintain the 24/7 availability. 5.1 Use Cases A manufacturer wants to integrate data from different divisions. Some of this data is rarely retrieved, while some divisions constantly utilize the same data pools. An auto-tier storage system will save the manufacturer money by pulling the rarely utilized data to a lower (and cheaper) tier. However, this data may consist in R&D results, not popular but containing critical information. As lower-tier often provides decreased security, the company should study carefully tiering strategies. 6.0 End-Point Input Validation/Filtering Many big data use cases in enterprise settings require data collection from many sources, such as end-point devices. For example, a security information and event management system (SIEM) may collect event logs from millions of hardware devices and software applications in an enterprise network. A key challenge in the data collection process is input validation: how can we trust the data? How can we validate that a source of input data is not malicious and how can we filter malicious input from our collection? Input validation and filtering is a daunting challenge posed by untrusted input sources, especially with the bring your own device (BYOD) model. 6.1 Use Cases Both data retrieved from weather sensors and feedback votes sent by an iPhone application share a similar validation problem. A motivated adversary may be able to create “rogue” virtual sensors, or spoof iPhone IDs to rig the results. This is further complicated by the amount of data collected, which may exceed millions of readings/votes. To perform these tasks effectively, algorithms need to be created to validate the input for large data sets. 7.0 Real-time Security/Compliance Monitoring Real-time security monitoring has always been a challenge, given the number of alerts generated by (security) devices. These alerts (correlated or not) lead to many false positives, which are mostly ignored or simply “clicked away,” as humans cannot cope with the shear amount. This problem might even increase with big data,
  • 8. © Copyright 2012, Cloud Security Alliance. All rights reserved. 8 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges given the volume and velocity of data streams. However, big data technologies might also provide an opportunity, in the sense that these technologies do allow for fast processing and analytics of different types of data. Which in its turn can be used to provide, for instance, real-time anomaly detection based on scalable security analytics. 7.1 Use Cases Most industries and government (agencies) will benefit from real-time security analytics, although the use cases may differ. There are use cases which are common, like, “Who is accessing which data from which resource at what time”; “Are we under attack?” or “Do we have a breach of compliance standard C because of action A?” These are not really new, but the difference is that we have more data at our disposal to make faster and better decisions (e.g., less false positives) in that regard. However, new use cases can be defined or we can redefine existing use cases in lieu of big data. For example, the health industry largely benefits from big data technologies, potentially saving billions to the tax-payer, becoming more accurate with the payment of claims and reducing the fraud related to claims. However, at the same time, the records stored may be extremely sensitive and have to be compliant with HIPAA or regional/local regulations, which call for careful protection of that same data. Detecting in real-time the anomalous retrieval of personal information, intentional or unintentional, allows the health care provider to timely repair the damage created and to prevent further misuse. 8.0 Scalable and Composable Privacy-Preserving Data Mining and Analytics Big data can be seen as a troubling manifestation of Big Brother by potentially enabling invasions of privacy, invasive marketing, decreased civil freedoms, and increase state and corporate control. A recent analysis of how companies are leveraging data analytics for marketing purposes identified an example of how a retailer was able to identify that a teenager was pregnant before her father knew. Similarly, anonymizing data for analytics is not enough to maintain user privacy. For example, AOL released anonymized search logs for academic purposes, but users were easily identified by their searchers. Netflix faced a similar problem when users of their anonymized data set were identified by correlating their Netflix movie scores with IMDB scores. Therefore, it is important to establish guidelines and recommendations for preventing inadvertent privacy disclosures. 8.1 Use Cases User data collected by companies and government agencies are constantly mined and analyzed by inside analysts and also potentially outside contractors or business partners. A malicious insider or untrusted partner can abuse these datasets and extract private information from customers.
  • 9. © Copyright 2012, Cloud Security Alliance. All rights reserved. 9 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges Similarly, intelligence agencies require the collection of vast amounts of data. The data sources are numerous and may include chat-rooms, personal blogs and network routers. Most collected data is, however, innocent in nature, need not be retained, and anonymity preserved. Robust and scalable privacy preserving mining algorithms will increase the chances of collecting relevant information to increase user safety. 9.0 Cryptographically Enforced Access Control and Secure Communication To ensure that the most sensitive private data is end-to-end secure and only accessible to the authorized entities, data has to be encrypted based on access control policies. Specific research in this area such as attribute-based encryption (ABE) has to be made richer, more efficient, and scalable. To ensure authentication, agreement and fairness among the distributed entities, a cryptographically secure communication framework has to be implemented. 9.1 Use Cases Sensitive data is routinely stored unencrypted in the cloud. The main problem to encrypt data, especially large data sets, is the all-or-nothing retrieval policy of encrypted data, disallowing users to easily perform fine grained actions such as sharing records or searches. ABE alleviates this problem by utilizing a public key cryptosystem where attributes related to the data encrypted serve to unlock the keys. On the other hand, we have unencrypted less sensitive data as well, such as data useful for analytics. Such data has to be communicated in a secure and agreed-upon way using a cryptographically secure communication framework. 10.0 Granular Access Control The security property that matters from the perspective of access control is secrecy—preventing access to data by people that should not have access. The problem with course-grained access mechanisms is that data that could otherwise be shared is often swept into a more restrictive category to guarantee sound security. Granular access control gives data managers a scalpel instead of a sword to share data as much as possible without compromising secrecy. 10.1 Use Cases Big data analysis and cloud computing are increasingly focused on handling diverse data sets, both in terms of variety of schemas and variety of security requirements. Legal and policy restrictions on data come from numerous sources. The Sarbanes-Oxley Act levees requirements to protect corporate financial information, and the Health Insurance Portability and Accountability Act includes numerous restrictions on sharing personal health records. Executive Order 13526 outlines an elaborate system of protecting national security information.
  • 10. © Copyright 2012, Cloud Security Alliance. All rights reserved. 10 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges Privacy policies, sharing agreements, and corporate policy also impose requirements on data handling. Managing this plethora of restrictions has so far resulted in increased costs for developing applications and a walled garden approach in which few people can participate in the analysis. Granular access control is necessary for analytical systems to adapt to this increasingly complex security environment. 11.0 Granular Audits With real-time security monitoring (see section 12.0), we try to be notified at the moment an attack takes place. In reality, this will not always be the case (e.g., new attacks, missed true positives). In order to get to the bottom of a missed attack, we need audit information. This is not only relevant because we want to understand what happened and what went wrong, but also because compliance, regulation and forensics reasons. In that regard, auditing is not something new, but the scope and granularity might be different. For example, we have to deal with more data objects, which probably are (but not necessarily) distributed. 11.1 Use Cases Compliance requirements (e.g., HIPAA, PCI, Sarbanes-Oxley) require financial firms to provide granular auditing records. Additionally, the loss of records containing private information is estimated at $200/record. Legal action – depending on the geographic region – might follow in case of a data breach. Key personnel at financial institutions require access to large data sets containing PI, such as SSN. Marketing firms want access, for instance, to personal social media information to optimize their customer-centric approach regarding online ads. 12.0 Data Provenance Provenance metadata will grow in complexity due to large provenance graphs generated from provenance- enabled programming environments in big data applications. Analysis of such large provenance graphs to detect metadata dependencies for security/confidentiality applications is computationally intensive. 12.1 Use Cases Several key security applications require the history of a digital record – such as details about its creation. Examples include detecting insider trading for financial companies or to determine the accuracy of the data source for research investigations. These security assessments are time sensitive in nature, and require fast algorithms to handle the provenance metadata containing this information. In addition, data provenance complements audit logs for compliance requirements, such as PCI or Sarbanes-Oxley.
  • 11. © Copyright 2012, Cloud Security Alliance. All rights reserved. 11 CLOUD SECURITY ALLIANCE Top Ten Big Data Security and Privacy Challenges 13.0 Conclusion Big data is here to stay. It is practically impossible to imagine the next application without it consuming data, producing new forms of data, and containing data-driven algorithms. As compute environments become cheaper, applications environments become networked, and system and analytics environments become shared over the cloud, security, access control, compression and encryption and compliance introduce challenges that have to be addressed in a systematic way. The Cloud Security Alliance (CSA) Big Data Working Group (BDWG) recognizes these challenges and has a mission for addressing these in a standardized and systematic way. In this paper, we have highlighted the top ten security and privacy problems that need to be addressed for making big data processing and computing infrastructure more secure. Some common elements in this list of top ten issues that are specific to big data arise from the use of multiple infrastructure tiers (both storage and computing) for processing big data, the use of new compute infrastructures such as NoSQL databases (for fast throughput necessitated by big data volumes) that have not been thoroughly vetted for security issues, the non- scalability of encryption for large data sets, non-scalability of real-time monitoring techniques that might be practical for smaller volumes of data, the heterogeneity of devices that produce the data, and confusion with the plethora of diverse legal and policy restrictions that leads to ad hoc approaches for ensuring security and privacy. Many of the items in the list of top ten challenges also serve to clarify specific aspects of the attack surface of the entire big data processing infrastructure that should be analyzed for these types of threats. We plan to use OpenMobius, an open-source, large scale, distributed data processing, analytics, and tools platform from eBay Research Labs as an experimental test bed. Our hope is that this paper will spur action in the research and development community to collaboratively increase focus on the top ten challenges, leading to greater security and privacy in big data platforms.