SlideShare a Scribd company logo
1 of 19
Security Threats to Hadoop: Data Leakage Attacks
and Investigation
Presented By
KIRAN GAJBHIYE
Contents
• Introduction
• Hadoop security
• Hadoop Components
• Data Leakage Attack
• Analyze Data Leakage in Data analyzer
• Disadvantages of Data Leakage
• overcome the threat
• Conclusion
Introduction
• Hadoop is an open-source framework that allows to store and process big
data
• popular platforms for big data storage and analysis
ex - manufacturing, healthcare, insurance, and retail
• Hadoop consists of core libraries
• A distributed file system (HDFS) decrease data losses
• YARN for resource management and application scheduling
• Map Reduce for data processing
• powerful processing capacity, huge storage capacity, scalability
Hadoop Security
• Hadoop Contains Sensitive Data
As Hadoop adoption grows too has the types of data organizations look to store.
Often the data is proprietary or personal and it must be protected
• Add Kerberos-based authentication to server
– Establishes identity for clients, hosts and services
– Prevents impersonation/passwords are never sent over the wire
• Data stored in encrypted form
• MySQL is often used
Hadoop Components
 MapReduce (version 1.2.1)
• Scheduling
• Map Reduce is a framework used to write the applications that process
large amounts of data on hardware
• MapReduce framework divides an input file into many chunks and then a
mapper for each chunk reads the data, does computations and provides
outputs in the form of key/value pairs
Continue……
 HDFS(Hadoop Distributed File System)
• The HDFS is the Java portable file system which is more scalable, reliable
and distributed in the Hadoop framework environment
• Store large data sets and cope with hardware failure
• Namenode
consists of a single Namenode, a master server that manages the file system
namespaces and regulates the access to files by clients
• DataNode
responsible for serving read and write requests from file system clients
Fig 1 Architecture of HDFS (hadoop distributed file system)
Data Leakage Attack
 application layer data leakage
attackers can obtain private data by application-layer vulnerability or
malware
ex - A vulnerability in the current Hadoop audit mechanism is that it only
records the operation type, time, and content, but no information on who did this
operation
Continue…
 operating system layer data leakage
• if attackers have the permission to log-in to the host operating
system of a Hadoop node
• They can bypass the monitor of Hadoop and steal data directly
in the OS layer
Investigate on Data Leakage
• an investigation framework aimed at data leakage attacks in Hadoop was
proposed
• This framework is composed of many data collectors and one data analyzer
• The data collector is located in the kernel of the host operating system in Hadoop
nodes
• It actively monitors the accesses to important data on each node and transmits
these behavior logs to the data analyzer
• The data it collects includes Hadoop logs, the Fsimage file, our own monitor (i.e.,
HProgger) logs and images or logs of files, processes, networks, and system
Fig 2 Data Collector
Fig 3 Data Analyzer
Continue…
 OS-layer data leakage attack, the detection algorithm works in
four dimensions:
• Abnormal directory
In collected logs, if any directory that is out of this range is found, an attack may
have happened.
• Abnormal user
if any other user instead of the Hadoop super user is found in HProgger logs, an
attack may have happened
Continue…
• Abnormal operation
if an attacker wants to steal a block from the physical machine, they need to copy
(move) it to another directory and sometimes may rename the block, so if any of
these operations are found, the attack may have happened
• Block Proportion
SusBlockNum (FileID) represent the number of suspicious blocks in a file dataset
and AllBlockNumber (FileID) represent the total number of blocks in a file dataset.
We define block proportion (BP) as the result of dividing SusBlockNum (FileID) by
AllBlockNumber (FileID).
Disadvantages of data leakage
• Lost of important and confidential documents
• Hack the whole system
•
Overcome the threat
• Transport-level encryption
• Kerberos used for strong authentication
• Apache Knox used for perimeter security
• Data-centric security
• Data-at-Rest Encryption
• Data-in-Motion Encryption
Conclusion
This including an on-demand data collection method and an automatic analysis
method for data leakage attacks in Hadoop. It collects data from the machines in the
Hadoop cluster to our forensic server and then analyzes them.
With the automatic detection algorithm, it can find out whether there exist
suspicious data leakage behaviors and give warnings and evidence to users. This
collected evidence can be used to find the attackers and reconstruct the attack
scenarios.
References
• D. Das and O. O’Malley, “Adding Security to Apache Hadoop,” Hortonworks
Technical Report, 2011.
• S. Park and Y. Lee, “Secure Hadoop with Encrypted HDFS,” GPC 2013, Seoul,
Korea, 2013, May 9–11.
• R. K. L. Ko and M. A. Will. “Progger: An Efficient, Tamper-Evident Kernel-
Space Logger for Cloud Data Provenance Tracking,” CLOUD 2014, Anchorage,
AK, 2014, June 27– July 2.
THANK YOU !!!

More Related Content

What's hot

CNIT 121: 14 Investigating Applications
CNIT 121: 14 Investigating ApplicationsCNIT 121: 14 Investigating Applications
CNIT 121: 14 Investigating ApplicationsSam Bowne
 
CNIT 152: 4 Starting the Investigation & 5 Leads
CNIT 152: 4 Starting the Investigation & 5 LeadsCNIT 152: 4 Starting the Investigation & 5 Leads
CNIT 152: 4 Starting the Investigation & 5 LeadsSam Bowne
 
CNIT 152: 13 Investigating Mac OS X Systems
CNIT 152: 13 Investigating Mac OS X SystemsCNIT 152: 13 Investigating Mac OS X Systems
CNIT 152: 13 Investigating Mac OS X SystemsSam Bowne
 
CNIT 152 12. Investigating Windows Systems (Part 3)
CNIT 152 12. Investigating Windows Systems (Part 3)CNIT 152 12. Investigating Windows Systems (Part 3)
CNIT 152 12. Investigating Windows Systems (Part 3)Sam Bowne
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksBig Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksLuan Moreno Medeiros Maciel
 
Datafoucs 2014 on line digital forensic investigations damir delija 2
Datafoucs 2014 on line digital forensic investigations damir delija 2Datafoucs 2014 on line digital forensic investigations damir delija 2
Datafoucs 2014 on line digital forensic investigations damir delija 2Damir Delija
 
CNIT 152: 10 Enterprise Services
CNIT 152: 10 Enterprise ServicesCNIT 152: 10 Enterprise Services
CNIT 152: 10 Enterprise ServicesSam Bowne
 
EnCase Enterprise Basic File Collection
EnCase Enterprise Basic File Collection EnCase Enterprise Basic File Collection
EnCase Enterprise Basic File Collection Damir Delija
 
[NetApp] Simplified HA:DR Using Storage Solutions
[NetApp] Simplified HA:DR Using Storage Solutions[NetApp] Simplified HA:DR Using Storage Solutions
[NetApp] Simplified HA:DR Using Storage SolutionsPerforce
 
Cloudera Impala - HUG Karlsruhe, July 04, 2013
Cloudera Impala - HUG Karlsruhe, July 04, 2013Cloudera Impala - HUG Karlsruhe, July 04, 2013
Cloudera Impala - HUG Karlsruhe, July 04, 2013Alexander Alten-Lorenz
 
CNIT 152: 9 Network Evidence
CNIT 152: 9 Network EvidenceCNIT 152: 9 Network Evidence
CNIT 152: 9 Network EvidenceSam Bowne
 
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos AlgorithmSolving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos AlgorithmDataWorks Summit
 
Selective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed HadoopSelective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed HadoopDataWorks Summit
 
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data CollectionCNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data CollectionSam Bowne
 
Usage aspects techniques for enterprise forensics data analytics tools
Usage aspects techniques for enterprise forensics data analytics toolsUsage aspects techniques for enterprise forensics data analytics tools
Usage aspects techniques for enterprise forensics data analytics toolsDamir Delija
 
Flashy prefetching for high performance flash drives
Flashy prefetching for high performance flash drivesFlashy prefetching for high performance flash drives
Flashy prefetching for high performance flash drivesPratik Bhat
 
CNIT 152: 12 Investigating Windows Systems (Part 2 of 3)
CNIT 152: 12 Investigating Windows Systems (Part 2 of 3)CNIT 152: 12 Investigating Windows Systems (Part 2 of 3)
CNIT 152: 12 Investigating Windows Systems (Part 2 of 3)Sam Bowne
 
Designing and Implementing a cloud-hosted SaaS for data movement and Sharing ...
Designing and Implementing a cloud-hosted SaaS for data movement and Sharing ...Designing and Implementing a cloud-hosted SaaS for data movement and Sharing ...
Designing and Implementing a cloud-hosted SaaS for data movement and Sharing ...Haripds Shrestha
 

What's hot (20)

CNIT 121: 14 Investigating Applications
CNIT 121: 14 Investigating ApplicationsCNIT 121: 14 Investigating Applications
CNIT 121: 14 Investigating Applications
 
CNIT 152: 4 Starting the Investigation & 5 Leads
CNIT 152: 4 Starting the Investigation & 5 LeadsCNIT 152: 4 Starting the Investigation & 5 Leads
CNIT 152: 4 Starting the Investigation & 5 Leads
 
Forensics Analysis and Validation
Forensics Analysis and Validation  Forensics Analysis and Validation
Forensics Analysis and Validation
 
CNIT 152: 13 Investigating Mac OS X Systems
CNIT 152: 13 Investigating Mac OS X SystemsCNIT 152: 13 Investigating Mac OS X Systems
CNIT 152: 13 Investigating Mac OS X Systems
 
Beyond Hadoop and MapReduce
Beyond Hadoop and MapReduceBeyond Hadoop and MapReduce
Beyond Hadoop and MapReduce
 
CNIT 152 12. Investigating Windows Systems (Part 3)
CNIT 152 12. Investigating Windows Systems (Part 3)CNIT 152 12. Investigating Windows Systems (Part 3)
CNIT 152 12. Investigating Windows Systems (Part 3)
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksBig Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
 
Datafoucs 2014 on line digital forensic investigations damir delija 2
Datafoucs 2014 on line digital forensic investigations damir delija 2Datafoucs 2014 on line digital forensic investigations damir delija 2
Datafoucs 2014 on line digital forensic investigations damir delija 2
 
CNIT 152: 10 Enterprise Services
CNIT 152: 10 Enterprise ServicesCNIT 152: 10 Enterprise Services
CNIT 152: 10 Enterprise Services
 
EnCase Enterprise Basic File Collection
EnCase Enterprise Basic File Collection EnCase Enterprise Basic File Collection
EnCase Enterprise Basic File Collection
 
[NetApp] Simplified HA:DR Using Storage Solutions
[NetApp] Simplified HA:DR Using Storage Solutions[NetApp] Simplified HA:DR Using Storage Solutions
[NetApp] Simplified HA:DR Using Storage Solutions
 
Cloudera Impala - HUG Karlsruhe, July 04, 2013
Cloudera Impala - HUG Karlsruhe, July 04, 2013Cloudera Impala - HUG Karlsruhe, July 04, 2013
Cloudera Impala - HUG Karlsruhe, July 04, 2013
 
CNIT 152: 9 Network Evidence
CNIT 152: 9 Network EvidenceCNIT 152: 9 Network Evidence
CNIT 152: 9 Network Evidence
 
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos AlgorithmSolving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
 
Selective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed HadoopSelective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed Hadoop
 
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data CollectionCNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
CNIT 121: 6 Discovering the Scope of the Incident & 7 Live Data Collection
 
Usage aspects techniques for enterprise forensics data analytics tools
Usage aspects techniques for enterprise forensics data analytics toolsUsage aspects techniques for enterprise forensics data analytics tools
Usage aspects techniques for enterprise forensics data analytics tools
 
Flashy prefetching for high performance flash drives
Flashy prefetching for high performance flash drivesFlashy prefetching for high performance flash drives
Flashy prefetching for high performance flash drives
 
CNIT 152: 12 Investigating Windows Systems (Part 2 of 3)
CNIT 152: 12 Investigating Windows Systems (Part 2 of 3)CNIT 152: 12 Investigating Windows Systems (Part 2 of 3)
CNIT 152: 12 Investigating Windows Systems (Part 2 of 3)
 
Designing and Implementing a cloud-hosted SaaS for data movement and Sharing ...
Designing and Implementing a cloud-hosted SaaS for data movement and Sharing ...Designing and Implementing a cloud-hosted SaaS for data movement and Sharing ...
Designing and Implementing a cloud-hosted SaaS for data movement and Sharing ...
 

Similar to Security Threats to Hadoop: Data Leakage Attacks and Investigation

Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCloudera, Inc.
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of ViewKaran Alang
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoophadooparchbook
 
Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015Cloudera, Inc.
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptxAakashBerlia1
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoophadooparchbook
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfsshrey mehrotra
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoopmarkgrover
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.pptvijayapraba1
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenHadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenmaharajothip1
 
Big Data and Hadoop Introduction
 Big Data and Hadoop Introduction Big Data and Hadoop Introduction
Big Data and Hadoop IntroductionDzung Nguyen
 
Big data and Hadoop introduction
Big data and Hadoop introductionBig data and Hadoop introduction
Big data and Hadoop introductionDzung Nguyen
 
Unit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxUnit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxAnkitChauhan817826
 
Architecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an exampleArchitecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an examplehadooparchbook
 
Hops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopHops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopJim Dowling
 
Introduction to Data Storage and Cloud Computing
Introduction to Data Storage and Cloud ComputingIntroduction to Data Storage and Cloud Computing
Introduction to Data Storage and Cloud ComputingRutuja751147
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.MaharajothiP
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object Sandeep Patil
 

Similar to Security Threats to Hadoop: Data Leakage Attacks and Investigation (20)

Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Hadoop data management
Hadoop data managementHadoop data management
Hadoop data management
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoop
 
Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015Application Architectures with Hadoop | Data Day Texas 2015
Application Architectures with Hadoop | Data Day Texas 2015
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptx
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoop
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfs
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenHadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
 
Hadoop training in bangalore
Hadoop training in bangaloreHadoop training in bangalore
Hadoop training in bangalore
 
Big Data and Hadoop Introduction
 Big Data and Hadoop Introduction Big Data and Hadoop Introduction
Big Data and Hadoop Introduction
 
Big data and Hadoop introduction
Big data and Hadoop introductionBig data and Hadoop introduction
Big data and Hadoop introduction
 
Unit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxUnit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptx
 
Architecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an exampleArchitecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an example
 
Hops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopHops - Distributed metadata for Hadoop
Hops - Distributed metadata for Hadoop
 
Introduction to Data Storage and Cloud Computing
Introduction to Data Storage and Cloud ComputingIntroduction to Data Storage and Cloud Computing
Introduction to Data Storage and Cloud Computing
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
 

Recently uploaded

Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 

Recently uploaded (20)

Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 

Security Threats to Hadoop: Data Leakage Attacks and Investigation

  • 1. Security Threats to Hadoop: Data Leakage Attacks and Investigation Presented By KIRAN GAJBHIYE
  • 2. Contents • Introduction • Hadoop security • Hadoop Components • Data Leakage Attack • Analyze Data Leakage in Data analyzer • Disadvantages of Data Leakage • overcome the threat • Conclusion
  • 3. Introduction • Hadoop is an open-source framework that allows to store and process big data • popular platforms for big data storage and analysis ex - manufacturing, healthcare, insurance, and retail • Hadoop consists of core libraries • A distributed file system (HDFS) decrease data losses • YARN for resource management and application scheduling • Map Reduce for data processing • powerful processing capacity, huge storage capacity, scalability
  • 4. Hadoop Security • Hadoop Contains Sensitive Data As Hadoop adoption grows too has the types of data organizations look to store. Often the data is proprietary or personal and it must be protected • Add Kerberos-based authentication to server – Establishes identity for clients, hosts and services – Prevents impersonation/passwords are never sent over the wire • Data stored in encrypted form • MySQL is often used
  • 5. Hadoop Components  MapReduce (version 1.2.1) • Scheduling • Map Reduce is a framework used to write the applications that process large amounts of data on hardware • MapReduce framework divides an input file into many chunks and then a mapper for each chunk reads the data, does computations and provides outputs in the form of key/value pairs
  • 6. Continue……  HDFS(Hadoop Distributed File System) • The HDFS is the Java portable file system which is more scalable, reliable and distributed in the Hadoop framework environment • Store large data sets and cope with hardware failure • Namenode consists of a single Namenode, a master server that manages the file system namespaces and regulates the access to files by clients • DataNode responsible for serving read and write requests from file system clients
  • 7. Fig 1 Architecture of HDFS (hadoop distributed file system)
  • 8. Data Leakage Attack  application layer data leakage attackers can obtain private data by application-layer vulnerability or malware ex - A vulnerability in the current Hadoop audit mechanism is that it only records the operation type, time, and content, but no information on who did this operation
  • 9. Continue…  operating system layer data leakage • if attackers have the permission to log-in to the host operating system of a Hadoop node • They can bypass the monitor of Hadoop and steal data directly in the OS layer
  • 10. Investigate on Data Leakage • an investigation framework aimed at data leakage attacks in Hadoop was proposed • This framework is composed of many data collectors and one data analyzer • The data collector is located in the kernel of the host operating system in Hadoop nodes • It actively monitors the accesses to important data on each node and transmits these behavior logs to the data analyzer • The data it collects includes Hadoop logs, the Fsimage file, our own monitor (i.e., HProgger) logs and images or logs of files, processes, networks, and system
  • 11. Fig 2 Data Collector
  • 12. Fig 3 Data Analyzer
  • 13. Continue…  OS-layer data leakage attack, the detection algorithm works in four dimensions: • Abnormal directory In collected logs, if any directory that is out of this range is found, an attack may have happened. • Abnormal user if any other user instead of the Hadoop super user is found in HProgger logs, an attack may have happened
  • 14. Continue… • Abnormal operation if an attacker wants to steal a block from the physical machine, they need to copy (move) it to another directory and sometimes may rename the block, so if any of these operations are found, the attack may have happened • Block Proportion SusBlockNum (FileID) represent the number of suspicious blocks in a file dataset and AllBlockNumber (FileID) represent the total number of blocks in a file dataset. We define block proportion (BP) as the result of dividing SusBlockNum (FileID) by AllBlockNumber (FileID).
  • 15. Disadvantages of data leakage • Lost of important and confidential documents • Hack the whole system •
  • 16. Overcome the threat • Transport-level encryption • Kerberos used for strong authentication • Apache Knox used for perimeter security • Data-centric security • Data-at-Rest Encryption • Data-in-Motion Encryption
  • 17. Conclusion This including an on-demand data collection method and an automatic analysis method for data leakage attacks in Hadoop. It collects data from the machines in the Hadoop cluster to our forensic server and then analyzes them. With the automatic detection algorithm, it can find out whether there exist suspicious data leakage behaviors and give warnings and evidence to users. This collected evidence can be used to find the attackers and reconstruct the attack scenarios.
  • 18. References • D. Das and O. O’Malley, “Adding Security to Apache Hadoop,” Hortonworks Technical Report, 2011. • S. Park and Y. Lee, “Secure Hadoop with Encrypted HDFS,” GPC 2013, Seoul, Korea, 2013, May 9–11. • R. K. L. Ko and M. A. Will. “Progger: An Efficient, Tamper-Evident Kernel- Space Logger for Cloud Data Provenance Tracking,” CLOUD 2014, Anchorage, AK, 2014, June 27– July 2.