SlideShare a Scribd company logo
The rise of “Big Data” on
cloud computing: Review and
open research issues
• Ibrahim Abaker Targio Hashem
• Ibrar Yaqoob
• Nor Badrul Anuar
• Salimah Mokhtar
• Abdullah Gani
• Samee Ullah Khan
Presented By
Kazi Mojammel Hossen
ID: B130305001
2
Minhazul Arefin
ID: B130305003
Outlines
◎Introduction
◎Definition & Characteristics of Big Data
◎Cloud Computing
◎Relationship between cloud computing & big data
◎Case studies
◎Big data storage system
◎Hadoop background
◎Research challenges
◎Open research issues
◎Conclusion
3
Introduction
◎The continuous increase in the volume and detail of
data captured by organizations has produced an
overwhelming flow of data in either structured or
unstructured format.
◎Virtualization is a process of resource sharing and
isolation of underlying hardware to increase computer
resource utilization, efficiency, and scalability.
◎The goal of this study is to implement a
comprehensive investigation of the status of big data
in cloud computing environments
4
What is Big Data?
Big data is a term utilized
to refer to the increase in
the volume of data that are
difficult to store, process,
and analyze through
traditional database
technologies.
5
Characteristics of big data
◎Big data are characterized by three aspects:
i. data are numerous
ii. data cannot be categorized into regular
relational databases
iii. data are generated, captured, and
processed rapidly.
6
Characteristics of big data
7
Volume
◎Processing Performance
◎Class Imbalance
◎Feature Engineering
◎Non-Linearity
8
Velocity
◎Data Availability
◎Real-Time
Process/Streaming
◎Independent and
Identically
◎Distributed Random
Variables
9
Variety
◎Data Locality
◎Data Heterogeneity
◎Dirty and Noisy Data
10
Varacity
◎Data Provenance
◎Data Uncertainty
◎Dirty and Noisy Data
11
Classification of Big Data
◎Web & Social Media
◎Machine
◎Sensing
◎Transaction
◎IoT
12
1. Data sources
Classification of Big Data
◎Structured
○ SQL Server
○ Oracle
○ Access, Excel
◎Semi-structured
○ Text Analytics
○ Blogs
○ Social Authority
○ Video
○ Audio
◎Unstructured
○ Weather data
○ Currency Conversion
○ Demographic
○ E-Commerce
13
2. Content Format
Classification of Big Data
◎Document-oriented
◎Column-oriented
◎Graph database
◎Key-value
14
3. Data Stores
Classification of Big Data
◎Cleaning
◎Transform
◎Normalization
15
4. Data Staging
Classification of Big Data
◎Batch
○ Used MapReduce based system
◎Real Time
○ Scalable streaming system
16
4. Data Preprocessing
What is Cloud Computing?
Cloud computing is a fast-
growing technology that
has established itself in
the next generation of IT
industry and business.
17
Cloud Service Model
◎Cloud service model typically consists of PaaS, SaaS
and IaaS
18
Relationship between Colud Computing & Big Data
19
Organization case Studies from vendors
◎A language technology aids
touchscreen typing by
providing personalized
predictions and corrections
◎Collects & analyzes terabytes
of data to create language
model
◎Used Apache Hadoop
running on Amazon Simple
Storage Service
20
1. Swiftkey
Organization case Studies from vendors
◎Maker of Halo, a science
fiction media franchise
◎The developers analyzed
data to obtain insights into
player preferences and
online tournament
◎Used Windows Azure
HDInsight Service, which is
based on Apache Hadoop big
data framework
21
2. 343 Industries
Organization case Studies from vendors
◎Online travel agency
◎Unifying tens of
thousands of bus
schedules into a single
booking operations
◎Implemented
GoogleQuery to analyze
large dataset in Google
data processing
infrastructure
22
3. redBus
Organization case Studies from vendors
◎A mobile communication
company
◎Gathers and analyze large
amount of data from
mobile phones
◎Used Hadoop Distributed
File System (HDFS)
23
4. Nokia
Organization case Studies from vendors
◎An online retailer
◎Experiencing revenue
leakage for unreliable
real time notifications of
service problems
◎Used big data
algorithms to create a
cloud monitoring
system that deliver
notifications
24
5. Alacer
Case Studies from Scholarly/Academic Source
Situation/ context Objective Approach Result
Massively parallel
DNA sequencing
generates
staggering amounts
of data
Provide accurate &
reproductive genomic
result
Develop a Mercury analysis
pipeline and deploy it in the
Amazon web service cloud
via DNAnexus platform
Established a powerful
combination of a robust and fully
validated software pipeline and a
scalable computational resource
Conducting
analyses on large
social networks
such as Twitter
To use cloud services as
a possible solution for
the analysis of large
amounts of data
Use PageRank algorithm on
the Twitter user base to
obtain user ranking
Implemented a relatively cheap
solution for data acquisition and
analysis by using Amazon cloud
infrastuture
To study the
complex molecular
interactions that
regulate biological
systems
To develop a Hadoop
Based cloud computing
application that process
sequence of microscopic
images
Use Hadoop cloud
computing framework
Allows users to submit data
processing jobs in the cloud
Applications
running on cloud
computing likely
may fail
Design a failure scenario
Create a series of failure
scenarios on a Amazon cloud
computing platform
Help to identify vulnerabilities in
Hadoop applications running in
cloud
25
“
There were 5 exabytes of information created
between the dawn of civilization through 2003,
but that much information is now created in
every 2 days
26
- Eric Schmidt,
Executive Chairman, Google
Big data storage system
◎Traditional storage systems store data through
structured RDBMS
◎A storage architecture need to achieve availability &
reliability
◎Need to store and manage large dataset
◎The organizational systems of data storage can be
divided into three parts:
○ Disc array
○ Connection and network subsystems
○ Storage management software
27
Comparision of Storage Media
Type Specific use Advantages Disadvantages
Hard
drives
Store data up to four
terabytes
• Density
• Cost per bit storage
• Speedy start up
• Require Special cooling
• High read latency time
• Produce more heat
Solid
state
memory
Store data up to two
terabytes
• Fast access to data
• Fast movement of huge data
• Fast start-up time
• More expensive than hard drives
Object
storage
Store data as variable
size object rather than
fixed sized blocks
• Easy to find information
• Unique identifier to find data objects
• Ensure security
• Complexity in tracking indices
Optical
storage
Store data at different
angles throughout the
storage medium
• Least expensive
• Removable storage medium
• Complex
• Ability to produce multiple
optical disks in a single unit is yet
to be proven
Cloud
Storage
Serve as a provisioning
& storage model
• Usefull for small organization that do
not have sufficient storage capacity
• Can store large amount of data
• Less Security
28
Hadoop
◎A free, Java-based
programming
framework that
supports the processing
of large data sets in a
distributed computing
environment
◎Has Google’s powerful
computation
MapReduce Technology
29
HDFS (Hadoop Distributed File System)
◎A scalable distributed file system for applications
dealing with large data sets
○ Distributed: runs in a cluster
○ Scalable: 10Κ nodes, 100Κ files 10PB storage
◎ Storage space is seamless for the whole cluster
◎ Files broken into blocks
◎ Typical block size: 128 MB.
◎ Replication: Each block copied to multiple data
nodes.
30
What is MapReduce?
◎A programming model
◎A programming framework
◎Used to develop solutions that will
○ Process large amounts of data in a parallelized fashion
○ In clusters of computing nodes
◎Originally a closed-source implementation at Google
◎Hadoop: Open source implementation of the
algorithms described in the scientific papers
31
MapReduce
◎The model is broken down in 2 phases:
○ Map: Non overlapping sets of data input (<key, value> records) are
assigned to different processes (mappers) that produce a set of
intermediate <key, value> results
○ Reduce: Data of Map phase are fed to a typically smaller number of
processes(reducers) that aggregate the input results to a smaller
number of <key, value> records.
32
Research Challenges
◎Ability to handle increasing
amounts of data in an
appropriate manner
◎NoSQL database store and
retrieve large volumes of
distributed data.
◎Wang et al proposed a new
scalable data cube analysis
technique called HaCube in
big data clusters to
overcome the challenges of
large-scale data.
33
1. Scalability
Research Challenges
◎Refers to the resources of the
system accessible on
demand by an authorized
individual
◎Mobile user needs data
within a short amount of
time
◎Services must remain
operational even in the case
of a security breach
34
2. Availability
Research Challenges
◎Preventing improper or
unauthorized change or
access
◎Must ensure the
correctness of user data
◎Should provide a
mechanism for the user
to check whether the
data is maintained
35
3. Data Integrity
Research Challenges
◎Transforming data into a
form suitable for analysis
is an obstacle in the
adoption of big data
◎Owing to the variety of
data formats, big data can
be transformed into an
analysis workflow in two
ways
36
4. Transformation
Transforming big data for analysis.
◎Structured data is pre-processed before they are stored
in relational databases to meet the constraints of
schema-on-write, then it can be retrieved for analysis
◎Unstructured data must first be stored in distributed
databases, such as HBase, before they are processed
for analysis
37
Research Challenges
◎Defined as “any
difficulty encountered
along one or more
quality dimensions that
render data completely
or largely unfit for use”
◎High-quality data in the
cloud is characterized
by data consistency
38
5. Data quality
Research Challenges
◎Variety, one of the major aspects
of big data characterization
◎In a cloud environment, users
can store data
◎Structured data formats are
appropriate for database
systems
◎Semi-structured data formats
are appropriate only to some
extent
◎Unstructured data are
inappropriate
39
6. Heterogeneity
Research Challenges
◎Concerns to hamper
users who outsource
their private data into
the cloud storage
◎Encryption is utilized by
most researchers to
ensure data privacy in
the cloud
40
7. Privacy
Research Challenges
◎Specific laws &
regulation must be
established to preserve
sensitive information
◎Monitoring of company
staff communications is
not legal
◎Electrical monitoring is
permitted under special
circumstances
41
8. Legal/regulatory issues
Research Challenges
◎Design and operation of a
management system to
assure that data delivers
value and is not a cost
◎Who can do what to the
organization's data and how.
◎ Ensuring standards are set
and met
◎ A strategic & high level view
across the organization
42
9. Governance
Open research issues
◎Heterogeneous nature of
data
◎Data gathered from
different sources in
unstructured format
◎Hadoop and MapReduce
simplify the distributed
processing of unstructured
data formats
43
1. Data Staging
Open research issues
◎Provide capacity to
address massive
amount of data
◎Optimization of existing
file systems
◎Stored data in a manner
that they can be
retrieved and migrated
easily
44
2. Distributed storage systems
Open research issues
◎Should obtain
information from large
amount of data in
limited time
◎Need better algorithm
◎Data sources may
contain different
formats which makes
interrogation for
analysis a complex task
45
3. Data Analysis
Open research issues
◎Need policies that cover
all user privacy
◎Utilizing strong
cryptography to
encapsulate sensitive
data
◎Need algorithm to
secure key
management and
exchange
46
4. Data Security
Future of Cloud Computing & Big Data
◎Stream computing
◎Dramatically improved forecasting and predictive
analysis across all scientific disciplines
◎The rise of the Social Graph
– Battle lines are drawn
◎ Individually tailored and personalized solutions,
services and experiences
– Medical diagnosis and treatment
– Lifestyle management
– Targeted marketing and advertising
47
Limitation of Cloud Computing & Big Data
◎Querying encrypted data is time consuming
◎Difficult to handle such variety of data
◎Normally there is only one destination from which to
secure data
◎Less concerns with the safety and privacy of
important data stored remotely
◎Unable to access data without internet
48
Conclusion
◎The size of data at present is huge and continues to
increase every day
◎Present a review on the rise of big data in cloud
computing
◎Reviewed some of the challenges in big data
processing
◎The key issues in big data in clouds were highlighted
◎Researchers should collaborate to ensure the long-
term success of data management in a cloud
computing environment
49
Thanks!
Any questions?
50

More Related Content

What's hot

Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
Md. Salman Ahmed
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
Yaman Hajja, Ph.D.
 
Big data
Big dataBig data
Cloud computing
Cloud computingCloud computing
Cloud computing
Shiva Prasad
 
Cloud Computing and Big Data
Cloud Computing and Big DataCloud Computing and Big Data
Cloud Computing and Big Data
Robert Keahey
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
SiamAhmed16
 
What is big data?
What is big data?What is big data?
What is big data?
David Wellman
 
Big Data
Big DataBig Data
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Edureka!
 
Digital Transformation and Data Science
Digital Transformation and Data ScienceDigital Transformation and Data Science
Digital Transformation and Data Science
Matthew W. Bowers
 
Big data storage
Big data storageBig data storage
Big data storage
Vikram Nandini
 
Big data ppt
Big data pptBig data ppt
Big data ppt
Deepika ParthaSarathy
 
Cloud migration slides
Cloud migration slidesCloud migration slides
Cloud migration slides
Erika Barron
 
Big Data
Big DataBig Data
Big Data
Seminar Links
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
Rahul Agarwal
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
Dataflair Web Services Pvt Ltd
 
Big_data_ppt
Big_data_ppt Big_data_ppt
Big_data_ppt
Sadhana Singh
 
Big Data
Big DataBig Data
Big Data
Vinayak Kamath
 
Tips & tricks to drive effective Master Data Management & ERP harmonization
Tips & tricks to drive effective Master Data Management & ERP harmonizationTips & tricks to drive effective Master Data Management & ERP harmonization
Tips & tricks to drive effective Master Data Management & ERP harmonization
Verdantis
 

What's hot (20)

Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big data
Big dataBig data
Big data
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cloud Computing and Big Data
Cloud Computing and Big DataCloud Computing and Big Data
Cloud Computing and Big Data
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big Data
Big DataBig Data
Big Data
 
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
Big Data Applications | Big Data Analytics Use-Cases | Big Data Tutorial for ...
 
Digital Transformation and Data Science
Digital Transformation and Data ScienceDigital Transformation and Data Science
Digital Transformation and Data Science
 
Big data storage
Big data storageBig data storage
Big data storage
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Cloud migration slides
Cloud migration slidesCloud migration slides
Cloud migration slides
 
Big Data
Big DataBig Data
Big Data
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Big_data_ppt
Big_data_ppt Big_data_ppt
Big_data_ppt
 
Big Data
Big DataBig Data
Big Data
 
Tips & tricks to drive effective Master Data Management & ERP harmonization
Tips & tricks to drive effective Master Data Management & ERP harmonizationTips & tricks to drive effective Master Data Management & ERP harmonization
Tips & tricks to drive effective Master Data Management & ERP harmonization
 

Similar to The rise of “Big Data” on cloud computing

B1803031217
B1803031217B1803031217
B1803031217
IOSR Journals
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Denodo
 
Fundamentals of Big Data
Fundamentals of Big DataFundamentals of Big Data
Fundamentals of Big Data
The Wisdom Daily
 
The rise of big data on cloud computing
The rise of big data on cloud computing The rise of big data on cloud computing
The rise of big data on cloud computing
Muhammad Maaz Irfan
 
The elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloudThe elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloudKhazret Sapenov
 
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
Jeff Hung
 
BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD
BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD
BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD
IJNSA Journal
 
Big data security and privacy issues in the
Big data security and privacy issues in theBig data security and privacy issues in the
Big data security and privacy issues in the
IJNSA Journal
 
Geo-distributed Analytics with NetApp StorageGRID and Alluxio
Geo-distributed Analytics with NetApp StorageGRID and AlluxioGeo-distributed Analytics with NetApp StorageGRID and Alluxio
Geo-distributed Analytics with NetApp StorageGRID and Alluxio
Alluxio, Inc.
 
A scalabl e and cost effective framework for privacy preservation over big d...
A  scalabl e and cost effective framework for privacy preservation over big d...A  scalabl e and cost effective framework for privacy preservation over big d...
A scalabl e and cost effective framework for privacy preservation over big d...amna alhabib
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
Dr. Anita Goel
 
E018142329
E018142329E018142329
E018142329
IOSR Journals
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
Nagarjuna D.N
 
A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Science
ijtsrd
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
nabati
 
Distributed Framework for Data Mining As a Service on Private Cloud
Distributed Framework for Data Mining As a Service on Private CloudDistributed Framework for Data Mining As a Service on Private Cloud
Distributed Framework for Data Mining As a Service on Private Cloud
IJERA Editor
 
Long Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC DatacenterLong Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC Datacenter
inside-BigData.com
 

Similar to The rise of “Big Data” on cloud computing (20)

B1803031217
B1803031217B1803031217
B1803031217
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Fundamentals of Big Data
Fundamentals of Big DataFundamentals of Big Data
Fundamentals of Big Data
 
The rise of big data on cloud computing
The rise of big data on cloud computing The rise of big data on cloud computing
The rise of big data on cloud computing
 
The elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloudThe elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloud
 
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
[DataCon.TW 2017] Data Lake: centralize in on-prem vs. decentralize on cloud
 
BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD
BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD
BIG DATA SECURITY AND PRIVACY ISSUES IN THE CLOUD
 
Big data security and privacy issues in the
Big data security and privacy issues in theBig data security and privacy issues in the
Big data security and privacy issues in the
 
Geo-distributed Analytics with NetApp StorageGRID and Alluxio
Geo-distributed Analytics with NetApp StorageGRID and AlluxioGeo-distributed Analytics with NetApp StorageGRID and Alluxio
Geo-distributed Analytics with NetApp StorageGRID and Alluxio
 
A scalabl e and cost effective framework for privacy preservation over big d...
A  scalabl e and cost effective framework for privacy preservation over big d...A  scalabl e and cost effective framework for privacy preservation over big d...
A scalabl e and cost effective framework for privacy preservation over big d...
 
Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017Big data and cloud computing 9 sep-2017
Big data and cloud computing 9 sep-2017
 
E018142329
E018142329E018142329
E018142329
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
A Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data ScienceA Review Paper on Big Data and Hadoop for Data Science
A Review Paper on Big Data and Hadoop for Data Science
 
Big data business case
Big data   business caseBig data   business case
Big data business case
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
Distributed Framework for Data Mining As a Service on Private Cloud
Distributed Framework for Data Mining As a Service on Private CloudDistributed Framework for Data Mining As a Service on Private Cloud
Distributed Framework for Data Mining As a Service on Private Cloud
 
Long Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC DatacenterLong Live Posix - HPC Storage and the HPC Datacenter
Long Live Posix - HPC Storage and the HPC Datacenter
 

More from Minhazul Arefin

Controlling Home Appliances adopting Chatbot using Machine Learning Approach
Controlling Home Appliances adopting Chatbot using Machine Learning ApproachControlling Home Appliances adopting Chatbot using Machine Learning Approach
Controlling Home Appliances adopting Chatbot using Machine Learning Approach
Minhazul Arefin
 
Object Detection on Dental X-ray Images using R-CNN
Object Detection on Dental X-ray Images using R-CNNObject Detection on Dental X-ray Images using R-CNN
Object Detection on Dental X-ray Images using R-CNN
Minhazul Arefin
 
Natural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning ApproachNatural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning Approach
Minhazul Arefin
 
Efficient estimation of word representations in vector space (2013)
Efficient estimation of word representations in vector space (2013)Efficient estimation of word representations in vector space (2013)
Efficient estimation of word representations in vector space (2013)
Minhazul Arefin
 
Semantic scaffolds for pseudocode to-code generation (2020)
Semantic scaffolds for pseudocode to-code generation (2020)Semantic scaffolds for pseudocode to-code generation (2020)
Semantic scaffolds for pseudocode to-code generation (2020)
Minhazul Arefin
 
Recurrent neural networks (rnn) and long short term memory networks (lstm)
Recurrent neural networks (rnn) and long short term memory networks (lstm)Recurrent neural networks (rnn) and long short term memory networks (lstm)
Recurrent neural networks (rnn) and long short term memory networks (lstm)
Minhazul Arefin
 
SPoC: search-based pseudocode to code
SPoC: search-based pseudocode to codeSPoC: search-based pseudocode to code
SPoC: search-based pseudocode to code
Minhazul Arefin
 

More from Minhazul Arefin (7)

Controlling Home Appliances adopting Chatbot using Machine Learning Approach
Controlling Home Appliances adopting Chatbot using Machine Learning ApproachControlling Home Appliances adopting Chatbot using Machine Learning Approach
Controlling Home Appliances adopting Chatbot using Machine Learning Approach
 
Object Detection on Dental X-ray Images using R-CNN
Object Detection on Dental X-ray Images using R-CNNObject Detection on Dental X-ray Images using R-CNN
Object Detection on Dental X-ray Images using R-CNN
 
Natural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning ApproachNatural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning Approach
 
Efficient estimation of word representations in vector space (2013)
Efficient estimation of word representations in vector space (2013)Efficient estimation of word representations in vector space (2013)
Efficient estimation of word representations in vector space (2013)
 
Semantic scaffolds for pseudocode to-code generation (2020)
Semantic scaffolds for pseudocode to-code generation (2020)Semantic scaffolds for pseudocode to-code generation (2020)
Semantic scaffolds for pseudocode to-code generation (2020)
 
Recurrent neural networks (rnn) and long short term memory networks (lstm)
Recurrent neural networks (rnn) and long short term memory networks (lstm)Recurrent neural networks (rnn) and long short term memory networks (lstm)
Recurrent neural networks (rnn) and long short term memory networks (lstm)
 
SPoC: search-based pseudocode to code
SPoC: search-based pseudocode to codeSPoC: search-based pseudocode to code
SPoC: search-based pseudocode to code
 

Recently uploaded

DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
ambekarshweta25
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 

Recently uploaded (20)

DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 

The rise of “Big Data” on cloud computing

  • 1. The rise of “Big Data” on cloud computing: Review and open research issues • Ibrahim Abaker Targio Hashem • Ibrar Yaqoob • Nor Badrul Anuar • Salimah Mokhtar • Abdullah Gani • Samee Ullah Khan
  • 2. Presented By Kazi Mojammel Hossen ID: B130305001 2 Minhazul Arefin ID: B130305003
  • 3. Outlines ◎Introduction ◎Definition & Characteristics of Big Data ◎Cloud Computing ◎Relationship between cloud computing & big data ◎Case studies ◎Big data storage system ◎Hadoop background ◎Research challenges ◎Open research issues ◎Conclusion 3
  • 4. Introduction ◎The continuous increase in the volume and detail of data captured by organizations has produced an overwhelming flow of data in either structured or unstructured format. ◎Virtualization is a process of resource sharing and isolation of underlying hardware to increase computer resource utilization, efficiency, and scalability. ◎The goal of this study is to implement a comprehensive investigation of the status of big data in cloud computing environments 4
  • 5. What is Big Data? Big data is a term utilized to refer to the increase in the volume of data that are difficult to store, process, and analyze through traditional database technologies. 5
  • 6. Characteristics of big data ◎Big data are characterized by three aspects: i. data are numerous ii. data cannot be categorized into regular relational databases iii. data are generated, captured, and processed rapidly. 6
  • 12. Classification of Big Data ◎Web & Social Media ◎Machine ◎Sensing ◎Transaction ◎IoT 12 1. Data sources
  • 13. Classification of Big Data ◎Structured ○ SQL Server ○ Oracle ○ Access, Excel ◎Semi-structured ○ Text Analytics ○ Blogs ○ Social Authority ○ Video ○ Audio ◎Unstructured ○ Weather data ○ Currency Conversion ○ Demographic ○ E-Commerce 13 2. Content Format
  • 14. Classification of Big Data ◎Document-oriented ◎Column-oriented ◎Graph database ◎Key-value 14 3. Data Stores
  • 15. Classification of Big Data ◎Cleaning ◎Transform ◎Normalization 15 4. Data Staging
  • 16. Classification of Big Data ◎Batch ○ Used MapReduce based system ◎Real Time ○ Scalable streaming system 16 4. Data Preprocessing
  • 17. What is Cloud Computing? Cloud computing is a fast- growing technology that has established itself in the next generation of IT industry and business. 17
  • 18. Cloud Service Model ◎Cloud service model typically consists of PaaS, SaaS and IaaS 18
  • 19. Relationship between Colud Computing & Big Data 19
  • 20. Organization case Studies from vendors ◎A language technology aids touchscreen typing by providing personalized predictions and corrections ◎Collects & analyzes terabytes of data to create language model ◎Used Apache Hadoop running on Amazon Simple Storage Service 20 1. Swiftkey
  • 21. Organization case Studies from vendors ◎Maker of Halo, a science fiction media franchise ◎The developers analyzed data to obtain insights into player preferences and online tournament ◎Used Windows Azure HDInsight Service, which is based on Apache Hadoop big data framework 21 2. 343 Industries
  • 22. Organization case Studies from vendors ◎Online travel agency ◎Unifying tens of thousands of bus schedules into a single booking operations ◎Implemented GoogleQuery to analyze large dataset in Google data processing infrastructure 22 3. redBus
  • 23. Organization case Studies from vendors ◎A mobile communication company ◎Gathers and analyze large amount of data from mobile phones ◎Used Hadoop Distributed File System (HDFS) 23 4. Nokia
  • 24. Organization case Studies from vendors ◎An online retailer ◎Experiencing revenue leakage for unreliable real time notifications of service problems ◎Used big data algorithms to create a cloud monitoring system that deliver notifications 24 5. Alacer
  • 25. Case Studies from Scholarly/Academic Source Situation/ context Objective Approach Result Massively parallel DNA sequencing generates staggering amounts of data Provide accurate & reproductive genomic result Develop a Mercury analysis pipeline and deploy it in the Amazon web service cloud via DNAnexus platform Established a powerful combination of a robust and fully validated software pipeline and a scalable computational resource Conducting analyses on large social networks such as Twitter To use cloud services as a possible solution for the analysis of large amounts of data Use PageRank algorithm on the Twitter user base to obtain user ranking Implemented a relatively cheap solution for data acquisition and analysis by using Amazon cloud infrastuture To study the complex molecular interactions that regulate biological systems To develop a Hadoop Based cloud computing application that process sequence of microscopic images Use Hadoop cloud computing framework Allows users to submit data processing jobs in the cloud Applications running on cloud computing likely may fail Design a failure scenario Create a series of failure scenarios on a Amazon cloud computing platform Help to identify vulnerabilities in Hadoop applications running in cloud 25
  • 26. “ There were 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created in every 2 days 26 - Eric Schmidt, Executive Chairman, Google
  • 27. Big data storage system ◎Traditional storage systems store data through structured RDBMS ◎A storage architecture need to achieve availability & reliability ◎Need to store and manage large dataset ◎The organizational systems of data storage can be divided into three parts: ○ Disc array ○ Connection and network subsystems ○ Storage management software 27
  • 28. Comparision of Storage Media Type Specific use Advantages Disadvantages Hard drives Store data up to four terabytes • Density • Cost per bit storage • Speedy start up • Require Special cooling • High read latency time • Produce more heat Solid state memory Store data up to two terabytes • Fast access to data • Fast movement of huge data • Fast start-up time • More expensive than hard drives Object storage Store data as variable size object rather than fixed sized blocks • Easy to find information • Unique identifier to find data objects • Ensure security • Complexity in tracking indices Optical storage Store data at different angles throughout the storage medium • Least expensive • Removable storage medium • Complex • Ability to produce multiple optical disks in a single unit is yet to be proven Cloud Storage Serve as a provisioning & storage model • Usefull for small organization that do not have sufficient storage capacity • Can store large amount of data • Less Security 28
  • 29. Hadoop ◎A free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment ◎Has Google’s powerful computation MapReduce Technology 29
  • 30. HDFS (Hadoop Distributed File System) ◎A scalable distributed file system for applications dealing with large data sets ○ Distributed: runs in a cluster ○ Scalable: 10Κ nodes, 100Κ files 10PB storage ◎ Storage space is seamless for the whole cluster ◎ Files broken into blocks ◎ Typical block size: 128 MB. ◎ Replication: Each block copied to multiple data nodes. 30
  • 31. What is MapReduce? ◎A programming model ◎A programming framework ◎Used to develop solutions that will ○ Process large amounts of data in a parallelized fashion ○ In clusters of computing nodes ◎Originally a closed-source implementation at Google ◎Hadoop: Open source implementation of the algorithms described in the scientific papers 31
  • 32. MapReduce ◎The model is broken down in 2 phases: ○ Map: Non overlapping sets of data input (<key, value> records) are assigned to different processes (mappers) that produce a set of intermediate <key, value> results ○ Reduce: Data of Map phase are fed to a typically smaller number of processes(reducers) that aggregate the input results to a smaller number of <key, value> records. 32
  • 33. Research Challenges ◎Ability to handle increasing amounts of data in an appropriate manner ◎NoSQL database store and retrieve large volumes of distributed data. ◎Wang et al proposed a new scalable data cube analysis technique called HaCube in big data clusters to overcome the challenges of large-scale data. 33 1. Scalability
  • 34. Research Challenges ◎Refers to the resources of the system accessible on demand by an authorized individual ◎Mobile user needs data within a short amount of time ◎Services must remain operational even in the case of a security breach 34 2. Availability
  • 35. Research Challenges ◎Preventing improper or unauthorized change or access ◎Must ensure the correctness of user data ◎Should provide a mechanism for the user to check whether the data is maintained 35 3. Data Integrity
  • 36. Research Challenges ◎Transforming data into a form suitable for analysis is an obstacle in the adoption of big data ◎Owing to the variety of data formats, big data can be transformed into an analysis workflow in two ways 36 4. Transformation
  • 37. Transforming big data for analysis. ◎Structured data is pre-processed before they are stored in relational databases to meet the constraints of schema-on-write, then it can be retrieved for analysis ◎Unstructured data must first be stored in distributed databases, such as HBase, before they are processed for analysis 37
  • 38. Research Challenges ◎Defined as “any difficulty encountered along one or more quality dimensions that render data completely or largely unfit for use” ◎High-quality data in the cloud is characterized by data consistency 38 5. Data quality
  • 39. Research Challenges ◎Variety, one of the major aspects of big data characterization ◎In a cloud environment, users can store data ◎Structured data formats are appropriate for database systems ◎Semi-structured data formats are appropriate only to some extent ◎Unstructured data are inappropriate 39 6. Heterogeneity
  • 40. Research Challenges ◎Concerns to hamper users who outsource their private data into the cloud storage ◎Encryption is utilized by most researchers to ensure data privacy in the cloud 40 7. Privacy
  • 41. Research Challenges ◎Specific laws & regulation must be established to preserve sensitive information ◎Monitoring of company staff communications is not legal ◎Electrical monitoring is permitted under special circumstances 41 8. Legal/regulatory issues
  • 42. Research Challenges ◎Design and operation of a management system to assure that data delivers value and is not a cost ◎Who can do what to the organization's data and how. ◎ Ensuring standards are set and met ◎ A strategic & high level view across the organization 42 9. Governance
  • 43. Open research issues ◎Heterogeneous nature of data ◎Data gathered from different sources in unstructured format ◎Hadoop and MapReduce simplify the distributed processing of unstructured data formats 43 1. Data Staging
  • 44. Open research issues ◎Provide capacity to address massive amount of data ◎Optimization of existing file systems ◎Stored data in a manner that they can be retrieved and migrated easily 44 2. Distributed storage systems
  • 45. Open research issues ◎Should obtain information from large amount of data in limited time ◎Need better algorithm ◎Data sources may contain different formats which makes interrogation for analysis a complex task 45 3. Data Analysis
  • 46. Open research issues ◎Need policies that cover all user privacy ◎Utilizing strong cryptography to encapsulate sensitive data ◎Need algorithm to secure key management and exchange 46 4. Data Security
  • 47. Future of Cloud Computing & Big Data ◎Stream computing ◎Dramatically improved forecasting and predictive analysis across all scientific disciplines ◎The rise of the Social Graph – Battle lines are drawn ◎ Individually tailored and personalized solutions, services and experiences – Medical diagnosis and treatment – Lifestyle management – Targeted marketing and advertising 47
  • 48. Limitation of Cloud Computing & Big Data ◎Querying encrypted data is time consuming ◎Difficult to handle such variety of data ◎Normally there is only one destination from which to secure data ◎Less concerns with the safety and privacy of important data stored remotely ◎Unable to access data without internet 48
  • 49. Conclusion ◎The size of data at present is huge and continues to increase every day ◎Present a review on the rise of big data in cloud computing ◎Reviewed some of the challenges in big data processing ◎The key issues in big data in clouds were highlighted ◎Researchers should collaborate to ensure the long- term success of data management in a cloud computing environment 49