6. 6
What is Big Data?
Big data is data that exceeds the processing capacity of
conventional database systems.
The data is too big, moves too fast,
or doesn’t fit the structures of your database architectures.
To gain value from this data,
you must choose an alternative way to process it.
Big Data Now: O'Reilly Media
13. 13
Big Data Landscape
Source: Big Data in the Enterprise. When to Use What?
14. 14
Big Data Solution
Spreadsheet Predictive Analytics Embedded BI
Petabytes of Data
(Unstructured)
Sensors Devices Bots Crawlers ERP CRM LOB APPs
Unstructured and Structured Data
Parallel Data Warehouse
Hadoop On
Cloud
Hadoop On
Private
Server
Connectors
S
S
RS
BI Platform
Familiar End User Tools
Data Market Place
Data Market
Hundreds of TB of Data
(structured)
15. 15
“ The market for big data
will reach $16.1 billion in 2014,
growing 6 times faster than the overall IT market. ”
IDC
17. 17
What is Hadoop?
A scalable fault-tolerant distributed system
for data storage and processing
Completely written in java
Open source & distributed under Apache license
18. 18
Hadoop is growing
Hadoop will continue to displace other IT spending,
disrupting enterprise data warehouse and enterprise
storage.
IDC predicting the co-habitation for the foreseeable future of
RDBMS with the newer Hadoop ecosystem and NoSQL
databases.
Hadoop software revenue was $209.2 million or 11 percent
of the total big data software market in 2012.
The comprehensive Hadoop market (combined hardware,
software, & services) bagged 23 percent of the big data
market in 2012, which was projected to grow to 31 percent
in 2013. [IDC]
20. 20
Big Data Technologies Adopted or To Be
Adopted in Next 24 Months
Source: 2013 Big Data Opportunities Survey, Unisphere Research May 2013
21. 21
SQL development for Hadoop
Hadoop uses MapReduce to process Big Data.
SQL development for Hadoop enables business
analysts to use their skills and SQL tools of choice
for big data projects.
Developers can now choose
– Hive
– Impala
– Jaql
– Hadapt
Source: www.eweek.com
26. 26
Hadoop clone wars end
Expects to see consolidation among big data
startups
Some companies will start to close their
doors, while others will probably get acquired.
Cloudera competes against the likes of tier-one
megavendors like IBM and Oracle.
29. 29
Internet of things
The Internet is expanding beyond PCs and mobile
devices into enterprise assets such as field
equipment, and consumer items such as cars and
televisions.
Over 50% of Internet connections are things.
Enterprises should not limit themselves to thinking
that only the Internet of Things (i.e., assets and
machines) as the potential to leverage the four
"internets” (people, things, information and places).
31. 31
Prediction #5
More data warehouses will deploy
enterprise data hubs
32. 32
Hadoop roles in data warehouses
Data hubs offload ETL processing and data from
enterprise data warehouses to Hadoop
Hadoop acting as a central enterprise hub.
10 times cheaper and can perform more
analytics for additional processing or new apps.
Source: www.eweek.com
35. 35
Prediction #6
Business intelligence (BI) will be
embedded on smart systems
36. 36
Embedded BI
Embedded data analytics and “business
intelligence” begin to emerge.
Sales forces may manage their customer
relationships through embedded, smart apps
with built-in analytics to make decisions
Progressively, smart software in mobile and
enterprise systems will make decisions and
make data scientists redundant.
Source: http://www.experfy.com
37. 37
Evolution of Embedded BI
Source: http://www.b-eye-network.com/
41. 41
NoSQL
NoSQL means “Not only SQL”, rather than
“the absence of SQL”
There are many ways to look at data other
tham structure and ordered approach that
SQL requires.
The industry is begining to seatle on a few
major of players
45. 45
Limitation of Hadoop 1.x
No horizatontal scalability of NameNode
Does not support NameNode high availability
Not possible to run Non-MapReduce Big Data
applications on HDFS
Run as a batch job
Does not support Multi-tenancy
48. 48
AAnnaallyyttiiccss SSooffttwwaarree aass aa SSeerrvviiccee
Data as a Service
Data as a Service
(Database, No SQL, Hadoop, in-Memory)
(Database, No SQL, Hadoop, in-Memory)
SSttoorraaggee aass aa SSeerrvviiccee
Compute as a Service
49. 49
Big Data as a Service
The IDC estimates for Hadoop-as-a-service
market in 2012 was about $130 million, projected
to grow by 145 percent to $318 million in 2013.
More Cloud provider will offer Hadoop as a Service
– Amazon AWS
– Microsoft Azure HD Insight
– IBM Bluemix
– Qubole
54. 54
External Data
The explosive growth of social media, mobile devices,
and machine sensors is generating a wealth of bits.
Some of this data is generated within an organization,
but a larger percentage comes from the outside
In 2014, businesses will find more ways to harness this
mix of structured and unstructured data
55. 55
Hadoop & BI
Hadoop
Fast Database BI Tool
Internal
External
Source: Big Data and BI Best Practices: YellowFin