Forecast of 
Big Data Trends 
Assoc. Prof. Dr. Thanachart Numnonda 
Executive Director 
IMC Institute 
3 September 2014
2 
BBiigg DDaattaa transforms Business
3 
Data created every minute 
Source http://mashable.com/2012/06/22/data-created-every-minute/
4 
The Rise of Big Data
5 
Data Growth
6 
What is Big Data? 
Big data is data that exceeds the processing capacity of 
conventional database systems. 
The data is too big, moves too fast, 
or doesn’t fit the structures of your database architectures. 
To gain value from this data, 
you must choose an alternative way to process it. 
Big Data Now: O'Reilly Media
7 
Three Characteristics of Big Data 
Source Introduction to Big Data: Dr. Putchong Uthayopas
8 
Big Data Supply Chain
9 
Big Data Application Area 
Source: BIG DATA Case Study,Anju Singh
10 
Big Data Use Cases
11 
Hospitality Industry Captures 
Source McKinsey & Company
12 
Next Product to Buy 
Source McKinsey & Company
13 
Big Data Landscape 
Source: Big Data in the Enterprise. When to Use What?
14 
Big Data Solution 
Spreadsheet Predictive Analytics Embedded BI 
Petabytes of Data 
(Unstructured) 
Sensors Devices Bots Crawlers ERP CRM LOB APPs 
Unstructured and Structured Data 
Parallel Data Warehouse 
Hadoop On 
Cloud 
Hadoop On 
Private 
Server 
Connectors 
S 
S 
RS 
BI Platform 
Familiar End User Tools 
Data Market Place 
Data Market 
Hundreds of TB of Data 
(structured)
15 
“ The market for big data 
will reach $16.1 billion in 2014, 
growing 6 times faster than the overall IT market. ” 
IDC
16 
Prediction #1 
Hadoop will gain in stature
17 
What is Hadoop? 
A scalable fault-tolerant distributed system 
for data storage and processing 
Completely written in java 
Open source & distributed under Apache license
18 
Hadoop is growing 
Hadoop will continue to displace other IT spending, 
disrupting enterprise data warehouse and enterprise 
storage. 
IDC predicting the co-habitation for the foreseeable future of 
RDBMS with the newer Hadoop ecosystem and NoSQL 
databases. 
Hadoop software revenue was $209.2 million or 11 percent 
of the total big data software market in 2012. 
The comprehensive Hadoop market (combined hardware, 
software, & services) bagged 23 percent of the big data 
market in 2012, which was projected to grow to 31 percent 
in 2013. [IDC]
19 
Prediction #2 
SQL holds biggest promise 
for Big Data
20 
Big Data Technologies Adopted or To Be 
Adopted in Next 24 Months 
Source: 2013 Big Data Opportunities Survey, Unisphere Research May 2013
21 
SQL development for Hadoop 
Hadoop uses MapReduce to process Big Data. 
SQL development for Hadoop enables business 
analysts to use their skills and SQL tools of choice 
for big data projects. 
Developers can now choose 
– Hive 
– Impala 
– Jaql 
– Hadapt 
Source: www.eweek.com
22 
Prediction #3 
Big Data vendor 
consolidation begins
23 
Worldwide Big Data Revenue 2013 
Source: Wikibon.org
24 
Hadoop Distribution 
Amazon 
Cloudera 
MapR 
Microsoft Windows Azure 
IBM Infosphere BigInsights 
EMC Greenplum HD Hadoop distribution 
Hartonwork
25
26 
Hadoop clone wars end 
Expects to see consolidation among big data 
startups 
Some companies will start to close their 
doors, while others will probably get acquired. 
Cloudera competes against the likes of tier-one 
megavendors like IBM and Oracle.
27 
Prediction #4 
Internet of things grow
28
29 
Internet of things 
The Internet is expanding beyond PCs and mobile 
devices into enterprise assets such as field 
equipment, and consumer items such as cars and 
televisions. 
Over 50% of Internet connections are things. 
Enterprises should not limit themselves to thinking 
that only the Internet of Things (i.e., assets and 
machines) as the potential to leverage the four 
"internets” (people, things, information and places).
30
31 
Prediction #5 
More data warehouses will deploy 
enterprise data hubs
32 
Hadoop roles in data warehouses 
Data hubs offload ETL processing and data from 
enterprise data warehouses to Hadoop 
Hadoop acting as a central enterprise hub. 
10 times cheaper and can perform more 
analytics for additional processing or new apps. 
Source: www.eweek.com
33 
Data Warehouse Offload
34 
Enterprise Data Hub
35 
Prediction #6 
Business intelligence (BI) will be 
embedded on smart systems
36 
Embedded BI 
Embedded data analytics and “business 
intelligence” begin to emerge. 
Sales forces may manage their customer 
relationships through embedded, smart apps 
with built-in analytics to make decisions 
Progressively, smart software in mobile and 
enterprise systems will make decisions and 
make data scientists redundant. 
Source: http://www.experfy.com
37 
Evolution of Embedded BI 
Source: http://www.b-eye-network.com/
38 
Source: Jaspersoft
39 
Prediction #7 
Less relational SQL, 
more NoSQL
40 
Data Management Trends 
Source KMS Technology
41 
NoSQL 
NoSQL means “Not only SQL”, rather than 
“the absence of SQL” 
There are many ways to look at data other 
tham structure and ordered approach that 
SQL requires. 
The industry is begining to seatle on a few 
major of players
42 
Popular NoSQL/New SQL Distributions
43 
Prediction #8 
Hadoop will shift to 
real-time processing
44 
Hadoop 1.0 Ecosystem 
Pig 
MapReduce 
Hive 
(Job Scheduling/Execution System) 
HDFS 
(Hadoop Distributed File System) 
Zookepper 
Flume 
HBase 
Source Big Data Hadoop: Danairat Thanabodithammachari
45 
Limitation of Hadoop 1.x 
No horizatontal scalability of NameNode 
Does not support NameNode high availability 
Not possible to run Non-MapReduce Big Data 
applications on HDFS 
Run as a batch job 
Does not support Multi-tenancy
46 
Hadoop 2.0
47 
Prediction #9 
Big Data as a Service (BDaaS)
48 
AAnnaallyyttiiccss SSooffttwwaarree aass aa SSeerrvviiccee 
Data as a Service 
Data as a Service 
(Database, No SQL, Hadoop, in-Memory) 
(Database, No SQL, Hadoop, in-Memory) 
SSttoorraaggee aass aa SSeerrvviiccee 
Compute as a Service
49 
Big Data as a Service 
The IDC estimates for Hadoop-as-a-service 
market in 2012 was about $130 million, projected 
to grow by 145 percent to $318 million in 2013. 
More Cloud provider will offer Hadoop as a Service 
– Amazon AWS 
– Microsoft Azure HD Insight 
– IBM Bluemix 
– Qubole
50
51
52
53 
Prediction #10 
External data is as important 
as internal data
54 
External Data 
The explosive growth of social media, mobile devices, 
and machine sensors is generating a wealth of bits. 
Some of this data is generated within an organization, 
but a larger percentage comes from the outside 
In 2014, businesses will find more ways to harness this 
mix of structured and unstructured data
55 
Hadoop & BI 
Hadoop 
Fast Database BI Tool 
Internal 
External 
Source: Big Data and BI Best Practices: YellowFin
56 
www.facebook.com/imcinstitute
57 
Thank you 
thanachart@imcinstitute.com 
www.facebook.com/imcinstitute 
www.slideshare.net/imcinstitute

Forecast of Big Data Trends

  • 1.
    Forecast of BigData Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
  • 2.
    2 BBiigg DDaattaatransforms Business
  • 3.
    3 Data createdevery minute Source http://mashable.com/2012/06/22/data-created-every-minute/
  • 4.
    4 The Riseof Big Data
  • 5.
  • 6.
    6 What isBig Data? Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the structures of your database architectures. To gain value from this data, you must choose an alternative way to process it. Big Data Now: O'Reilly Media
  • 7.
    7 Three Characteristicsof Big Data Source Introduction to Big Data: Dr. Putchong Uthayopas
  • 8.
    8 Big DataSupply Chain
  • 9.
    9 Big DataApplication Area Source: BIG DATA Case Study,Anju Singh
  • 10.
    10 Big DataUse Cases
  • 11.
    11 Hospitality IndustryCaptures Source McKinsey & Company
  • 12.
    12 Next Productto Buy Source McKinsey & Company
  • 13.
    13 Big DataLandscape Source: Big Data in the Enterprise. When to Use What?
  • 14.
    14 Big DataSolution Spreadsheet Predictive Analytics Embedded BI Petabytes of Data (Unstructured) Sensors Devices Bots Crawlers ERP CRM LOB APPs Unstructured and Structured Data Parallel Data Warehouse Hadoop On Cloud Hadoop On Private Server Connectors S S RS BI Platform Familiar End User Tools Data Market Place Data Market Hundreds of TB of Data (structured)
  • 15.
    15 “ Themarket for big data will reach $16.1 billion in 2014, growing 6 times faster than the overall IT market. ” IDC
  • 16.
    16 Prediction #1 Hadoop will gain in stature
  • 17.
    17 What isHadoop? A scalable fault-tolerant distributed system for data storage and processing Completely written in java Open source & distributed under Apache license
  • 18.
    18 Hadoop isgrowing Hadoop will continue to displace other IT spending, disrupting enterprise data warehouse and enterprise storage. IDC predicting the co-habitation for the foreseeable future of RDBMS with the newer Hadoop ecosystem and NoSQL databases. Hadoop software revenue was $209.2 million or 11 percent of the total big data software market in 2012. The comprehensive Hadoop market (combined hardware, software, & services) bagged 23 percent of the big data market in 2012, which was projected to grow to 31 percent in 2013. [IDC]
  • 19.
    19 Prediction #2 SQL holds biggest promise for Big Data
  • 20.
    20 Big DataTechnologies Adopted or To Be Adopted in Next 24 Months Source: 2013 Big Data Opportunities Survey, Unisphere Research May 2013
  • 21.
    21 SQL developmentfor Hadoop Hadoop uses MapReduce to process Big Data. SQL development for Hadoop enables business analysts to use their skills and SQL tools of choice for big data projects. Developers can now choose – Hive – Impala – Jaql – Hadapt Source: www.eweek.com
  • 22.
    22 Prediction #3 Big Data vendor consolidation begins
  • 23.
    23 Worldwide BigData Revenue 2013 Source: Wikibon.org
  • 24.
    24 Hadoop Distribution Amazon Cloudera MapR Microsoft Windows Azure IBM Infosphere BigInsights EMC Greenplum HD Hadoop distribution Hartonwork
  • 25.
  • 26.
    26 Hadoop clonewars end Expects to see consolidation among big data startups Some companies will start to close their doors, while others will probably get acquired. Cloudera competes against the likes of tier-one megavendors like IBM and Oracle.
  • 27.
    27 Prediction #4 Internet of things grow
  • 28.
  • 29.
    29 Internet ofthings The Internet is expanding beyond PCs and mobile devices into enterprise assets such as field equipment, and consumer items such as cars and televisions. Over 50% of Internet connections are things. Enterprises should not limit themselves to thinking that only the Internet of Things (i.e., assets and machines) as the potential to leverage the four "internets” (people, things, information and places).
  • 30.
  • 31.
    31 Prediction #5 More data warehouses will deploy enterprise data hubs
  • 32.
    32 Hadoop rolesin data warehouses Data hubs offload ETL processing and data from enterprise data warehouses to Hadoop Hadoop acting as a central enterprise hub. 10 times cheaper and can perform more analytics for additional processing or new apps. Source: www.eweek.com
  • 33.
  • 34.
  • 35.
    35 Prediction #6 Business intelligence (BI) will be embedded on smart systems
  • 36.
    36 Embedded BI Embedded data analytics and “business intelligence” begin to emerge. Sales forces may manage their customer relationships through embedded, smart apps with built-in analytics to make decisions Progressively, smart software in mobile and enterprise systems will make decisions and make data scientists redundant. Source: http://www.experfy.com
  • 37.
    37 Evolution ofEmbedded BI Source: http://www.b-eye-network.com/
  • 38.
  • 39.
    39 Prediction #7 Less relational SQL, more NoSQL
  • 40.
    40 Data ManagementTrends Source KMS Technology
  • 41.
    41 NoSQL NoSQLmeans “Not only SQL”, rather than “the absence of SQL” There are many ways to look at data other tham structure and ordered approach that SQL requires. The industry is begining to seatle on a few major of players
  • 42.
    42 Popular NoSQL/NewSQL Distributions
  • 43.
    43 Prediction #8 Hadoop will shift to real-time processing
  • 44.
    44 Hadoop 1.0Ecosystem Pig MapReduce Hive (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) Zookepper Flume HBase Source Big Data Hadoop: Danairat Thanabodithammachari
  • 45.
    45 Limitation ofHadoop 1.x No horizatontal scalability of NameNode Does not support NameNode high availability Not possible to run Non-MapReduce Big Data applications on HDFS Run as a batch job Does not support Multi-tenancy
  • 46.
  • 47.
    47 Prediction #9 Big Data as a Service (BDaaS)
  • 48.
    48 AAnnaallyyttiiccss SSooffttwwaarreeaass aa SSeerrvviiccee Data as a Service Data as a Service (Database, No SQL, Hadoop, in-Memory) (Database, No SQL, Hadoop, in-Memory) SSttoorraaggee aass aa SSeerrvviiccee Compute as a Service
  • 49.
    49 Big Dataas a Service The IDC estimates for Hadoop-as-a-service market in 2012 was about $130 million, projected to grow by 145 percent to $318 million in 2013. More Cloud provider will offer Hadoop as a Service – Amazon AWS – Microsoft Azure HD Insight – IBM Bluemix – Qubole
  • 50.
  • 51.
  • 52.
  • 53.
    53 Prediction #10 External data is as important as internal data
  • 54.
    54 External Data The explosive growth of social media, mobile devices, and machine sensors is generating a wealth of bits. Some of this data is generated within an organization, but a larger percentage comes from the outside In 2014, businesses will find more ways to harness this mix of structured and unstructured data
  • 55.
    55 Hadoop &BI Hadoop Fast Database BI Tool Internal External Source: Big Data and BI Best Practices: YellowFin
  • 56.
  • 57.
    57 Thank you thanachart@imcinstitute.com www.facebook.com/imcinstitute www.slideshare.net/imcinstitute