2. Last 5 years we have generated data more than
entire humanity has ever generated.!!
After you start reading this line.
More than 150k searches has been made on Google.
More than 34k GB of Data generated in the world.
More than 50 Hours video has been uploaded in
YouTube.
Big data is huge.!
Reference : http://www.internetlivestats.com/one-second/
3. (C)2016 - 2017 Aru Learning System 3Reference : Excelacom
5. (C)2016 - 2017 Aru Learning System 5
20 million requests.
750 Gb Logs.
60000 request sent in first one hour.
30k users online.
3 days seats sold in 6 mins.
6. In 5 years, 21 billion connected devices, 6.1 billion will
be mobile phones.
Big opportunities for Fortune 1000 companies to small
scale companies.
At the moment 0.5% data ever analyzed. Imagine
the potential.
73% of organization is invested of going to invest in
hadoop by 2016.
Time will never be better than now to learn
hadoop..!
6(C)2016 - 2017 Aru Learning System
Reference : http://www.informationweek.com/mobile/mobile-devices/gartner-21-
billion-iot-devices-to-invade-by-2020/d/d-id/1323081
7. Bigdata successfully Implemented
Areas - Politics
Worlds biggest democracy election 2014:
1 .2 Billion people.
24 crore internet users.
100 Million smart phones
Huge data to be analyzed in short manner.
SAP, Oracle, InMobi partnered with BJP.
Used cookies and Targeted advertisement.
Obama used at 2012 in U.S Elections.
7(C)2016 - 2017 Aru Learning System
Reference : http://www.firstpost.com/india/big-data-analysis-how-it-firms-like-sap-
oracle-helped-modi-win-1576355.html
8. How Bigdata has used in politics?
(C)2016 - 2017 Aru Learning System 8
http://www.deccanherald.com/content/530569/big-data-tech-flavour-tn.html
9. Big data implemented in other
areas
Digitalized industry,
Health care –Siemens 800 million users
Entertainment – Netflix
Banking Sector – Credit risk simulation
Telecom – Vodafone
Mobile – Real time location based Promotion.
9(C)2016 - 2017 Aru Learning System
Reference : http://www-01.ibm.com/software/data/bigdata/use-cases.html
10. Bigdata Definition
3 +1 V’s by IBM
Volume
Velocity
Varity
Veracity
Anything beyond storage & Processing power is called
big data.
10(C)2016 - 2017 Aru Learning System
11. Meet Hadoop
Hadoop is a framework written on java to process
Bigdata.
Hadoop name derived from big elephant toy.
Developed by Doug cutting.
11(C)2016 - 2017 Aru Learning System
13. Working on a project nutch at yahoo.
Faced scaling up problem.
2003 Google published a white paper of how they
handle their data problem.
Based on that idea hadoop is developed as an open
source system under apache license.
It has two main concepts which made it the multi
billion dollar industry
• HDFS and Map Reduce
13(C)2016 - 2017 Aru Learning System
Link to white paper : http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-
osdi04.pdf
14. Why Hadoop?
Scale out instead of scale in.
Commodity hardware.
Unstructured data processing.
14(C)2016 - 2017 Aru Learning System
16. What hadoop is not?
Not going to replace RDBMS
Not be suitable for Real time operational. Why?
Hadoop is not suitable for all problems.
Hadoop is not best solution for many small files.
16(C)2016 - 2017 Aru Learning System
Reference : https://wiki.apache.org/hadoop/HadoopIsNot
17. Assignment
Think about your company data and how hadoop can
be used for that?
17(C)2016 - 2017 Aru Learning System