Phone : +91 9619 66 3272
Twitter : @aruforchange
1(C)2016 - 2017 Tamilboomi E-Learning
Last 5 years we have generated data more than
entire humanity has ever generated.!!
 After you start reading this line.
 More than 150k searches has been made on Google.
 More than 34k GB of Data generated in the world.
 More than 50 Hours video has been uploaded in
YouTube.
Big data is huge.!
Reference : http://www.internetlivestats.com/one-second/
(C)2016 - 2017 Aru Learning System 3Reference : Excelacom
(C)2016 - 2017 Aru Learning System 4
(C)2016 - 2017 Aru Learning System 5
 20 million requests.
 750 Gb Logs.
 60000 request sent in first one hour.
 30k users online.
 3 days seats sold in 6 mins.
 In 5 years, 21 billion connected devices, 6.1 billion will
be mobile phones.
 Big opportunities for Fortune 1000 companies to small
scale companies.
At the moment 0.5% data ever analyzed. Imagine
the potential.
 73% of organization is invested of going to invest in
hadoop by 2016.
Time will never be better than now to learn
hadoop..!
6(C)2016 - 2017 Aru Learning System
Reference : http://www.informationweek.com/mobile/mobile-devices/gartner-21-
billion-iot-devices-to-invade-by-2020/d/d-id/1323081
Bigdata successfully Implemented
Areas - Politics
 Worlds biggest democracy election 2014:
 1 .2 Billion people.
 24 crore internet users.
 100 Million smart phones
 Huge data to be analyzed in short manner.
 SAP, Oracle, InMobi partnered with BJP.
 Used cookies and Targeted advertisement.
 Obama used at 2012 in U.S Elections.
7(C)2016 - 2017 Aru Learning System
Reference : http://www.firstpost.com/india/big-data-analysis-how-it-firms-like-sap-
oracle-helped-modi-win-1576355.html
How Bigdata has used in politics?
(C)2016 - 2017 Aru Learning System 8
http://www.deccanherald.com/content/530569/big-data-tech-flavour-tn.html
Big data implemented in other
areas
 Digitalized industry,
 Health care –Siemens 800 million users
 Entertainment – Netflix
 Banking Sector – Credit risk simulation
 Telecom – Vodafone
 Mobile – Real time location based Promotion.
9(C)2016 - 2017 Aru Learning System
Reference : http://www-01.ibm.com/software/data/bigdata/use-cases.html
Bigdata Definition
 3 +1 V’s by IBM
 Volume
 Velocity
 Varity
 Veracity
 Anything beyond storage & Processing power is called
big data.
10(C)2016 - 2017 Aru Learning System
Meet Hadoop
 Hadoop is a framework written on java to process
Bigdata.
 Hadoop name derived from big elephant toy.
 Developed by Doug cutting.
11(C)2016 - 2017 Aru Learning System
(C)2016 - 2017 Aru Learning System 12
Working on a project nutch at yahoo.
 Faced scaling up problem.
 2003 Google published a white paper of how they
handle their data problem.
 Based on that idea hadoop is developed as an open
source system under apache license.
 It has two main concepts which made it the multi
billion dollar industry
• HDFS and Map Reduce
13(C)2016 - 2017 Aru Learning System
Link to white paper : http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-
osdi04.pdf
Why Hadoop?
 Scale out instead of scale in.
 Commodity hardware.
 Unstructured data processing.
14(C)2016 - 2017 Aru Learning System
Current Career choices
15(C)2016 - 2017 Aru Learning System
Reference : karmasphere
What hadoop is not?
 Not going to replace RDBMS
 Not be suitable for Real time operational. Why?
 Hadoop is not suitable for all problems.
 Hadoop is not best solution for many small files.
16(C)2016 - 2017 Aru Learning System
Reference : https://wiki.apache.org/hadoop/HadoopIsNot
Assignment
 Think about your company data and how hadoop can
be used for that?
17(C)2016 - 2017 Aru Learning System

Hadoop journey

  • 1.
    Phone : +919619 66 3272 Twitter : @aruforchange 1(C)2016 - 2017 Tamilboomi E-Learning
  • 2.
    Last 5 yearswe have generated data more than entire humanity has ever generated.!!  After you start reading this line.  More than 150k searches has been made on Google.  More than 34k GB of Data generated in the world.  More than 50 Hours video has been uploaded in YouTube. Big data is huge.! Reference : http://www.internetlivestats.com/one-second/
  • 3.
    (C)2016 - 2017Aru Learning System 3Reference : Excelacom
  • 4.
    (C)2016 - 2017Aru Learning System 4
  • 5.
    (C)2016 - 2017Aru Learning System 5  20 million requests.  750 Gb Logs.  60000 request sent in first one hour.  30k users online.  3 days seats sold in 6 mins.
  • 6.
     In 5years, 21 billion connected devices, 6.1 billion will be mobile phones.  Big opportunities for Fortune 1000 companies to small scale companies. At the moment 0.5% data ever analyzed. Imagine the potential.  73% of organization is invested of going to invest in hadoop by 2016. Time will never be better than now to learn hadoop..! 6(C)2016 - 2017 Aru Learning System Reference : http://www.informationweek.com/mobile/mobile-devices/gartner-21- billion-iot-devices-to-invade-by-2020/d/d-id/1323081
  • 7.
    Bigdata successfully Implemented Areas- Politics  Worlds biggest democracy election 2014:  1 .2 Billion people.  24 crore internet users.  100 Million smart phones  Huge data to be analyzed in short manner.  SAP, Oracle, InMobi partnered with BJP.  Used cookies and Targeted advertisement.  Obama used at 2012 in U.S Elections. 7(C)2016 - 2017 Aru Learning System Reference : http://www.firstpost.com/india/big-data-analysis-how-it-firms-like-sap- oracle-helped-modi-win-1576355.html
  • 8.
    How Bigdata hasused in politics? (C)2016 - 2017 Aru Learning System 8 http://www.deccanherald.com/content/530569/big-data-tech-flavour-tn.html
  • 9.
    Big data implementedin other areas  Digitalized industry,  Health care –Siemens 800 million users  Entertainment – Netflix  Banking Sector – Credit risk simulation  Telecom – Vodafone  Mobile – Real time location based Promotion. 9(C)2016 - 2017 Aru Learning System Reference : http://www-01.ibm.com/software/data/bigdata/use-cases.html
  • 10.
    Bigdata Definition  3+1 V’s by IBM  Volume  Velocity  Varity  Veracity  Anything beyond storage & Processing power is called big data. 10(C)2016 - 2017 Aru Learning System
  • 11.
    Meet Hadoop  Hadoopis a framework written on java to process Bigdata.  Hadoop name derived from big elephant toy.  Developed by Doug cutting. 11(C)2016 - 2017 Aru Learning System
  • 12.
    (C)2016 - 2017Aru Learning System 12
  • 13.
    Working on aproject nutch at yahoo.  Faced scaling up problem.  2003 Google published a white paper of how they handle their data problem.  Based on that idea hadoop is developed as an open source system under apache license.  It has two main concepts which made it the multi billion dollar industry • HDFS and Map Reduce 13(C)2016 - 2017 Aru Learning System Link to white paper : http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce- osdi04.pdf
  • 14.
    Why Hadoop?  Scaleout instead of scale in.  Commodity hardware.  Unstructured data processing. 14(C)2016 - 2017 Aru Learning System
  • 15.
    Current Career choices 15(C)2016- 2017 Aru Learning System Reference : karmasphere
  • 16.
    What hadoop isnot?  Not going to replace RDBMS  Not be suitable for Real time operational. Why?  Hadoop is not suitable for all problems.  Hadoop is not best solution for many small files. 16(C)2016 - 2017 Aru Learning System Reference : https://wiki.apache.org/hadoop/HadoopIsNot
  • 17.
    Assignment  Think aboutyour company data and how hadoop can be used for that? 17(C)2016 - 2017 Aru Learning System