Big data

1,199 views

Published on

Learn about big data and basics of Hadoop

Published in: Technology
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total views
1,199
On SlideShare
0
From Embeds
0
Number of Embeds
147
Actions
Shares
0
Downloads
27
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

Big data

  1. 1. ©copyright Ankur Raina 2012
  2. 2. • 3 million lines of code are tracking your checked baggage.• A billion lines of code are included in the working of the latest airbus plane.• A billion transistors per person.• 4 billion mobile phone subscribers.• St. Anthony Falls Bridge ( Minneapolis) is fitted with 200 embedded sensors. ©copyright Ankur Raina 2012
  3. 3. ©copyright Ankur Raina 2012
  4. 4. • 2001 8 Lakh Petabytes of data• 2020 35 zettabytes of data ©copyright Ankur Raina 2012
  5. 5. 7 TB/day 10 TB/day ©copyright Ankur Raina 2012
  6. 6. The Trouble begins here…• 80% of the world’s information is unstructured.• Unstructured information is growing at 15 times the rate of structured information. ©copyright Ankur Raina 2012
  7. 7. ©copyright Ankur Raina 2012
  8. 8. Contents• What is Big Data ?• The 3Vs.• What is a Big Data platform ?• Needle in a haystack problem.• Big Data & Social Media.• The Call Centre mantra.• ABCs of Hadoop. ©copyright Ankur Raina 2012
  9. 9. Big DataThe information which cannot be processed/analyzed using the traditional processes or tools. • Instrumentation • Interconnection • M2M interconnectivity • Intelligent Machines ©copyright Ankur Raina 2012
  10. 10. ©copyright Ankur Raina 2012
  11. 11. Big Data Platform• Lets you store the data in its native business object format & get value out of it through massive parallelism on readily available components.• It’s not a replacement of Data Warehouse. ©copyright Ankur Raina 2012
  12. 12. Service Oriented Architecture (SOA ) This is what I need !!! ©copyright Ankur Raina 2012
  13. 13. Social MediaWe know…• What are the people saying ?But…• Why are people saying what they are saying & behaving in the way they are behaving ? ©copyright Ankur Raina 2012
  14. 14. • Super Bowl 2011 (4064 Ttps ,Feb 2011)• Bin Laden’s death ( 5106 Ttps )• Japan Earthquake ( 6939 Ttps )• Paraghay’s football penalty shootout win over Brazil in the Copa America quarter-final peaked at 7166 Ttps• Same day U.S match win in the FIFA women’s world cup -> 7196 Ttps• Singer Beyonce’s pregnancy announcement (8868 Ttps ) ©copyright Ankur Raina 2012
  15. 15. • In-Motion Analytics ( Streams Computing )• Using At Rest ( BigInsights) ©copyright Ankur Raina 2012
  16. 16. HADOOP• Creator: Doug Cutting• Top-level Apache Project.• Inspired by Google’s work on it GFS ( Google File System ).• Function-to-data model & not data-to- function model. ©copyright Ankur Raina 2012
  17. 17. ©copyright Ankur Raina 2012
  18. 18. Hadoop HadoopHDFS Map Reduce Common Components ©copyright Ankur Raina 2012
  19. 19. Hadoop Distributed File System• Data broken into blocks & distributed throughout the cluster.• Data locality.• Mean Time To Failure ( MTTF )• Block size ( 64MB default )• Higher block sizes available for longer files to reduce the amount of metadata. ( BigInsights 128 MB )• Redundancy• Name Node server ©copyright Ankur Raina 2012
  20. 20. ©copyright Ankur Raina 2012
  21. 21. Map Reduce• Map job which takes a set of data and converts it into another set of data where individual elements are broken down into tuples.• Reduce job takes the output from a map as input & combines those data tuples into smaller set of tuples. ©copyright Ankur Raina 2012
  22. 22. Map Reduce• Job• Tasks• Job Tracker• Task Tracker Agents• Shuffle• Combiner ©copyright Ankur Raina 2012
  23. 23. ©copyright Ankur Raina 2012
  24. 24. Hadoop Common Components• Set of libraries that support various Hadoop subprojects.• /bin/hdfs dfs <args>Command Functionchmod Changes the permissions for reading & writing to a given file/set of files.chown Changes the owner of a given file/set of filescopyFromLocal Copies a file from the local file system into HDFS ©copyright Ankur Raina 2012
  25. 25. Command FunctioncopyToLocal Copies a file from HDFS to the local file system.cp Copies HDFS files from one directory to another.expunge Empties all files that are in the trash.cat Copies the files to standard output.ls Displays a listing of files in a given directory.mkdir Creates a directory in HDFS.mv Moves files from one directory to another.rm Deletes a file 7 sends it to the trash. ( use –skiptrash option for deleting permanently). ©copyright Ankur Raina 2012
  26. 26. References• www.ibm.com• www.hadoop.apache.org• Understanding Big Data by Chris, Dirk, Tom, George & Paul ( McGraw Hill )• Oracle Magazine ©copyright Ankur Raina 2012

×