Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Big Data

689 views

Published on

A primer slide for Big Data. Talks Basics. Gives Pointers.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Introduction to Big Data

  1. 1. Big Data Introduction
  2. 2. Practical Examples • Auto Suggestion in Google Search • Google Translation • Loui Von Ahn’s Re-CAPTCHA • Completely Automated Public Turing Test to tell Computers and Human Apart
  3. 3. Practical Examples • Auto Tagging Photos • Facebook • Google Plus • Google Image Search • Twitter Follow Suggestions • Flipkart, Amazon Product Recommendations
  4. 4. BigData in Sports • Cricket • Duckworth Lewis System • Resources (wickets left and overs left) • Target • ODI: Runs at the end of 50 overs • = double the runs @ 30 overs • = 2.5 times runs @ 25 overs • BaseBall • Billy Beane and Paul DePodesta putting together a strong baseball team with underdogs for 2002 American League • ‘Moneyball’ book and movie
  5. 5. So, what is in it? • Frame a question, analyze a large set of data to find patterns and make predictions which serves as a possible answer to the question. • Improve upon.
  6. 6. What is the most important component in BigData Analysis? The DATA
  7. 7. What Else? • Large Computing Power • Efficient Algorithms • Example: MapReduce • File System tuned for large scale data • HDFS • Hive Data warehouse • Statistical Analysis • Correllation • Regression • Clustering • Principal Component Analysis • Discriminant Analysis • Queuing • ANOVA • Hypothesis Testing • Optimization Techniques • Linear Programming • Mixed Integer Programming • Constraint Programming MATHEMATICS
  8. 8. Closely Related Topics • Artificial Intelligence • Machine Learning • Natural Language Processing • Learning Syntax and Semantics of Human Languages • Data Analysis • Algorithms
  9. 9. Turing Test • Alan Turing (considered Father of Artificial Intelligence) proposed a question “Can Machines Think?” in his 1950 research paper • In a better way, if a question is asked, can a computer imitate a human being and deceive the person who asked the question, by making him believe the answer has come from a human being instead of a computer. • Rock Paper Scissors Game • By Machine: http://www.nytimes.com/interactive/science/rock- paper-scissors.html?_r=0
  10. 10. Difficult Tasks in Machine Learning • Judgement • Taking decision with hypothetically contradictory outcomes. • Responding to a new situation • Natural Language Processing • Understanding Syntax, Semantics • Responding to Emotional Tones, Different Accents • Imagination • So, naturally story narration task is very difficult • Concept of Truth / Good or Bad / Philosophy / Principles / Ethics
  11. 11. In other words, • Big Data Analysis - AI - Machine Learning • Imparts “intelligence” into the computer. • Makes it learn, one at a time. • Improve the learning with new inputs and possible / expected / realization of actual outputs.
  12. 12. Suggestions: • Online Learning / Courses • MOOC – Massively Open Online Courses • Coursera • edX • Udacity • Khan Academy • Books / Movies • Big Data: A Revolution That Will Transform How We Live, Work and Think • Viktor Mayer-Schonberger and Kenneth Cukier • The Robot – 2010 Hindi / Tamil Movie • Money Ball – 2011 Hollywood Movie (or 2003 book by Michael Lewis) • I Robot – 2004 Hollywood Movie
  13. 13. Where is this Content? • http://www.slideshare.net/arunramatma

×