• Like
Introduction to Big Data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Introduction to Big Data

  • 439 views
Published

A primer slide for Big Data. Talks Basics. Gives Pointers.

A primer slide for Big Data. Talks Basics. Gives Pointers.

Published in Data & Analytics
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
439
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
7
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Big Data Introduction
  • 2. Practical Examples • Auto Suggestion in Google Search • Google Translation • Loui Von Ahn’s Re-CAPTCHA • Completely Automated Public Turing Test to tell Computers and Human Apart
  • 3. Practical Examples • Auto Tagging Photos • Facebook • Google Plus • Google Image Search • Twitter Follow Suggestions • Flipkart, Amazon Product Recommendations
  • 4. BigData in Sports • Cricket • Duckworth Lewis System • Resources (wickets left and overs left) • Target • ODI: Runs at the end of 50 overs • = double the runs @ 30 overs • = 2.5 times runs @ 25 overs • BaseBall • Billy Beane and Paul DePodesta putting together a strong baseball team with underdogs for 2002 American League • ‘Moneyball’ book and movie
  • 5. So, what is in it? • Frame a question, analyze a large set of data to find patterns and make predictions which serves as a possible answer to the question. • Improve upon.
  • 6. What is the most important component in BigData Analysis? The DATA
  • 7. What Else? • Large Computing Power • Efficient Algorithms • Example: MapReduce • File System tuned for large scale data • HDFS • Hive Data warehouse • Statistical Analysis • Correllation • Regression • Clustering • Principal Component Analysis • Discriminant Analysis • Queuing • ANOVA • Hypothesis Testing • Optimization Techniques • Linear Programming • Mixed Integer Programming • Constraint Programming MATHEMATICS
  • 8. Closely Related Topics • Artificial Intelligence • Machine Learning • Natural Language Processing • Learning Syntax and Semantics of Human Languages • Data Analysis • Algorithms
  • 9. Turing Test • Alan Turing (considered Father of Artificial Intelligence) proposed a question “Can Machines Think?” in his 1950 research paper • In a better way, if a question is asked, can a computer imitate a human being and deceive the person who asked the question, by making him believe the answer has come from a human being instead of a computer. • Rock Paper Scissors Game • By Machine: http://www.nytimes.com/interactive/science/rock- paper-scissors.html?_r=0
  • 10. Difficult Tasks in Machine Learning • Judgement • Taking decision with hypothetically contradictory outcomes. • Responding to a new situation • Natural Language Processing • Understanding Syntax, Semantics • Responding to Emotional Tones, Different Accents • Imagination • So, naturally story narration task is very difficult • Concept of Truth / Good or Bad / Philosophy / Principles / Ethics
  • 11. In other words, • Big Data Analysis - AI - Machine Learning • Imparts “intelligence” into the computer. • Makes it learn, one at a time. • Improve the learning with new inputs and possible / expected / realization of actual outputs.
  • 12. Suggestions: • Online Learning / Courses • MOOC – Massively Open Online Courses • Coursera • edX • Udacity • Khan Academy • Books / Movies • Big Data: A Revolution That Will Transform How We Live, Work and Think • Viktor Mayer-Schonberger and Kenneth Cukier • The Robot – 2010 Hindi / Tamil Movie • Money Ball – 2011 Hollywood Movie (or 2003 book by Michael Lewis) • I Robot – 2004 Hollywood Movie
  • 13. Where is this Content? • http://www.slideshare.net/arunramatma