Faiz ul haque Zeya
MS CS University of Tulsa,OK,USA
Topics covered











1. Introduction
2.Bigdata: how big it is
3.Bigdata Technology.
4. Few examples of Big ...
Introduction
 Large set of data. Site of peta byte, exa byte.
 Not stored relational.
 Massive scale computational.
 N...
How large it is

 Peta byte 10^15
 Zetta byte 10^21

Exabyte 10^ 18

 Google processed about 24 petabytes of data per d...
BigData Technologies
 Relational database,SQL queries cannot handle such

amount of data.
 Therefore other technologies ...
Few examples of Big Data
 Airplane reservation system.
 Google Translate.
 Netflix Movie recommendation
 Amazon Book r...
Airline reservation system
 Oren Etzioni of Washington ‘s venture capital based






startup Farecast.
It predicts b...
GOOGLE Translate
 Whole internet as training data.Corpus
 Google release Trillion word corpus in 2009.
 They accept mes...
Netflix Million $ prize
 Netflix announced to award 1M$ prize for the team

who improves the recommendation algorithm by ...
Amazon’s recommendation
 Amazon uses item to item recommendation instead of

traditional collaborative recommendation.
 ...
Map Reduce
 Q&A
Big data introduction
Upcoming SlideShare
Loading in...5
×

Big data introduction

587

Published on

Introduction of what bid data is to beginners.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
587
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
28
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Big data introduction

  1. 1. Faiz ul haque Zeya MS CS University of Tulsa,OK,USA
  2. 2. Topics covered           1. Introduction 2.Bigdata: how big it is 3.Bigdata Technology. 4. Few examples of Big Data. 5. Airline reservation system 6. Google Translate. 7.Amazon recommendation. 8. Netflix recommendation. 9. Hadoop, Map reduce. 10. Q&A.
  3. 3. Introduction  Large set of data. Site of peta byte, exa byte.  Not stored relational.  Massive scale computational.  NO SQL queries.  New technology like MAP REDUCE,HADOOP.  Reason: Scalability and poor performance on large scale.
  4. 4. How large it is  Peta byte 10^15  Zetta byte 10^21 Exabyte 10^ 18  Google processed about 24 petabytes of data per day in 2009.[  Yahoo stores 2 petabytes of data on behavior.  eBay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising.
  5. 5. BigData Technologies  Relational database,SQL queries cannot handle such amount of data.  Therefore other technologies are requried  MAP REDUCE parallel computation.
  6. 6. Few examples of Big Data  Airplane reservation system.  Google Translate.  Netflix Movie recommendation  Amazon Book recommendation
  7. 7. Airline reservation system  Oren Etzioni of Washington ‘s venture capital based     startup Farecast. It predicts based on past data whether airline prices will go up or down. Etzioni uses predictive model for that. Microsoft purchase it for 110 M $ Make it part of BING search engine.
  8. 8. GOOGLE Translate  Whole internet as training data.Corpus  Google release Trillion word corpus in 2009.  They accept messy data.  Candide uses 3 million translated sentences.  Google uses billions of pages from intenet.
  9. 9. Netflix Million $ prize  Netflix announced to award 1M$ prize for the team who improves the recommendation algorithm by 5%.  They are movie recommender.  Most of the sales are due to recommendations from the site.  Reason is that so many shows that the user don’t even know.
  10. 10. Amazon’s recommendation  Amazon uses item to item recommendation instead of traditional collaborative recommendation.  Item to item recommendation search for similar items rather than similar users.  This approach is scalable to large data set.
  11. 11. Map Reduce
  12. 12.  Q&A
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×