SlideShare a Scribd company logo
1 of 13
Faiz ul haque Zeya
MS CS University of Tulsa,OK,USA
Topics covered











1. Introduction
2.Bigdata: how big it is
3.Bigdata Technology.
4. Few examples of Big Data.
5. Airline reservation system
6. Google Translate.
7.Amazon recommendation.
8. Netflix recommendation.
9. Hadoop, Map reduce.
10. Q&A.
Introduction
 Large set of data. Site of peta byte, exa byte.
 Not stored relational.
 Massive scale computational.
 NO SQL queries.

 New technology like MAP REDUCE,HADOOP.
 Reason: Scalability and poor performance on large

scale.
How large it is

 Peta byte 10^15
 Zetta byte 10^21

Exabyte 10^ 18

 Google processed about 24 petabytes of data per day in

2009.[
 Yahoo stores 2 petabytes of data on behavior.
 eBay.com uses two data warehouses at 7.5 petabytes
and 40PB as well as a 40PB Hadoop cluster for search,
consumer recommendations, and merchandising.
BigData Technologies
 Relational database,SQL queries cannot handle such

amount of data.
 Therefore other technologies are requried
 MAP REDUCE parallel computation.
Few examples of Big Data
 Airplane reservation system.
 Google Translate.
 Netflix Movie recommendation
 Amazon Book recommendation
Airline reservation system
 Oren Etzioni of Washington ‘s venture capital based






startup Farecast.
It predicts based on past data whether airline prices
will go up or down.
Etzioni uses predictive model for that.
Microsoft purchase it for 110 M $
Make it part of BING search engine.
GOOGLE Translate
 Whole internet as training data.Corpus
 Google release Trillion word corpus in 2009.
 They accept messy data.
 Candide uses 3 million translated sentences.

 Google uses billions of pages from intenet.
Netflix Million $ prize
 Netflix announced to award 1M$ prize for the team

who improves the recommendation algorithm by 5%.
 They are movie recommender.
 Most of the sales are due to recommendations from
the site.
 Reason is that so many shows that the user don’t even
know.
Amazon’s recommendation
 Amazon uses item to item recommendation instead of

traditional collaborative recommendation.
 Item to item recommendation search for similar items
rather than similar users.
 This approach is scalable to large data set.
Map Reduce
 Q&A

More Related Content

Similar to MS CS Faiz ul haque Zeya covers Big Data topics

Machine Learning on Big Data with HADOOP
Machine Learning on Big Data with HADOOPMachine Learning on Big Data with HADOOP
Machine Learning on Big Data with HADOOPEPAM Systems
 
Ps Appliance Overview
Ps Appliance OverviewPs Appliance Overview
Ps Appliance Overviewtvstay
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Eli White
 
Improve your Tech Quotient
Improve your Tech QuotientImprove your Tech Quotient
Improve your Tech QuotientTarence DSouza
 
BigData Meets the Federal Data Center
BigData Meets the Federal Data CenterBigData Meets the Federal Data Center
BigData Meets the Federal Data CenterAbe Usher
 
Hadoop Webinar 28July15
Hadoop Webinar 28July15Hadoop Webinar 28July15
Hadoop Webinar 28July15Edureka!
 
Is It A Right Time For Me To Learn Hadoop. Find out ?
Is It A Right Time For Me To Learn Hadoop. Find out ?Is It A Right Time For Me To Learn Hadoop. Find out ?
Is It A Right Time For Me To Learn Hadoop. Find out ?Edureka!
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryAli Dasdan
 
Big data analytics 1
Big data analytics 1Big data analytics 1
Big data analytics 1gauravsc36
 
Hadoop for beginners free course ppt
Hadoop for beginners   free course pptHadoop for beginners   free course ppt
Hadoop for beginners free course pptNjain85
 
Big Data Analytics from a Practitioners View
Big Data Analytics from a Practitioners ViewBig Data Analytics from a Practitioners View
Big Data Analytics from a Practitioners ViewRaghu Kashyap
 

Similar to MS CS Faiz ul haque Zeya covers Big Data topics (20)

Not Your Mom's SEO
Not Your Mom's SEONot Your Mom's SEO
Not Your Mom's SEO
 
A Big Data Concept
A Big Data ConceptA Big Data Concept
A Big Data Concept
 
HadoopWorkshopJuly2014
HadoopWorkshopJuly2014HadoopWorkshopJuly2014
HadoopWorkshopJuly2014
 
Machine Learning on Big Data with HADOOP
Machine Learning on Big Data with HADOOPMachine Learning on Big Data with HADOOP
Machine Learning on Big Data with HADOOP
 
Ps Appliance Overview
Ps Appliance OverviewPs Appliance Overview
Ps Appliance Overview
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011
 
Big Data
Big DataBig Data
Big Data
 
Improve your Tech Quotient
Improve your Tech QuotientImprove your Tech Quotient
Improve your Tech Quotient
 
ANALYTICS OF DATA USING HADOOP-A REVIEW
ANALYTICS OF DATA USING HADOOP-A REVIEWANALYTICS OF DATA USING HADOOP-A REVIEW
ANALYTICS OF DATA USING HADOOP-A REVIEW
 
BigData Meets the Federal Data Center
BigData Meets the Federal Data CenterBigData Meets the Federal Data Center
BigData Meets the Federal Data Center
 
Hadoop Webinar 28July15
Hadoop Webinar 28July15Hadoop Webinar 28July15
Hadoop Webinar 28July15
 
Is It A Right Time For Me To Learn Hadoop. Find out ?
Is It A Right Time For Me To Learn Hadoop. Find out ?Is It A Right Time For Me To Learn Hadoop. Find out ?
Is It A Right Time For Me To Learn Hadoop. Find out ?
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st century
 
Big data analytics 1
Big data analytics 1Big data analytics 1
Big data analytics 1
 
Big data business case
Big data   business caseBig data   business case
Big data business case
 
L21 Big Data and Analytics
L21 Big Data and AnalyticsL21 Big Data and Analytics
L21 Big Data and Analytics
 
Hadoop for beginners free course ppt
Hadoop for beginners   free course pptHadoop for beginners   free course ppt
Hadoop for beginners free course ppt
 
Big Data Analytics from a Practitioners View
Big Data Analytics from a Practitioners ViewBig Data Analytics from a Practitioners View
Big Data Analytics from a Practitioners View
 
Web 3.0 Emerging
Web 3.0 EmergingWeb 3.0 Emerging
Web 3.0 Emerging
 

More from Faiz Zeya

structureformal1.ppt
structureformal1.pptstructureformal1.ppt
structureformal1.pptFaiz Zeya
 
elementsofZ.pptx
elementsofZ.pptxelementsofZ.pptx
elementsofZ.pptxFaiz Zeya
 
lec3forma.pptx
lec3forma.pptxlec3forma.pptx
lec3forma.pptxFaiz Zeya
 
FOLBUKCFAIZ.pptx
FOLBUKCFAIZ.pptxFOLBUKCFAIZ.pptx
FOLBUKCFAIZ.pptxFaiz Zeya
 
Word2vec-08032022-012238pm (1).pptx
Word2vec-08032022-012238pm (1).pptxWord2vec-08032022-012238pm (1).pptx
Word2vec-08032022-012238pm (1).pptxFaiz Zeya
 
Code completion using OpenAI APIs.pptx
Code completion using OpenAI APIs.pptxCode completion using OpenAI APIs.pptx
Code completion using OpenAI APIs.pptxFaiz Zeya
 
Types of machine learning.pptx
Types of machine learning.pptxTypes of machine learning.pptx
Types of machine learning.pptxFaiz Zeya
 
Linear algebraweek2
Linear algebraweek2Linear algebraweek2
Linear algebraweek2Faiz Zeya
 
Query expansion for search improvement by faizulhaque
Query expansion for search improvement by faizulhaque Query expansion for search improvement by faizulhaque
Query expansion for search improvement by faizulhaque Faiz Zeya
 

More from Faiz Zeya (9)

structureformal1.ppt
structureformal1.pptstructureformal1.ppt
structureformal1.ppt
 
elementsofZ.pptx
elementsofZ.pptxelementsofZ.pptx
elementsofZ.pptx
 
lec3forma.pptx
lec3forma.pptxlec3forma.pptx
lec3forma.pptx
 
FOLBUKCFAIZ.pptx
FOLBUKCFAIZ.pptxFOLBUKCFAIZ.pptx
FOLBUKCFAIZ.pptx
 
Word2vec-08032022-012238pm (1).pptx
Word2vec-08032022-012238pm (1).pptxWord2vec-08032022-012238pm (1).pptx
Word2vec-08032022-012238pm (1).pptx
 
Code completion using OpenAI APIs.pptx
Code completion using OpenAI APIs.pptxCode completion using OpenAI APIs.pptx
Code completion using OpenAI APIs.pptx
 
Types of machine learning.pptx
Types of machine learning.pptxTypes of machine learning.pptx
Types of machine learning.pptx
 
Linear algebraweek2
Linear algebraweek2Linear algebraweek2
Linear algebraweek2
 
Query expansion for search improvement by faizulhaque
Query expansion for search improvement by faizulhaque Query expansion for search improvement by faizulhaque
Query expansion for search improvement by faizulhaque
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

MS CS Faiz ul haque Zeya covers Big Data topics

  • 1. Faiz ul haque Zeya MS CS University of Tulsa,OK,USA
  • 2. Topics covered           1. Introduction 2.Bigdata: how big it is 3.Bigdata Technology. 4. Few examples of Big Data. 5. Airline reservation system 6. Google Translate. 7.Amazon recommendation. 8. Netflix recommendation. 9. Hadoop, Map reduce. 10. Q&A.
  • 3. Introduction  Large set of data. Site of peta byte, exa byte.  Not stored relational.  Massive scale computational.  NO SQL queries.  New technology like MAP REDUCE,HADOOP.  Reason: Scalability and poor performance on large scale.
  • 4. How large it is  Peta byte 10^15  Zetta byte 10^21 Exabyte 10^ 18  Google processed about 24 petabytes of data per day in 2009.[  Yahoo stores 2 petabytes of data on behavior.  eBay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising.
  • 5.
  • 6. BigData Technologies  Relational database,SQL queries cannot handle such amount of data.  Therefore other technologies are requried  MAP REDUCE parallel computation.
  • 7. Few examples of Big Data  Airplane reservation system.  Google Translate.  Netflix Movie recommendation  Amazon Book recommendation
  • 8. Airline reservation system  Oren Etzioni of Washington ‘s venture capital based     startup Farecast. It predicts based on past data whether airline prices will go up or down. Etzioni uses predictive model for that. Microsoft purchase it for 110 M $ Make it part of BING search engine.
  • 9. GOOGLE Translate  Whole internet as training data.Corpus  Google release Trillion word corpus in 2009.  They accept messy data.  Candide uses 3 million translated sentences.  Google uses billions of pages from intenet.
  • 10. Netflix Million $ prize  Netflix announced to award 1M$ prize for the team who improves the recommendation algorithm by 5%.  They are movie recommender.  Most of the sales are due to recommendations from the site.  Reason is that so many shows that the user don’t even know.
  • 11. Amazon’s recommendation  Amazon uses item to item recommendation instead of traditional collaborative recommendation.  Item to item recommendation search for similar items rather than similar users.  This approach is scalable to large data set.