SlideShare a Scribd company logo
1 of 24
Download to read offline
Big Data
me@zynick.com
26th Dec 2013
Google Flu Trends Prediction (2008)
●

Epidemiologists use early detection of disease outbreak to reduce number
of people affected

●

CDC (Centers of Disease Control and Prevention) collects Influenza-like
Illness (ILI) from its surveillance network and from its surveillance network
and publishes weekly
Google Flu Trends Prediction (2008)
●
Hurricane in 2004
Hurricane in 2004

Result: 7 times their normal sales rate!
Grammar Checking
(Machine Learning) Algorithms
● Improve algorithm? Or pump in more data
● Testing
○ 1 million, 10 million, 100 million, 1 billion data

● Result
○ Worst algorithm perform better when it has billion
data
■ Accuracy rate from 75% to 95%
○ Best algorithm perform worst when it has billion data
■ Accuracy rate from 85% to 94%
Farecast.com (2006)
● Flight Price Prediction
○ Model had no understanding of why, only what.

●
●
●
●

Accuracy of 74.5%
Average $50 saving per Ticket
$10 million in potential customer savings
Acquired by Microsoft
○ Bing.com/travel

http://www.prnewswire.com/news-releases/farecast-launches-new-tools-to-help-savvy-travelers-catchelusive-airfare-price-drops-this-summer-58165652.html
Decide.com (2011)
● Analyzing 4 Millions Product Using 25 Billion
Price Observation
○ Identifies data that people had never been able to
‘see’ before, i.e. prices might temporarily increase
for older models once new ones are introduced

●
●
●
●

Price prediction 77% accurate
Average savings $87 per product
Total savings $72 million+
Acquired by Ebay

[1]http://techcrunch.com/2012/05/03/decide-com-brings-its-price-comparisons-to-ipad-reveals-plansto-expand-to-household-goods-cars/
[2]http://newbooksinbrief.com/2013/03/21/31-a-summary-of-big-data-a-revolution-that-will-transformhow-we-live-work-and-think-by-viktor-mayer-schonberger-and-kenneth-cukier/
UPS
● Use geo local data in multiple ways
○ Sensors, wireless modules, gps
○ Predict engine trouble
○ Know the truck whereabouts (in case of delays)

● Monitor employees
● Scrutinize itenary to optimie route
● Result (2011):
○ 30m miles, 3m gallon of fuel saving

● Safety efficiency, few turns, which tends to
lead to accidents, waste time, consume
more fuels when struck in jam
Pregnancy Prediction
● Shopping behavior is about to change explore for new brands and loyalty
● Baby gift registry, lotions (@ 3rd month),
supplement (magnesium, calcium, zinc, etc)
● Pregnancy Prediction Score
● Sends coupon

* http://icebreakerconsulting.com/target-predicts-pregnancy-with-big-data
Geo Local Data
● Targeted advertising on where he is located,
or where he is to go
● Aggregated to reveal trend
● Detects traffic jam without seeing the car number speed of smartphone travel in
highway
● Estimate how many protesters turn out at a
demonstration
Data Reuse (Secondary Usage)
● Google Street View
○ Primary Usage: Street View
○ Secondary Usage: Collecting Geo Local Data, Open
Wifi Connection to improve GPS Location

● Amazon
○ Primary Usage: Sales
○ Secondary Usage: Book Recommendation
Values of Big Data
●
●
●
●

Data can be grabbed easily and cheaply
What > Why (corrrelation vs causation)
Traditional Sampling (n), Big Data (n=ALL)
Quantification > Qualification
Values of Big Data
● Data Driven
○ Less Bias
○ More Accurate
○ Faster Result

● Pattern Prediction
○ Saves lives
○ Predict problem and correct them before the user
realize there were something wrong
Big Data 3 Major Shift
● Ability to analyze vast amount of data
about a topic rather than settle for a smaller
set
● Willingness to embrace data of messiness
rather than privilege exactitude
● Growing respect correlation vs continue
quest of causality
Correlation vs Causation
● Cause → Effect
● Correlation → Effect
○ Correlation → Cause? Optional

● Chris Anderson
○ Big Data make Science Method Obsolete
○ “With enough data, the numbers speak for
themselves”

* http://www.wired.com/science/discoveries/magazine/16-07/pb_theory
Is Correlation Good Enough?
It Depends.
“For many everyday needs, knowing what not why is good
enough.” The book is full of such examples from making
better diagnostic decisions when caring for premature
babies to which flavor Pop-Tarts to stock at the front of the
Walmart store before a hurricane. Big data can help answer
these questions, but they never required “knowing why.”
Big data analysis can be about correlations OR causation—
it all depends, as it has always been, on what question we
are asking, what problem we are solving, and what goal
we are trying to achieve.
Is Correlation Good Enough?
“If millions of electronic medical records reveal that cancer
sufferers who take a certain combination of aspirin and
orange juice see their disease go into remission, then the
exact cause for the improvement in health may be less
important than the fact that they lived. Likewise, if we
can save money by knowing the best time to buy a plane
ticket without understanding the method behind airfare
madness, that’s good enough.”
Risk (The Dark Side of Big Data)
● Privacy Invasion
○ Viewing Data in a Lower Level
○ NSA, GCHQ
○ Dangerous when falls into the wrong hands

● Minority Report (2002)
○ “If we hold people responsible for predicted future
acts, ones they may never commit, we also deny
that humans have a capacity for moral choice.”
Embracing Big Data
● Data
● Skills
● Ideas (Big Data Mindset)
Things to Aware Of
● Data Validity
○ books you read 10 years ago may not be applicable
for amazon recommendation anymore
Questions?
Read the book.
End.
me@zynick.com

More Related Content

Similar to Big Data

Big Data Berlin 2019 | Data Research vs Data Privacy: The New Battlefield in ...
Big Data Berlin 2019 | Data Research vs Data Privacy: The New Battlefield in ...Big Data Berlin 2019 | Data Research vs Data Privacy: The New Battlefield in ...
Big Data Berlin 2019 | Data Research vs Data Privacy: The New Battlefield in ...Dataconomy Media
 
Big Data in Disease Management
Big Data in Disease ManagementBig Data in Disease Management
Big Data in Disease ManagementInterpretOmics
 
SDNC13 -Day1- The Danger of Big Data by Kerry Bodine
SDNC13 -Day1- The Danger of Big Data by Kerry BodineSDNC13 -Day1- The Danger of Big Data by Kerry Bodine
SDNC13 -Day1- The Danger of Big Data by Kerry BodineService Design Network
 
Big data hype (and reality)
Big data hype (and reality)Big data hype (and reality)
Big data hype (and reality)Shesha
 
Webinar: Analytics as Your Business Edge
Webinar: Analytics as Your Business EdgeWebinar: Analytics as Your Business Edge
Webinar: Analytics as Your Business EdgeWSO2
 
Revolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfRevolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfOmar Maher
 
The REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on PrivacyThe REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on PrivacyClaudiu Popa
 
Data analytics and the power of creating social impact
Data analytics and the power of creating social impactData analytics and the power of creating social impact
Data analytics and the power of creating social impactTA Telecom
 
Data Storytelling - Game changer for Analytics
Data Storytelling - Game changer for Analytics Data Storytelling - Game changer for Analytics
Data Storytelling - Game changer for Analytics Gramener
 
GSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGang Li
 
AI for humans - the future of your digital self
AI for humans - the future of your digital selfAI for humans - the future of your digital self
AI for humans - the future of your digital selfSpeck&Tech
 
10 ways big data is used in the real world
10 ways big data is used in the real world10 ways big data is used in the real world
10 ways big data is used in the real worldKDR Talent Solutions
 
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020Gramener
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessManojit Nandi
 
Big Data Analytics - GTech Seminar
Big Data Analytics - GTech SeminarBig Data Analytics - GTech Seminar
Big Data Analytics - GTech SeminarBijilash Babu
 
Big Data Analytics - The New Cold War
Big Data Analytics - The New Cold WarBig Data Analytics - The New Cold War
Big Data Analytics - The New Cold WarKunal Dutta
 
Predicting the Future of Predictive Analytics in Healthcare
Predicting the Future of Predictive Analytics in HealthcarePredicting the Future of Predictive Analytics in Healthcare
Predicting the Future of Predictive Analytics in HealthcareDale Sanders
 
The What, Why and How of Big Data
The What, Why and How of Big DataThe What, Why and How of Big Data
The What, Why and How of Big DataLuca Naso
 

Similar to Big Data (20)

Big Data Berlin 2019 | Data Research vs Data Privacy: The New Battlefield in ...
Big Data Berlin 2019 | Data Research vs Data Privacy: The New Battlefield in ...Big Data Berlin 2019 | Data Research vs Data Privacy: The New Battlefield in ...
Big Data Berlin 2019 | Data Research vs Data Privacy: The New Battlefield in ...
 
Big Data in Disease Management
Big Data in Disease ManagementBig Data in Disease Management
Big Data in Disease Management
 
SDNC13 -Day1- The Danger of Big Data by Kerry Bodine
SDNC13 -Day1- The Danger of Big Data by Kerry BodineSDNC13 -Day1- The Danger of Big Data by Kerry Bodine
SDNC13 -Day1- The Danger of Big Data by Kerry Bodine
 
Big data hype (and reality)
Big data hype (and reality)Big data hype (and reality)
Big data hype (and reality)
 
U4 l03 Checking your Assumptions
U4 l03 Checking your AssumptionsU4 l03 Checking your Assumptions
U4 l03 Checking your Assumptions
 
Webinar: Analytics as Your Business Edge
Webinar: Analytics as Your Business EdgeWebinar: Analytics as Your Business Edge
Webinar: Analytics as Your Business Edge
 
Revolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdfRevolutionizing your Business with AI (AUC VLabs).pdf
Revolutionizing your Business with AI (AUC VLabs).pdf
 
The REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on PrivacyThe REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on Privacy
 
Data analytics and the power of creating social impact
Data analytics and the power of creating social impactData analytics and the power of creating social impact
Data analytics and the power of creating social impact
 
Data Storytelling - Game changer for Analytics
Data Storytelling - Game changer for Analytics Data Storytelling - Game changer for Analytics
Data Storytelling - Game changer for Analytics
 
GSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-EditionGSAMPerspectives7-BigData-Edition
GSAMPerspectives7-BigData-Edition
 
AI for humans - the future of your digital self
AI for humans - the future of your digital selfAI for humans - the future of your digital self
AI for humans - the future of your digital self
 
10 ways big data is used in the real world
10 ways big data is used in the real world10 ways big data is used in the real world
10 ways big data is used in the real world
 
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
Storytelling for analytics | Naveen Gattu | CDAO Apex 2020
 
Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairness
 
Big Data Analytics - GTech Seminar
Big Data Analytics - GTech SeminarBig Data Analytics - GTech Seminar
Big Data Analytics - GTech Seminar
 
Big Data Analytics - The New Cold War
Big Data Analytics - The New Cold WarBig Data Analytics - The New Cold War
Big Data Analytics - The New Cold War
 
Big Data-Job 2
Big Data-Job 2Big Data-Job 2
Big Data-Job 2
 
Predicting the Future of Predictive Analytics in Healthcare
Predicting the Future of Predictive Analytics in HealthcarePredicting the Future of Predictive Analytics in Healthcare
Predicting the Future of Predictive Analytics in Healthcare
 
The What, Why and How of Big Data
The What, Why and How of Big DataThe What, Why and How of Big Data
The What, Why and How of Big Data
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

Big Data

  • 2.
  • 3. Google Flu Trends Prediction (2008) ● Epidemiologists use early detection of disease outbreak to reduce number of people affected ● CDC (Centers of Disease Control and Prevention) collects Influenza-like Illness (ILI) from its surveillance network and from its surveillance network and publishes weekly
  • 4. Google Flu Trends Prediction (2008) ●
  • 6. Hurricane in 2004 Result: 7 times their normal sales rate!
  • 7. Grammar Checking (Machine Learning) Algorithms ● Improve algorithm? Or pump in more data ● Testing ○ 1 million, 10 million, 100 million, 1 billion data ● Result ○ Worst algorithm perform better when it has billion data ■ Accuracy rate from 75% to 95% ○ Best algorithm perform worst when it has billion data ■ Accuracy rate from 85% to 94%
  • 8. Farecast.com (2006) ● Flight Price Prediction ○ Model had no understanding of why, only what. ● ● ● ● Accuracy of 74.5% Average $50 saving per Ticket $10 million in potential customer savings Acquired by Microsoft ○ Bing.com/travel http://www.prnewswire.com/news-releases/farecast-launches-new-tools-to-help-savvy-travelers-catchelusive-airfare-price-drops-this-summer-58165652.html
  • 9. Decide.com (2011) ● Analyzing 4 Millions Product Using 25 Billion Price Observation ○ Identifies data that people had never been able to ‘see’ before, i.e. prices might temporarily increase for older models once new ones are introduced ● ● ● ● Price prediction 77% accurate Average savings $87 per product Total savings $72 million+ Acquired by Ebay [1]http://techcrunch.com/2012/05/03/decide-com-brings-its-price-comparisons-to-ipad-reveals-plansto-expand-to-household-goods-cars/ [2]http://newbooksinbrief.com/2013/03/21/31-a-summary-of-big-data-a-revolution-that-will-transformhow-we-live-work-and-think-by-viktor-mayer-schonberger-and-kenneth-cukier/
  • 10. UPS ● Use geo local data in multiple ways ○ Sensors, wireless modules, gps ○ Predict engine trouble ○ Know the truck whereabouts (in case of delays) ● Monitor employees ● Scrutinize itenary to optimie route ● Result (2011): ○ 30m miles, 3m gallon of fuel saving ● Safety efficiency, few turns, which tends to lead to accidents, waste time, consume more fuels when struck in jam
  • 11. Pregnancy Prediction ● Shopping behavior is about to change explore for new brands and loyalty ● Baby gift registry, lotions (@ 3rd month), supplement (magnesium, calcium, zinc, etc) ● Pregnancy Prediction Score ● Sends coupon * http://icebreakerconsulting.com/target-predicts-pregnancy-with-big-data
  • 12. Geo Local Data ● Targeted advertising on where he is located, or where he is to go ● Aggregated to reveal trend ● Detects traffic jam without seeing the car number speed of smartphone travel in highway ● Estimate how many protesters turn out at a demonstration
  • 13. Data Reuse (Secondary Usage) ● Google Street View ○ Primary Usage: Street View ○ Secondary Usage: Collecting Geo Local Data, Open Wifi Connection to improve GPS Location ● Amazon ○ Primary Usage: Sales ○ Secondary Usage: Book Recommendation
  • 14. Values of Big Data ● ● ● ● Data can be grabbed easily and cheaply What > Why (corrrelation vs causation) Traditional Sampling (n), Big Data (n=ALL) Quantification > Qualification
  • 15. Values of Big Data ● Data Driven ○ Less Bias ○ More Accurate ○ Faster Result ● Pattern Prediction ○ Saves lives ○ Predict problem and correct them before the user realize there were something wrong
  • 16. Big Data 3 Major Shift ● Ability to analyze vast amount of data about a topic rather than settle for a smaller set ● Willingness to embrace data of messiness rather than privilege exactitude ● Growing respect correlation vs continue quest of causality
  • 17. Correlation vs Causation ● Cause → Effect ● Correlation → Effect ○ Correlation → Cause? Optional ● Chris Anderson ○ Big Data make Science Method Obsolete ○ “With enough data, the numbers speak for themselves” * http://www.wired.com/science/discoveries/magazine/16-07/pb_theory
  • 18. Is Correlation Good Enough? It Depends. “For many everyday needs, knowing what not why is good enough.” The book is full of such examples from making better diagnostic decisions when caring for premature babies to which flavor Pop-Tarts to stock at the front of the Walmart store before a hurricane. Big data can help answer these questions, but they never required “knowing why.” Big data analysis can be about correlations OR causation— it all depends, as it has always been, on what question we are asking, what problem we are solving, and what goal we are trying to achieve.
  • 19. Is Correlation Good Enough? “If millions of electronic medical records reveal that cancer sufferers who take a certain combination of aspirin and orange juice see their disease go into remission, then the exact cause for the improvement in health may be less important than the fact that they lived. Likewise, if we can save money by knowing the best time to buy a plane ticket without understanding the method behind airfare madness, that’s good enough.”
  • 20. Risk (The Dark Side of Big Data) ● Privacy Invasion ○ Viewing Data in a Lower Level ○ NSA, GCHQ ○ Dangerous when falls into the wrong hands ● Minority Report (2002) ○ “If we hold people responsible for predicted future acts, ones they may never commit, we also deny that humans have a capacity for moral choice.”
  • 21. Embracing Big Data ● Data ● Skills ● Ideas (Big Data Mindset)
  • 22. Things to Aware Of ● Data Validity ○ books you read 10 years ago may not be applicable for amazon recommendation anymore