SlideShare a Scribd company logo
1 of 25
V
    #bigdataMY
V
     olume

  elocity

ariety




             #bigdataMY
#bigdataMY
#bigdataMY
Feeds and notifications
Insights
Recommendation & Matching
Security
Monitoring & Reporting
Event logging



                            #bigdataMY
Feeds and notifications
Insights                    Change detection
Recommendation & Matching    Change reaction
Security                               Audit
Monitoring & Reporting
Event logging



                                   #bigdataMY
Get ahead of the curve

                                                     Noise


                                                       Ø
“Normal”

                       Ø
           Ø     Ø
      Ø
           Ø     Ø
     Ø
             Ø   Ø
                  ØØ             Ø
         Ø        Ø                          Ø
                                     Ø
                            Ø
                                Ø            Ø
                                         Ø
                                 ØØ
                            Ø                                       Ø Ø Ø
                                      Ø      Ø
                             Ø                                      Ø Ø
                                                                           Ø
                 “Normal”                                              Ø Ø
                                                                     Ø Ø
                                                                                 “Normal”
                                                                            Ø
                                                                          Ø



                                     Ø




                                Noise




                                                 [J Gama, University of Porto]              #bigdataMY
Get ahead of the curve

                                                     Noise
                                                                                   “New concept”
                                                       Ø
                                                                                        Ø     Ø Ø
“Normal”                                                                                      Ø Ø
                                                                                               Ø
                       Ø                                                                 Ø
           Ø     Ø
      Ø
           Ø     Ø
     Ø
             Ø   Ø
                  ØØ             Ø
         Ø        Ø
                                     Ø
                                             Ø
                                                        “Concept
                            Ø
                                Ø
                                         Ø
                                             Ø           drift”
                                 ØØ
                                                             Ø         Ø    Ø
                            Ø
                                      Ø      Ø
                             Ø                               Ø Ø
                                                                Ø        Ø Ø       “Normal”
                 “Normal”                                     Ø         Ø Ø Ø



                                     Ø
                                                                           “Big Data is much more likely to catch the
                                                                                          black swan as it swoops in”
                                Noise                                                        - Norman Nie, Revolution Analytics




                                                 [J Gama, University of Porto]                           #bigdataMY
Acunu Analytics




                  #bigdataMY
#bigdataMY
UserID
EMEA
 UK
  London
    N1
     Female
       16-21 year old
     16-21 year old
       Female
  16-21 year old
    Female
     London

                   #bigdataMY
V   Under the hood


      21:00   all = 1345    :00 = 45      :01 = 62     ...


      22:00   all = 3221    :00 = 22      :01 = 19     ...


       ...                                             ...


      UK      all = 228    user01 = 1    user14 = 12   ...


       US     all = 354    user01 = 15   user14 = 0    ...


      MY       all = 28    user01 = 0    user02 = 0    ...


       ...




                                                #bigdataMY
V                           Under the hood


                              21:00    all = 1345       :00 = 45       :01 = 62     ...


                             22:00    all = 3221 +1     :00 = 22      :01 = 19 +1   ...

{
    cust_id:      user01,      ...                                                  ...
    session_id:   102,
    geography:    UK,
                              UK      all = 228 +1    user01 = 1 +1   user14 = 12   ...
    browser:      IE,
    time:         22:01,
}                              US       all = 354      user01 = 15    user14 = 0    ...


                              MY        all = 28       user01 = 0     user02 = 0    ...


                               ...




                                                                             #bigdataMY
V
where time 21:00 - 22:00
  count(*)
                           Under the hood


                             21:00   all = 1345    :00 = 45      :01 = 62     ...


                             22:00   all = 3221    :00 = 22      :01 = 19     ...


                              ...                                             ...


                             UK       all = 228   user01 = 1    user14 = 12   ...


                              US      all = 354   user01 = 15   user14 = 0    ...


                             MY       all = 28    user01 = 0    user02 = 0    ...


                              ...




                                                                       #bigdataMY
V
where time 21:00 - 23:00
  count(*)
                           Under the hood


                             21:00   all = 1345    :00 = 45      :01 = 62     ...


                             22:00   all = 3221    :00 = 22      :01 = 19     ...


                              ...                                             ...


                             UK       all = 228   user01 = 1    user14 = 12   ...


                              US      all = 354   user01 = 15   user14 = 0    ...


                             MY       all = 28    user01 = 0    user02 = 0    ...


                              ...




                                                                       #bigdataMY
Little Trouble with Big Disks

                                #bigdataMY
COTS Journal, 2008




                     #bigdataMY
V
where time 21:00 - 23:00
  count(*)
                           Under the hood


                             21:00   all = 1345    :00 = 45      :01 = 62     ...


                             22:00   all = 3221    :00 = 22      :01 = 19     ...


                              ...                                             ...


                             UK       all = 228   user01 = 1    user14 = 12   ...


                              US      all = 354   user01 = 15   user14 = 0    ...


                             MY       all = 28    user01 = 0    user02 = 0    ...


                              ...




                                                                       #bigdataMY
Streaming algorithms

        A = [a1, a2, a3, a4, a5]
mean(A) = sum it up / number of things




                                     #bigdataMY
Streaming algorithms

        A = [a1, a2, a3, a4, a5]
mean(A) = sum it up / number of things

    now add another item a6...???




                                     #bigdataMY
Streaming algorithms

        A = [a1, a2, a3, a4, a5]
mean(A) = sum it up / number of things

    now add another item a6...???
          sum = sum + a6
       inc(number of things)




                                     #bigdataMY
Streaming algorithms

        A = [a1, a2, a3, a4, a5]
mean(A) = sum it up / number of things

    now add another item a6...???
          sum = sum + a6
       inc(number of things)


        try this with median?



                                     #bigdataMY
V     Realtime tradeoffs




            ity
        loc


                    Ad
      -ve



                      -ho
       gh




                         c
    Hi



            High-volume



                             #bigdataMY
V                     Conclusion



    Big Data also about the Little Things, done fast.

               The devil is in the details.

                  Make it accessible.




                                                #bigdataMY
V
    Q?
         #bigdataMY

More Related Content

Featured

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

V

  • 1. V #bigdataMY
  • 2. V olume elocity ariety #bigdataMY
  • 5. Feeds and notifications Insights Recommendation & Matching Security Monitoring & Reporting Event logging #bigdataMY
  • 6. Feeds and notifications Insights Change detection Recommendation & Matching Change reaction Security Audit Monitoring & Reporting Event logging #bigdataMY
  • 7. Get ahead of the curve Noise Ø “Normal” Ø Ø Ø Ø Ø Ø Ø Ø Ø ØØ Ø Ø Ø Ø Ø Ø Ø Ø Ø ØØ Ø Ø Ø Ø Ø Ø Ø Ø Ø Ø “Normal” Ø Ø Ø Ø “Normal” Ø Ø Ø Noise [J Gama, University of Porto] #bigdataMY
  • 8. Get ahead of the curve Noise “New concept” Ø Ø Ø Ø “Normal” Ø Ø Ø Ø Ø Ø Ø Ø Ø Ø Ø Ø Ø ØØ Ø Ø Ø Ø Ø “Concept Ø Ø Ø Ø drift” ØØ Ø Ø Ø Ø Ø Ø Ø Ø Ø Ø Ø Ø “Normal” “Normal” Ø Ø Ø Ø Ø “Big Data is much more likely to catch the black swan as it swoops in” Noise - Norman Nie, Revolution Analytics [J Gama, University of Porto] #bigdataMY
  • 9. Acunu Analytics #bigdataMY
  • 11. UserID EMEA UK London N1 Female 16-21 year old 16-21 year old Female 16-21 year old Female London #bigdataMY
  • 12. V Under the hood 21:00 all = 1345 :00 = 45 :01 = 62 ... 22:00 all = 3221 :00 = 22 :01 = 19 ... ... ... UK all = 228 user01 = 1 user14 = 12 ... US all = 354 user01 = 15 user14 = 0 ... MY all = 28 user01 = 0 user02 = 0 ... ... #bigdataMY
  • 13. V Under the hood 21:00 all = 1345 :00 = 45 :01 = 62 ... 22:00 all = 3221 +1 :00 = 22 :01 = 19 +1 ... { cust_id: user01, ... ... session_id: 102, geography: UK, UK all = 228 +1 user01 = 1 +1 user14 = 12 ... browser: IE, time: 22:01, } US all = 354 user01 = 15 user14 = 0 ... MY all = 28 user01 = 0 user02 = 0 ... ... #bigdataMY
  • 14. V where time 21:00 - 22:00 count(*) Under the hood 21:00 all = 1345 :00 = 45 :01 = 62 ... 22:00 all = 3221 :00 = 22 :01 = 19 ... ... ... UK all = 228 user01 = 1 user14 = 12 ... US all = 354 user01 = 15 user14 = 0 ... MY all = 28 user01 = 0 user02 = 0 ... ... #bigdataMY
  • 15. V where time 21:00 - 23:00 count(*) Under the hood 21:00 all = 1345 :00 = 45 :01 = 62 ... 22:00 all = 3221 :00 = 22 :01 = 19 ... ... ... UK all = 228 user01 = 1 user14 = 12 ... US all = 354 user01 = 15 user14 = 0 ... MY all = 28 user01 = 0 user02 = 0 ... ... #bigdataMY
  • 16. Little Trouble with Big Disks #bigdataMY
  • 17. COTS Journal, 2008 #bigdataMY
  • 18. V where time 21:00 - 23:00 count(*) Under the hood 21:00 all = 1345 :00 = 45 :01 = 62 ... 22:00 all = 3221 :00 = 22 :01 = 19 ... ... ... UK all = 228 user01 = 1 user14 = 12 ... US all = 354 user01 = 15 user14 = 0 ... MY all = 28 user01 = 0 user02 = 0 ... ... #bigdataMY
  • 19. Streaming algorithms A = [a1, a2, a3, a4, a5] mean(A) = sum it up / number of things #bigdataMY
  • 20. Streaming algorithms A = [a1, a2, a3, a4, a5] mean(A) = sum it up / number of things now add another item a6...??? #bigdataMY
  • 21. Streaming algorithms A = [a1, a2, a3, a4, a5] mean(A) = sum it up / number of things now add another item a6...??? sum = sum + a6 inc(number of things) #bigdataMY
  • 22. Streaming algorithms A = [a1, a2, a3, a4, a5] mean(A) = sum it up / number of things now add another item a6...??? sum = sum + a6 inc(number of things) try this with median? #bigdataMY
  • 23. V Realtime tradeoffs ity loc Ad -ve -ho gh c Hi High-volume #bigdataMY
  • 24. V Conclusion Big Data also about the Little Things, done fast. The devil is in the details. Make it accessible. #bigdataMY
  • 25. V Q? #bigdataMY