Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
De praktijk vanBig Data                            Friso van Vollenhoven                            fvanvollenhoven@xebia....
Big Data
Big Data
Big Data
Big Data                   Requirement:           Full table scan, 200GB table
Big Data
Big Data           Egypte, 27 januari 2011
Big Data                    Requirement:           40.000 updates per seconde, 24/7.
Databases            =   +   +
Databases              =        +    +network                   SAN                  storage
HDFS en MapReduce                 bottleneck                 SELECT SESSION, COUNT(*) FROM                 WEB_CLICKS GROU...
HDFS en MapReduce                 SELECT SESSION, COUNT(*) FROM                 WEB_CLICKS GROUP BY SESSION;              ...
HDFS en MapReduce
HDFS en MapReduceSELECT * FROM WEB_CLICKS;SELECT * FROM           SELECT * FROM WEB_CLICKS;             WEB_CLICKS;
HDFS en MapReduce                    GROUP BY SESSION
HDFS en MapReduce               COUNT(*)               COUNT(*)   COUNT(*)
HDFS en MapReduce        MAP     REDUCESELECT * FROM                    COUNT(*) WEB_CLICKS;                              ...
NoSQLindex     A B C D E F G H I   J K L M N O P Q R S T U V W X Y Z
Upcoming SlideShare
Loading in …5
×

120412 oracle big data summit

688 views

Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

120412 oracle big data summit

  1. 1. De praktijk vanBig Data Friso van Vollenhoven fvanvollenhoven@xebia.comEn waarom de huidigetechnologie niet (altijd)voldoet
  2. 2. Big Data
  3. 3. Big Data
  4. 4. Big Data
  5. 5. Big Data Requirement: Full table scan, 200GB table
  6. 6. Big Data
  7. 7. Big Data Egypte, 27 januari 2011
  8. 8. Big Data Requirement: 40.000 updates per seconde, 24/7.
  9. 9. Databases = + +
  10. 10. Databases = + +network SAN storage
  11. 11. HDFS en MapReduce bottleneck SELECT SESSION, COUNT(*) FROM WEB_CLICKS GROUP BY SESSION; CLIENTstoragenetwork
  12. 12. HDFS en MapReduce SELECT SESSION, COUNT(*) FROM WEB_CLICKS GROUP BY SESSION; CLIENTstoragenetwork bottleneck
  13. 13. HDFS en MapReduce
  14. 14. HDFS en MapReduceSELECT * FROM WEB_CLICKS;SELECT * FROM SELECT * FROM WEB_CLICKS; WEB_CLICKS;
  15. 15. HDFS en MapReduce GROUP BY SESSION
  16. 16. HDFS en MapReduce COUNT(*) COUNT(*) COUNT(*)
  17. 17. HDFS en MapReduce MAP REDUCESELECT * FROM COUNT(*) WEB_CLICKS; SORT/SHUFFLE GROUP BY SESSION MAP REDUCE MAP REDUCESELECT * FROM SELECT * FROM COUNT(*) COUNT(*) WEB_CLICKS; WEB_CLICKS;
  18. 18. NoSQLindex A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

×