Your SlideShare is downloading. ×
0
Per Møldrup-Dalum
State and University Library
SCAPE Information Day
State and University Library, Denmark, 2014-06-25
Had...
• A bit on Hadoop in general
• A bit on our experience in deploying Hadoop at the
library
2
Agenda
This work was partially...
• MapReduce: Simplified Data Processing on Large
Clusters, Jeffrey Dean and Senjay Ghemawat, 2004
• In 2005 Cutting and Ca...
4
Map/Reduce
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union ...
5
Lorem ipsum
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union...
6
The Zoo
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union und...
7
Hadoop at the Library
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the Euro...
• Blade servers with no local storage
• Storage exclusively on NAS
• We‘ve done several experiments
8
Can it be done?
This...
4 CPU nodes
• Two 6-core CPU
• Intel® Xeon® Processor
X5670 with 12M Cache, 2.93
GHz, and 6.40 GT/s Intel® QPI
• 96GB RAM
...
10
Cloudera Hadoop Distribution
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by ...
11
Interface
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union ...
• http://hadoop.apache.org
• http://www.cloudera.com
• http://static.googleusercontent.com/media/research.g
oogle.com/en//...
13This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 I...
Upcoming SlideShare
Loading in...5
×

Hadoop and its applications at the State and University Library, SCAPE Information Day, 25 June 2014

108

Published on

Per Møldrup-Dalum introduced how the State and University Library in Denmark have deployed Hadoop in connection with the SCAPE project. With Hadoop the library have been able to process large amounts of data so much fast than what has been done before.
The presentation was given at ‘SCAPE Information Day at the State and University Library, Denmark’, on 25 June 2014. The information day introduced the EU-funded project SCAPE (Scalable Preservation Environments) and its tools and services to the participants. For more information about the demo day, see this blog post, http://bit.ly/SCAPE_SB_Demo, about the event.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
108
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Hadoop and its applications at the State and University Library, SCAPE Information Day, 25 June 2014"

  1. 1. Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, 2014-06-25 Hadoop and its applications at the State and University Library
  2. 2. • A bit on Hadoop in general • A bit on our experience in deploying Hadoop at the library 2 Agenda This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
  3. 3. • MapReduce: Simplified Data Processing on Large Clusters, Jeffrey Dean and Senjay Ghemawat, 2004 • In 2005 Cutting and Cafarella created Hadoop at Yahoo! • Now an Apache project • Commercial distributions, community editions, DIY 3 Origins This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
  4. 4. 4 Map/Reduce This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137). MAP REDUCE
  5. 5. 5 Lorem ipsum This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137). • Count addresses that have fruits etc in their street name • Kirsebærhaven • Jordbærvej • Nødde allé • Result • Kirsebær: 1203 • Nødder: 34 • Jordbær: 543
  6. 6. 6 The Zoo This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137). HDFS – data locality MapReduce •••
  7. 7. 7 Hadoop at the Library This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
  8. 8. • Blade servers with no local storage • Storage exclusively on NAS • We‘ve done several experiments 8 Can it be done? This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137). Existing infrastructure CPU Storage
  9. 9. 4 CPU nodes • Two 6-core CPU • Intel® Xeon® Processor X5670 with 12M Cache, 2.93 GHz, and 6.40 GT/s Intel® QPI • 96GB RAM • 2Gbit Ethernet interface • CentOS • NFS mount point on NAS for HDFS • Reachable NAS storage: ~4PB 9 Cluster topology This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137). Science Museum/Science & Society Picture Library
  10. 10. 10 Cloudera Hadoop Distribution This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
  11. 11. 11 Interface This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
  12. 12. • http://hadoop.apache.org • http://www.cloudera.com • http://static.googleusercontent.com/media/research.g oogle.com/en//archive/mapreduce-osdi04.pdf 12 References This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
  13. 13. 13This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×