SlideShare is now on Android. 15 million presentations at your fingertips.  Get the app

×
  • Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

HBase at Mendeley

by Lead Engineer, Data Mining at Mendeley on Nov 24, 2010

  • 11,103 views

The details behind how and why we use HBase in the data mining team at Mendeley.

The details behind how and why we use HBase in the data mining team at Mendeley.

Statistics

Views

Total Views
11,103
Views on SlideShare
10,351
Embed Views
752

Actions

Likes
12
Downloads
138
Comments
4

6 Embeds 752

http://danharvey.wordpress.com 523
http://lanyrd.com 214
http://www.linkedin.com 6
http://twitter.com 4
http://paper.li 3
http://webcache.googleusercontent.com 2

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

14 of 4 previous next Post a comment

  • danharvey Dan Harvey, Lead Engineer, Data Mining at Mendeley I believe HBase has now improved to the point you can use it directly for serving from. Though you still need to be careful with load management on your cluster so map reduce tasks don't soak up all the I/O! 1 year ago
    Are you sure you want to
    Your message goes here
    Processing…
  • AlexMcLintock Alex McLintock, CTO, Tech Guru at Openweb Analysts Ltd, Dan: Would you still use Voldemort for a HBase front end cache now, or anything else? 1 year ago
    Are you sure you want to
    Your message goes here
    Processing…
  • danharvey Dan Harvey, Lead Engineer, Data Mining at Mendeley I got asked this during the presentation and on Twitter: How come you are putting data in #Voldemort for serving #Mendeley data? Why not serve it directly from #HBase?

    This was mostly because at that point in time we don't have as much data for serving as process so we can get away with less hardware right now. We tried it out and serving from HBase works fine as long as you are not running a lot of map reduce jobs over it as we do. Over time as we grow and add more features that use HBase in more interesting ways I'm sure we'll be using it for serving too, then we'll need a cluster just for serving and use the replication in 0.90 to link them together. Far cleaner than writing your own code to do that..
    3 years ago
    Are you sure you want to
    Your message goes here
    Processing…
  • rdmpage Roderic Page, Professor of Taxonomy at University of Glasgow Great to see some technical details about the Mendeley back-end. 3 years ago
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

HBase at Mendeley HBase at Mendeley Presentation Transcript