INTRODUCTION TOELEPHANTDBA DISTRIBUTED KEY/VALUE DATA STORE FOR EXPORTINGDATA FROM HADOOP/Soren Macbeth @sorenmacbeth
ANOTHER DATABASE? OH GOD WHY?!?!Hadoop is good at batch processing lots of data. Making thethe results of those batch calc...
NOTABLE FEATURESOpen source, originally created by atBackTypeWritten inCreation of the database index is completely disass...
BENEFITSSimple.Easy to use.Trivially Scalable.
DOMAIN CREATIONHadoop Input/OutputFormatProvided and tapsKeys and values stored as byte arrays. Serialization left asan ex...
DOMAIN CREATION
EXAMPLE DOMAIN-SPEC.YML---coordinator: elephantdb.persistence.LevelDBpersistence_opts: {}shard_count: 60shard_scheme: elep...
SERVING DOMAINSElephantDB servers watch DFS for new versions ofdomainsWhen a new version is available, servers automatical...
SERVING DOMAINS
GETTING DATAinterfaceClojure and Python client providedget and multiGetThrift-based
SIMPLE CLIENT INTERFACE(with-elephant "ip.a.b.c" 3578 client(multi-get client "some-domain" [k1 k2 k3 k4]))=> {k1 v1, k2 v...
IN PRODUCTION AT YIELDBOT8 m1.xlarge instance cluster500GB of data (compressed)
GITHUBhttps://github.com/nathanmarz/elephantdb
QUESTIONS?
YIELDBOT IS HIRING!http://yieldbot.com/jobs
Introduction to ElephantDB
Upcoming SlideShare
Loading in...5
×

Introduction to ElephantDB

441

Published on

A quick introduction to ElephantDB, a key/value database for exporting data from Hadoop.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
441
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction to ElephantDB

  1. 1. INTRODUCTION TOELEPHANTDBA DISTRIBUTED KEY/VALUE DATA STORE FOR EXPORTINGDATA FROM HADOOP/Soren Macbeth @sorenmacbeth
  2. 2. ANOTHER DATABASE? OH GOD WHY?!?!Hadoop is good at batch processing lots of data. Making thethe results of those batch calculation available to higherlayers isnt straightforward.This is what ElephantDB does. It is also the only thing that isdoes.
  3. 3. NOTABLE FEATURESOpen source, originally created by atBackTypeWritten inCreation of the database index is completely disassociatedfrom serving the indexThe server is read-onlyNathan MarzClojure
  4. 4. BENEFITSSimple.Easy to use.Trivially Scalable.
  5. 5. DOMAIN CREATIONHadoop Input/OutputFormatProvided and tapsKeys and values stored as byte arrays. Serialization left asan exercise to the readerPluggable persistence engines. LevelDB and BerkeleyDBJava Edition are providedDomains are versioned.Cascading Cascalog
  6. 6. DOMAIN CREATION
  7. 7. EXAMPLE DOMAIN-SPEC.YML---coordinator: elephantdb.persistence.LevelDBpersistence_opts: {}shard_count: 60shard_scheme: elephantdb.partition.HashModScheme
  8. 8. SERVING DOMAINSElephantDB servers watch DFS for new versions ofdomainsWhen a new version is available, servers automaticallydownload and hotswaps in the latest version
  9. 9. SERVING DOMAINS
  10. 10. GETTING DATAinterfaceClojure and Python client providedget and multiGetThrift-based
  11. 11. SIMPLE CLIENT INTERFACE(with-elephant "ip.a.b.c" 3578 client(multi-get client "some-domain" [k1 k2 k3 k4]))=> {k1 v1, k2 v2, k3 v3, k4 v4}
  12. 12. IN PRODUCTION AT YIELDBOT8 m1.xlarge instance cluster500GB of data (compressed)
  13. 13. GITHUBhttps://github.com/nathanmarz/elephantdb
  14. 14. QUESTIONS?
  15. 15. YIELDBOT IS HIRING!http://yieldbot.com/jobs
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×