Introduction to ElephantDB

686 views
558 views

Published on

A quick introduction to ElephantDB, a key/value database for exporting data from Hadoop.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
686
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction to ElephantDB

  1. 1. INTRODUCTION TOELEPHANTDBA DISTRIBUTED KEY/VALUE DATA STORE FOR EXPORTINGDATA FROM HADOOP/Soren Macbeth @sorenmacbeth
  2. 2. ANOTHER DATABASE? OH GOD WHY?!?!Hadoop is good at batch processing lots of data. Making thethe results of those batch calculation available to higherlayers isnt straightforward.This is what ElephantDB does. It is also the only thing that isdoes.
  3. 3. NOTABLE FEATURESOpen source, originally created by atBackTypeWritten inCreation of the database index is completely disassociatedfrom serving the indexThe server is read-onlyNathan MarzClojure
  4. 4. BENEFITSSimple.Easy to use.Trivially Scalable.
  5. 5. DOMAIN CREATIONHadoop Input/OutputFormatProvided and tapsKeys and values stored as byte arrays. Serialization left asan exercise to the readerPluggable persistence engines. LevelDB and BerkeleyDBJava Edition are providedDomains are versioned.Cascading Cascalog
  6. 6. DOMAIN CREATION
  7. 7. EXAMPLE DOMAIN-SPEC.YML---coordinator: elephantdb.persistence.LevelDBpersistence_opts: {}shard_count: 60shard_scheme: elephantdb.partition.HashModScheme
  8. 8. SERVING DOMAINSElephantDB servers watch DFS for new versions ofdomainsWhen a new version is available, servers automaticallydownload and hotswaps in the latest version
  9. 9. SERVING DOMAINS
  10. 10. GETTING DATAinterfaceClojure and Python client providedget and multiGetThrift-based
  11. 11. SIMPLE CLIENT INTERFACE(with-elephant "ip.a.b.c" 3578 client(multi-get client "some-domain" [k1 k2 k3 k4]))=> {k1 v1, k2 v2, k3 v3, k4 v4}
  12. 12. IN PRODUCTION AT YIELDBOT8 m1.xlarge instance cluster500GB of data (compressed)
  13. 13. GITHUBhttps://github.com/nathanmarz/elephantdb
  14. 14. QUESTIONS?
  15. 15. YIELDBOT IS HIRING!http://yieldbot.com/jobs

×