Active Cloud DB:A RESTful
Software-as-a-Service for Language
Agnostic Access to Distributed
Datastores
Chris Bunch Jonatha...
Who’s Using NoSQL?
2
and many others!
Do ItYourself!
• Pick a datastore
• Learn how the interfaces SHOULD work
• Learn how the interfaces REALLY work
• Migrate ...
Trouble in Paradise
4
(at least they’re honest about it)
The Problem
• No way to compare databases with real
applications
• No standard on what a real test is
• Too many variables...
You Need A Better Way
• Need a platform to:
• Easily evaluate datastores
• Quickly evaluate datastores
• Evaluate datastor...
Our Contribution
• Active Cloud DB:A Google App Engine app
that exposes the DB via REST
• Exposes string key/value DB
• Sp...
8
Realistically Speaking
• One test takes ~ 2 hours
• In one day at work you could generate a
graph comparing:
• HBase
• Cas...
RESTful Interface
• GET /resources/key ➜ get
• POST /resources/key (with value) ➜ put
• DELETE /resources/key ➜ delete
• G...
Caching Support
• Leverages Memcache API / memcached
• Provides a Least-Recently-Used Cache
• Write-through caching strate...
Bookstore App
• Four prototypes available that use Active
Cloud DB:
• Ruby on Rails
• Ruby (through Sinatra)
• Python (via...
13
The Actual Code
• With BigTable:
• val = `curl -X GET http://your-
app.appspot.com/resources/#{key}`
• Or in AppScale:
• v...
• Originally presented at CloudComp 2009
• An open-source implementation of the
Google App Engine APIs
• Automatically con...
• Supported Datastores as of AppScale 1.4:
• HBase, Hypertable
• MySQL
• Cassandra,Voldemort, Scalaris
• MongoDB
• Memcach...
17
Not Good Enough
• AppScale / GAE solve the problem for
Python and Java
• But only with certain APIs
• And with certain res...
But how do we test it?
• Cassandra 0.5.0 / MemcacheDB 1.2.1β
• Place 1000 items in the database and time:
• Get, put, quer...
20
21
22
A different type of test
• Workload model
• 10000 random operations selected
• 50/30/20 get/put/query ratio
• Constrained ...
24
25
26
Future Work
• Performance impact of:
• Cache size
• Millions of items in DB
• Overhead of Active Cloud DB
• Transaction su...
Related Work
• BigTable as a Web Service
• Not open source, HBase-like API
• Yahoo Cloud Serving Benchmark[SOCC10]
• Doesn...
Active Cloud DB is
Open for Business
• Open source - free to use
• Customize your own batch test or
workload test
• Access...
Thanks!
• Download Active Cloud DB and AppScale:
• http://appscale.cs.ucsb.edu
• To my advisor, Chandra Krintz
• To the Ap...
Upcoming SlideShare
Loading in...5
×

Active Cloud DB at CloudComp '10

1,487

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,487
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Active Cloud DB at CloudComp '10"

  1. 1. Active Cloud DB:A RESTful Software-as-a-Service for Language Agnostic Access to Distributed Datastores Chris Bunch Jonathan Kupferman Chandra Krintz Wednesday, October 27, 2010 CloudComp 2010 1
  2. 2. Who’s Using NoSQL? 2 and many others!
  3. 3. Do ItYourself! • Pick a datastore • Learn how the interfaces SHOULD work • Learn how the interfaces REALLY work • Migrate to a non-relational data model • each of these are non-trivial! 3
  4. 4. Trouble in Paradise 4 (at least they’re honest about it)
  5. 5. The Problem • No way to compare databases with real applications • No standard on what a real test is • Too many variables in the equation • Topology, query language, data model, APIs, consistency settings (to name a few) 5
  6. 6. You Need A Better Way • Need a platform to: • Easily evaluate datastores • Quickly evaluate datastores • Evaluate datastores on similar metrics 6
  7. 7. Our Contribution • Active Cloud DB:A Google App Engine app that exposes the DB via REST • Exposes string key/value DB • Speed up repeated operations via caching • Works on Google or AppScale •Free access to BigTable 7
  8. 8. 8
  9. 9. Realistically Speaking • One test takes ~ 2 hours • In one day at work you could generate a graph comparing: • HBase • Cassandra • Google BigTable • Amazon SimpleDB 9
  10. 10. RESTful Interface • GET /resources/key ➜ get • POST /resources/key (with value) ➜ put • DELETE /resources/key ➜ delete • GET /resources ➜ query (get all) 10
  11. 11. Caching Support • Leverages Memcache API / memcached • Provides a Least-Recently-Used Cache • Write-through caching strategy - all puts / deletes are written to the cache • Generational caching strategy - queries use a generation number 11
  12. 12. Bookstore App • Four prototypes available that use Active Cloud DB: • Ruby on Rails • Ruby (through Sinatra) • Python (via Django) • Python (through web.py) 12
  13. 13. 13
  14. 14. The Actual Code • With BigTable: • val = `curl -X GET http://your- app.appspot.com/resources/#{key}` • Or in AppScale: • val = `curl -X GET http:// 128.111.55.223:8080/resources/#{key}` 14
  15. 15. • Originally presented at CloudComp 2009 • An open-source implementation of the Google App Engine APIs • Automatically configures and deploys cloud infrastructures to run your application •includes database deployment 15
  16. 16. • Supported Datastores as of AppScale 1.4: • HBase, Hypertable • MySQL • Cassandra,Voldemort, Scalaris • MongoDB • MemcacheDB • Amazon SimpleDB 16
  17. 17. 17
  18. 18. Not Good Enough • AppScale / GAE solve the problem for Python and Java • But only with certain APIs • And with certain restrictions • Need something general purpose •All languages, no restrictions 18
  19. 19. But how do we test it? • Cassandra 0.5.0 / MemcacheDB 1.2.1β • Place 1000 items in the database and time: • Get, put, query, delete operations • Nine accessor threads • Standard deployment model 19
  20. 20. 20
  21. 21. 21
  22. 22. 22
  23. 23. A different type of test • Workload model • 10000 random operations selected • 50/30/20 get/put/query ratio • Constrained to 16 nodes • Performed on initially empty database 23
  24. 24. 24
  25. 25. 25
  26. 26. 26
  27. 27. Future Work • Performance impact of: • Cache size • Millions of items in DB • Overhead of Active Cloud DB • Transaction support 27
  28. 28. Related Work • BigTable as a Web Service • Not open source, HBase-like API • Yahoo Cloud Serving Benchmark[SOCC10] • Doesn’t run applications • No automation - you set up the DB, you set up the schemas, etc. 28
  29. 29. Active Cloud DB is Open for Business • Open source - free to use • Customize your own batch test or workload test • Access it via any programming language • Bookstore applications included 29
  30. 30. Thanks! • Download Active Cloud DB and AppScale: • http://appscale.cs.ucsb.edu • To my advisor, Chandra Krintz • To the AppScale team, especially co-lead Navraj Chohan 30
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×