An evaluation of distributed datastores using AppScale Cloud Platform
Upcoming SlideShare
Loading in...5
×
 

An evaluation of distributed datastores using AppScale Cloud Platform

on

  • 443 views

 

Statistics

Views

Total Views
443
Views on SlideShare
443
Embed Views
0

Actions

Likes
0
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

An evaluation of distributed datastores using AppScale Cloud Platform An evaluation of distributed datastores using AppScale Cloud Platform Presentation Transcript

  • An Evaluation of Distributed Datastores Using The AppScale Cloud1 Platform Presented By- Himanshu Ranjan Vaishnav TE-42065 (Comp-I) SEMINAR GUIDE - Prof. Mrs S. S. Sonawani 04/01/13
  • 2 What is AppScale?  AppScale is an open-source implementation of the Google App Engine cloud platform.  AppScale is an extension of the non-scalable software development kit that Google makes available for testing and debugging applications.  App-Scale currently supports HBase, Hypertable, Cassandra, Voldemort, MongoDB, MemcacheDB, Scalaris, and MySQL Cluster datastores. 04/01/13
  • 3 What AppScale Does?  AppScale is a robust, open source implementation of the Google App Engine APIs that executes over private virtualized cluster resources and cloud infrastructures including Amazon Web Services and Eucalyptus.  Users can execute their existing Google App Engine applications over AppScale without modification.  AppScale automates deployment and simplifies configuration of datastores that implement the API and facilitates their comparison and evaluation on end-to-end performance using real programs (Google App Engine applications). 04/01/13
  • 4 AppScale Features • More Choices of data Stores • MapReduce • App Engine Portability • Neptune Language • Fault Tolerance 04/01/13 And More
  • 5 Google App Engine  A software development platform  Platform-as-a-service (PaaS)  GAE Datastore  Big Table  A master/slave relationship 04/01/13
  • 6 Continue….  GAE Datastore API provides the following primitives: For eg. • Put (k, v): Add key k and value v to table; creating a table if needed • Get (k): Return value associated with key k • Delete (k): Remove key k and its value • Query (q): Perform query q using the Google Query Language (GQL) on a single table, returning a list of values • Count (t): For a given query, returns the size of the list of values returned 04/01/13
  • 7 Google App Engine APIs  Blobstore API  Users API  Channel API  URL Fetch API  Datastore API  XMPP API  Images API  MapReduce Streaming API  Memcache API  EC2 API  Namespace API  Task Queue API 04/01/13
  • 8 AppScale deployment  AS – App Server  ALB – App Load Balancer  DBS – Data Base Slave Peer  DBM – Data Base Master Peer 04/01/13
  • 9 Multi-tiered approach within AppScale 04/01/13
  • 10 Database Services  Protocol Buffer Server (PBServer)  User/App Server (UAServer)  Blobstore service  Monitoring Services  Neptune 04/01/13
  • 11 APPSCALE DISTRIBUTED DATABASE SUPPORT  Cassandra  HBase  Hypertable  MemcacheDB  MongoDB  Voldemort  MySQL 04/01/13
  • 12 1. Cassandra  Facebook engineers designed, implemented, and released  A hybrid approach  Consistent  Written in the Java and exposes its API through the Thrift software framework  Supports range queries 04/01/13
  • 13 2. HBase  Developed and released by PowerSet  An official Hadoop subproject  Employs a master-slave distributed architecture  Provides flexible column support  Written primarily in Java, with a small portion of the code base in C  HBase is deployed over the Hadoop Distributed File System (HDFS) 04/01/13
  • 14 3. Hypertable  Hypertable was developed by Zvents  Provide an open source version of Google’s BigTable  Written in C++  RangeServer 04/01/13
  • 15 4. MemcacheDB  Developed by Open source developer Steve Chu  Employs a master-slave approach  Runs with a single master node and multiple replica nodes  Written in C and uses Berkeley DB 04/01/13
  • 16 5. MongoDB  Developed and released by 10gen  Provide both the speed and scalability  Written in C++  Queries are performed using hashtable 04/01/13
  • 17 6. Voldemort  Developed by and currently in use internally at LinkedIn  Eventual consistency  More Developer friendly  Written in Java and exposes its API via Thrift 04/01/13
  • 18 7. MySQL  A well-known relational database  Employ MySQL Cluster  Provides concurrent access to the system  Written in C and C++ 04/01/13
  • 19 EVALUATION  Load tables in all databases with 1000 items  Test specifics: – On Each database put, get, delete, no-op performed – Considered- light load: one thread, medium load: three concurrent thread, heavy thread: nine concurrent thread – Repeat each experiment 5 times  Executes this application in an AppScale cloud  Each node executes with 2 virtual processors, 10GB of disk(max), 4GB of memory 04/01/13
  • 20 Experimental Results 04/01/13
  • 21 Limitations  Persistence  Lake of retrieving the entire table to run a query  Blobstore Max File Size  Not released the source code of  Datastore the Java App Engine server  Task Queue  Mail  Follow a ”deploy on all nodes”  Limited distribution supported 04/01/13
  • 22 Future Work  Expand out of the web services domain – Investigating opportunities in streaming – Integrated MapReduce support for highperformance computing (HPC) – Co-locate AppEngines and use shared memory  Additional databases: – MongoDB, Scalaris, CouchDB 04/01/13
  • 23 Continue…  Extending AppScale with new services for - large-scale data analytics - data - computation intensive tasks  Cloud-agnostic  Integration of mobile device 04/01/13
  • 24 CONCLUSION  Presents an open source implementation of the Google App Engine (GAE) Datastore API with in a cloud platform called AppScale  The implementation unifies access to wide range of open source distributed database technologies and automates their configuration and deployment. However, each database differs in the degree to which it implements the APIs. 04/01/13
  • 25 DEMO 04/01/13
  • 26 Thank You Any Questions ?? 04/01/13