Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Google's BigTable

17,583 views

Published on

Published in: Business, Technology
  • Be the first to comment

Google's BigTable

  1. 1. Google’s BigTable Out of the Slipstream :: July 3, 2008
  2. 2. The BigTable Goals <ul><ul><li>Wide Applicability </li></ul></ul><ul><ul><ul><li>Used in more than 60 Google products </li></ul></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><ul><li>High Performance </li></ul></ul><ul><ul><li>High Availability </li></ul></ul>
  3. 3. The BigTable Arena <ul><ul><li>Internet Scale </li></ul></ul><ul><ul><ul><li>Google :: BigTable and GFS </li></ul></ul></ul><ul><ul><ul><li>Apache :: HBase and HDFS </li></ul></ul></ul><ul><ul><ul><li>Amazon :: SimpleDB and S3 </li></ul></ul></ul><ul><ul><ul><li>Facebook :: Cachr and Haystacks </li></ul></ul></ul>
  4. 4. The BigTable Features <ul><ul><li>Dynamic control over data layout and format </li></ul></ul><ul><ul><li>Data is uninterpreted strings </li></ul></ul><ul><ul><li>“ Does not support a full relational model” </li></ul></ul><ul><ul><li>Locality of data </li></ul></ul><ul><ul><li>Dynamic control over serving data from memory or disk </li></ul></ul><ul><ul><li>Sparse, distributed, persistent multidimensional sorted map. </li></ul></ul><ul><ul><li>The map is indexed by: </li></ul></ul><ul><ul><ul><li>A row key </li></ul></ul></ul><ul><ul><ul><li>A column name </li></ul></ul></ul><ul><ul><ul><li>A timestamp </li></ul></ul></ul><ul><ul><ul><li>Each value in the map is an uninterpreted array of bytes </li></ul></ul></ul><ul><ul><li>Column oriented </li></ul></ul>
  5. 5. Architecture GFS SSTables Tables Chubby Clusters Tablets Tablet Servers
  6. 6. Table Structure Columns Timestamp / version Key Table Indexes Column Families Expando Columns
  7. 7. Google App Engine
  8. 8. App Engine <ul><ul><li>BigTable + Python + AppEngine SDK </li></ul></ul><ul><ul><li>Choice of web frameworks: </li></ul></ul><ul><ul><ul><li>webapp (pre-installed) </li></ul></ul></ul><ul><ul><ul><li>Django </li></ul></ul></ul><ul><ul><ul><li>CherryPy </li></ul></ul></ul><ul><ul><ul><li>Pylons </li></ul></ul></ul><ul><ul><ul><li>Web.py </li></ul></ul></ul><ul><ul><li>Google Accounts integration </li></ul></ul><ul><ul><li>App Engine SDK for offline development </li></ul></ul><ul><ul><li>Offline development environment </li></ul></ul><ul><ul><li>Online runtime environment </li></ul></ul><ul><ul><li>Free to get started </li></ul></ul><ul><ul><li>Priced similar to Amazon S3 </li></ul></ul>
  9. 9. Getting Started <ul><ul><li>Sign-up for an account </li></ul></ul><ul><ul><li>Download Python 2.5 </li></ul></ul><ul><ul><li>Download AppEngine SDK </li></ul></ul><ul><ul><ul><li>Local version of BigTable </li></ul></ul></ul><ul><ul><ul><li>Web-server </li></ul></ul></ul><ul><ul><ul><li>Google user account simulator </li></ul></ul></ul><ul><ul><ul><li>Webapp framework </li></ul></ul></ul><ul><ul><li>Getting started tutorial </li></ul></ul><ul><ul><li>Write you application </li></ul></ul><ul><ul><li>Upload to google </li></ul></ul>
  10. 10. Class Definition <ul><ul><li>Python code to declare a datastore class: </li></ul></ul><ul><ul><li>class Patient(db.Model):   firstName = db.UserProperty() lastName = db.UserProperty() dateOfBirth = db.DateTimeProperty() sex = db.UserProperty() </li></ul></ul>
  11. 11. Create <ul><ul><li>Python code to create and store an object: </li></ul></ul><ul><ul><li>patient = Patient() patient.firstName=“George” </li></ul></ul><ul><ul><li>patient.lastName=“James” </li></ul></ul><ul><ul><li>dateOfBirth=“2008-01-01” </li></ul></ul><ul><ul><li>sex=“M” </li></ul></ul><ul><ul><li>patient.put() </li></ul></ul>
  12. 12. Query <ul><ul><li>Python code to query a class: </li></ul></ul><ul><ul><li>patients = Patient.all() </li></ul></ul><ul><ul><li>for patient in patients: </li></ul></ul><ul><ul><li>self.response.out.write(‘Name %s %s.’, </li></ul></ul><ul><ul><li>patient.firstName, </li></ul></ul><ul><ul><li>patient.lastName) </li></ul></ul>
  13. 13. More complex query <ul><ul><li>Python code to select the 100 youngest male patients: </li></ul></ul><ul><ul><li>allPatients = Patient.all() </li></ul></ul><ul><ul><li>allPatients.filter(‘sex=‘,’Male’) </li></ul></ul><ul><ul><li>allPatients.order(‘dateOfBirth’) </li></ul></ul><ul><ul><li>patients = allPatients.fetch(100) </li></ul></ul>
  14. 14. Query using GQL <ul><ul><li>GQL = Google Query Language </li></ul></ul><ul><ul><li>GQL code to select the 100 youngest male patients: </li></ul></ul><ul><ul><li>select * from Patient where sex=‘Male’ order by dateOfBirth </li></ul></ul><ul><ul><li>Cannot select specific columns </li></ul></ul><ul><ul><li>No joins </li></ul></ul>
  15. 15. Indexes <ul><ul><li>Development SDK </li></ul></ul><ul><ul><ul><li>Index definitions generated automatically based on data access within your application </li></ul></ul></ul><ul><ul><ul><li>Index definitions uploaded to the Google server </li></ul></ul></ul><ul><ul><li>- kind: Patient </li></ul></ul><ul><ul><li>properties: </li></ul></ul><ul><ul><li>- name: dateOfBirth </li></ul></ul><ul><ul><li>direction: asc </li></ul></ul><ul><ul><li>- name: sex </li></ul></ul><ul><ul><li>direction: desc </li></ul></ul>
  16. 16. Indexes
  17. 17. Data Viewer
  18. 18. Data Viewer
  19. 19. Data Viewer
  20. 20. Conclusions <ul><ul><li>BigTable is an Internet Scale solution </li></ul></ul><ul><ul><li>Conventional databases are not up to the job </li></ul></ul><ul><ul><li>Home grown solutions </li></ul></ul><ul><ul><li>Increasing demand </li></ul></ul><ul><ul><li>??? </li></ul></ul><ul><ul><li>Profit </li></ul></ul>
  21. 21. <ul><ul><li>Thank you </li></ul></ul><ul><ul><li>Questions? </li></ul></ul>

×