Google's BigTable

17,113 views
16,802 views

Published on

Published in: Business, Technology
0 Comments
19 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
17,113
On SlideShare
0
From Embeds
0
Number of Embeds
165
Actions
Shares
0
Downloads
0
Comments
0
Likes
19
Embeds 0
No embeds

No notes for slide

Google's BigTable

  1. 1. Google’s BigTable Out of the Slipstream :: July 3, 2008
  2. 2. The BigTable Goals <ul><ul><li>Wide Applicability </li></ul></ul><ul><ul><ul><li>Used in more than 60 Google products </li></ul></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><ul><li>High Performance </li></ul></ul><ul><ul><li>High Availability </li></ul></ul>
  3. 3. The BigTable Arena <ul><ul><li>Internet Scale </li></ul></ul><ul><ul><ul><li>Google :: BigTable and GFS </li></ul></ul></ul><ul><ul><ul><li>Apache :: HBase and HDFS </li></ul></ul></ul><ul><ul><ul><li>Amazon :: SimpleDB and S3 </li></ul></ul></ul><ul><ul><ul><li>Facebook :: Cachr and Haystacks </li></ul></ul></ul>
  4. 4. The BigTable Features <ul><ul><li>Dynamic control over data layout and format </li></ul></ul><ul><ul><li>Data is uninterpreted strings </li></ul></ul><ul><ul><li>“ Does not support a full relational model” </li></ul></ul><ul><ul><li>Locality of data </li></ul></ul><ul><ul><li>Dynamic control over serving data from memory or disk </li></ul></ul><ul><ul><li>Sparse, distributed, persistent multidimensional sorted map. </li></ul></ul><ul><ul><li>The map is indexed by: </li></ul></ul><ul><ul><ul><li>A row key </li></ul></ul></ul><ul><ul><ul><li>A column name </li></ul></ul></ul><ul><ul><ul><li>A timestamp </li></ul></ul></ul><ul><ul><ul><li>Each value in the map is an uninterpreted array of bytes </li></ul></ul></ul><ul><ul><li>Column oriented </li></ul></ul>
  5. 5. Architecture GFS SSTables Tables Chubby Clusters Tablets Tablet Servers
  6. 6. Table Structure Columns Timestamp / version Key Table Indexes Column Families Expando Columns
  7. 7. Google App Engine
  8. 8. App Engine <ul><ul><li>BigTable + Python + AppEngine SDK </li></ul></ul><ul><ul><li>Choice of web frameworks: </li></ul></ul><ul><ul><ul><li>webapp (pre-installed) </li></ul></ul></ul><ul><ul><ul><li>Django </li></ul></ul></ul><ul><ul><ul><li>CherryPy </li></ul></ul></ul><ul><ul><ul><li>Pylons </li></ul></ul></ul><ul><ul><ul><li>Web.py </li></ul></ul></ul><ul><ul><li>Google Accounts integration </li></ul></ul><ul><ul><li>App Engine SDK for offline development </li></ul></ul><ul><ul><li>Offline development environment </li></ul></ul><ul><ul><li>Online runtime environment </li></ul></ul><ul><ul><li>Free to get started </li></ul></ul><ul><ul><li>Priced similar to Amazon S3 </li></ul></ul>
  9. 9. Getting Started <ul><ul><li>Sign-up for an account </li></ul></ul><ul><ul><li>Download Python 2.5 </li></ul></ul><ul><ul><li>Download AppEngine SDK </li></ul></ul><ul><ul><ul><li>Local version of BigTable </li></ul></ul></ul><ul><ul><ul><li>Web-server </li></ul></ul></ul><ul><ul><ul><li>Google user account simulator </li></ul></ul></ul><ul><ul><ul><li>Webapp framework </li></ul></ul></ul><ul><ul><li>Getting started tutorial </li></ul></ul><ul><ul><li>Write you application </li></ul></ul><ul><ul><li>Upload to google </li></ul></ul>
  10. 10. Class Definition <ul><ul><li>Python code to declare a datastore class: </li></ul></ul><ul><ul><li>class Patient(db.Model):   firstName = db.UserProperty() lastName = db.UserProperty() dateOfBirth = db.DateTimeProperty() sex = db.UserProperty() </li></ul></ul>
  11. 11. Create <ul><ul><li>Python code to create and store an object: </li></ul></ul><ul><ul><li>patient = Patient() patient.firstName=“George” </li></ul></ul><ul><ul><li>patient.lastName=“James” </li></ul></ul><ul><ul><li>dateOfBirth=“2008-01-01” </li></ul></ul><ul><ul><li>sex=“M” </li></ul></ul><ul><ul><li>patient.put() </li></ul></ul>
  12. 12. Query <ul><ul><li>Python code to query a class: </li></ul></ul><ul><ul><li>patients = Patient.all() </li></ul></ul><ul><ul><li>for patient in patients: </li></ul></ul><ul><ul><li>self.response.out.write(‘Name %s %s.’, </li></ul></ul><ul><ul><li>patient.firstName, </li></ul></ul><ul><ul><li>patient.lastName) </li></ul></ul>
  13. 13. More complex query <ul><ul><li>Python code to select the 100 youngest male patients: </li></ul></ul><ul><ul><li>allPatients = Patient.all() </li></ul></ul><ul><ul><li>allPatients.filter(‘sex=‘,’Male’) </li></ul></ul><ul><ul><li>allPatients.order(‘dateOfBirth’) </li></ul></ul><ul><ul><li>patients = allPatients.fetch(100) </li></ul></ul>
  14. 14. Query using GQL <ul><ul><li>GQL = Google Query Language </li></ul></ul><ul><ul><li>GQL code to select the 100 youngest male patients: </li></ul></ul><ul><ul><li>select * from Patient where sex=‘Male’ order by dateOfBirth </li></ul></ul><ul><ul><li>Cannot select specific columns </li></ul></ul><ul><ul><li>No joins </li></ul></ul>
  15. 15. Indexes <ul><ul><li>Development SDK </li></ul></ul><ul><ul><ul><li>Index definitions generated automatically based on data access within your application </li></ul></ul></ul><ul><ul><ul><li>Index definitions uploaded to the Google server </li></ul></ul></ul><ul><ul><li>- kind: Patient </li></ul></ul><ul><ul><li>properties: </li></ul></ul><ul><ul><li>- name: dateOfBirth </li></ul></ul><ul><ul><li>direction: asc </li></ul></ul><ul><ul><li>- name: sex </li></ul></ul><ul><ul><li>direction: desc </li></ul></ul>
  16. 16. Indexes
  17. 17. Data Viewer
  18. 18. Data Viewer
  19. 19. Data Viewer
  20. 20. Conclusions <ul><ul><li>BigTable is an Internet Scale solution </li></ul></ul><ul><ul><li>Conventional databases are not up to the job </li></ul></ul><ul><ul><li>Home grown solutions </li></ul></ul><ul><ul><li>Increasing demand </li></ul></ul><ul><ul><li>??? </li></ul></ul><ul><ul><li>Profit </li></ul></ul>
  21. 21. <ul><ul><li>Thank you </li></ul></ul><ul><ul><li>Questions? </li></ul></ul>

×