0
Thursday, May 26, 2011
Hands on with the App       Engine Datastore       Ikai Lan       May 9th, 2011         2Thursday, May 26, 2011
About the speaker       • Ikai Lan - Developer Programs Engineer, Developer Relations       • Twitter: @ikai       • Googl...
Lab prerequisites       • JDK 1.5+       • Apache Ant       • Codelab package: http://code.google.com/p/2011-datastore-   ...
Goals of this talk       • Understand a bit of how the datastore works underneath the         hood       • Have a conceptu...
Understanding the datastore       • The underlying Bigtable       • Indexing and queries       • Complex queries       • E...
Datastore layers                                     Complex   Entity Group Queries on   Key range   Get and set          ...
Datastore layers                                                                                     Get and set          ...
What does a Bigtable row look like?                  Source: http://static.googleusercontent.com/external_content/untruste...
Bigtable API       • “Give me the column ‘name’ at key 123”       • “Set the column ‘name’ at key 123 to ‘ikai’”       • “...
Datastore layers                                                                                     Get and set          ...
Megastore API       • “Give me all rows where the column ‘name’ equals ‘ikai’”       • “Transactionally write an update to...
Datastore layers                                                                                     Get and set          ...
App Engine Datastore API       • “Give me all Users for my app where the name equals ‘ikai’,         company equals ‘Googl...
Thursday, May 26, 2011
QueriesThursday, May 26, 2011
Let’s save an Entity with the low-level Java API        	 DatastoreService datastore = DatastoreServiceFactory      	 	 .g...
Get an instance of the DatastoreService        DatastoreService datastore = DatastoreServiceFactory      	 	 .getDatastore...
Instantiate a new Entity        DatastoreService datastore = DatastoreServiceFactory      	 	 .getDatastoreService();     ...
Instantiate a new Entity        DatastoreService datastore = DatastoreServiceFactory      	 	 .getDatastoreService();     ...
Set indexed properties        DatastoreService datastore = DatastoreServiceFactory      	 	 .getDatastoreService();      	...
Set unindexed properties        DatastoreService datastore = DatastoreServiceFactory      	 	 .getDatastoreService();     ...
Commit the entity to the datastore        DatastoreService datastore = DatastoreServiceFactory      	 	 .getDatastoreServi...
What happens when we save?                          Write the entity              Make the                       Success! ...
What actually gets written?              Entities table              Bigtable key                                         ...
Now let’s run a query       • If we have the key, we can fetch it right away by key       • What if we don’t? We need inde...
Let’s run a query        DatastoreService datastore = DatastoreServiceFactory      	 	 .getDatastoreService();      	 Quer...
Step 1: Query the indexes table              Entities table              Bigtable key                                     ...
Step 2: Start extracting keys              Entities table              Bigtable key                                       ...
Step 3: Batch get the entities themselves              Entities table              Bigtable key                           ...
Key takeaways       • This isn’t a relational database              – There are no full table scans              – Indexes...
Let’s run a more complex query!        DatastoreService datastore = DatastoreServiceFactory      	 	 .getDatastoreService(...
Query resolution strategies       • This query can be resolved using built in indexes              – Zig zag merge join - ...
Zig zag across multiple indexes                                     Begin by scanning indexes >=        Bigtable key      ...
Zig zag across multiple indexes        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:User:compa...
Zig zag across multiple indexes        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:User:compa...
Zig zag across multiple indexes        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:User:compa...
Zig zag across multiple indexes                                        Let’s advance the original cursor to >=        Bigt...
Zig zag across multiple indexes        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:User:compa...
Zig zag across multiple indexes        Bigtable key      Let’s move on to see if there are any more        AppId:User:comp...
Zig zag across multiple indexes        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:User:compa...
Zig zag across multiple indexes        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:User:compa...
Batch get the entities themselves              Entities table              Bigtable key                                   ...
Let’s change the shape of the data       • Zig zag performance is HIGHLY dependent on the shape of the         data       ...
Same query, sparsely distributed matches        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:U...
Same query, sparsely distributed matches                                     Begin by scanning indexes >=        Bigtable ...
Same query, sparsely distributed matches        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:U...
Same query, sparsely distributed matches        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:U...
Same query, sparsely distributed matches        Bigtable key                                     Okay, we’ve got another G...
Same query, sparsely distributed matches        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:U...
Same query, sparsely distributed matches        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:U...
Same query, sparsely distributed matches        Bigtable key        AppId:User:company:acme:alfred@acme.com        AppId:U...
What happens in this case?       • If we traverse too many indexes, the datastore throws a         NeedIndexException     ...
Composite index        Bigtable key        AppId:User:company:acme:firstName:alfred:alfred@acme.com        AppId:User:comp...
Composite index        Bigtable key        AppId:User:company:acme:firstName:alfred:alfred@acme.com        AppId:User:comp...
Composite index        Bigtable key        AppId:User:company:acme:firstName:alfred:alfred@acme.com        AppId:User:comp...
Composite index tradeoffs       • Created at entity save time - incurs additional datastore CPU         and storage quota ...
Complex Queries takeaways       • This isn’t a relational database              – There are no full table scans           ...
Thursday, May 26, 2011
Entity GroupsThursday, May 26, 2011
Why entity groups?       • We can perform transactions within this group - but not outside       • Data locality - data ar...
Entity groups and transactions       • A hierarchical structuring of your data into Megastore’s unit of         atomicity ...
Example: Data for a blog hosting service                    User                                      Blog   Has many     ...
Example: Data for a blog hosting service                    User                                      Blog   Has many     ...
Structure this data as an entity group          Entity                                              User          group ro...
How are entity groups stored?              Entities table              Bigtable key                                       ...
How are entity groups stored?              Entities table                                       Entity groups have a singl...
How are entity groups stored?              Entities table              Bigtable key                                       ...
Let’s write an entity group transactionally               DatastoreService datastore = DatastoreServiceFactory      	     ...
Let’s write an entity group transactionally               DatastoreService datastore = DatastoreServiceFactory      	     ...
Let’s write an entity group transactionally               DatastoreService datastore = DatastoreServiceFactory      	     ...
Let’s write an entity group transactionally               DatastoreService datastore = DatastoreServiceFactory      	     ...
Let’s write an entity group transactionally               DatastoreService datastore = DatastoreServiceFactory      	     ...
Let’s write an entity group transactionally               DatastoreService datastore = DatastoreServiceFactory      	     ...
Let’s write an entity group transactionally               DatastoreService datastore = DatastoreServiceFactory      	     ...
Let’s write an entity group transactionally               DatastoreService datastore = DatastoreServiceFactory      	     ...
Step 1: Commit                         Changes to         Changes to entities       Commit                         entitie...
On read, check for the most       Step 2: Entity visible               recent timestamp on the root                       ...
Step 3: Indexes updated                         Changes to          Changes to entities       Commit                      ...
Entity group and transactions takeaways       • Structure data into hierarchical trees              – Large enough to be u...
General datastore tips       • Denormalize as much as possible              – As much as possible, treat datastore as a ke...
Thursday, May 26, 2011
Questions?Thursday, May 26, 2011
Thursday, May 26, 2011
Upcoming SlideShare
Loading in...5
×

Introducing the App Engine datastore

3,350

Published on

Describes the App Engine datastore. Explains how indexing and queries work.

Published in: Technology, News & Politics
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,350
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
96
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Transcript of "Introducing the App Engine datastore"

  1. 1. Thursday, May 26, 2011
  2. 2. Hands on with the App Engine Datastore Ikai Lan May 9th, 2011 2Thursday, May 26, 2011
  3. 3. About the speaker • Ikai Lan - Developer Programs Engineer, Developer Relations • Twitter: @ikai • Google Profile: http://profiles.google.com/ikai.lan 3Thursday, May 26, 2011
  4. 4. Lab prerequisites • JDK 1.5+ • Apache Ant • Codelab package: http://code.google.com/p/2011-datastore- bootcamp-codelab/downloads/detail?name=2011-datastore- bootcamp-codelab.zip Shortlink: http://tinyurl.com/datastore-bootcamp 4Thursday, May 26, 2011
  5. 5. Goals of this talk • Understand a bit of how the datastore works underneath the hood • Have a conceptual background for the persistence codelab 5Thursday, May 26, 2011
  6. 6. Understanding the datastore • The underlying Bigtable • Indexing and queries • Complex queries • Entity groups • Underlying infrastructure 6Thursday, May 26, 2011
  7. 7. Datastore layers Complex Entity Group Queries on Key range Get and set queries Transactions properties scan by key Datastore ✓ ✓ ✓ ✓ ✓ Megastore ✓ ✓ ✓ ✓ Bigtable ✓ ✓ 7Thursday, May 26, 2011
  8. 8. Datastore layers Get and set Complex Entity Group Group on Key on Complex Entity Queries Queries range byGet and set key, key queries Transactions properties queries Transactions properties scan range scans by key Datastore ✓✓ ✓ ✓ ✓ ✓✓ ✓✓ Megastore ✓ ✓ ✓ ✓✓ ✓✓ Bigtable ✓ ✓✓ 8Thursday, May 26, 2011
  9. 9. What does a Bigtable row look like? Source: http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf 9Thursday, May 26, 2011
  10. 10. Bigtable API • “Give me the column ‘name’ at key 123” • “Set the column ‘name’ at key 123 to ‘ikai’” • “Give me all columns where the key is greater than 100 and less than 200” 10Thursday, May 26, 2011
  11. 11. Datastore layers Get and set Complex Entity Group Group on Key on Complex Entity Queries Queries range byGet and set key, key queries Transactions properties queries Transactions properties scan range scans by key Datastore ✓✓ ✓ ✓ ✓ ✓✓ ✓✓ Megastore ✓ ✓ ✓ ✓✓ ✓✓ Bigtable ✓ ✓✓ 11Thursday, May 26, 2011
  12. 12. Megastore API • “Give me all rows where the column ‘name’ equals ‘ikai’” • “Transactionally write an update to this group of entities” • “Do a cross datacenter write of this data such that reads will be strongly consistent” (High Replication Datastore) • Megastore paper: http://www.cidrdb.org/cidr2011/Papers/ CIDR11_Paper32.pdf 12Thursday, May 26, 2011
  13. 13. Datastore layers Get and set Complex Entity Group Group on Key on Complex Entity Queries Queries range byGet and set key, key queries Transactions properties queries Transactions properties scan range scans by key Datastore ✓✓ ✓ ✓ ✓ ✓✓ ✓✓ Megastore ✓ ✓ ✓ ✓✓ ✓✓ Bigtable ✓ ✓✓ 13Thursday, May 26, 2011
  14. 14. App Engine Datastore API • “Give me all Users for my app where the name equals ‘ikai’, company equals ‘Google’, and sort them by the ‘awesome’ column, descending” 14Thursday, May 26, 2011
  15. 15. Thursday, May 26, 2011
  16. 16. QueriesThursday, May 26, 2011
  17. 17. Let’s save an Entity with the low-level Java API DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 16Thursday, May 26, 2011
  18. 18. Get an instance of the DatastoreService DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Fetch a client instance Entity ikai = new Entity("User", "ikai@google.com"); ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 17Thursday, May 26, 2011
  19. 19. Instantiate a new Entity DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Set the Entity Kind ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 18Thursday, May 26, 2011
  20. 20. Instantiate a new Entity DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); ikai.setProperty("firstName", "ikai"); a Set unique key ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 19Thursday, May 26, 2011
  21. 21. Set indexed properties DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); First argument is the Entity ikai = new Entity("User", "ikai@google.com"); property name ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", argument Second is the property value "Ikai is a great man, a great, great man."); datastore.put(ikai); 20Thursday, May 26, 2011
  22. 22. Set unindexed properties DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); This property will be saved, but we ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); will not run queries against it ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 21Thursday, May 26, 2011
  23. 23. Commit the entity to the datastore DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a thing! man, a great, great man."); Save the great datastore.put(ikai); 22Thursday, May 26, 2011
  24. 24. What happens when we save? Write the entity Make the Success! write RPC Write the indexes 23Thursday, May 26, 2011
  25. 25. What actually gets written? Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Indexes table Bigtable key Value AppId:User:firstName:ikai:ikai@google.com ( Empty ) AppId:User:company:google:ikai@google.com ( Empty ) Read more: http://code.google.com/appengine/articles/storage_breakdown.html 24Thursday, May 26, 2011
  26. 26. Now let’s run a query • If we have the key, we can fetch it right away by key • What if we don’t? We need indexes. 25Thursday, May 26, 2011
  27. 27. Let’s run a query DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Query queryByName = new Query("User"); queryByName.addFilter("firstName", FilterOperator.EQUAL, "ikai"); List<Entity> results = datastore.prepare( queryByName).asList( FetchOptions.Builder.withDefaults()); // Roughly equivalent to: // SELECT * from User WHERE firstname = ‘ikai’; 26Thursday, May 26, 2011
  28. 28. Step 1: Query the indexes table Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Scan the indexes table for values >= AppId:User:firstName: Indexes table Bigtable key Value AppId:User:firstName:ikai:ikai@google.com ( Empty ) AppId:User:company:google:ikai@google.com ( Empty ) Read more: http://code.google.com/appengine/articles/storage_breakdown.html 27Thursday, May 26, 2011
  29. 29. Step 2: Start extracting keys Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Indexes table Bigtable key Value AppId:User:firstName:ikai:ikai@google.com ( Empty ) AppId:User:company:google:ikai@google.com ( Empty ) That gets us this row - extract the key ikai@google.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 28Thursday, May 26, 2011
  30. 30. Step 3: Batch get the entities themselves Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Now Indexes table let’s go back to the entities table and fetch that key. Success! Value Bigtable key AppId:User:firstName:ikai:ikai@google.com ( Empty ) AppId:User:company:google:ikai@google.com ( Empty ) Read more: http://code.google.com/appengine/articles/storage_breakdown.html 29Thursday, May 26, 2011
  31. 31. Key takeaways • This isn’t a relational database – There are no full table scans – Indexes MUST exist for every property we want to query – Natively, we can only query on matches or startsWith queries – Don’t index what we never need to query on • Get by key = one step. Query on property value = 2 steps 30Thursday, May 26, 2011
  32. 32. Let’s run a more complex query! DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Query queryByName = new Query("User"); queryByName.addFilter("firstName", FilterOperator.EQUAL, "ikai"); queryByName.addFilter("company", FilterOperator.EQUAL, "google"); List<Entity> results = datastore.prepare( queryByName).asList( FetchOptions.Builder.withDefaults()); // Roughly equivalent to: // SELECT * from User WHERE firstname = ‘ikai’ // AND company = ‘google’; 31Thursday, May 26, 2011
  33. 33. Query resolution strategies • This query can be resolved using built in indexes – Zig zag merge join - we’ll cover this example • Can be optimized using composite indexes 32Thursday, May 26, 2011
  34. 34. Zig zag across multiple indexes Begin by scanning indexes >= Bigtable key AppId:User:company:google AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 33Thursday, May 26, 2011
  35. 35. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com There’s at least a partial match, Bigtable key so we “jump” to the next index AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 34Thursday, May 26, 2011
  36. 36. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com Move to the next index. Start a scan for keys >= AppId:User:company:megacorp:zed@megacorp.com AppId:User:firstName:ikai:david@google.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 35Thursday, May 26, 2011
  37. 37. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com Okay, so that’s a twist. The first value that AppId:User:company:megacorp:zed@megacorp.com matches has key ikai@google.com! Does this Bigtable key value exist in the first index? AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 36Thursday, May 26, 2011
  38. 38. Zig zag across multiple indexes Let’s advance the original cursor to >= Bigtable key AppId:User:company:google:ikai@google.com AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 37Thursday, May 26, 2011
  39. 39. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com Alright! We found a match. Let’s AppId:User:firstName:ikai:ikai@acme.com add the key to our in memory list AppId:User:firstName:ikai:ikai@google.com and go back to the first index AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 38Thursday, May 26, 2011
  40. 40. Zig zag across multiple indexes Bigtable key Let’s move on to see if there are any more AppId:User:company:acme:alfred@acme.com matches. Let’s start at max@google.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 39Thursday, May 26, 2011
  41. 41. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com Are there any keys >= AppId:User:company:megacorp:zed@megacorp.com AppId:User:firstName:ikai:max@google.com? Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 40Thursday, May 26, 2011
  42. 42. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com No. We’re at the end of our Bigtable key index scans. Let’s do a batch AppId:User:firstName:alfred:alfred@acme.com key of our list of keys: AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com [ ‘ikai@google.com’ ] AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 41Thursday, May 26, 2011
  43. 43. Batch get the entities themselves Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Now let’s go back to the entities table and fetch that key. Success! Read more: http://code.google.com/appengine/articles/storage_breakdown.html 42Thursday, May 26, 2011
  44. 44. Let’s change the shape of the data • Zig zag performance is HIGHLY dependent on the shape of the data • Let’s go ahead and muck with the data a bit 43Thursday, May 26, 2011
  45. 45. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 44Thursday, May 26, 2011
  46. 46. Same query, sparsely distributed matches Begin by scanning indexes >= Bigtable key AppId:User:company:google AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 45Thursday, May 26, 2011
  47. 47. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Move to the next index. Start a scan for keys >= Bigtable key AppId:User:firstName:ikai:david@google.com AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 46Thursday, May 26, 2011
  48. 48. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key Oh ... no matches. Let’s AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com move back to the first AppId:User:firstName:igor:ikai@google.com index and move the AppId:User:firstName:ikai:ikai@megacorp.com cursor down AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 47Thursday, May 26, 2011
  49. 49. Same query, sparsely distributed matches Bigtable key Okay, we’ve got another Googler AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 48Thursday, May 26, 2011
  50. 50. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Move to the next index. Start a scan for keys >= Bigtable key AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 49Thursday, May 26, 2011
  51. 51. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com Oh ... no matches here AppId:User:company:google:ikai@google.com either. Let’s go back to AppId:User:company:google:max@google.com the first index. AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 50Thursday, May 26, 2011
  52. 52. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com Oh ... no matches here AppId:User:company:google:ikai@google.com either. Let’s go back to AppId:User:company:google:max@google.com the first index. AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com ... if these indexes were AppId:User:firstName:ikai:ikai@acme.com huge, we could be here AppId:User:firstName:igor:ikai@google.com for a while! AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 51Thursday, May 26, 2011
  53. 53. What happens in this case? • If we traverse too many indexes, the datastore throws a NeedIndexException • We’ll want to build a composite index 52Thursday, May 26, 2011
  54. 54. Composite index Bigtable key AppId:User:company:acme:firstName:alfred:alfred@acme.com AppId:User:company:google:firstName:david:david@google.com AppId:User:company:google:firstName:ikai:ikai@google.com AppId:User:company:google:firstName:max:max@google.com AppId:User:company:megacorp:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 53Thursday, May 26, 2011
  55. 55. Composite index Bigtable key AppId:User:company:acme:firstName:alfred:alfred@acme.com AppId:User:company:google:firstName:david:david@google.com AppId:User:company:google:firstName:ikai:ikai@google.com AppId:User:company:google:firstName:max:max@google.com AppId:User:company:megacorp:firstName:zed:zed@megacorp.com Search for all keys >= AppId:User:company:google:firstName:ikai Read more: http://code.google.com/appengine/articles/storage_breakdown.html 54Thursday, May 26, 2011
  56. 56. Composite index Bigtable key AppId:User:company:acme:firstName:alfred:alfred@acme.com AppId:User:company:google:firstName:david:david@google.com AppId:User:company:google:firstName:ikai:ikai@google.com AppId:User:company:google:firstName:max:max@google.com AppId:User:company:megacorp:firstName:zed:zed@megacorp.com Well, that was much faster, wasn’t it? Read more: http://code.google.com/appengine/articles/storage_breakdown.html 55Thursday, May 26, 2011
  57. 57. Composite index tradeoffs • Created at entity save time - incurs additional datastore CPU and storage quota • You can only create 200 composite index • You need to know the possible queries ahead of time! 56Thursday, May 26, 2011
  58. 58. Complex Queries takeaways • This isn’t a relational database – There are no full table scans – Indexes MUST exist for every property we want to query • Performance depends on the shape of the data • Worse case scenario: if your query matches are highly sparse • Build composite indexes when you need them 57Thursday, May 26, 2011
  59. 59. Thursday, May 26, 2011
  60. 60. Entity GroupsThursday, May 26, 2011
  61. 61. Why entity groups? • We can perform transactions within this group - but not outside • Data locality - data are stored “near” each other • Strongly consistent queries when using High Replication datastore within this entity group 59Thursday, May 26, 2011
  62. 62. Entity groups and transactions • A hierarchical structuring of your data into Megastore’s unit of atomicity • Allows for transactional behavior - but only within a single entity group • Key unit of consistency when using High Replication datastore 60Thursday, May 26, 2011
  63. 63. Example: Data for a blog hosting service User Blog Has many Has many Entry Has many Comment 61Thursday, May 26, 2011
  64. 64. Example: Data for a blog hosting service User Blog Has many Has many Entry This can be structured as an entity group (tree structure)! Has many Comment 62Thursday, May 26, 2011
  65. 65. Structure this data as an entity group Entity User group root Blog Blog Entry Entry Entry Comment Comment Comment 63Thursday, May 26, 2011
  66. 66. How are entity groups stored? Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized User ) AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog ) AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:111 AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:222 AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment ) Comment:333 Read more: http://code.google.com/appengine/docs/python/datastore/entities.html 64Thursday, May 26, 2011
  67. 67. How are entity groups stored? Entities table Entity groups have a single root entity Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized User ) AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog ) AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:111 AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:222 AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment ) Comment:333 Read more: http://code.google.com/appengine/docs/python/datastore/entities.html 65Thursday, May 26, 2011
  68. 68. How are entity groups stored? Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized User ) AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog ) AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry ) Child entities embed the entire ancestry in AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:111 their keys AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:222 AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment ) Comment:333 Read more: http://code.google.com/appengine/docs/python/datastore/entities.html 66Thursday, May 26, 2011
  69. 69. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 67Thursday, May 26, 2011
  70. 70. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Create the root entity Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 68Thursday, May 26, 2011
  71. 71. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); This is the first child entity - notice the third // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); argument, which specifies the parent entity key Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 69Thursday, May 26, 2011
  72. 72. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); The next deeper entity sets the blog as the parent Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 70Thursday, May 26, 2011
  73. 73. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", We can also opt to not provide a key name and ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", just use a parent key for a new entity blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 71Thursday, May 26, 2011
  74. 74. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); Start a new transaction // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 72Thursday, May 26, 2011
  75. 75. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); Put the entities in parallel 73Thursday, May 26, 2011
  76. 76. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity Actually commit the changes datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 74Thursday, May 26, 2011
  77. 77. Step 1: Commit Changes to Changes to entities Commit entities visible and indexes visible Roll the timestamp forward on the root entity 75Thursday, May 26, 2011
  78. 78. On read, check for the most Step 2: Entity visible recent timestamp on the root entity Changes to Changes to entities Commit entities visible and indexes visible This is the version we want since it represents a complete write 76Thursday, May 26, 2011
  79. 79. Step 3: Indexes updated Changes to Changes to entities Commit entities visible and indexes visible Indexes are written - now we can query for this entity with the new properties 77Thursday, May 26, 2011
  80. 80. Entity group and transactions takeaways • Structure data into hierarchical trees – Large enough to be useful, small enough to maximize transactional throughput • Transactions need an entity group root - roughly 1 transaction/ second – If you write N entities that are all part of 1 entity group, it counts as 1 write • Optimistic locking used - can be expensive with a lot of contention 78Thursday, May 26, 2011
  81. 81. General datastore tips • Denormalize as much as possible – As much as possible, treat datastore as a key-value store (Dictionary or Map like structure) – Move large reporting to offline processing. This lets you avoid unnecessary indexes • Use entity groups for your data • Build composite indexes where you need them - “need” depends on shape of your data 79Thursday, May 26, 2011
  82. 82. Thursday, May 26, 2011
  83. 83. Questions?Thursday, May 26, 2011
  84. 84. Thursday, May 26, 2011
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×