Introducing the App Engine datastore

  • 3,068 views
Uploaded on

Describes the App Engine datastore. Explains how indexing and queries work.

Describes the App Engine datastore. Explains how indexing and queries work.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,068
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
89
Comments
0
Likes
7

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Thursday, May 26, 2011
  • 2. Hands on with the App Engine Datastore Ikai Lan May 9th, 2011 2Thursday, May 26, 2011
  • 3. About the speaker • Ikai Lan - Developer Programs Engineer, Developer Relations • Twitter: @ikai • Google Profile: http://profiles.google.com/ikai.lan 3Thursday, May 26, 2011
  • 4. Lab prerequisites • JDK 1.5+ • Apache Ant • Codelab package: http://code.google.com/p/2011-datastore- bootcamp-codelab/downloads/detail?name=2011-datastore- bootcamp-codelab.zip Shortlink: http://tinyurl.com/datastore-bootcamp 4Thursday, May 26, 2011
  • 5. Goals of this talk • Understand a bit of how the datastore works underneath the hood • Have a conceptual background for the persistence codelab 5Thursday, May 26, 2011
  • 6. Understanding the datastore • The underlying Bigtable • Indexing and queries • Complex queries • Entity groups • Underlying infrastructure 6Thursday, May 26, 2011
  • 7. Datastore layers Complex Entity Group Queries on Key range Get and set queries Transactions properties scan by key Datastore ✓ ✓ ✓ ✓ ✓ Megastore ✓ ✓ ✓ ✓ Bigtable ✓ ✓ 7Thursday, May 26, 2011
  • 8. Datastore layers Get and set Complex Entity Group Group on Key on Complex Entity Queries Queries range byGet and set key, key queries Transactions properties queries Transactions properties scan range scans by key Datastore ✓✓ ✓ ✓ ✓ ✓✓ ✓✓ Megastore ✓ ✓ ✓ ✓✓ ✓✓ Bigtable ✓ ✓✓ 8Thursday, May 26, 2011
  • 9. What does a Bigtable row look like? Source: http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/bigtable-osdi06.pdf 9Thursday, May 26, 2011
  • 10. Bigtable API • “Give me the column ‘name’ at key 123” • “Set the column ‘name’ at key 123 to ‘ikai’” • “Give me all columns where the key is greater than 100 and less than 200” 10Thursday, May 26, 2011
  • 11. Datastore layers Get and set Complex Entity Group Group on Key on Complex Entity Queries Queries range byGet and set key, key queries Transactions properties queries Transactions properties scan range scans by key Datastore ✓✓ ✓ ✓ ✓ ✓✓ ✓✓ Megastore ✓ ✓ ✓ ✓✓ ✓✓ Bigtable ✓ ✓✓ 11Thursday, May 26, 2011
  • 12. Megastore API • “Give me all rows where the column ‘name’ equals ‘ikai’” • “Transactionally write an update to this group of entities” • “Do a cross datacenter write of this data such that reads will be strongly consistent” (High Replication Datastore) • Megastore paper: http://www.cidrdb.org/cidr2011/Papers/ CIDR11_Paper32.pdf 12Thursday, May 26, 2011
  • 13. Datastore layers Get and set Complex Entity Group Group on Key on Complex Entity Queries Queries range byGet and set key, key queries Transactions properties queries Transactions properties scan range scans by key Datastore ✓✓ ✓ ✓ ✓ ✓✓ ✓✓ Megastore ✓ ✓ ✓ ✓✓ ✓✓ Bigtable ✓ ✓✓ 13Thursday, May 26, 2011
  • 14. App Engine Datastore API • “Give me all Users for my app where the name equals ‘ikai’, company equals ‘Google’, and sort them by the ‘awesome’ column, descending” 14Thursday, May 26, 2011
  • 15. Thursday, May 26, 2011
  • 16. QueriesThursday, May 26, 2011
  • 17. Let’s save an Entity with the low-level Java API DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 16Thursday, May 26, 2011
  • 18. Get an instance of the DatastoreService DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Fetch a client instance Entity ikai = new Entity("User", "ikai@google.com"); ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 17Thursday, May 26, 2011
  • 19. Instantiate a new Entity DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Set the Entity Kind ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 18Thursday, May 26, 2011
  • 20. Instantiate a new Entity DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); ikai.setProperty("firstName", "ikai"); a Set unique key ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 19Thursday, May 26, 2011
  • 21. Set indexed properties DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); First argument is the Entity ikai = new Entity("User", "ikai@google.com"); property name ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", argument Second is the property value "Ikai is a great man, a great, great man."); datastore.put(ikai); 20Thursday, May 26, 2011
  • 22. Set unindexed properties DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); This property will be saved, but we ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); will not run queries against it ikai.setUnindexedProperty("biography", "Ikai is a great man, a great, great man."); datastore.put(ikai); 21Thursday, May 26, 2011
  • 23. Commit the entity to the datastore DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); ikai.setProperty("firstName", "ikai"); ikai.setProperty("company", "google"); ikai.setUnindexedProperty("biography", "Ikai is a thing! man, a great, great man."); Save the great datastore.put(ikai); 22Thursday, May 26, 2011
  • 24. What happens when we save? Write the entity Make the Success! write RPC Write the indexes 23Thursday, May 26, 2011
  • 25. What actually gets written? Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Indexes table Bigtable key Value AppId:User:firstName:ikai:ikai@google.com ( Empty ) AppId:User:company:google:ikai@google.com ( Empty ) Read more: http://code.google.com/appengine/articles/storage_breakdown.html 24Thursday, May 26, 2011
  • 26. Now let’s run a query • If we have the key, we can fetch it right away by key • What if we don’t? We need indexes. 25Thursday, May 26, 2011
  • 27. Let’s run a query DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Query queryByName = new Query("User"); queryByName.addFilter("firstName", FilterOperator.EQUAL, "ikai"); List<Entity> results = datastore.prepare( queryByName).asList( FetchOptions.Builder.withDefaults()); // Roughly equivalent to: // SELECT * from User WHERE firstname = ‘ikai’; 26Thursday, May 26, 2011
  • 28. Step 1: Query the indexes table Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Scan the indexes table for values >= AppId:User:firstName: Indexes table Bigtable key Value AppId:User:firstName:ikai:ikai@google.com ( Empty ) AppId:User:company:google:ikai@google.com ( Empty ) Read more: http://code.google.com/appengine/articles/storage_breakdown.html 27Thursday, May 26, 2011
  • 29. Step 2: Start extracting keys Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Indexes table Bigtable key Value AppId:User:firstName:ikai:ikai@google.com ( Empty ) AppId:User:company:google:ikai@google.com ( Empty ) That gets us this row - extract the key ikai@google.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 28Thursday, May 26, 2011
  • 30. Step 3: Batch get the entities themselves Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Now Indexes table let’s go back to the entities table and fetch that key. Success! Value Bigtable key AppId:User:firstName:ikai:ikai@google.com ( Empty ) AppId:User:company:google:ikai@google.com ( Empty ) Read more: http://code.google.com/appengine/articles/storage_breakdown.html 29Thursday, May 26, 2011
  • 31. Key takeaways • This isn’t a relational database – There are no full table scans – Indexes MUST exist for every property we want to query – Natively, we can only query on matches or startsWith queries – Don’t index what we never need to query on • Get by key = one step. Query on property value = 2 steps 30Thursday, May 26, 2011
  • 32. Let’s run a more complex query! DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Query queryByName = new Query("User"); queryByName.addFilter("firstName", FilterOperator.EQUAL, "ikai"); queryByName.addFilter("company", FilterOperator.EQUAL, "google"); List<Entity> results = datastore.prepare( queryByName).asList( FetchOptions.Builder.withDefaults()); // Roughly equivalent to: // SELECT * from User WHERE firstname = ‘ikai’ // AND company = ‘google’; 31Thursday, May 26, 2011
  • 33. Query resolution strategies • This query can be resolved using built in indexes – Zig zag merge join - we’ll cover this example • Can be optimized using composite indexes 32Thursday, May 26, 2011
  • 34. Zig zag across multiple indexes Begin by scanning indexes >= Bigtable key AppId:User:company:google AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 33Thursday, May 26, 2011
  • 35. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com There’s at least a partial match, Bigtable key so we “jump” to the next index AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 34Thursday, May 26, 2011
  • 36. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com Move to the next index. Start a scan for keys >= AppId:User:company:megacorp:zed@megacorp.com AppId:User:firstName:ikai:david@google.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 35Thursday, May 26, 2011
  • 37. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com Okay, so that’s a twist. The first value that AppId:User:company:megacorp:zed@megacorp.com matches has key ikai@google.com! Does this Bigtable key value exist in the first index? AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 36Thursday, May 26, 2011
  • 38. Zig zag across multiple indexes Let’s advance the original cursor to >= Bigtable key AppId:User:company:google:ikai@google.com AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 37Thursday, May 26, 2011
  • 39. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com Alright! We found a match. Let’s AppId:User:firstName:ikai:ikai@acme.com add the key to our in memory list AppId:User:firstName:ikai:ikai@google.com and go back to the first index AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 38Thursday, May 26, 2011
  • 40. Zig zag across multiple indexes Bigtable key Let’s move on to see if there are any more AppId:User:company:acme:alfred@acme.com matches. Let’s start at max@google.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 39Thursday, May 26, 2011
  • 41. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com Are there any keys >= AppId:User:company:megacorp:zed@megacorp.com AppId:User:firstName:ikai:max@google.com? Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 40Thursday, May 26, 2011
  • 42. Zig zag across multiple indexes Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com No. We’re at the end of our Bigtable key index scans. Let’s do a batch AppId:User:firstName:alfred:alfred@acme.com key of our list of keys: AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:ikai:ikai@google.com [ ‘ikai@google.com’ ] AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 41Thursday, May 26, 2011
  • 43. Batch get the entities themselves Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized entity - includes firstName, company and biography values ) Now let’s go back to the entities table and fetch that key. Success! Read more: http://code.google.com/appengine/articles/storage_breakdown.html 42Thursday, May 26, 2011
  • 44. Let’s change the shape of the data • Zig zag performance is HIGHLY dependent on the shape of the data • Let’s go ahead and muck with the data a bit 43Thursday, May 26, 2011
  • 45. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 44Thursday, May 26, 2011
  • 46. Same query, sparsely distributed matches Begin by scanning indexes >= Bigtable key AppId:User:company:google AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 45Thursday, May 26, 2011
  • 47. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Move to the next index. Start a scan for keys >= Bigtable key AppId:User:firstName:ikai:david@google.com AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 46Thursday, May 26, 2011
  • 48. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key Oh ... no matches. Let’s AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com move back to the first AppId:User:firstName:igor:ikai@google.com index and move the AppId:User:firstName:ikai:ikai@megacorp.com cursor down AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 47Thursday, May 26, 2011
  • 49. Same query, sparsely distributed matches Bigtable key Okay, we’ve got another Googler AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 48Thursday, May 26, 2011
  • 50. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com AppId:User:company:google:ikai@google.com AppId:User:company:google:max@google.com AppId:User:company:megacorp:zed@megacorp.com Move to the next index. Start a scan for keys >= Bigtable key AppId:User:firstName:ikai:ikai@google.com AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 49Thursday, May 26, 2011
  • 51. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com Oh ... no matches here AppId:User:company:google:ikai@google.com either. Let’s go back to AppId:User:company:google:max@google.com the first index. AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com AppId:User:firstName:ikai:ikai@acme.com AppId:User:firstName:igor:ikai@google.com AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 50Thursday, May 26, 2011
  • 52. Same query, sparsely distributed matches Bigtable key AppId:User:company:acme:alfred@acme.com AppId:User:company:google:david@google.com Oh ... no matches here AppId:User:company:google:ikai@google.com either. Let’s go back to AppId:User:company:google:max@google.com the first index. AppId:User:company:megacorp:zed@megacorp.com Bigtable key AppId:User:firstName:alfred:alfred@acme.com ... if these indexes were AppId:User:firstName:ikai:ikai@acme.com huge, we could be here AppId:User:firstName:igor:ikai@google.com for a while! AppId:User:firstName:ikai:ikai@megacorp.com AppId:User:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 51Thursday, May 26, 2011
  • 53. What happens in this case? • If we traverse too many indexes, the datastore throws a NeedIndexException • We’ll want to build a composite index 52Thursday, May 26, 2011
  • 54. Composite index Bigtable key AppId:User:company:acme:firstName:alfred:alfred@acme.com AppId:User:company:google:firstName:david:david@google.com AppId:User:company:google:firstName:ikai:ikai@google.com AppId:User:company:google:firstName:max:max@google.com AppId:User:company:megacorp:firstName:zed:zed@megacorp.com Read more: http://code.google.com/appengine/articles/storage_breakdown.html 53Thursday, May 26, 2011
  • 55. Composite index Bigtable key AppId:User:company:acme:firstName:alfred:alfred@acme.com AppId:User:company:google:firstName:david:david@google.com AppId:User:company:google:firstName:ikai:ikai@google.com AppId:User:company:google:firstName:max:max@google.com AppId:User:company:megacorp:firstName:zed:zed@megacorp.com Search for all keys >= AppId:User:company:google:firstName:ikai Read more: http://code.google.com/appengine/articles/storage_breakdown.html 54Thursday, May 26, 2011
  • 56. Composite index Bigtable key AppId:User:company:acme:firstName:alfred:alfred@acme.com AppId:User:company:google:firstName:david:david@google.com AppId:User:company:google:firstName:ikai:ikai@google.com AppId:User:company:google:firstName:max:max@google.com AppId:User:company:megacorp:firstName:zed:zed@megacorp.com Well, that was much faster, wasn’t it? Read more: http://code.google.com/appengine/articles/storage_breakdown.html 55Thursday, May 26, 2011
  • 57. Composite index tradeoffs • Created at entity save time - incurs additional datastore CPU and storage quota • You can only create 200 composite index • You need to know the possible queries ahead of time! 56Thursday, May 26, 2011
  • 58. Complex Queries takeaways • This isn’t a relational database – There are no full table scans – Indexes MUST exist for every property we want to query • Performance depends on the shape of the data • Worse case scenario: if your query matches are highly sparse • Build composite indexes when you need them 57Thursday, May 26, 2011
  • 59. Thursday, May 26, 2011
  • 60. Entity GroupsThursday, May 26, 2011
  • 61. Why entity groups? • We can perform transactions within this group - but not outside • Data locality - data are stored “near” each other • Strongly consistent queries when using High Replication datastore within this entity group 59Thursday, May 26, 2011
  • 62. Entity groups and transactions • A hierarchical structuring of your data into Megastore’s unit of atomicity • Allows for transactional behavior - but only within a single entity group • Key unit of consistency when using High Replication datastore 60Thursday, May 26, 2011
  • 63. Example: Data for a blog hosting service User Blog Has many Has many Entry Has many Comment 61Thursday, May 26, 2011
  • 64. Example: Data for a blog hosting service User Blog Has many Has many Entry This can be structured as an entity group (tree structure)! Has many Comment 62Thursday, May 26, 2011
  • 65. Structure this data as an entity group Entity User group root Blog Blog Entry Entry Entry Comment Comment Comment 63Thursday, May 26, 2011
  • 66. How are entity groups stored? Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized User ) AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog ) AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:111 AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:222 AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment ) Comment:333 Read more: http://code.google.com/appengine/docs/python/datastore/entities.html 64Thursday, May 26, 2011
  • 67. How are entity groups stored? Entities table Entity groups have a single root entity Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized User ) AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog ) AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:111 AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:222 AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment ) Comment:333 Read more: http://code.google.com/appengine/docs/python/datastore/entities.html 65Thursday, May 26, 2011
  • 68. How are entity groups stored? Entities table Bigtable key Value AppId:User:ikai@google.com ( Protobuf serialized User ) AppId:User:ikai@google.com/Blog:123 ( Protobuf serialized Blog ) AppId:User:ikai@google.com/Blog:123/Entry:456 ( Protobuf serialized Entry ) AppId:User:ikai@google.com/Blog:123/Entry:789 ( Protobuf serialized Entry ) Child entities embed the entire ancestry in AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:111 their keys AppId:User:ikai@google.com/Blog:123/Entry:456/ ( Protobuf serialized Comment ) Comment:222 AppId:User:ikai@google.com/Blog:123/Entry:789/ ( Protobuf serialized Comment ) Comment:333 Read more: http://code.google.com/appengine/docs/python/datastore/entities.html 66Thursday, May 26, 2011
  • 69. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 67Thursday, May 26, 2011
  • 70. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Create the root entity Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 68Thursday, May 26, 2011
  • 71. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); This is the first child entity - notice the third // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); argument, which specifies the parent entity key Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 69Thursday, May 26, 2011
  • 72. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); The next deeper entity sets the blog as the parent Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 70Thursday, May 26, 2011
  • 73. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", We can also opt to not provide a key name and ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", just use a parent key for a new entity blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 71Thursday, May 26, 2011
  • 74. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); Start a new transaction // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 72Thursday, May 26, 2011
  • 75. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); Put the entities in parallel 73Thursday, May 26, 2011
  • 76. Let’s write an entity group transactionally DatastoreService datastore = DatastoreServiceFactory .getDatastoreService(); Entity ikai = new Entity("User", "ikai@google.com"); Entity blog = new Entity("Blog", "ikaisays.com", ikai.getKey()); Entity entry = new Entity("Entry", "datastore-intro", blog.getKey()); // Auto assign an ID Entity comment = new Entity("Comment", entry.getKey()); Transaction tx = datastore.beginTransaction(); // Helper function for clarity Actually commit the changes datastore.put(Arrays.asList(ikai, blog,entry, comment)); tx.commit(); 74Thursday, May 26, 2011
  • 77. Step 1: Commit Changes to Changes to entities Commit entities visible and indexes visible Roll the timestamp forward on the root entity 75Thursday, May 26, 2011
  • 78. On read, check for the most Step 2: Entity visible recent timestamp on the root entity Changes to Changes to entities Commit entities visible and indexes visible This is the version we want since it represents a complete write 76Thursday, May 26, 2011
  • 79. Step 3: Indexes updated Changes to Changes to entities Commit entities visible and indexes visible Indexes are written - now we can query for this entity with the new properties 77Thursday, May 26, 2011
  • 80. Entity group and transactions takeaways • Structure data into hierarchical trees – Large enough to be useful, small enough to maximize transactional throughput • Transactions need an entity group root - roughly 1 transaction/ second – If you write N entities that are all part of 1 entity group, it counts as 1 write • Optimistic locking used - can be expensive with a lot of contention 78Thursday, May 26, 2011
  • 81. General datastore tips • Denormalize as much as possible – As much as possible, treat datastore as a key-value store (Dictionary or Map like structure) – Move large reporting to offline processing. This lets you avoid unnecessary indexes • Use entity groups for your data • Build composite indexes where you need them - “need” depends on shape of your data 79Thursday, May 26, 2011
  • 82. Thursday, May 26, 2011
  • 83. Questions?Thursday, May 26, 2011
  • 84. Thursday, May 26, 2011