I've (probably) been using Google App Engine for a week longer than you have


Published on

Slides from an introduction to Google App Engine presented on the 31st of May 2008 at BarCamp London 4.

Published in: Technology, News & Politics
  • Google is being run by Indians, managerially and technically. Even though Page and Schmidt are CEO and Executive Chairman of Big G, but still we can’t forget that it was Amit Singhal, an IIT Roorkey Graduate, who re-wrote the whole algorithm of Google Search Engine in 2000 which made Google the best in the industry. Then, Nikesh Arora of BHU-IT is the Chief Business Manager; Vic Goundotra is the man behind the whole Google Plus… and, many many more. Search FAMOUS INDIANS WORKING IN GOOGLE for more details.
    Are you sure you want to  Yes  No
    Your message goes here
  • This is about Google app engine
    Are you sure you want to  Yes  No
    Your message goes here
  • Very informative stuff Simon.

    I am evaluating GAE for one of my Struts based Java project. And I have serious concerns on storage and retrieval of data. On traditional database you can generate reports on your data. You can exec aggregate queries.

    But here the scene has been changed here. Still I am confused how I see stored data in BigTable db? I am seriously concerned to have no showstopper in my work and to avoid/identify limitations. My project survival and success based on it.

    Comments will be appreciated.

    Tahir Akram
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

I've (probably) been using Google App Engine for a week longer than you have

  1. I’ve (probably) been using Google App Engine for a week longer than you have Simon Willison - http://simonwillison.net/ BarCamp London 4 31st May 2008
  2. Except you have to re-write your whole application If you totally rethink the way you use a database
  3. What it can do • Serve static files • Serve dynamic requests • Store data • Call web services (sort of) • Authenticate against Google’s user database • Send e-mail, process images, use memcache
  4. The dev environment is really, really nice • Download the (open source) SDK • a full simulation of the App Engine environment • dev_appserver.py myapp for a local webserver • appcfg.py update myapp to deploy to the cloud
  5. Options • You have to use Python • You can choose how you use it: • CGI-style scripts • WSGI applications • Google’s webapp framework • Django (0.96 provided, or install your own)
  6. Hello World # helloworld.py print quot;Content-Type: text/htmlquot; print print quot;Hello, world!quot; # app.yaml application: simonwillison-helloworld version: 1 runtime: python api_version: 1 handlers: - url: /.* script: helloworld.py
  7. With webapp and WSGI import wsgiref.handlers from google.appengine.ext import webapp class MainPage(webapp.RequestHandler):   def get(self):     self.response.headers['Content-Type'] = 'text/html'     self.response.out.write('Hello, webapp World!') def main():   application = webapp.WSGIApplication( [('/', MainPage)], debug=True)   wsgiref.handlers.CGIHandler().run(application) if __name__ == quot;__main__quot;:   main()
  8. With Django from django.conf.urls.defaults import * from django.http import HttpResponse def hello(request): return HttpResponse(quot;Hello, World!quot;) urlpatterns = patterns('', ('^$', hello), ) (And django_dispatch.py for boilerplate)
  9. • Don't use CGI: it requires reloading for every hit • Why use Django over webapp? • Django has easy cookies and custom 500 errors • Django is less verbose • Django middleware is really handy • You can use other WSGI frameworks if you like
  10. Static files # in app.yaml handlers: - url: /css static_dir: css - url: /img static_dir: img - url: /favicon.ico static_files: img/favicon.ico upload: img/favicon.ico mime_type: image/x-icon
  11. The Datastore API
  12. “Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance.”
  13. The App Engine datastore • Apparently based on BigTable • Absolutely not a relational database • No joins (they do have “reference fields”) • No aggregate queries - not even count()! • Hierarchy affects sharding and transactions • All queries must run against an existing index
  14. Models and entities • Data is stored as entities • Entities have properties - key/value pairs • An entity has a unique key • Entities live in a hierarchy, and siblings exist in the same entity group - these are actually really important for transactions and performance • A model is kind of like a class; it lets you define a type of entity
  15. AppEngine Models from google.appengine.ext import db class Account(db.Model): slug = db.StringProperty(required=True) owner = db.UserProperty() onlyme = db.BooleanProperty() referrers = db.StringListProperty() (There is a ReferenceProperty, but I haven’t used it yet)
  16. Inserting data account = Account( key_name = slug, slug = slug, referrers = ['...', '...'], onlyme = False, owner = users.get_current_user() ) db.put(account) # Or account.put() Browser.get_or_insert(key_name, parent = account, slug = browser_slug )
  17. Running queries Browser.all().ancestor(account) Account.gql(quot;WHERE slug = :1quot;, slug)) Story.all().filter( 'title =', 'Foo' ).order('-date')
  18. BUT... • All queries must run against an existing index • Filtering or sorting on a property requires that the property exists • Inequality filters are allowed on one property only • Properties in inequality filters must be sorted before other sort orders • ... and various other rules • Thankfully the dev server creates most indexes for you automatically based on usage
  19. How indexes are used 1. The datastore identifies the index that corresponds with the query’s kind, filter properties, filter operators, and sort orders. 2. The datastore starts scanning the index at the first entity that meets all of the filter conditions using the query’s filter values. 3. The datastore continues to scan the index, returning each entity, until it finds the next entity that does not meet the filter conditions, or until it reaches the end of the index.
  20. Further limitations • If you create a new index and push it live, you have to wait for it to rebuilt • This can take hours, and apparently can go wrong • You can’t safely grab more than about 500 records at once - App Engine times out • You can’t delete in bulk
  21. Other random notes • You have to use the URL Fetch API to do HTTP requests (e.g. for web services) - and it times out agressively at about 5 seconds • The Google accounts Users API is ridiculously easy to use, but... • no permanent unique identifier; if the user changes their e-mail address you’re screwed • The new image and memcache APIs are neat
  22. Final thoughts • It’s really nice not to have to worry about hosting • But... the lack of aggregate queries and ad-hoc queries really hurts • Perfect for small projects you don’t want to worry about and big things which you’re sure will have to scale • Pricing is comparable to S3 - i.e. stupidly cheap
  23. Pricing • $0.10 - $0.12 per CPU core-hour • $0.15 - $0.18 per GB-month of storage • $0.11 - $0.13 per GB outgoing bandwidth • $0.09 - $0.11 per GB incoming bandwidth
  24. Thank you!