• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
I've (probably) been using Google App Engine for a week longer than you have
 

I've (probably) been using Google App Engine for a week longer than you have

on

  • 27,786 views

Slides from an introduction to Google App Engine presented on the 31st of May 2008 at BarCamp London 4.

Slides from an introduction to Google App Engine presented on the 31st of May 2008 at BarCamp London 4.

Statistics

Views

Total Views
27,786
Views on SlideShare
27,258
Embed Views
528

Actions

Likes
35
Downloads
622
Comments
3

15 Embeds 528

http://jamesgae.appspot.com 400
http://www.slideshare.net 54
http://tmuhimbisemoses.blogspot.com 27
http://thoughtsbehindakeyboard.blogspot.com 22
http://chunghe.blogspot.com 6
http://soapbox.gruden.int 4
http://chunghe.blogspot.tw 3
https://jamesgae.appspot.com 2
https://www.linkedin.com 2
http://localhost 2
http://tonyhsieh.blogspot.com 2
http://chunghe.blogspot.hk 1
http://www.lmodules.com 1
https://home.jolicloud.com 1
http://paper.li 1
More...

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

13 of 3 previous next Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Google is being run by Indians, managerially and technically. Even though Page and Schmidt are CEO and Executive Chairman of Big G, but still we can’t forget that it was Amit Singhal, an IIT Roorkey Graduate, who re-wrote the whole algorithm of Google Search Engine in 2000 which made Google the best in the industry. Then, Nikesh Arora of BHU-IT is the Chief Business Manager; Vic Goundotra is the man behind the whole Google Plus… and, many many more. Search FAMOUS INDIANS WORKING IN GOOGLE for more details.
    Are you sure you want to
    Your message goes here
    Processing…
  • This is about Google app engine
    Are you sure you want to
    Your message goes here
    Processing…
  • Very informative stuff Simon.

    I am evaluating GAE for one of my Struts based Java project. And I have serious concerns on storage and retrieval of data. On traditional database you can generate reports on your data. You can exec aggregate queries.

    But here the scene has been changed here. Still I am confused how I see stored data in BigTable db? I am seriously concerned to have no showstopper in my work and to avoid/identify limitations. My project survival and success based on it.

    Comments will be appreciated.

    Tahir Akram
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    I've (probably) been using Google App Engine for a week longer than you have I've (probably) been using Google App Engine for a week longer than you have Presentation Transcript

    • I’ve (probably) been using Google App Engine for a week longer than you have Simon Willison - http://simonwillison.net/ BarCamp London 4 31st May 2008
    • Except you have to re-write your whole application If you totally rethink the way you use a database
    • What it can do • Serve static files • Serve dynamic requests • Store data • Call web services (sort of) • Authenticate against Google’s user database • Send e-mail, process images, use memcache
    • The dev environment is really, really nice • Download the (open source) SDK • a full simulation of the App Engine environment • dev_appserver.py myapp for a local webserver • appcfg.py update myapp to deploy to the cloud
    • Options • You have to use Python • You can choose how you use it: • CGI-style scripts • WSGI applications • Google’s webapp framework • Django (0.96 provided, or install your own)
    • Hello World # helloworld.py print quot;Content-Type: text/htmlquot; print print quot;Hello, world!quot; # app.yaml application: simonwillison-helloworld version: 1 runtime: python api_version: 1 handlers: - url: /.* script: helloworld.py
    • With webapp and WSGI import wsgiref.handlers from google.appengine.ext import webapp class MainPage(webapp.RequestHandler):   def get(self):     self.response.headers['Content-Type'] = 'text/html'     self.response.out.write('Hello, webapp World!') def main():   application = webapp.WSGIApplication( [('/', MainPage)], debug=True)   wsgiref.handlers.CGIHandler().run(application) if __name__ == quot;__main__quot;:   main()
    • With Django from django.conf.urls.defaults import * from django.http import HttpResponse def hello(request): return HttpResponse(quot;Hello, World!quot;) urlpatterns = patterns('', ('^$', hello), ) (And django_dispatch.py for boilerplate)
    • • Don't use CGI: it requires reloading for every hit • Why use Django over webapp? • Django has easy cookies and custom 500 errors • Django is less verbose • Django middleware is really handy • You can use other WSGI frameworks if you like
    • Static files # in app.yaml handlers: - url: /css static_dir: css - url: /img static_dir: img - url: /favicon.ico static_files: img/favicon.ico upload: img/favicon.ico mime_type: image/x-icon
    • The Datastore API
    • “Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance.”
    • The App Engine datastore • Apparently based on BigTable • Absolutely not a relational database • No joins (they do have “reference fields”) • No aggregate queries - not even count()! • Hierarchy affects sharding and transactions • All queries must run against an existing index
    • Models and entities • Data is stored as entities • Entities have properties - key/value pairs • An entity has a unique key • Entities live in a hierarchy, and siblings exist in the same entity group - these are actually really important for transactions and performance • A model is kind of like a class; it lets you define a type of entity
    • AppEngine Models from google.appengine.ext import db class Account(db.Model): slug = db.StringProperty(required=True) owner = db.UserProperty() onlyme = db.BooleanProperty() referrers = db.StringListProperty() (There is a ReferenceProperty, but I haven’t used it yet)
    • Inserting data account = Account( key_name = slug, slug = slug, referrers = ['...', '...'], onlyme = False, owner = users.get_current_user() ) db.put(account) # Or account.put() Browser.get_or_insert(key_name, parent = account, slug = browser_slug )
    • Running queries Browser.all().ancestor(account) Account.gql(quot;WHERE slug = :1quot;, slug)) Story.all().filter( 'title =', 'Foo' ).order('-date')
    • BUT... • All queries must run against an existing index • Filtering or sorting on a property requires that the property exists • Inequality filters are allowed on one property only • Properties in inequality filters must be sorted before other sort orders • ... and various other rules • Thankfully the dev server creates most indexes for you automatically based on usage
    • How indexes are used 1. The datastore identifies the index that corresponds with the query’s kind, filter properties, filter operators, and sort orders. 2. The datastore starts scanning the index at the first entity that meets all of the filter conditions using the query’s filter values. 3. The datastore continues to scan the index, returning each entity, until it finds the next entity that does not meet the filter conditions, or until it reaches the end of the index.
    • Further limitations • If you create a new index and push it live, you have to wait for it to rebuilt • This can take hours, and apparently can go wrong • You can’t safely grab more than about 500 records at once - App Engine times out • You can’t delete in bulk
    • Other random notes • You have to use the URL Fetch API to do HTTP requests (e.g. for web services) - and it times out agressively at about 5 seconds • The Google accounts Users API is ridiculously easy to use, but... • no permanent unique identifier; if the user changes their e-mail address you’re screwed • The new image and memcache APIs are neat
    • Final thoughts • It’s really nice not to have to worry about hosting • But... the lack of aggregate queries and ad-hoc queries really hurts • Perfect for small projects you don’t want to worry about and big things which you’re sure will have to scale • Pricing is comparable to S3 - i.e. stupidly cheap
    • Pricing • $0.10 - $0.12 per CPU core-hour • $0.15 - $0.18 per GB-month of storage • $0.11 - $0.13 per GB outgoing bandwidth • $0.09 - $0.11 per GB incoming bandwidth
    • Thank you!