Google Devfest 2009 Argentina - Intro to Appengine
Upcoming SlideShare
Loading in...5
×
 

Google Devfest 2009 Argentina - Intro to Appengine

on

  • 4,044 views

Google Devfest 2009 Argentina - Intro to Appengine

Google Devfest 2009 Argentina - Intro to Appengine

Statistics

Views

Total Views
4,044
Views on SlideShare
4,030
Embed Views
14

Actions

Likes
6
Downloads
49
Comments
0

5 Embeds 14

http://www.slideshare.net 8
http://thetechnicalweb.blogspot.com 2
http://thetechnicalweb.blogspot.kr 2
http://thetechnicalweb.blogspot.co.uk 1
http://thetechnicalweb.blogspot.no 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Google Devfest 2009 Argentina - Intro to Appengine Google Devfest 2009 Argentina - Intro to Appengine Presentation Transcript

  • From Spark Plug to Drive Train: Life of an App Engine Request Patrick Chanezon @chanezon Ignacio Blanco @blanconet 11/16/09 Post your questions for this talk on Twitter #devfest09 #appengine
  • Designing for Scale and Reliability
  • Agenda Designing for Scale and Reliability App Engine: Design Motivations Life of a Request Request for static content Request for dynamic content Requests that use APIs App Engine: Design Motivations, Recap App Engine: The Numbers
  • Google App Engine
  • LiveJournal circa 2007 From Brad Fitzpatrick's USENIX '07 talk: "LiveJournal: Behind the Scenes"
  • LiveJournal circa 2007 From Brad Fitzpatrick's USENIX '07 talk: "LiveJournal: Behind the Scenes" Memcache Application Frontends Servers Storage Static File Servers
  • Basic LAMP Linux, Apache, MySQL, Programming Language Scalable? Shared machine for database and webserver Reliable? Single point of failure (SPOF)
  • Dedicated Database Database running on a separate server Requirements Another machine plus additional management Scalable? Up to one web server Reliable? Two single points of failure
  • Multiple Web Servers Benefits: Grow traffic beyond the capacity of one webserver Requirements: More machines Set up load balancing
  • Multiple Web Servers Load Balancing: DNS Round Robin Register list of IPs with DNS Statistical load balancing DNS record is cached with Time To Live (TTL) TTL may not be respected
  • Multiple Web Servers Load Balancing: DNS Round Robin Register list of IPs with DNS Statistical load balancing DNS record is cached with Time To Live (TTL) TTL may not be respected Now wait for DNS changes to propagate :-(
  • Multiple Web Servers Load Balancing: DNS Round Robin Scalable? Add more webservers as necessary Still I/O bound on one database Reliable? Cannot redirect traffic quickly Database still SPOF
  • Reverse Proxy Benefits: Custom Routing Specialization Application-level load balancing Requirements: More machines Configuration and code for reverse proxies
  • Reverse Proxy Scalable? Add more web servers Specialization Bound by Routing capacity of reverse proxy One database server Reliable? Agile application-level routing Specialized components are more robust Multiple reverse proxies requires network-level routing DNS Round Robin (again) Fancy network routing hardware Database is still SPOF
  • Master-Slave Database Benefits: Better read throughput Invisible to application Requirements: Even more machines Changes to MySQL
  • Master-Slave Database Scalable? Scales read rate with # of servers But not writes What happens eventually?
  • Master-Slave Database Reliable? Master is SPOF for writes Master may die before replication
  • Partitioned Database Benefits: Requirements: Increase in both read Even more machines and write throughput Lots of management Re-architect data model Rewrite queries
  • The App Engine Stack
  • App Engine: Design Motivations
  • Design Motivations Build on Existing Google Technology Provide an Integrated Environment Encourage Small Per-Request Footprints Encourage Fast Requests Maintain Isolation Between Applications Encourage Statelessness and Specialization Require Partitioned Data Model
  • Life of a Request: Request for Static Content
  • Request for Static Content on Google Network Routed to the nearest Google datacenter Travels over Google's network Same infrastructure other Google products use Lots of advantages for free
  • Request for Static Content Routing at the Front End Google App Engine Front Ends Load balancing Routing Frontends route static requests to specialized serving infrastructure
  • Request for Static Content Static Content Servers Google Static Content Serving Built on shared Google Infrastructure Static files are physically separate from code files How are static files defined?
  • Request for Static Content What content is static? Java Runtime: appengine-web.xml ... <static> <include path="/**.png" /> <exclude path="/data/**.png /> </static> ... Python Runtime: app.yaml ... - url: /images static_dir: static/images OR - url: /images/(.*) static_files: static/images/1 upload: static/images/(.*) ...
  • Request For Static Content Response to the user Back to the Front End and out to the user Specialized infrastructure App runtimes don't serve static content
  • Life of a Request: Request for Dynamic Content
  • Request for Dynamic Content: New Components App Servers and App Master App Servers Serve dynamic requests Where your code runs App Master Schedules applications Informs Front Ends
  • Request for Dynamic Content: Appservers What do they do? Many applications Many concurrent requests Smaller app footprint + fast requests = more apps Enforce Isolation Keeps apps safe from each other Enforce statelessness Allows for scheduling flexibility Service API requests
  • Request For Dynamic Content Routing at the Frontend Front Ends route dynamic requests to App Servers
  • Request for Dynamic Content App Server 1. Checks for cached runtime If it exists, no initialization 2. Execute request 3. Cache the runtime System is designed to maximize caching Slow first request, faster subsequent requests Optimistically cache data in your runtime!
  • Life of a Request: Requests accessing APIs
  • API Requests App Server 1. App issues API call 2. App Server accepts 3. App Server blocks runtime 4. App Server issues call 5. Returns the response Use APIs to do things you don't want to do in your runtime, such as...
  • APIs
  • Memcacheg A more persistent in-memory cache Distributed in-memory cache memcacheg Also written by Brad Fitzpatrick adds: set_multi, get_multi, add_multi Optimistically cache for optimization Very stable, robust and specialized
  • The App Engine Datastore Persistent storage
  • Bigtable in one slide Scalable structured storage Is not... Is... database sharded, sorted array sharded database distributed hashtable
  • Datastore Under the Covers Scalability does not come for free Datastore built on top of Bigtable Queries Indexes
  • Bigtable Sort Machine #1 Machine #2 Machine #3
  • Bigtable Operations Read Write Delete Single row transaction (atomic) Operate
  • Bigtable Operations Scan: prefix b Scan: range b to d
  • Entities table Primary table Every entity in every app Generic, schemaless Row name -> entity key Column (only 1) serialized entity
  • Entity keys Based on parent entities: root entity, then child, child, ... class Grandparent(db.Model): pass class Parent(db.Model): pass class Child(db.Model): pass Felipe = Grandparent() Carlos = Parent(parent=Felipe) Ignacio = Child(parent=Carlos) my family /Grandparent:Felipe /Grandparent:Felipe/Parent:Carlos /Grandparent:Felipe/Parent:Carlos/Child:Ignacio
  • Queries GQL < SQL Restrict by kind Filter by property values Sort on properties Ancestor SELECT * FROM Person ... WHERE name = 'John'; ORDER BY name DESC; WHERE city = 'Sonoma' AND state = 'CA' AND country = 'USA'; WHERE ANCESTOR IS :Felipe ORDER BY name;
  • Indexes Dedicated Bigtable tables Map property values to entities Each row = index data + entity key Convert queries to scans No filtering in memory No sorting in memory
  • Query planning Convert query to dense index scan 1. Pick index 2. Compute prefix or range 3. Scan!
  • Kind index SELECT * FROM Grandparent 1. Kind index 2. Prefix = Grandparent 3. Scan! Row name Row name
  • Single-property index Queries on single property (one ASC one DESC) WHERE name = 'John' ORDER BY name DESC WHERE name >= 'B' AND name < 'C' ORDER BY name Row name Key
  • Single-property index Queries on single property (one ASC one DESC) Row name Key
  • Single-property index WHERE name = 'John' 1. Prefix = Parent name John (fully specified) 2. Scan!
  • Single-property index ORDER BY name DESC 1. Prefix = Parent name 2. Scan!
  • Single-property index WHERE name >= 'B' and name < 'C' ORDER BY name 1. RANGE = [Parent name B, Parent name C) 2. Scan!
  • The App Engine Datastore Persistent storage Your data is already partitioned Use Entity Groups Explicit Indexes make for fast reads But slower writes Replicated and fault tolerant On commit: ≥3 machines Geographically distributed Bonus: Keep globally unique IDs for free
  • Other APIs GMail Gadget API Google Tasks Queue Accounts Picasaweb Google Talk
  • App Engine: Design Motivations, Recap
  • Build on Existing Google Technology creative commons licensed photograph: http://www.flickr.com/photos/cote/54408562/
  • Provide an Integrated Environment Why? Manage all apps together What it means for you: Follow best practices Some restrictions Use our tools Benefits: Use our tools Admin Console All of your logs in one place No machines to configure or manage Easy deployment
  • Encourage Small Per-Request Footprints Why? Better utilization of App Servers Fairness What it means for your app: Less Memory Usage Limited CPU Benefits: Better use of resources
  • Encourage Fast Requests Why? Better utilization of appservers Fairness between applications Routing and scheduling agility What it means for your app: Runtime caching Request deadlines Benefits: Optimistically share state between requests Better throughput Fault tolerance Better use of resources
  • Maintain Isolation Between Apps Why? Safety Predictability What it means for your app: Certain system calls unavailable Benefits: Security Performance
  • Encourage Statelessness and Specialization Why? App Server performance Load balancing Fault tolerance What this means for you app: Use API calls Benefits: Automatic load balancing Fault tolerance Less code for you to write Better use of resources
  • Require Partitioned Data Model Why? The Datastore What this means for your app: Data model + Indexes Reads are fast, writes are slower Benefits: Design your schema once No need to re-architect for scalability More efficient use of cpu and memory
  • Google App Engine: The Numbers
  • Google App Engine Currently, over 100K applications Serving over 250M pageviews per day Written by over 200K developers
  • App Engine
  • Open For Questions The White House's "Open For Questions" application accepted 100K questions and 3.6M votes in under 48 hours
  • 18 months in review Apr 2008 Python launch May 2008 Memcache, Images API Jul 2008 Logs export Aug 2008 Batch write/delete Oct 2008 HTTPS support Dec 2008 Status dashboard, quota details Feb 2009 Billing, larger files Apr 2009 Java launch, DB import, cron support, SDC May 2009 Key-only queries Jun 2009 Task queues Aug 2009 Kindless queries Sep 2009 XMPP Oct 2009 Incoming email
  • Roadmap Storing/serving large files Mapping operations across data sets Datastore cursors Notification alerts for application exceptions Datastore dump and restore facility
  • Wrap up
  • Always free to get started ~5M pageviews/month 6.5 CPU hrs/day 1 GB storage 650K URL Fetch calls 2,000 recipients emailed 1 GB/day bandwidth N tasks
  • Purchase additional resources * * free monthly quota of ~5 million page views still in full effect
  • Thank you Read more http://code.google.com/appengine/
  • Questions? Post your questions for this talk on Twitter #devfest09 #appengine
  • Thanks To Alon Levi, Fred Sauer for most of these slides