Google App Engine performance tuningDavid Chen from TagtooSaturday, 1 June, 13
About me• Tagtoo Cofounder & Senior Engineer• NCTU Computer Science M.S.• UW. Computational Finance M.S.• Interested in:• ...
Google App Engine & Google Cloud PlatformSaturday, 1 June, 13
Introduction• Google App Engine in 2007• PaaS (Platform as a Service)• Total Solution• Scalable, Easy to use (really?)• Au...
Introduction• Google App Engine in 2007• PaaS (Platform as a Service)• Total Solution• Scalable, Easy to use (really?)• Au...
Google Cloud Platform Familyhttps://cloud.google.com/Saturday, 1 June, 13
Google Cloud Platform Family• App Engine becomes gateway to• Google Computing Engine (as EC2)• CloudSQL (as MySQL)• Big Qu...
Why We Use App Engine• Really Easy to configure• Really Easy to scale• Never Few Failed• Fully Python environment• Real Rea...
Why We Use App Engine• Really Easy to configure• Really Easy to scale• Never Few Failed• Fully Python environment• Real Rea...
Performance TuningSaturday, 1 June, 13
Environment Configuration• Python 2.7• Thread: Safe• High Replication Datastore (HRD)• webapp2 + jinja2https://developers.g...
Setting• Setting Frontend Instance Class• Configure the Scheduler• Idle Instance• Pending Latency• Task queue control• X-Ap...
Performance Tuning #1: Log Analysis, find outwhich kind of request need to be optimizeSaturday, 1 June, 13
Log AnalysisSaturday, 1 June, 13
Log AnalysisSaturday, 1 June, 13
Log AnalysisSaturday, 1 June, 13
Log AnalysisSaturday, 1 June, 13
Log AnalysisSaturday, 1 June, 13
Analysis Log in Details• Export logs• appcfg.py request_logs myapp/ mylogs.txt• options:--num_days--end_date--serverity=[0...
Log Parser• Sqlite:http://google-app-engine-samples.googlecode.com/svn/trunk/logparser/logparser.py• BigQuery:https://code...
Query• select path, count(*) as freq, avg(cpm_usd) as avg_cost from log group bypath order by avg_cost desc limit 10• sele...
Query• select path, count(*) as freq, avg(cpm_usd) as avg_cost from log group bypath order by avg_cost desc limit 10• sele...
Performance Tuning #2: AppStat, find out how tooptimize requestSaturday, 1 June, 13
AppStat• Profiling the RPC (remote procedure call) performance of your application.• Datastore• Memcache• Url Fetch• Mail• ...
AppStat• Profiling the RPC (remote procedure call) performance of your application.• Datastore• Memcache• Url Fetch• Mail• ...
Install App StatsApp.yamlappengine_config.pySaturday, 1 June, 13
Demo App StatSaturday, 1 June, 13
What AppStat can tell• Is your application making unnecessary RPC calls?• Should it be caching data instead of making repe...
Performance Tuning #3: optimize rpcSaturday, 1 June, 13
Method 1: Batch RPCBatch VersionSaturday, 1 June, 13
Example: Batch RPCHint: Batch as much as you can (less than 500)Saturday, 1 June, 13
Example: Batch RPCHint: Some function support batch modeSaturday, 1 June, 13
Method 2: Async RPCSaturday, 1 June, 13
Example: Async RPCHint: Read async need to control the flow carefully. fire the rpc as soon aspossible, get the result as la...
Example: Async RPCHint: For write, always try async, or try batch + asyncSaturday, 1 June, 13
Method 3: Cache RPC resultRPCRPCCached VersionWrite to CacheRead CacheReturn if cache existsSaturday, 1 June, 13
Example: Cache RPCHint: cache repeat rpc, check the hit ratio at Memcache ViewerSaturday, 1 June, 13
Example: Cache RPCHint: memcache accept both async and batch operationSaturday, 1 June, 13
Performance Tuning #4: optimize datastoreSaturday, 1 June, 13
Google App Engine Datastore• Schemaless (NoSql) for large scale• Need to manually configure index• Support limited SQL comm...
Datastore Index• 太長了.. 有機會再講Saturday, 1 June, 13
Datastore Pricing• Write: 50000 free, $0.1 / 100kRead: 50000 free, $0.07 / 100kSmall: 50000 free, $0.01 / 100k• Looks Fine...
Datastore PricingSaturday, 1 June, 13
Datastore PricingSaturday, 1 June, 13
Datastore PricingSaturday, 1 June, 13
SAMPLE• Get:1 read ops = $0.07 / 100k• Insert:2 + 2 x 12 = 26 write ops = $2.6 / 100kHint: Each index property cost moneyS...
SAMPLE• Get:1 read ops = $0.07 / 100k• Insert:2 + 2 x 12 = 26 write ops = $2.6 / 100k 37xHint: Each index property cost mo...
How to optimize it?Saturday, 1 June, 13
Redefined Model with “indexed=False”• Get:1 read ops = $0.07 / 100k• Insert:2 = 2 write ops = $0.2 / 100k• Cannot query wit...
ListPropertyHint: ListProperty is useful, but could also be dangerousSaturday, 1 June, 13
ListProperty + MapReduceHint: MapReduce + ListProperty could be more than dangerous...Saturday, 1 June, 13
Entity SizeHint: Entity size won’t affect the cost (and won’t affect the performance)Saturday, 1 June, 13
Datastore Hints• Make Table BIG!• but only index if it is necessary• Find Alternative Solution:• CloudSQL + Cache• query i...
Build indexhttps://code.google.com/p/google-app-engine-ranklist/Hint: Binary Tree, Trie, etc.. Build complex data structur...
More Design Principle• Denormalize is better than normalize• Think about real user case• MxM or Mxn or nxn?• More read or ...
Google App Engine - NDB• Automatic caching• In-Context Cache• write through - Memcache Cache• The StructuredProperty class...
Performance Tuning #5: optimize controlDatastore RenderLogicSaturday, 1 June, 13
Use server side cacheDatastore RenderLogicUpdate CacheReturn if cache existsSaturday, 1 June, 13
Use server side cacheDatastore RenderLogicUpdate CacheReturn if cache existsSaturday, 1 June, 13
Use server side cacheDatastore RenderLogicUpdate CacheReturn if cache existsSaturday, 1 June, 13
Use Tasks for non-request bound functionality• Offline Update•Datastore RenderLogicSaturday, 1 June, 13
Host dynamic content as static• Last-Modified, ETag• Expires, max-age• Expires• Edge-ControlSaturday, 1 June, 13
Upcoming SlideShare
Loading in …5
×

AppEngine Performance Tuning

5,321 views

Published on

Explain the way to optimize AppEngine Performance.
Presented in PyCon 2013 in Taiwan

Published in: Technology, News & Politics

AppEngine Performance Tuning

  1. 1. Google App Engine performance tuningDavid Chen from TagtooSaturday, 1 June, 13
  2. 2. About me• Tagtoo Cofounder & Senior Engineer• NCTU Computer Science M.S.• UW. Computational Finance M.S.• Interested in:• Python• Google Cloud Platform• Ski• Taipei.py ROCK!Saturday, 1 June, 13
  3. 3. Google App Engine & Google Cloud PlatformSaturday, 1 June, 13
  4. 4. Introduction• Google App Engine in 2007• PaaS (Platform as a Service)• Total Solution• Scalable, Easy to use (really?)• Automatic scaling and loading balanceSaturday, 1 June, 13
  5. 5. Introduction• Google App Engine in 2007• PaaS (Platform as a Service)• Total Solution• Scalable, Easy to use (really?)• Automatic scaling and loading balanceHint:我要打十個Saturday, 1 June, 13
  6. 6. Google Cloud Platform Familyhttps://cloud.google.com/Saturday, 1 June, 13
  7. 7. Google Cloud Platform Family• App Engine becomes gateway to• Google Computing Engine (as EC2)• CloudSQL (as MySQL)• Big Query (SQL for terabyte)• Cloud Storage• 2013 Google I/O• Datastore Service• Php Support ...Saturday, 1 June, 13
  8. 8. Why We Use App Engine• Really Easy to configure• Really Easy to scale• Never Few Failed• Fully Python environment• Real Reason:• Too Lazy to learn a complex platform• Cheap / Powerful / Easy if use it carefullySaturday, 1 June, 13
  9. 9. Why We Use App Engine• Really Easy to configure• Really Easy to scale• Never Few Failed• Fully Python environment• Real Reason:• Too Lazy to learn a complex platform• Cheap / Powerful / Easy if use it carefullySaturday, 1 June, 13
  10. 10. Performance TuningSaturday, 1 June, 13
  11. 11. Environment Configuration• Python 2.7• Thread: Safe• High Replication Datastore (HRD)• webapp2 + jinja2https://developers.google.com/appengine/docs/adminconsole/performancesettingsSaturday, 1 June, 13
  12. 12. Setting• Setting Frontend Instance Class• Configure the Scheduler• Idle Instance• Pending Latency• Task queue control• X-AppEngine-FailFast• Backend Instance• Discount Instance Hour (40% off)Super Easy to ScaleSaturday, 1 June, 13
  13. 13. Performance Tuning #1: Log Analysis, find outwhich kind of request need to be optimizeSaturday, 1 June, 13
  14. 14. Log AnalysisSaturday, 1 June, 13
  15. 15. Log AnalysisSaturday, 1 June, 13
  16. 16. Log AnalysisSaturday, 1 June, 13
  17. 17. Log AnalysisSaturday, 1 June, 13
  18. 18. Log AnalysisSaturday, 1 June, 13
  19. 19. Analysis Log in Details• Export logs• appcfg.py request_logs myapp/ mylogs.txt• options:--num_days--end_date--serverity=[0-4]• appcfg.py --num_days=1 --end_date=2013-05-22 request_logs myapp/mylogs.txtSaturday, 1 June, 13
  20. 20. Log Parser• Sqlite:http://google-app-engine-samples.googlecode.com/svn/trunk/logparser/logparser.py• BigQuery:https://code.google.com/p/log2bq/MapReduce -> CloudStorage -> BigQuery• MySQL:http://github/lucemia/log2sql.gitSaturday, 1 June, 13
  21. 21. Query• select path, count(*) as freq, avg(cpm_usd) as avg_cost from log group bypath order by avg_cost desc limit 10• select path, avg(ms), count(*) from log where group by path order by avg(ms)desc limit 10Saturday, 1 June, 13
  22. 22. Query• select path, count(*) as freq, avg(cpm_usd) as avg_cost from log group bypath order by avg_cost desc limit 10• select path, avg(ms), count(*) from log where group by path order by avg(ms)desc limit 10Saturday, 1 June, 13
  23. 23. Performance Tuning #2: AppStat, find out how tooptimize requestSaturday, 1 June, 13
  24. 24. AppStat• Profiling the RPC (remote procedure call) performance of your application.• Datastore• Memcache• Url Fetch• Mail• ...• AppEngine Charge for RPChttps://developers.google.com/appengine/docs/python/tools/appstatsSaturday, 1 June, 13
  25. 25. AppStat• Profiling the RPC (remote procedure call) performance of your application.• Datastore• Memcache• Url Fetch• Mail• ...• AppEngine Charge for RPChttps://developers.google.com/appengine/docs/python/tools/appstatsSaturday, 1 June, 13
  26. 26. Install App StatsApp.yamlappengine_config.pySaturday, 1 June, 13
  27. 27. Demo App StatSaturday, 1 June, 13
  28. 28. What AppStat can tell• Is your application making unnecessary RPC calls?• Should it be caching data instead of making repeated RPC calls to get thesame data?• Will your application perform better if multiple requests are executed inparallel rather than serially?Saturday, 1 June, 13
  29. 29. Performance Tuning #3: optimize rpcSaturday, 1 June, 13
  30. 30. Method 1: Batch RPCBatch VersionSaturday, 1 June, 13
  31. 31. Example: Batch RPCHint: Batch as much as you can (less than 500)Saturday, 1 June, 13
  32. 32. Example: Batch RPCHint: Some function support batch modeSaturday, 1 June, 13
  33. 33. Method 2: Async RPCSaturday, 1 June, 13
  34. 34. Example: Async RPCHint: Read async need to control the flow carefully. fire the rpc as soon aspossible, get the result as late as possible.Saturday, 1 June, 13
  35. 35. Example: Async RPCHint: For write, always try async, or try batch + asyncSaturday, 1 June, 13
  36. 36. Method 3: Cache RPC resultRPCRPCCached VersionWrite to CacheRead CacheReturn if cache existsSaturday, 1 June, 13
  37. 37. Example: Cache RPCHint: cache repeat rpc, check the hit ratio at Memcache ViewerSaturday, 1 June, 13
  38. 38. Example: Cache RPCHint: memcache accept both async and batch operationSaturday, 1 June, 13
  39. 39. Performance Tuning #4: optimize datastoreSaturday, 1 June, 13
  40. 40. Google App Engine Datastore• Schemaless (NoSql) for large scale• Need to manually configure index• Support limited SQL command, but have different behavior• offset• No inequality on more than one property• ...• Support MapReduce (but expansive..)Saturday, 1 June, 13
  41. 41. Datastore Index• 太長了.. 有機會再講Saturday, 1 June, 13
  42. 42. Datastore Pricing• Write: 50000 free, $0.1 / 100kRead: 50000 free, $0.07 / 100kSmall: 50000 free, $0.01 / 100k• Looks Fine?• In fact:• Write could be much more expansive than you expectedSaturday, 1 June, 13
  43. 43. Datastore PricingSaturday, 1 June, 13
  44. 44. Datastore PricingSaturday, 1 June, 13
  45. 45. Datastore PricingSaturday, 1 June, 13
  46. 46. SAMPLE• Get:1 read ops = $0.07 / 100k• Insert:2 + 2 x 12 = 26 write ops = $2.6 / 100kHint: Each index property cost moneySaturday, 1 June, 13
  47. 47. SAMPLE• Get:1 read ops = $0.07 / 100k• Insert:2 + 2 x 12 = 26 write ops = $2.6 / 100k 37xHint: Each index property cost moneySaturday, 1 June, 13
  48. 48. How to optimize it?Saturday, 1 June, 13
  49. 49. Redefined Model with “indexed=False”• Get:1 read ops = $0.07 / 100k• Insert:2 = 2 write ops = $0.2 / 100k• Cannot query with propertywith indexed=False2xHint: know how you wanna query your databefore define the modelSaturday, 1 June, 13
  50. 50. ListPropertyHint: ListProperty is useful, but could also be dangerousSaturday, 1 June, 13
  51. 51. ListProperty + MapReduceHint: MapReduce + ListProperty could be more than dangerous...Saturday, 1 June, 13
  52. 52. Entity SizeHint: Entity size won’t affect the cost (and won’t affect the performance)Saturday, 1 June, 13
  53. 53. Datastore Hints• Make Table BIG!• but only index if it is necessary• Find Alternative Solution:• CloudSQL + Cache• query index and tree in memorySaturday, 1 June, 13
  54. 54. Build indexhttps://code.google.com/p/google-app-engine-ranklist/Hint: Binary Tree, Trie, etc.. Build complex data structureSaturday, 1 June, 13
  55. 55. More Design Principle• Denormalize is better than normalize• Think about real user case• MxM or Mxn or nxn?• More read or more write?• immutable data?• relation or duplicate?• Get or Query?• ...Zen of datastoreSaturday, 1 June, 13
  56. 56. Google App Engine - NDB• Automatic caching• In-Context Cache• write through - Memcache Cache• The StructuredProperty class, which allows entities to have nested structure• Asynchronous APIs which allow concurrent actions (and "synchronous" APIsif you dont need that)• Watch Out: Different Async BehaviorHint: use ndbSaturday, 1 June, 13
  57. 57. Performance Tuning #5: optimize controlDatastore RenderLogicSaturday, 1 June, 13
  58. 58. Use server side cacheDatastore RenderLogicUpdate CacheReturn if cache existsSaturday, 1 June, 13
  59. 59. Use server side cacheDatastore RenderLogicUpdate CacheReturn if cache existsSaturday, 1 June, 13
  60. 60. Use server side cacheDatastore RenderLogicUpdate CacheReturn if cache existsSaturday, 1 June, 13
  61. 61. Use Tasks for non-request bound functionality• Offline Update•Datastore RenderLogicSaturday, 1 June, 13
  62. 62. Host dynamic content as static• Last-Modified, ETag• Expires, max-age• Expires• Edge-ControlSaturday, 1 June, 13

×