Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Like this? Share it with your network

Share

HPPG - high performance photo gallery

on

  • 2,495 views

New version of presentation a lot of changes since last time.

New version of presentation a lot of changes since last time.

Statistics

Views

Total Views
2,495
Views on SlideShare
2,494
Embed Views
1

Actions

Likes
0
Downloads
8
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

HPPG - high performance photo gallery Presentation Transcript

  • 1. Introducing high performance photo gallery
    • Remigijus Kiminas
                  • 2010-11-29
                  • 2. v5
  • 3. Who I am?
    • Author of
      • http://livehelperchat.com/
      • 4. http://redmine.remdex.info my projects :)
    • Currently working
      • http://www.coralsolutions.com/
      • 5. Freelancing and building open-source software in free time
  • 6. Purpose of the presentation 1
    • Present some architecture decisions witch were applied building image gallery
  • 7. What's new since last presentation
    • Mobile devices get support
    • 8. Image gallery can be used as shopping CMS
      • Credit's based buying
      • 9. Checkout using paypal service
    • Uncached pages get speed improvement by finding bug in paginator.
    • 10. Official ngnix support
  • 11. What's new since last presentation 2
    • Extensions
    • 12. Kernel modules override
    • 13. Kernel classes override
    • 14. CSS compile
    • 15. Most popular images in 24 hours
    • 16. Photo approvement functionality
    • 17. Image filtering by resolution
  • 18. What's new since last presentation 3
    • Thumbnails recreation script
    • 19. 100% duplicates management accuracy
    • 20. More configurable system aspects as:
      • Max upload photo size
      • 21. Max archive size
      • 22. Max file queue size
    • Animated gif support
  • 23. What's new since last presentation 4
    • Animated gif support
    • 24. Completely fixed AJAX navigation usability, no more confusing of available images to left or to right.
    • 25. Front end design remake, thanks to http://pauliusc.lt
    • 26. HTML output compression
    • 27. HTML 5 frontend changes, saves bandwidth
  • 28. What's new since last presentation 5
    • Some performance improvement regarding users permissions settings
    • 29. More things moved to Memcached service
  • 30. What's new since last presentation 5 V4
    • Sort by relevance was introduced
    • 31. AddQuery usage implementation in search
    • 32. Refactored search page. One query less now.
    • 33. Paginator updates
    • 34. Sphinx wildcard support
    • 35. Images without original deletion script
    • 36. SEO enchancement related to resolution and user current page
  • 37. What's new since last presentation 5 V5
    • Refactored captcha, it's now AJAX/javacript based, performs well, plus saves one request on image preview window
    • 38. Image preview full window cache!!! cached windows is as fast as cached pagination around 5ms
    • 39. Image counter from log file, avoid insert on each image preview window
  • 40. What's new since last presentation 5 V5
    • Last rated functionality
    • 41. Cache status window
    • 42. Recently top rated, in 24 hours
    • 43. APC support as cache engine.
    • 44. HTML5, SWF, FLV files support
    • 45. Search suggest feature
  • 46. What's new since last presentation 5 V5
    • Mysql query hint for album pagination, mysql planner choosed wrong indexes
    • 47. Smart selects in image preview window
    • 48. Full multilanguage support including translatable module URL!!! none of my known gallery/cms has this featyre. E.x gallery/search (engish) or gallerie/recherche (french)
    • 49. Full InnoDB support. Performs well as MyISAM. Top process is PHP not Mysql :)
  • 50. Future works
    • Pagination sharding with index filter shard table. It should boost large sets of pagination around 100% > and keep constant speed with millions of photos.
    • 51. http://remdex.info/Optimising-mysql-limit-performance-99a.html
    • 52. Backend redesign
  • 53. Issues with previous image gallery's I had
    • A lot of users = a lot of problems
      • No caching support
      • 54. Unoptimized SQL query's
      • 55. Resource hungry
      • 56. No framework used (well, perhaps this is not a problem, but most of the time they just duplicate frameworks functionality, reinventing the wheel...)
      • 57. No Etag based caching, bandwidth saver...
  • 58. Requirements
    • Optimized SQL queries
    • 59. Fulltext search engine
    • 60. Etag based caching
    • 61. SQL querys caching
    • 62. Fullpage caching
    • 63. Low resource requirements
  • 64. Adopted software
    • APC – opcode cache for PHP
    • 65. Sphinx – free open-source SQL full-text search engine (http://sphinxsearch.com/)
    • 66. Memcached – free & open source, high-performance, distributed memory object caching system (http://memcached.org/)
    • 67. eZ Components – an enterprise-ready, general-purpose PHP library of components used independently or together for PHP application development. (http://ez.no/ezcomponents)
    • 68. JQuery – is a fast and concise JavaScript Library that simplifies HTML document traversing, event handling, animating, and Ajax interactions for rapid web development. (http://jquery.com/)
    • 69. Lighttpd – lightweight open-source web server. (http://www.lighttpd.net/)
    • 70. Mysql – database engine (http://www.mysql.com)
  • 71. Adopted software
    • Ngnix - A HTTP and mail proxy server licensed under a 2-clause BSD-like license. (http://nginx.org/)
    • 72. Fully working ngnix config provided. For eshop requirements and standard
  • 73. Building process – core
    • Gallery core is based on eZ Components. Used components:
      • Authentication
      • 74. Configuration
      • 75. Database
      • 76. Feed
      • 77. ImageAnalysis
      • 78. ImageConversion
      • 79. PersistentObject
      • 80. Translation
      • 81. Cache
      • 82. Url
      • 83. UserInput
  • 84. Fulltext search implementation
    • Why sphinx?
      • Very very fast :)
    • Used features of 9.9
      • SetSelect – this feature was introduced in 9.9 version and allowed to make fancy filtering.
      • 85. Example in next slide
  • 86. Image full mode problem with previous and next image
    • Search condition in literal. I need to find 2 previous images based on current image position including search keyword, sorting mode.
    • 87. URL consists of
      • Current image ID (16679)
      • 88. Keyword (haposai)
      • 89. Sort mode (popular)
    • How do I find out what should I display in two first thumbnails (middle image is current our image)?
  • 90. Solution
    • Use SetSelect query $cl->SetSelect ( "*, (hits > '.$Image->hits.' OR (hits = '.$Image->hits.' AND pid > '.$Image->pid.')) AS myfilter" ); $cl->SetFilter ( "myfilter", array(1) );
    • 91. Things I do not know how to do till now. If sorting is based on relevance how to now previous two images.
    • 92. I know now. But:
      • SetSelect does not work with @weight attributes in it.
      • 93. Had to use two query's. SetFilter() works with @weight
      • 94. AddQuery comes in help here for perfromance. Mutch more relevance images now.
  • 95. Some search statistic
    • Each day around 190 K querys. It were more if search result page were not be cached :)
  • 96. Mysql performance tweaking
    • Just optimise querys (EXPLAIN is you friend)
    • 97. Not a single slow query
    • 98. Some tips:
      • With large data sets use
      • 99. SELECT * FROM `lh_gallery_images`
      • 100. INNER JOIN ( SELECT pid FROM lh_gallery_images ORDER BY comtime DESC, pid DESC LIMIT 20 OFFSET 20 ) AS items
      • 101. ON lh_gallery_images.pid = items.pid
      • 102. This query is at least 5x times faster than normal select. Tested with (150 K records.)
      • 103. See - http://www.mysqlperformanceblog.com
  • 104. Supported HTTP servers
    • Lighttpd
    • 105. Apache
    • 106. Ngnix
      • With ngnix managed to produce around 1200 Q/S on cached page. It's 30% more than with Lighttpd.
  • 107. Caching objects
    • Version caching
      • http://www.bestechvideos.com/2009/03/21/railslab-scaling-rails-episode-8-memcached
      • 108. http://www.infoq.com/presentations/lutke-rockstar-memcaching
      • 109. Version cache were used in
        • Album pages
        • 110. Last uploaded
        • 111. Last hits
        • 112. Popular images and so on.
        • 113. The most popular images in 24 hours
      • Then cache is cleared?
        • It's not, only version number is increased, and automatic cache self expire, because cache key does not exists.
  • 114. Some code with version cache
    • Cache Key calculation in Album
    • 115. $cache = CSCacheAPC::getMem(); $cacheKey = md5('version_'.$cache->getCacheVersion('album_'.(int)$Params['user_parameters']['album_id']).$mode.'album_view_url'.(int)$Params['user_parameters']['album_id'].'_page_'.$Params['user_parameters_unordered']['page']);
      • Includes:
        • Album version
        • 116. $mode – sorting mode (Ex. Popular)
        • 117. Page
      • this combination gives unique cache version for each page.
    • Same logic applies to all listing pages
  • 118. Some benchmarks [root@ks310613 ~]# ab -n 500 -c 10 http://animeonly.org/Fantasy/Mix-16a.html This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0 Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright 2006 The Apache Software Foundation, http://www.apache.org/ Benchmarking animeonly.org (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Finished 500 requests Server Software: lighttpd Server Hostname: animeonly.org Server Port: 80 Document Path: /Fantasy/Mix-16a.html Document Length: 26883 bytes Concurrency Level: 10 Time taken for tests: 0.545137 seconds Complete requests: 500 Failed requests: 0 Write errors: 0 Total transferred: 13593092 bytes HTML transferred: 13441500 bytes Requests per second: 917.20 [#/sec] (mean) Time per request: 10.903 [ms] (mean) Time per request: 1.090 [ms] (mean, across all concurrent requests) Transfer rate: 24349.84 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 5 10 2.9 9 23 Waiting: 4 9 3.1 9 23 Total: 5 10 2.9 9 23 Percentage of the requests served within a certain time (ms) 50% 9 66% 12 75% 13 80% 13 90% 13 95% 13 98% 20 99% 22 100% 23 (longest request)
  • 119. Etag base caching
    • What is it?
      • An ETag (entity tag) is part of HTTP, the protocol for the World Wide Web. It is a response header that may be returned by an HTTP/1.1 compliant web server and is used to determine change in content at a given URL (http://en.wikipedia.org/wiki/HTTP_ETag)
  • 120. How to use it? $ExpireTime = 3600; $currentKeyEtag = md5($cacheKey.'user_id_'.erLhcoreClassUser::instance()->getUserID());; header('Cache-Control: max-age=' . $ExpireTime); // must-revalidate header('Expires: '.gmdate('D, d M Y H:i:s', time()+$ExpireTime).' GMT'); header('ETag: ' . $currentKeyEtag); $iftag = isset($_SERVER['HTTP_IF_NONE_MATCH']) ? $_SERVER['HTTP_IF_NONE_MATCH'] == $currentKeyEtag : null; if ($iftag === true) { header (&quot;HTTP/1.0 304 Not Modified&quot;); header ('Content-Length: 0'); exit; }
    • $cacheKey – from previous example cache key
    • 121. User ID is needed if user is logged in.
    • 122. Can be used for custom pages, that do not change
    • 123. Then image is uploaded or deleted, we just increase cache version and Etag is expired automatic also.
  • 124. Some MRTG screen shots 1
    • Hits per hour
    • 125. Mysql queries
  • 126. Some MRTG screen shots 2
    • Memcached status
    • 127. Traffic stats
  • 128. Conclusions
    • Single server with sphinx, memcached, mysql, nginx handles per day around 180 K pageviews daily.
    • 129. No performance issues at this time.
    • 130. Gallery home page
    • 131. http://code.google.com/p/hppg/
  • 132. Thank you for your attention :)
    • Questions etc:
      • [email_address]