Scalability in Mind 
當老軟體Drupal 遇上大架構 
2014-10-18 PHPConf 
Jimmy Huang 黃雋
Drupal and me 
Jimmy Major Versions 
媽,我在這裡
Why not Drupal?
It’s just a CMS 
for damned cat not for me 
Image from: https://flic.kr/p/hv9xDs
Learning Curve 
Image from http://www.codem0nk3y.com/2012/04/what-bugs-me-about-modx-and-why/cms-learning-curve/
Slower... 
than my own fastest code 
Image from: https://flic.kr/p/9CWhYu
Too may reason to say no... 
Not OOP 
No ORM 
Made by PHP 
Hard to make theme 
Hard to staging, continues deploying
沒有愛
1. Flexibility 
For Drupal Beginner
Drupal can be a: 
● Personal blog 
● Company official site 
● Community forum 
● Online commerce shopping mall 
● Company intranet portal 
● Heavy media site 
● Video portal 
● Mobile backend CMS
CMS? 
Development oriented CMS 
● Not (only) a framework 
● config many things in UI 
● Abstract in data layer 
● Need 3-party modules
Content Type 
Sample for phpconf 2014
Sample for phpconf 2014
View 1
View 1
View 2
View 2
View 3
View 3
Query Generator in clicks 
same query, different layout
Modules will working together 
hook API design
Modules will working together
module_invoke('AMAZINGMY_captcha', 'captcha', 
'generate', $captcha_type_challenge); 
/** 
* Implementation of hook_captcha(). 
*/ 
function AMAZINGMY_captcha_captcha($op, $captcha_type='') { 
switch ($op) { 
case 'list': 
return array('AMAZINGMY CAPTCHA'); 
case 'generate': 
if ($captcha_type == 'AMAZINGMY CAPTCHA') { 
$captcha = array(); 
$captcha['solution'] = 'AMAZINGMY'; 
$captcha['form']['captcha_response'] = array( 
'#type' => 'textfield', 
'#title' => t('Enter "Amazing"'), 
'#required' => TRUE, 
); 
return $captcha;
horizontal 
2. Scalability
Hosting Architecture 
Image from https://groups.drupal.org/node/24412 
scale out 
scale out 
for pure dynamic site
Prepare to scale 
● Reverse proxy or Hardware load balance? 
● Where are your Sessions? 
● File storage? 
● Separated read/write query?
Reverse Proxy 
ip from - $_SERVER[‘HTTP_X_FORWARDED_FOR‘] 
https://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/ip_address/7
Reverse Proxy 
Reverse Proxy 
Remote addr will get proxy IP 
Real ip need forward from Proxy
Reverse Proxy 
Example setting of Nginx 
Location { 
proxy_set_header Host $host; 
proxy_set_header X-Real-IP $remote_addr; 
proxy_set_header X-Forwarded-Host $host; 
proxy_set_header X-Forwarded-Server $host; 
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; 
}
Session Storage 
●Plugable 
●Centralized 
●Fast
Session Storage 
before 2008, Drupal 6 save session to DB 
https://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/_drupal_bootstrap/6
Session Storage 
after 2011, Drupal 7, have plugable Session config in core 
https://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/drupal_bootstrap/7
Session Storage 
after 2014, Drupal 8 include better handler from Symfony 2 
Drupal 8 API: http://goo.gl/VVQ2Ua
Session Storage 
PHP 5.4 also have better SessionHandler class 
http://php.net/manual/en/class.sessionhandler.php
File Storage 
● After upload 
can other instance saw files?
File Storage 
Drupal 6 – only 1 hook nothing to help scaling
File Storage 
Drupal 7 – complete file handling api (hook_file_*)
File Storage 
● After upload, send to AWS S3 or FTP? 
– Yes! by hook_file_copy 
● Before display, alter URL for CDN support? 
– Yes! by hook_file_url_alter 
● When load file, streaming by other host? 
– Yes! by hook_file_load
File Storage 
function hook_file_url_alter(&$uri) { 
$cdn1 = 'http://cdn1.example.com'; 
$cdn2 = 'http://cdn2.example.com'; 
$cdn_extensions = array('css', 'js'); 
if($this_file_extension in $cdn_extensions){ 
$uri = $cdn1 . '/' . $path; 
} 
else{ 
$uri = $cdn2 . '/' . $path; 
} 
}
File Storage 
Third-party module - Storage API 
● Save to FTP / HTTP 
● Save to Database 
● Save to S3 
● Save to Rackspace
Database Scaling 
MongoDB? PostgreSQL?
Database Scaling 
Drupal 6 - happy querying, tragedy scaling 
function statistics_get($nid) { 
if ($nid > 0) { 
// Retrieve an array with both totalcount and 
daycount. 
$statistics = db_fetch_array(db_query('SELECT 
totalcount, daycount, timestamp FROM {node_counter} 
WHERE nid = %d', $nid)); 
} 
return $statistics; 
}
Database Scaling 
Drupal 7 – DB abstract layer 
$statistics = db_select('node_counter', 'n') 
->fields('n', array( 
'totalcount', 
'daycount', 
'timestamp')) 
->condition('nid', $nid,'=') 
->execute() 
->fetchAssoc(); 
● Support another Database (not MySQL only) 
● Separate R/W query easily
Database Scaling 
random slave every time DB bootstrap 
# default master (read / write query) 
$databases['default']['default'] = $info_array; 
# multiple slave (read only query) 
$databases['default']['slave'][] = $info_array; 
$databases['default']['slave'][] = $info_array;
Database Scaling 
page specific query to slave by 1 click
Database Scaling
3. why Scalability matter?
我不胖,只是腫了一點
Not Fastest solution 
But Available solution
Why a CMS designed like this? 
● Pro 
– Quick and easy to enter Drupal (even not Engineer) 
– Can stack special requirement into Drupal 
– When more function or more user, scale out 
● Cons 
– Not so easy (if you would like to develop with D) 
– Not so flexible vs framework (because it isn’t) 
– Definitely can scale if well planned, but always not
沒有深深愛過 
怎知好與壞? 
我知道你胖,但還是愛你
you may interested in: 
● 2.4 million page views per day in Drupal 
http://sf2010.drupal.org/conference/sessions/24-million-page-views-day-6 
0-m-month-one-server.html 
● Auto Scale Drupal setup in AWS 
http://www.slideshare.net/burgerboydaddy/scaling-drupal-horizontally-and-in- 
cloud 
● Drupal vs Django 
http://birdhouse.org/blog/2009/11/11/drupal-or-django/ 
● Drupal with nodejs 
https://www.drupal.org/project/nodejs 
● Drupal with Docker 
https://github.com/ricardoamaro/docker-drupal 
● Drupal with MongoDB 
https://www.drupal.org/project/mongodb
Thank You! 
You can also find Drupaler here: 
1. DrupalTaiwan.org 
2. goo.gl/PxuhqQ 
每週三晚上8:00 Hangout 網路聚 
3. FB/groups/drupaltaiwan/ 
DrupalTaiwan Facebook 社團

Scaling in Mind (Case study of Drupal Core)

  • 1.
    Scalability in Mind 當老軟體Drupal 遇上大架構 2014-10-18 PHPConf Jimmy Huang 黃雋
  • 2.
    Drupal and me Jimmy Major Versions 媽,我在這裡
  • 3.
  • 4.
    It’s just aCMS for damned cat not for me Image from: https://flic.kr/p/hv9xDs
  • 5.
    Learning Curve Imagefrom http://www.codem0nk3y.com/2012/04/what-bugs-me-about-modx-and-why/cms-learning-curve/
  • 6.
    Slower... than myown fastest code Image from: https://flic.kr/p/9CWhYu
  • 7.
    Too may reasonto say no... Not OOP No ORM Made by PHP Hard to make theme Hard to staging, continues deploying
  • 8.
  • 9.
    1. Flexibility ForDrupal Beginner
  • 10.
    Drupal can bea: ● Personal blog ● Company official site ● Community forum ● Online commerce shopping mall ● Company intranet portal ● Heavy media site ● Video portal ● Mobile backend CMS
  • 19.
    CMS? Development orientedCMS ● Not (only) a framework ● config many things in UI ● Abstract in data layer ● Need 3-party modules
  • 20.
    Content Type Samplefor phpconf 2014
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
    Query Generator inclicks same query, different layout
  • 29.
    Modules will workingtogether hook API design
  • 30.
  • 31.
    module_invoke('AMAZINGMY_captcha', 'captcha', 'generate',$captcha_type_challenge); /** * Implementation of hook_captcha(). */ function AMAZINGMY_captcha_captcha($op, $captcha_type='') { switch ($op) { case 'list': return array('AMAZINGMY CAPTCHA'); case 'generate': if ($captcha_type == 'AMAZINGMY CAPTCHA') { $captcha = array(); $captcha['solution'] = 'AMAZINGMY'; $captcha['form']['captcha_response'] = array( '#type' => 'textfield', '#title' => t('Enter "Amazing"'), '#required' => TRUE, ); return $captcha;
  • 32.
  • 33.
    Hosting Architecture Imagefrom https://groups.drupal.org/node/24412 scale out scale out for pure dynamic site
  • 34.
    Prepare to scale ● Reverse proxy or Hardware load balance? ● Where are your Sessions? ● File storage? ● Separated read/write query?
  • 35.
    Reverse Proxy ipfrom - $_SERVER[‘HTTP_X_FORWARDED_FOR‘] https://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/ip_address/7
  • 36.
    Reverse Proxy ReverseProxy Remote addr will get proxy IP Real ip need forward from Proxy
  • 37.
    Reverse Proxy Examplesetting of Nginx Location { proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-Host $host; proxy_set_header X-Forwarded-Server $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; }
  • 38.
    Session Storage ●Plugable ●Centralized ●Fast
  • 39.
    Session Storage before2008, Drupal 6 save session to DB https://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/_drupal_bootstrap/6
  • 40.
    Session Storage after2011, Drupal 7, have plugable Session config in core https://api.drupal.org/api/drupal/includes%21bootstrap.inc/function/drupal_bootstrap/7
  • 41.
    Session Storage after2014, Drupal 8 include better handler from Symfony 2 Drupal 8 API: http://goo.gl/VVQ2Ua
  • 42.
    Session Storage PHP5.4 also have better SessionHandler class http://php.net/manual/en/class.sessionhandler.php
  • 43.
    File Storage ●After upload can other instance saw files?
  • 44.
    File Storage Drupal6 – only 1 hook nothing to help scaling
  • 45.
    File Storage Drupal7 – complete file handling api (hook_file_*)
  • 46.
    File Storage ●After upload, send to AWS S3 or FTP? – Yes! by hook_file_copy ● Before display, alter URL for CDN support? – Yes! by hook_file_url_alter ● When load file, streaming by other host? – Yes! by hook_file_load
  • 47.
    File Storage functionhook_file_url_alter(&$uri) { $cdn1 = 'http://cdn1.example.com'; $cdn2 = 'http://cdn2.example.com'; $cdn_extensions = array('css', 'js'); if($this_file_extension in $cdn_extensions){ $uri = $cdn1 . '/' . $path; } else{ $uri = $cdn2 . '/' . $path; } }
  • 48.
    File Storage Third-partymodule - Storage API ● Save to FTP / HTTP ● Save to Database ● Save to S3 ● Save to Rackspace
  • 49.
  • 50.
    Database Scaling Drupal6 - happy querying, tragedy scaling function statistics_get($nid) { if ($nid > 0) { // Retrieve an array with both totalcount and daycount. $statistics = db_fetch_array(db_query('SELECT totalcount, daycount, timestamp FROM {node_counter} WHERE nid = %d', $nid)); } return $statistics; }
  • 51.
    Database Scaling Drupal7 – DB abstract layer $statistics = db_select('node_counter', 'n') ->fields('n', array( 'totalcount', 'daycount', 'timestamp')) ->condition('nid', $nid,'=') ->execute() ->fetchAssoc(); ● Support another Database (not MySQL only) ● Separate R/W query easily
  • 52.
    Database Scaling randomslave every time DB bootstrap # default master (read / write query) $databases['default']['default'] = $info_array; # multiple slave (read only query) $databases['default']['slave'][] = $info_array; $databases['default']['slave'][] = $info_array;
  • 53.
    Database Scaling pagespecific query to slave by 1 click
  • 54.
  • 55.
  • 56.
  • 57.
    Not Fastest solution But Available solution
  • 58.
    Why a CMSdesigned like this? ● Pro – Quick and easy to enter Drupal (even not Engineer) – Can stack special requirement into Drupal – When more function or more user, scale out ● Cons – Not so easy (if you would like to develop with D) – Not so flexible vs framework (because it isn’t) – Definitely can scale if well planned, but always not
  • 59.
  • 60.
    you may interestedin: ● 2.4 million page views per day in Drupal http://sf2010.drupal.org/conference/sessions/24-million-page-views-day-6 0-m-month-one-server.html ● Auto Scale Drupal setup in AWS http://www.slideshare.net/burgerboydaddy/scaling-drupal-horizontally-and-in- cloud ● Drupal vs Django http://birdhouse.org/blog/2009/11/11/drupal-or-django/ ● Drupal with nodejs https://www.drupal.org/project/nodejs ● Drupal with Docker https://github.com/ricardoamaro/docker-drupal ● Drupal with MongoDB https://www.drupal.org/project/mongodb
  • 61.
    Thank You! Youcan also find Drupaler here: 1. DrupalTaiwan.org 2. goo.gl/PxuhqQ 每週三晚上8:00 Hangout 網路聚 3. FB/groups/drupaltaiwan/ DrupalTaiwan Facebook 社團