Running a scalable and reliable
Symfony2 application
22nd Nov 2013
Symfony Sweden November Camp
@vjom = Ville Mattila
github.com/vmattila/ | villescorner.com | fi.linkedin.com/in/villemattila

• CTO & President of the board in Eventio Oy
– We provide IT tools and services for event
organizers
– Office(s) in Kaarina & Turku (Åbo), Finland

• Our flagship product eventio.com runs on
Symfony 2.3
Being a reliable means that you accept and prepare that

ANYTHING CAN FAIL, ANY TIME
(scalability comes free on top)
Our Infrastructure
•
•
•

Region: eu-west-1
Multiple AZ’s
Auto-scaling

Load Balancer
(ELB)

PHP/HTTP Servers
(EC2)
PHP/CLI Background
Workers (EC2)
(RDS Multi-AZ)

CDN
(CloudFront)

S3 Storage for
”in-app” files

S3 Storage for
assets
Session Handling

Store Sessions in Database
• Replace NativeFileSessionHandler with
an existing database handler or create your
own
config.yml:
services:
my.distributed.session_handler:
class: MyDistributedSessionHandler
arguments: [ . . . ]
framework:
session:
handler_id: my.distributed.session_handle
Symfony2 offers handlers for PDO, MongoDB and Memcached by default!
Session Handling

Beware of Race Conditions
• Default SessionHandler implementations (or PHP itself)
do not implement any kind of locking
• Only session snapshot from the request that finishes
last is stored (other changes vanish)
Request #1
$session->set(“email”,
“ville@eventio.fi”)

Request #2

TIME

$session->set(“twitter”, “vjom”);
//
//
//
//

Some time consuming
operation that takes
seconds...
like email validation.

Request #3

// And other operations

$session->set(“name”, “Ville”)

exit;

// And other operations
exit;

exit;
Session Handling in Eventio.com:

• Symfony2 (”browser”) sessions in Redis Cluster,
without locking
The ”business session” identifier
Flash Messages
Other temporary and/or cached data

PredisSessionHandler
is in our GitHub

• ”Business sessions” are stored in MongoDB
• Common identifier for the user’s or customer’s session
throughout the system
• Business sessions are never purged for accounting and
auditing reasons
• Updates to session parameters are done atomically (by
Doctrine ODM)
Background Processing

return new Response(); // ASAP!
Background Processing Flow
Controller validates the request.
On PHP/HTTP Servers
POST /get_report
{”poll”: ”/is_ready?handle=1234”}

ReportController::
generateAction()

The actual task is passed to
the queue

GET /is_ready?handle=1234

ReportController::
pollAction()

{”status”: ”wait”}

Queue
Abstraction
Library

Status reports to
a shared store.
On PHP/CLI Worker Servers

Task Processor
QH invoked CLI Process

Queue Handler
(QH)
Long running CLI Process

… and forwards it to another process that
does the actual job.

Queue
Implementation

EventioBBQ
is in our GitHub

Queue Handler asks new jobs from
the queue
Session Handling

Reporting Status
• Pattern 1: Send status reports to a database
(or a key-value store)
// Set status in a background worker
$predis->set('bg_task_status:' . $handle, 'completed');
// Get status in the poll handling controller
$predis->get('bg_task_status:' . $handle);

• Pattern 2: Pass result to a (temporary)
queue and poll the queue for status reports
Remember with CLI Workers
• Service Container’s request scope is not
available
Use service_container and/or abstract away the
requested feature.

• Session information is not directly available
in worker processes
• We pass Session ID (and Request ID) with the job and
recreate the environment for the background process
• service_container is a great help here
S3 as a Common File Store
• Send files to S3 with a simple HTTP Call
• Files are accessible over HTTP by the application (private) or directly by
external users (public)
• URL signing enables you to grant temporary access also to private files
• Object Expiration for a bucket works as a /tmp for cloud
// composer.json
"require": {
"aws/aws-sdk-php": "2.*"
}
$objectParams = array(
'Bucket' => '...',
'Key'
=> 'my/files/pngfile.png',
'Body'
=> $fileContents,
'ACL'
=> CannedAcl::PRIVATE_ACCESS,
);
$s3->putObject($objectParams);

@see http://aws.amazon.com/sdkforphp2/

$objectParams = array(
'Bucket' => '...',
'Key'
=> 'my/files/pngfile.png',
'SaveAs' => '/run/shm/pngfile.png',
);
$s3->getObject($objectParams);
Reliable Cron
How to ensure that
1) a cron initiated process is run only
once (to avoid double actions)?
2) cron processes are not our single
point of failure?
Reliable Cron
instance-1

instance-2

instance-3

eventio:cron is run at every
instance, at every minute

Every eventio:cron process
tries to acquire a global lock,
but only one gets it

SETNX cron:lock:2013-11-22T11:30:00Z instance-1
From Redis documentation:
SETNX Set key to hold string value if key does
not exist. In that case, it is equal to SET. When
key already holds a value, no operation is
performed.
Reliable Cron
CronBundle
is in our GitHub
instance-1

instance-2

instance-3

The current
timestamp is
persisted and used in
future processing

Instance that receives the lock continues the
cron process and triggers cron.tick event in
the application.
$cronTime = new DateTime(date("Y-m-d H:i:00"));
$cronEvent = new CronEvent($cronTime);
$this->get('event_dispatcher')->dispatch('cron.tick', $cronEvent);

Job classes listen for the triggered cron.tick event:

Decide if we should do
something now. Use
$cronTime as the
current time.

class CronJob {
public function run(CronEvent $event)
{
$cronTime = $event->getCronTime();
if ($cronTime->getTimestamp() % (5 * 60)) {
return;
}
// Do the task but consider current time as $cronTime!
}
}

Consider
background
workers.
ASSET & APPLICATION
DEPLOYMENTS
Going Live!
Asset Deployment
<commithash>

Deploymen
t master
server

Asset hash is stored in a
simple
<commithash>.txt
file in our asset S3
bucket

The master server creates a hash of all asset files and their
(original) contents with the help of assetic:list
If the asset S3 bucket does not have /<assethash>.txt file,
deployment master does a full assetic:dump and uploads the
dumped asset files to the S3 bucket under /<assethash>/
directory. This ensures that HTTP caches are busted.
Application deployments
<commithash>

Deploymen
t master
server

•
•
•

cache:clear
Quick local &
connectivity tests
(Instance registration
back to ELB)

Instance is
unregistered from the
Load Balancer

S3 bucket is queried for the
<assethash> matching the
current <commithash>
… and added to the
parameters.yml
asset_hash: <assethash>
config.yml:
framework:
templating:
assets_base_urls: ["https://.../%asset_hash%"]
https://eventio.com/

Thanks!
https://github.com/eventio/

Running a Scalable And Reliable Symfony2 Application in Cloud (Symfony Sweden November Camp 22 Nov 2013)

  • 1.
    Running a scalableand reliable Symfony2 application 22nd Nov 2013 Symfony Sweden November Camp
  • 2.
    @vjom = VilleMattila github.com/vmattila/ | villescorner.com | fi.linkedin.com/in/villemattila • CTO & President of the board in Eventio Oy – We provide IT tools and services for event organizers – Office(s) in Kaarina & Turku (Åbo), Finland • Our flagship product eventio.com runs on Symfony 2.3
  • 3.
    Being a reliablemeans that you accept and prepare that ANYTHING CAN FAIL, ANY TIME (scalability comes free on top)
  • 4.
    Our Infrastructure • • • Region: eu-west-1 MultipleAZ’s Auto-scaling Load Balancer (ELB) PHP/HTTP Servers (EC2) PHP/CLI Background Workers (EC2) (RDS Multi-AZ) CDN (CloudFront) S3 Storage for ”in-app” files S3 Storage for assets
  • 5.
    Session Handling Store Sessionsin Database • Replace NativeFileSessionHandler with an existing database handler or create your own config.yml: services: my.distributed.session_handler: class: MyDistributedSessionHandler arguments: [ . . . ] framework: session: handler_id: my.distributed.session_handle Symfony2 offers handlers for PDO, MongoDB and Memcached by default!
  • 6.
    Session Handling Beware ofRace Conditions • Default SessionHandler implementations (or PHP itself) do not implement any kind of locking • Only session snapshot from the request that finishes last is stored (other changes vanish) Request #1 $session->set(“email”, “ville@eventio.fi”) Request #2 TIME $session->set(“twitter”, “vjom”); // // // // Some time consuming operation that takes seconds... like email validation. Request #3 // And other operations $session->set(“name”, “Ville”) exit; // And other operations exit; exit;
  • 7.
    Session Handling inEventio.com: • Symfony2 (”browser”) sessions in Redis Cluster, without locking The ”business session” identifier Flash Messages Other temporary and/or cached data PredisSessionHandler is in our GitHub • ”Business sessions” are stored in MongoDB • Common identifier for the user’s or customer’s session throughout the system • Business sessions are never purged for accounting and auditing reasons • Updates to session parameters are done atomically (by Doctrine ODM)
  • 8.
  • 9.
    Background Processing Flow Controllervalidates the request. On PHP/HTTP Servers POST /get_report {”poll”: ”/is_ready?handle=1234”} ReportController:: generateAction() The actual task is passed to the queue GET /is_ready?handle=1234 ReportController:: pollAction() {”status”: ”wait”} Queue Abstraction Library Status reports to a shared store. On PHP/CLI Worker Servers Task Processor QH invoked CLI Process Queue Handler (QH) Long running CLI Process … and forwards it to another process that does the actual job. Queue Implementation EventioBBQ is in our GitHub Queue Handler asks new jobs from the queue
  • 10.
    Session Handling Reporting Status •Pattern 1: Send status reports to a database (or a key-value store) // Set status in a background worker $predis->set('bg_task_status:' . $handle, 'completed'); // Get status in the poll handling controller $predis->get('bg_task_status:' . $handle); • Pattern 2: Pass result to a (temporary) queue and poll the queue for status reports
  • 11.
    Remember with CLIWorkers • Service Container’s request scope is not available Use service_container and/or abstract away the requested feature. • Session information is not directly available in worker processes • We pass Session ID (and Request ID) with the job and recreate the environment for the background process • service_container is a great help here
  • 12.
    S3 as aCommon File Store • Send files to S3 with a simple HTTP Call • Files are accessible over HTTP by the application (private) or directly by external users (public) • URL signing enables you to grant temporary access also to private files • Object Expiration for a bucket works as a /tmp for cloud // composer.json "require": { "aws/aws-sdk-php": "2.*" } $objectParams = array( 'Bucket' => '...', 'Key' => 'my/files/pngfile.png', 'Body' => $fileContents, 'ACL' => CannedAcl::PRIVATE_ACCESS, ); $s3->putObject($objectParams); @see http://aws.amazon.com/sdkforphp2/ $objectParams = array( 'Bucket' => '...', 'Key' => 'my/files/pngfile.png', 'SaveAs' => '/run/shm/pngfile.png', ); $s3->getObject($objectParams);
  • 13.
    Reliable Cron How toensure that 1) a cron initiated process is run only once (to avoid double actions)? 2) cron processes are not our single point of failure?
  • 14.
    Reliable Cron instance-1 instance-2 instance-3 eventio:cron isrun at every instance, at every minute Every eventio:cron process tries to acquire a global lock, but only one gets it SETNX cron:lock:2013-11-22T11:30:00Z instance-1 From Redis documentation: SETNX Set key to hold string value if key does not exist. In that case, it is equal to SET. When key already holds a value, no operation is performed.
  • 15.
    Reliable Cron CronBundle is inour GitHub instance-1 instance-2 instance-3 The current timestamp is persisted and used in future processing Instance that receives the lock continues the cron process and triggers cron.tick event in the application. $cronTime = new DateTime(date("Y-m-d H:i:00")); $cronEvent = new CronEvent($cronTime); $this->get('event_dispatcher')->dispatch('cron.tick', $cronEvent); Job classes listen for the triggered cron.tick event: Decide if we should do something now. Use $cronTime as the current time. class CronJob { public function run(CronEvent $event) { $cronTime = $event->getCronTime(); if ($cronTime->getTimestamp() % (5 * 60)) { return; } // Do the task but consider current time as $cronTime! } } Consider background workers.
  • 16.
  • 17.
    Asset Deployment <commithash> Deploymen t master server Assethash is stored in a simple <commithash>.txt file in our asset S3 bucket The master server creates a hash of all asset files and their (original) contents with the help of assetic:list If the asset S3 bucket does not have /<assethash>.txt file, deployment master does a full assetic:dump and uploads the dumped asset files to the S3 bucket under /<assethash>/ directory. This ensures that HTTP caches are busted.
  • 18.
    Application deployments <commithash> Deploymen t master server • • • cache:clear Quicklocal & connectivity tests (Instance registration back to ELB) Instance is unregistered from the Load Balancer S3 bucket is queried for the <assethash> matching the current <commithash> … and added to the parameters.yml asset_hash: <assethash> config.yml: framework: templating: assets_base_urls: ["https://.../%asset_hash%"]
  • 19.