Slides of my half an hour talk at Symfony Sweden November Camp, held at Hilton Slussen on Friday 22nd Nov 2013. Slides contain a generic infrastructure overview of running Eventio.com on Amazon Web Services and show a few details to consider when designing and running a distributed, reliable and scalable PHP application.
Human Factors of XR: Using Human Factors to Design XR Systems
Running a Scalable And Reliable Symfony2 Application in Cloud (Symfony Sweden November Camp 22 Nov 2013)
1. Running a scalable and reliable
Symfony2 application
22nd Nov 2013
Symfony Sweden November Camp
2. @vjom = Ville Mattila
github.com/vmattila/ | villescorner.com | fi.linkedin.com/in/villemattila
• CTO & President of the board in Eventio Oy
– We provide IT tools and services for event
organizers
– Office(s) in Kaarina & Turku (Åbo), Finland
• Our flagship product eventio.com runs on
Symfony 2.3
3. Being a reliable means that you accept and prepare that
ANYTHING CAN FAIL, ANY TIME
(scalability comes free on top)
5. Session Handling
Store Sessions in Database
• Replace NativeFileSessionHandler with
an existing database handler or create your
own
config.yml:
services:
my.distributed.session_handler:
class: MyDistributedSessionHandler
arguments: [ . . . ]
framework:
session:
handler_id: my.distributed.session_handle
Symfony2 offers handlers for PDO, MongoDB and Memcached by default!
6. Session Handling
Beware of Race Conditions
• Default SessionHandler implementations (or PHP itself)
do not implement any kind of locking
• Only session snapshot from the request that finishes
last is stored (other changes vanish)
Request #1
$session->set(“email”,
“ville@eventio.fi”)
Request #2
TIME
$session->set(“twitter”, “vjom”);
//
//
//
//
Some time consuming
operation that takes
seconds...
like email validation.
Request #3
// And other operations
$session->set(“name”, “Ville”)
exit;
// And other operations
exit;
exit;
7. Session Handling in Eventio.com:
• Symfony2 (”browser”) sessions in Redis Cluster,
without locking
The ”business session” identifier
Flash Messages
Other temporary and/or cached data
PredisSessionHandler
is in our GitHub
• ”Business sessions” are stored in MongoDB
• Common identifier for the user’s or customer’s session
throughout the system
• Business sessions are never purged for accounting and
auditing reasons
• Updates to session parameters are done atomically (by
Doctrine ODM)
9. Background Processing Flow
Controller validates the request.
On PHP/HTTP Servers
POST /get_report
{”poll”: ”/is_ready?handle=1234”}
ReportController::
generateAction()
The actual task is passed to
the queue
GET /is_ready?handle=1234
ReportController::
pollAction()
{”status”: ”wait”}
Queue
Abstraction
Library
Status reports to
a shared store.
On PHP/CLI Worker Servers
Task Processor
QH invoked CLI Process
Queue Handler
(QH)
Long running CLI Process
… and forwards it to another process that
does the actual job.
Queue
Implementation
EventioBBQ
is in our GitHub
Queue Handler asks new jobs from
the queue
10. Session Handling
Reporting Status
• Pattern 1: Send status reports to a database
(or a key-value store)
// Set status in a background worker
$predis->set('bg_task_status:' . $handle, 'completed');
// Get status in the poll handling controller
$predis->get('bg_task_status:' . $handle);
• Pattern 2: Pass result to a (temporary)
queue and poll the queue for status reports
11. Remember with CLI Workers
• Service Container’s request scope is not
available
Use service_container and/or abstract away the
requested feature.
• Session information is not directly available
in worker processes
• We pass Session ID (and Request ID) with the job and
recreate the environment for the background process
• service_container is a great help here
12. S3 as a Common File Store
• Send files to S3 with a simple HTTP Call
• Files are accessible over HTTP by the application (private) or directly by
external users (public)
• URL signing enables you to grant temporary access also to private files
• Object Expiration for a bucket works as a /tmp for cloud
// composer.json
"require": {
"aws/aws-sdk-php": "2.*"
}
$objectParams = array(
'Bucket' => '...',
'Key'
=> 'my/files/pngfile.png',
'Body'
=> $fileContents,
'ACL'
=> CannedAcl::PRIVATE_ACCESS,
);
$s3->putObject($objectParams);
@see http://aws.amazon.com/sdkforphp2/
$objectParams = array(
'Bucket' => '...',
'Key'
=> 'my/files/pngfile.png',
'SaveAs' => '/run/shm/pngfile.png',
);
$s3->getObject($objectParams);
13. Reliable Cron
How to ensure that
1) a cron initiated process is run only
once (to avoid double actions)?
2) cron processes are not our single
point of failure?
14. Reliable Cron
instance-1
instance-2
instance-3
eventio:cron is run at every
instance, at every minute
Every eventio:cron process
tries to acquire a global lock,
but only one gets it
SETNX cron:lock:2013-11-22T11:30:00Z instance-1
From Redis documentation:
SETNX Set key to hold string value if key does
not exist. In that case, it is equal to SET. When
key already holds a value, no operation is
performed.
15. Reliable Cron
CronBundle
is in our GitHub
instance-1
instance-2
instance-3
The current
timestamp is
persisted and used in
future processing
Instance that receives the lock continues the
cron process and triggers cron.tick event in
the application.
$cronTime = new DateTime(date("Y-m-d H:i:00"));
$cronEvent = new CronEvent($cronTime);
$this->get('event_dispatcher')->dispatch('cron.tick', $cronEvent);
Job classes listen for the triggered cron.tick event:
Decide if we should do
something now. Use
$cronTime as the
current time.
class CronJob {
public function run(CronEvent $event)
{
$cronTime = $event->getCronTime();
if ($cronTime->getTimestamp() % (5 * 60)) {
return;
}
// Do the task but consider current time as $cronTime!
}
}
Consider
background
workers.
17. Asset Deployment
<commithash>
Deploymen
t master
server
Asset hash is stored in a
simple
<commithash>.txt
file in our asset S3
bucket
The master server creates a hash of all asset files and their
(original) contents with the help of assetic:list
If the asset S3 bucket does not have /<assethash>.txt file,
deployment master does a full assetic:dump and uploads the
dumped asset files to the S3 bucket under /<assethash>/
directory. This ensures that HTTP caches are busted.
18. Application deployments
<commithash>
Deploymen
t master
server
•
•
•
cache:clear
Quick local &
connectivity tests
(Instance registration
back to ELB)
Instance is
unregistered from the
Load Balancer
S3 bucket is queried for the
<assethash> matching the
current <commithash>
… and added to the
parameters.yml
asset_hash: <assethash>
config.yml:
framework:
templating:
assets_base_urls: ["https://.../%asset_hash%"]