Do you queue? By Kevin Schroeder Technology Evangelist Zend Technologies
About Kevin Past: Programming/Sys Admin Current: Technology Evangelist/Author/Composer @kpschrade
Could your PHP apps benefit from being able to process data or execute asynchronously? Twtpoll results
Performance Execute logic apart from the main request (asynchronicity) Scalability The ability to handle non-immediate logic as resources are available Why would you want to queue?
People often say that performance and scalability are separate things. This is inaccurate (consider Google) Performance is the speed by which a request is executed Scalability is the ability of that request to maintain its performance as load/infrastructure increases Performance & Scalability
DON’T!! You are not Facebook You probably won’t be Don’t overcomplicate your problems by trying to be So how do you scale to Facebook size?
Typical anatomy of a PHP Application Bad for scala-bility! Presentation Application Control Database Access Business Logic Presentation Application Control Business Logic Presentation | 7
Defined tasks Loose coupling Resource discovery What helps make software scalable?
The Golden Rule of Scalability “It can probably wait”
Pre-caching data Data analysis Pre-analysis (predicting where your users will go) Data processing Pre-calculating (preparing data for the next request) Data is “out of date” once it leaves the web server Immediacy is seldom necessary Asynchronous execution uses
A sledgehammer can hit a machine Scalability and High Availability are yin and yang A site that can’t keep running is not scalable A site that can’t scale will fail (if it gets really popular) Machines can be added and removed at will Not “cloudy” necessarily No single point of failure Data exists in at least two, preferably at least three, places Characteristics
Waste disk space Control usage (don’t let users do anything they want) Pre-calculate as much as possible Calculate and cache/store Don’t scan large amounts of data Keep data processing off the front end servers Don’t just cache Don’t let it substitute for thought Cache hit rates can be meaningless if you have hundreds of cache hits for a request Considerations
Build a deployment mechanism with NO hardcoded values like directory or resource locations
Make as much as possible configurable/discoverable
Decouple/Partition Don’t tie everything (relationships and such) into the database Use queues/messaging Stomp interfaces are really good for PHP – Can also use Java Bridge Zend_Queue has several interfaces Try to use stateless interfaces polling is more scalable than idle connections; introduces lag Considerations
Use Cron /w PHP CLI (probably shouldn’t do this) Use Gearman Use home-grown (don’t do this) Use pcntl_fork() (NEVER do this) Use Zend Server Job Queue Options
Your only real options Very cloud friendly * I am not an expert on Gearman. Corrections will be taken in the spirit that they are given. For obvious reasons, I will focus on Zend Server
Schedule jobs in the future Set recurring jobs Execute immediately, as resources are available (my fav) Utilize ZendJobQueue() Using the Zend Server Job Queue
Job Queue Architecture – Elastic Backend Web Server /w JQ Web Server Users! Web Server /w JQ Web Server Load Balancer Web Server /w JQ Web Server Pros Scale the backend as necessary Default (easy) mechanism Cons Getting the job status requires using a DB
Job Queue Architecture – Elastic Frontend Web Server Web Server /w JQ Users! Web Server Web Server /w JQ Load Balancer Web Server Web Server /w JQ Pros Easy to communicate with the Job Queue server handling the job Cons Requires you to build your own interface
Create a task-handling controller Create an abstract task class Understands the Job Queue Self contained If Elastic Backend: connects to localhost If Elastic Frontend: connects to load balancer (my preferred), load balanced JQ servers manage themselves Execute the task, have it serialize itself and send it to send to the task handler Kevin’s favorite way to implement it
comzendjobqueueManager Handles connecting to the queue and passing results back and forth comzendjobqueueJobAbstract Abstract class that a job would be based off of comzendjobqueueResponse The response from the manager when a job is queued. Contains the server name and job number orgeschradejobsScandir The actual job that scans the directory orgeschradejobsScandirResult An object that represents the data found Classes involved in the demo
Create job and set data Execute job Job passes itself to the queue manager Manager serializes job Manager uses HTTP call through a load balancer to queue the job The queue on the other end returns the job id and server name Job ID and server name is passed to the client Client polls the manager to get a completed job When the job is returned pass the serialized version of the executed job Execution Flow
Let’s write some code (no I’m not copping out with slides. We’re all told to show our work in grade skool)