6. Scheduling, Dependencies, Operations
- Crawls run once a week
- Customer site transient outages
- Multiple campaigns, same domain
- Empower help team to help
8. Language Agnostic
github.com/seomoz/qless
github.com/seomoz/qless-py
bitbucket.org/nuclon/qless-perl
github.com/seomoz/qless-core
Releasing Soon!
We’ve also played with Node.js and C++ bindings
9. Job Anatomy
Job ID and Type Priority, Flagging, Move
Searchable Tags
JSON Blob Data History
Straight out of the web app
14. Python Client
# In gnomes.py
class GnomesJob(object):
# This would be invoked when a GnomesJob is popped off the 'underpants' queue
@staticmethod
def underpants(job):
# 1) Collect Underpants
job['foo'] # Reference job data
# Complete and advance to the next step, 'unknown’
job.complete('unknown')
@staticmethod
def unknown(job):
# 2) ?
...
# Complete and advance to the next step, 'profit’
job.complete('profit')
@staticmethod
def profit(job):
# 3) Profit
...
# Complete the job
job.complete()
18. Where My Images At?
1) “Angry Guy” from http://dinkerson.wordpress.com/
http://dinkerson.files.wordpress.com/2012/02/angry-
guy.jpg
2) “Underpants Gnomes” from
http://www.tumblr.com/tagged/underpants-gnomes
3) “Switchboards” from
http://markc1.typepad.com/.a/6a00d83451bb2969e201310f
8060a1970c-800wi
Thanks for the images!
Editor's Notes
Hi, I’m Dan Lecocq from SEOmoz, and I’m going to talk quickly about our queueing system, qless.
Real quickly, we’re up in Seattle and though we don’t do SEO ourselves, we sell data and services to support SEO consultants.We offer a number of services, but the one that drove our need for qless was one in which we crawl customer sites and report general SEO bad practices.
We’re on the cloud, and so in addition to possible code flakiness, workers sometimes disappear.Jobs are long-running (sometimes even days) and represent a lot of work, and so making sure that jobs complete is really important.Jobs are sparse, and so when jobs get lost, we hear about it
We have great customers, but we’re a b2b company with most of those customers being independent SEO consultants. When things go wrong, we hear about it quickly. Getting back to customers who have taken the time to write in is importantBeyond that, we’d like to have the information to find hosts that are misbehaving, as well as automatically get stats about how long jobs take
Allows our Help Team to do stuff on their ownRetry jobsChange job propertiesFigure out what went wrongSaves engineer time (we used to have to do all that for them)Improves customer response timesCustom Crawl QlessShow:Queue statsFailed jobs (and retry a few)Search tags, add, remove, change priorityNotifications with growl
Because it’s completely client-managed, and written in a core set of Lua bindings, it’s easy to support new languages. The atomicity guaranty of the Lua scripts enables us to make strong correctness guaranteesClient detect jobs dropped by other workers; clients get an exclusive lock on a job and must complete or heartbeat to keep lease
Because it’s completely client-managed, and written in a core set of Lua bindings, it’s easy to support new languages. The atomicity guaranty of the Lua scripts enables us to make strong correctness guaranteesClient detect jobs dropped by other workers; clients get an exclusive lock on a job and must complete or heartbeat to keep lease
Because it’s completely client-managed, and written in a core set of Lua bindings, it’s easy to support new languages. The atomicity guaranty of the Lua scripts enables us to make strong correctness guaranteesClient detect jobs dropped by other workers; clients get an exclusive lock on a job and must complete or heartbeat to keep lease