Data Reads Through Well it
means creating systems that perform data reads through services. data reads typically have to be synchronous because a user is waiting on Services the operation. So they have to occur inside the request/response life-cycle.
data writes through a messaging
or queuing system Often, a user doesn’t have to wait for data to be written to receive a response. So writes can be done asynchronously outside of the request/ response life-cycle which mean you can put them straight into a queue
Monolithic Applications Also, having your
entire application in one code base and system doesn’t scale. This leads to test suites that take more than 30 minutes to Don’t Scale run, deploys that push your entire application just for a simple update.
Joel Spolsky calls people that
exhibit this behavior “Architecture Astronauts” “Sometimes smart thinkers just don't know when to stop, and they create these absurd, all-encompassing, high-level pictures of the universe that are all good and ﬁne, but don't actually mean anything at all. These are the people I call Architecture Astronauts. It's very hard to get them to write code or design programs, because they won't stop thinking about Architecture.”
and then you add some
background processing... and then you realize that you can’t do everything inside the request/response cycle so you add in a background process. For now we’ll assume you’re using a database backed queue like dj, bj, or some other kind of “j”.
complex business logic and complex
business logic. every time something enters our system we have to perform many different tasks that are interdependent. Here’s just a taste of it: our feed fetcher pulls in a new blog post from somewhere
push writes through a messaging
system data writes through a messaging system with built in routing. It also helps if it’s optimized for processing thousands of messages per second and supports the pubsub style
first_request.on_complete do |response| post =
Post.new(JSON.parse(response.body)) # get the first url in the post third_request = Typhoeus::Request.new(post.links.first) third_request.on_complete do |response| # do something with that end hydra.queue third_request post end
these features enable you to
build an event based system, which is Event Based System exactly what we needed. when certain updates happen, it should kick off calculations elsewhere in the system. I’ll get into that in a bit, but ﬁrst some rabbit speciﬁcs
So we have a fanout
exchange called entry.write. every queue bound to this exchange will get messages published to it. Here we have the three things we want to do. First, index it for searching. Second, store it in our key valuer store. Third, index in a completely separate index used for data research. So the search is Solr/lucene and the research is Hadoop. Completely decoupled systems.
That’s how we write entries.
Here’s how we do event based processing on those writes. so here’s an example where we have a topic exchange named ‘entry.notify’. queues can be bound to exchanges. so we have these three queues
partition tolerance when you replicate
data across multiple systems, you create the possibility of forming a partition. this happens when one or more systems lose connectivity to other systems. partition tolerance is deﬁned formally as “no set of failures less than total net work failure is allowed to cause the system to respond incorrectly”