Transitioning your architecture to scale
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Transitioning your architecture to scale

on

  • 472 views

 

Statistics

Views

Total Views
472
Views on SlideShare
461
Embed Views
11

Actions

Likes
1
Downloads
5
Comments
0

1 Embed 11

https://twitter.com 11

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Transitioning your architecture to scale Presentation Transcript

  • 1. Transitioning yourarchitecture to scaleDavid Tinker - CTO BrandsEye@david_tinker#scaleconf15 April 2013
  • 2. What is BrandsEye?• We monitor online conversation (Twitter,Facebook, G+, websites, blogs, etc.) about brands(e.g.ABSA, Gautrain) and derive insights from thedata• We process between 1m and 3m tweets and otherbrand mentions per day• Relevancy, sentiment analysis, country and languageand other variables all automated and crowdsourced• Clients use our web app to see the results andexplore the data
  • 3. Simplified BrandsEye SchemaAccounturititleextractcountrylanguageetc.matched phrasesbrand sentimentsMentionnameparentBrandquerybrandSearch Phrase
  • 4. BrandsEye in June 2011• Single Java application (ear file) for everything• Mention collection, mention processing & client web app• All accessing MySQL directly using Hibernate & JDBC• Single MySQL server at Rackspace• Separate database per client account• 2 app servers at Rackspace (Apache & JBoss)• Client web app load balanced• Single instance for mention collection and processing• Appropriate architecture when BE started (2006)• It was website mentions and not tweets back then and 20/day was a lot!• Quick to get to market
  • 5. Problems in 2011• Fragile• Any problem with mention processing or MySQL would stop mentioncollection and lead to missed tweets (and grumpy clients)• Re-deploying the app (e.g. for client web app change) would interruptmention collection and processing• Hard to recover from mention processing problems as mentions werenot stored prior to processing• Any change to any part of the application risked breaking some otherpart of it (no tests)• Hard to change MySQL schema as all parts of the application needed tobe changed and manually tested• If our MySQL server dies we are down for a while and will lose data• Slow• Everything used MySQL and our server (16 cores, 72G RAM) was takingstrain• Poor latency - long time for a mention to appear in a client account
  • 6. Solution & Challenges• Separate out and decouple mention collection,mention processing, the client app and the database• Had to keep the business running at the same time• Small team (1.5 - 4 developers) and limited budgetso big bang new architecture not an option=> Incremental approach which is still in progress!The major components and technologies are describedon the following slides with details on what worked andwhat didn’t work for us
  • 7. Simplified ArchitectureChickenMention StoreAnalyticsPublic APIAccount Mention DataPostgreSQL master+slaveBeefFeedproxyMention CollectorRedis MongoRabbitMQAccount Mention DataPostgreSQL master+slaveBrandsEyeJavascript appJSONHTTPPorkMentionProcessingPipelineRedisMashAccount MetaDataPgSQL3rd party appsAMQPJDBCMentionsGravyThe BrandsEyeCrowdMySQLMentionsAMQP
  • 8. Sunday Lunch
  • 9. Mention FlowFeedproxyMention Collector RabbitMQPorkMentionProcessingPipelineChickenMention StoreAnalyticsPublic APIAccount Mention DataPostgreSQL master+slaveGravyThe BrandsEyeCrowdRater (starving student)
  • 10. Feedproxy (Mention Collector)• Java app deployed on 2 virtual servers• Collects mentions from many different sources using Redis sorted setsfor efficient polling and de-duping• Buffers mentions in a MongoDB capped collection as JSON messages• Writes them to a RabbitMQ queue on a remote server for processing• Can replay mentions from a point in the past to recover from processingfailures• Has a web UI to display status and stats
  • 11. Feedproxy (2)• Redis• Redis is a semi-persistent key/value store with sets, lists etc.• Perfect for this application• Uses clever fork trick with copy-on-write to save to disk periodically• The data fits in memory easily and its not terribly bad if we lose themost recent 2 minutes or so• Lightning fast and uses hardly any CPU• As easy as using in memory data structures but the app continues whereit left off after a re-deploy• Have to watch out for “leaks” in Redis data structures• Clustered version not available yet but you can do replication
  • 12. Feedproxy (3)• MongoDB• JSON data store with indexing, querying and so on• Uses memory mapped files for everything and relies on the OS• Has capped collections (ring buffer) with fixed size• Capped collection uses a lot of IO once it fills up even though we onlyuse one index• Gets “swapped out” so occasional retrieval of old mentions takes a longtime - Mongo not so good on a machine doing other stuff as well• “Expensive” way of buffering our mentions• Have written a replacement but its not in production yet
  • 13. Feedproxy (4)• RabbitMQ• RabbitMQ is a message broker• You setup exchanges which distribute messages to queues forconsumers to process• Initially used it to buffer mentions on the mention collector server and aRabbitMQ shovel to get those to the RabbitMQ instance on theprocessing machine• The shovel got “stuck” sometimes (every couple of weeks)• Rabbit’s memory usage climbs linearly with the number of messages inits queues regardless of queue durability settings + it stops acceptingmessages when a “high watermark” of memory usage is reached• Cannot “replay” mentions from a point in time in the past• Not good as a durable store for messages
  • 14. Chicken API & Mention StoreChickenMention StoreAnalyticsPublic APIAccount Mention DataPostgreSQL master+slaveBeefFeedproxyMention CollectorRedis MongoRabbitMQAccount Mention DataPostgreSQL master+slaveBrandsEyeJavascript appJSONHTTPPorkMentionProcessingPipelineRedisMashAccount MetaDataPgSQL3rd party appsAMQPJDBCMentionsGravyThe BrandsEyeCrowdMySQLMentionsAMQP
  • 15. Chicken API & Mention Store• Provides a REST API to access mentions andanalytics• Written using Grails, a Groovy+Java Ruby on Rails clone for Spring stack• The only app with access to the account mention databases• Translates our high level mention filter language into SQL• Has good set of functional tests• BrandsEye customers can use the API to build their own apps• Supports multiple different mention stores with a single API• We are busy transitioning from a single MySQL server to severalPostgreSQL clusters• Keeps stats in Redis• Uses Apache SOLR for full text search (likely to be replaced withPostgreSQL)• Stateless (will be load balanced soon)
  • 16. Chicken API & Mention Store• Online documentation• We use semantic versioning• Major: Increment on breaking API changes• Minor: Increment when new functionality added• Patch: Increment for bug fixesThe Book of ChickenWelcome to the BrandsEye API, v1.17.10https://api.brandseye.com/rest/accounts/BESC27AA/mentions?filter=Published inthelast month and Language isnt enselect id, title, extract ... from mentionjoin ... where published_date >= ? andlanguage <> ?Can also do groupby etc.
  • 17. Why PostgreSQL?• Synchronous replication since 9.1• Transaction on master only commits when slave has received the logrecords• Keeps pair of servers exactly in sync• Replication done over dedicated gigabit network link• Read-only queries can go to master or slave, writes only to master• We wrote an app (running on 3rd machine) to monitor the pair andpromote the slave / disable replication as needed (uses Hetzner failoverIP addresses to make the dead machine inaccessible)• MySQL now also has something similar but it all seems a bit ropey• Other reasons• PostgreSQL has good full text search and arrays• MySQL query planner doesn’t handle some simple subqueries +inexplicably fails to use indexes for others• We need to stick with RDBMS as we do lots of ad-hoc queries
  • 18. Why Grails?• Quick to build apps• Groovy has syntax similar to Ruby and Python but runs on the JVM andintegrates seamlessly with Java code• Most Java code is also valid Groovy code so great for a team with Javabackground• Grails is very much like Rails but using familiar Java stuff (Spring,Hibernate etc.)• Lots of plugins and they are easy to write• Performance is available• Need something fast? Just write that little bit in Java• We won’t have to switch stacks (e.g. from Ruby to Java) at some pointfor performance reasons
  • 19. Mash Account Meta DataChickenMention StoreAnalyticsPublic APIAccount Mention DataPostgreSQL master+slaveBeefFeedproxyMention CollectorRedis MongoRabbitMQAccount Mention DataPostgreSQL master+slaveBrandsEyeJavascript appJSONHTTPPorkMentionProcessingPipelineRedisMashAccount MetaDataPgSQL3rd party appsAMQPJDBCMentionsGravyThe BrandsEyeCrowdMySQLMentionsAMQP
  • 20. Mash Account Meta Data• Provides a REST API for account meta data• Grails app using MySQL (moving to PostgreSQL)• Brands, search phrases, processing rules etc.• Notifies client applications of changes via RabbitMQ topic exchange• Client apps typically cache the data until it changes or a timeout expires• Most client apps use a Java library which handles the caching and mapsthe JSON to a data model• Simple way to distribute the information to many de-coupled apps
  • 21. Pork Mention ProcessorChickenMention StoreAnalyticsPublic APIAccount Mention DataPostgreSQL master+slaveBeefFeedproxyMention CollectorRedis MongoRabbitMQAccount Mention DataPostgreSQL master+slaveBrandsEyeJavascript appJSONHTTPPorkMentionProcessingPipelineRedisMashAccount MetaDataPgSQL3rd party appsAMQPJDBCMentionsGravyThe BrandsEyeCrowdMySQLMentionsAMQP
  • 22. Pork Mention Processor• Consumes mentions (JSON messages) fromRabbitMQ queues• Grails app with basic UI for monitoring• Annotates mentions with extra information (relevancy, sentiment,country, language etc.) using machine learning and other techniques• Applies automated processing rules• Sends & receives mentions from the BrandsEye crowd• Writes mentions to Chicken (Mention Store)• ACK mention if all good, otherwise NACK and re-process• Lots of batching and shared models for performance and rate limitingreasons• Groovy closures and other features result in compact maintainable code• Small amount of performance centric code in Java + machine learninglibraries• Can process 1m+ mentions/hour, mostly waiting for Chicken• Keeps stats in Redis
  • 23. RabbitMQ• Good• ACK/NACK model is very convenient for development• Can limit number of un-ACKed messages allowed to control appmemory usage• Exchange types and routing keys allow for flexible setup• Easy to duplicate messages on the fly (e.g. for debugging in live)• Nice admin console• Limitations• Cannot cluster queues so even for clustered rabbit losing a machineloses everything in its queues• Memory usage climbs rapidly if you aren’t consuming messages
  • 24. Beef Client Web App• Grails + Javascript using Backbone, Handlebars etc.• Communicates with Chicken using the same API we offer to clients• Maintains brands, phrases etc. by communicating with Mash• All REST using JSON• Makes it possible for usto refactor all of thebackend apps (e.g.change databaseschemas) so long as wekeep the API the same
  • 25. Monitoring• We use Wormly forserver and app alerts• Each app has asimple web consolethat can be checkedfor “ERROR”• Logs are aggregatedusing Graylog2
  • 26. SCM and Packaging• We use Git & Bitbucket• Same as Github but much cheaper for a small team with many repos• Master must always be deployable, we don’t use branches much• Apps are packaged as executable war files• Servlet container (Jetty) embedded in war• java -jar crackling.war and it will come up listening on its port• Built on the target machine (no CI yet ...)• Apache or Nginx in front• Good documentation• Each app has a README.md describing its purpose, how to build andrun/publish it, API endpoints, dependencies (and how to install them)etc.
  • 27. Hetzner & AWS vs Rackspace• Price• Rackspace is at least 6x more expensive that Hetzner for a similar(supposedly better quality) machine• Rackspace cloud servers are also much more expensive than Amazonservers (6x or more) and they charge full price even when your serverisn’t running• So we are moving towards “more machines that can fail” instead of “afew really reliable machines”• Other factors• Hetzner give failover IPs that you can change with an API forimplementing HA stuff• Servers have 2 NICs so you can create private nets for replication• Traffic is free (great for backups) up to 10000G/month• We need physical hardware for at least the database servers forperformance
  • 28. Tech Summary• Redis• Fast solid software, lots of use cases for data sets that fit in memory• Watch out for leaks• MongoDB• Capped collections slower than you expect• Really wants to be the only thing installed on the server• MySQL• Not very good at optimizing queries or using indexes, dodgy replication• PostgreSQL• Synchronous replication is cool• Advanced query optimizer• RabbitMQ• Great for short term routing of messages, watch out for memory usage
  • 29. Conclusion• We can now easily scale BrandsEye to handle anynumber of clients and volume of mentions• All of this is still in progress• We have lots of other little apps not describedhere interacting using the same tech• We have done a lot of work with chef for our newservers but its not handling everything yet
  • 30. Questions?$100 discount to Scaleconf people!