2. Agenda
● Introduction to Ranker
● Performance Challenges In Ranker
● Performance Tuning Strategies
● Conclusion
3. Introduction To Ranker
Ranker is a social site and platform that is in essence an operating system for Lists.
Ranker makes it easy, fun, and social for users to Rank things – anything - via a
Netflix-style drop-and-drag interface and a huge backend database. Everything in the
system is an object, so that we can aggregate individual lists and answer the “wisdom
of crowds” question “what is the best ___”.
Ranker is fully distributable. So for example a travel blog can embed Ranker on their
site that allows their users to easily rank their own favorite golf destinations. This gives
the blog a sticky interactive tool, as well as valuable content showcasing a continually
updated ranking of their community’s consensus picks for golf spots.
The Ranker platform is flexible enough to be used for publishing, social networking,
shopping, polling, even organization.
4. Performance Challenges In Ranker
The Ranker application has to deal with two main performance issues:
1. Data Volume - One of the biggest challenges while building Ranker, compared to
other regular web applications is the volume of data that has to be managed. Ranker
deals with close to 10 million topics, most of which have been obtained from Freebase.
Freebase uses a custom RDF store to persist and retrieve its data. However Ranker
needs to achieve the same performance levels using a relational database.
2. Traffic - Ranker deals with a lot of social and entertaining content. This results in
traffic spikes where its not uncommon to get a huge number of visitors in very short
span of time. We have often seen around 40,000 visits to a single page within a span
of one hour, before subsiding back to a more reasonable traffic volume.
5. Performance Tuning Strategies
The performance challenges explained in the previous slide are
handled using the following methods:
● Caching
● Database De-normalization
● Hardware
● Delayed Calculation/Aggregation
● Event Based Post-Processing
● Search Indexing
6. Caching
Ranker implements caching using the open source Ehcache framework. Caching is
implemented to various depths within the application. While some parts of the
application only use caching to store backend objects that are time-consuming to load,
other parts of the application cache the entire request by storing the generated HTML in
the cache.
Caching in Ranker is also well integrated with the custom CMS that is used to configure
various pages in the application. The CMS allows us to specify different cache
expiration times for each block of each configured page in Ranker.
7. Database De-normalization
Since the Ranker traffic patterns indicate that a huge percentage of activity on the site
is for "reads" and a much smaller percentage for "writes", de-normalizing the database
provides huge performance benefits for the application.
Database de-normalization often involves duplicating data across tables in order to
avoid expensive joins in the SQL queries. Hence it involves a lot of overhead while
editing or deleting the de-normalized entities. A single user action might require the
application to update multiple locations due to this technique. This is also very prone to
causing bugs in the system when the programmers are not aware of all the places the
data is duplicated in. Hence de-normalization has been used very cautiously and is
used when none of the other approaches are applicable.
8. Hardware
Using better hardware is often a much simpler and cheaper option than investing a lot
of time in improving the performance of some parts of the application. We have made
sure that we have the most suitable hardware for the systems that are being built,
based on the amount of memory and processing power needed.
Ranker also uses hardware load balancers to distribute the load across multiple web
servers. This makes a huge difference when there is a spike in traffic, as mentioned in
the “Performance Challenges In Ranker” section.
Here is one situation where coding for better hardware made a huge difference to the
performance: One of the background processes in Ranker required to make around 9
million queries to be able to complete its job. Later we realized that by loading all the
data into memory in one shot, we could reduce the number of queries to a few
thousand. However this would require us store around 3GB of data in memory. Hence it
made more sense to get systems with bigger memory capacity. This change resulted in
the performance increasing by about 20 times.
9. Delayed Calculation/Aggregation
Ranker uses a large number of small batch programs that perform
calculations using complex algorithms, on a regular interval. This allows
us to pre calculate scores for lists and items and hence avoid performing
the calculations every time data is retrieved or stored. The tricky part in
using this technique is to choose the right amount of pre-calculation of
data. Too much pre-calculation will result in large number of results to
store, however too less of it can result in doing a lot of calculation while
loading the data.
For example, this technique is used to calculate the most interesting lists
in each domain in Ranker. The algorithm to identify the interesting lists
uses the a lot of factors like number of views, number of votes, etc and is
executed once a day. In this case, instead of determining the most
interesting lists in each domain, we only determine a universal score for
each list. This score can be used to get the most interesting lists in any
domain.
10. Event Based Post-Processing
This is another form of Delayed Calculation. Some of the user actions in
Ranker will need the system to sometimes perform complex/time-
consuming operations. For cases like these, Ranker uses an event based
post-processing framework to perform these operations asynchronously.
This will allow us to give the user a quick response time and also perform
time-consuming operations within a few seconds after the action. The only
disadvantage in using this approach is the difficulty it causes in reporting
errors and failures to the user.
For example, when someone comments on a list, we need to notify the list
author through an email. Even though the list author needs to be notified
immediately, having a delay of a few seconds is acceptable. Performing
this asynchronously will allow us to give a quick response to the user
without having the user wait for the email to be sent.
11. Search Indexing
Ranker uses popular indexing tools like Lucene and Solr to index all
searchable data. Using a search index provides huge performance benefits
while performing text based search in the application.
Different strategies are used to add data into the index. Entities which are
frequently created / changed in the system are added to the index through an
automated SQL query, which runs every 5 minutes. Other entities, like the
data obtained from freebase is updated through a program that is triggered
manually.
Solr allows us to search across a number of fields and also do so using
different weights for each type of field, without compromising on the speed of
the search.
12. Conclusion
Making any changes in the application for improving the speed and performance
of the application always involve certain trade-offs. In Ranker, we have made
sure that we only make changes once they are analyzed well and we are ready
to handle all the side effects of the change. Changes often involve additional
effort in maintenance and environment setup. Some of them even require us to
acquire and maintain new servers, like in the case of search indexing and
background processes.
By choosing a variety of techniques to handle the different performance
problems in the application, Ranker has been able to deliver and scale to the
traffic as it becomes more popular.