These days it is not uncommon to have 100s of gigabytes of data that must be sliced and diced, then delivered fast and rendered quickly. Typically solutions involve lots of caching and expensive hardware with lots of memory. And, while those solutions certainly can work, they aren’t always cost effective, or feasible in certain environments (like in the cloud). This talk seeks to cover some strategies for caching large data sets without tons of expensive hardware, but through software and data design.
It’s common wisdom that serving your data from memory dramatically improves application performance and is a key to scaling. However, caching large datasets brings its own challenges: distribution, consistency, dealing with memory limits, and optimizing data loading just to name a few. This talk will go through some of the challenges, and solutions, to achieve fast data queries in the cloud. The audience will come away armed with a number of practical techniques for organizing and building caches for non-transactional datasets that can be applied to scale existing systems, or design new systems.