
Be the first to like this
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Published on
Getafix is a data replication scheme that cuts memory costs in interactive analytics engines. It tracks data segment popularity in a cluster and runs a modified bestfit algorithm to find a memory efficient allocation. It is built upon our optimal solution to the static version of the scheduling problem, which yields a solution that has the least makespan and replication factor. In practice, it can cut cluster memory costs by 2X without sacrificing query performance and save firms as much as $10 million annually. It also automatically tiers popular data to powerful nodes in a heterogeneous cluster and greatly simplifies sys admin work.
Website: https://medium.com/@iwonderhow/getafixusingpopularitytocurtailmemorycostsininteractiveanalyticsengines1471cf5dab84
Paper: https://dl.acm.org/citation.cfm?id=3190542&dl=ACM&coll=DL
Source Code: http://dprg.cs.uiuc.edu/downloads#l_getafix
Be the first to like this
Be the first to comment