My recent talk at the Birmingham R User Meeting (BRUM) was on Big Data in R. Different people have different definitions of big data. For this talk, my definition of big data is:
“Data collections big enough to require you to change the way you store and process them.” - Andy Pryke
I discuss the factors which can limit the size of data analysed using R and a variety of ways to address these, including moving data structures out of RAM and onto disk; using in database processing / analytics and harnessing the power of Hadoop to allow massively parallel R.
Clipping is a handy way to collect important slides you want to go back to later.