More data usually beats
by Anand Rajaraman
• Big Data is the term for a collection of data sets so
large and complex that it becomes difficult to
process using on-hand database management
tools or traditional data processing applications.
• Popularization of big data - SNS, Smart phone,
Sensor, Open data.
• Hardware - reduce costs, increase efficiencies,
• Cloud computing.
3 Vs of Big Data
TDWI Research 2011 Big Data Analytic Report
Big Data technology
• Big Data technology stacks allow to effectively
capture, store, select and process data of big
volume, variety and velocity. These technologies
were invented by internet giants such as Yahoo,
Google and Facebook because they first dealt with
unstructured data on a large scale. Several key
terms and principals are the backbones of the Big
• NoSQL Systems
• Key-Value Storages (including in-memory caches)
• Map Reduce
• Horizontal Scaling
Big data landspace
What is MongoDB
Definition & history
• MongoDB (from "humongous") is an open source document-oriented
database system developed and supported by 10gen. It is part of the
NoSQL family of database systems. Instead of storing data in tables as is
done in a "classical" relational database, MongoDB stores structured data
as JSON-like documents with dynamic schemas (MongoDB calls the
format BSON), making the integration of data in certain types of
applications easier and faster.
• 10gen began development of MongoDB in October 2007. The database
is used by eBay, MetLife, Telefónica, Foursquare, MTV Networks and the
UK Government. MongoDB is the most popular NoSQL database
• Binaries are available for Windows, Linux, OS X, and Solaris.
• Development of MongoDB began at 10gen in 2007, when the company
was building a platform as a service similar to Windows Azure or Google
App Engine. In 2009, MongoDB was open sourced as a stand-alone
product with an AGPL license.
• In March 2010, from version 1.4, MongoDB has been considered
• The latest stable version, 2.4.6, was released in August 20 2013.
Practice : Creating a blog in 15 minutes
• $ rails new blog ‐‐skip‐active‐record
• $ cd blog
• $ vi Gemfile
o gem ‘mongoid’, github: ʹmongoid/mongoidʹ
o gem ʹbson_extʹ, ʹ~> 1.8.6‘
o gem ʹexecjsʹ
o gem ʹtherubyracer‘
• $ bundle update
• $ rails g mongoid:config
• $ rails generate scaffold Post title:string content:text
• $ rails s