An analogy with water to help
understand Big Data.
Data seem to be everywhere. Data has
emerged as the world’s vital resource for
decision making and can be called as elixir
of digital world.
The sheer volume of the data is colossal.
Every day, we create around 2.5
Quintillion bytes of data.
Data, Data, Everywhere…..
Water is the elixir of life. It the most
essential, pervasive and vital element for
all known forms of life.
Water is everywhere:
Above the Earth in the air and clouds,
On the surface of the Earth in rivers,
oceans, ice, plants, in living organisms,
and Inside the Earth in the top few miles
of the ground.
Water, Water, Everywhere....
Generated from anything and everything.
Some sources feed data unceasingly in real
time: sensors, posts to social media sites,
digital pictures and videos, e-Commerce
transactions, cell phone GPS signals, just to
name a few. However, only negligible
percentage is being been processed as
meaningful info from that.
There is a crunch for relevant data when
and where we need and in the reliable form
that we need.
Some 97% of the water on the earth is
Two percent of the water on earth is
glacier ice at the North and South Poles.
Less than 1% of all the water on earth is
fresh water that we can actually use.
Majority of the population are struggling
to get usable water.
Big data comes in two distinct forms –
DATA AT REST and DATA IN MOTION
Essentially there are two types, Static
Water (Lentic) and Flowing Water (Lotic).
The characteristic of water can be
determined based on whether water is at
rest (Hydrostatics) or water is at motion
Data at Rest is the term used for persisted data.
This refers to data that has been collected from various
sources and is then analyzed after the event occurs.
The point where the data is analyzed and the point where
action is taken on it occur at two separate times.
Similar to Lentic System, typical characteristics of data at rest
are VOLUME AND VARIETY
VOLUME is the characteristic of data at rest that is most
associated with Big Data.
The second characteristic of data at rest is VARIETY of data,
meaning the data represents a number of data domains and
a number of data types such as structured data or
unstructured like text, images, video or any other raw data.
(meaning ‘to make
calm’) are still / slow
moving water bodies
like lakes, ponds, and
The characteristics of
water body are highly
dependent on the size
of the water body and
on the climatic
conditions in the
Typically they are
characterized by water
times and fluxes.
Data in Motion is the term used for data as it is in transit.
Data in motion is processed and analyzed in real time, or
near-real time, and has to be handled in a very different way
than data at rest. Data in motion tends to resemble event-
processing architectures, and focuses on real-time or
operational intelligence applications.
As observed in Lotic System, typical characteristics of data in
motion are VELOCITY and VARIABILITY.
The VELOCITY is the rate of flow at which the data is created,
stored, analyzed, and visualized.
Big Data VELOCITY means a large quantity of data is being
processed in a short amount of time.
The second characteristic for data in motion is VARIABILITY,
which refers to any change in data over time, including the
flow rate, the format, or the composition.
(meaning ‘to wash‘)
are Water AT
They are running
water, where the
entire body of water
moves in a definite
These may comprise
rivers and springs.
Big Data is typically measured by VOLUME(key attribute of
data at rest ) and VELOCITY (key attribute of data in motion )
VOLUME-As mentioned earlier, for big data, volume of data
rules and so is the measuring units.
We have kilobyte, megabyte, gigabyte, terabyte, petabyte,
exabyte, zettabyte, and yottabyte
(1,000,000,000,000,000,000,000,000 bytes) to express the
VELOCITY-‘Data In Motion’ is widely used to represent the
speed at which large volumes of data are processed. Big data
can hit fast. Just imagine dealing with petabytes of data
transactions per second. Big data solutions should handle
and process those rapidly arriving data.
Water is measured
water at rest or
water in motion.
Water at rest is
measured in units of
Water in motion is
measured in units of