– “large data sets so big that commonly-used software tools are unable to capture,
curate, manage, and process the data within a tolerable elapsed time.”
Hadoop Dominates Big Data market
– Used widely by some of the world's largest websites,
such as Facebook, eBay, Amazon and Yahoo
– Moving into the enterprise
– Invented by developers at Yahoo!
/ page 8
What is Big Data?
/ page 10
Characteristics of Big Data
Big Data is facilitated by Data Science
Data Science is facilitated by Machine Learning
Machine Learning is a confluence of disciplines: computer science,
mathematical statistics, probability theory, visualization, etc.
What is the “New” Part of Big Data
“Big” is new, more data to manage than ever before
Traditional data content is now coupled with internal and external sources of
unstructured data via social media
New forms of analysis such as sentiment and credibility analysis
Circa 2000 and the Internet bubble event. Will it occur again?
A bubble may occur, but not because of Big Data
/ page 11
Applications for Big Data
Fraud and Risk
“Big Data is the definitive source of
competitive advantage across all
industries. For those organizations
that understand and embrace the new
reality of Big Data, the possibilities
for new innovation, improved agility,
and increased profitability are nearly
Source: Wikibon 2012