Why Big Data is Really
About Small Data:
The Big Data Paradox
Judith Hurwitz
President & CEO, Hurwitz & Associates
Agenda
§  What is so big about Big Data?
§  What is a data scientist
§  Data at rest, data in motion
§  Is Big Analyti...
Meet the Speaker
§  Judith Hurwitz
§ 

President and CEO of Hurwitz & Associates, Inc., a strategy consulting and resear...
Our Team’s Latest Book

4
What is so big about big data?
§  Definition of Big Data
§ 
§ 
§ 
§ 

Volume – How much data
Variety – Various types ...
What is the Purpose of Big Data?

§  Gather, store, manage, and manipulate
vast amounts of data at the right speed, at
th...
Examples of Big Data
§  Analyze multiple data sources to detect and protect
against insider trading, money laundering, cr...
Why do we need to think about Big Data?

§  What big data means
to business
§  More data for better
decision making
§  ...
From Big to Small
•  Big data is only the first
step in the journey
•  Big data requires that
you reduce the amount
of dat...
The Role of a Data Scientist?
§  Combining computing science, math, statistics, and
business (domain) knowledge
§  Looki...
Representation Technology Stack

Interfaces$and$feeds$from/to$internal$applica@ons$

Interfaces$and$feeds$from/to$the$Inte...
Where Most of This Began

Data	
  Warehouse

Data
Mart

Transactional
System
(Production	
  Data)

12
Then It Got “Better”

Data	
  Warehouse

Data
Mart

Data	
  Warehouse

Data
Mart

Transactional
System
(Production	
  Data...
Then It Got “More Better”

Operational	
  
System

LOB
Data
Mart

Operational	
  
System

Data	
  Warehouse

LOB
Data
Mart...
And Better Still

Operational	
  
System

Operational	
  
System

LOB
Data
Mart

Staging
Area

Data	
  Warehouse

LOB
Data...
Oops. Data at rest vs. data in motion

Operational	
  
System

Operational	
  
System

Staging
Area

????

Transactional
S...
Data At Rest, Data In Motion
§  Data in motion is no longer a bad thing
§  Trend is combining “traditional” with
streami...
Is Big Analytics More Important?
§  In a word

YES

§  We are looking for answers to questions we haven’t
asked yet
§  ...
Is Hadoop the New EDW?
§  No one type of Big Data platform is optimal for all
requirements
§  Hadoop is changing the eco...
Rethinking Data Modeling
§  Traditional data models assume:
§  Relational data
§  Clean data
§  A few clearly identifi...
Big Data Use Cases
§  “Voice of the Customer”, 360-degree view of customer
§  Strengthen brand and increase customer loy...
Correlating Varied Data Sources in Finance
§  Financial services is highly competitive and highly regulated.
Financial se...
Advanced Security Analytics to Predict and Protect
§  Government agency needed more visibility into all
system traffic
§...
Matching Capabilities to Business Problems

§  Text Analytics
§  Next Best Action
§  Data in Motion
§  Adding business...
How Do You Manage Big Data?

§  Big data is not clean –
it is massive and
much is unstructured
§  Resulting patterns
fro...
You need to think about the following:
§  Where are the sources of
the data that could be
important?
§  How often do you...
Q&A

§  Thank you!
§  Contact info:
§  Judith Hurwitz: judith.hurwitz@hurwitz.com

27
Upcoming SlideShare
Loading in …5
×

Why Big Data is Really about Small Data

620 views

Published on

This presentation explains how big data is transforming the way data is managed and provides a context on why it is essential to get to the data that matters.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
620
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Why Big Data is Really about Small Data

  1. 1. Why Big Data is Really About Small Data: The Big Data Paradox Judith Hurwitz President & CEO, Hurwitz & Associates
  2. 2. Agenda §  What is so big about Big Data? §  What is a data scientist §  Data at rest, data in motion §  Is Big Analytics more important? §  Rethinking data modeling in a big data world §  A couple of examples §  What you should think about §  Questions?
  3. 3. Meet the Speaker §  Judith Hurwitz §  President and CEO of Hurwitz & Associates, Inc., a strategy consulting and research firm focused on distributed computing technologies. A pioneer in anticipating technology innovation and adoption, Judith advocates for a pragmatic adoption of an architectural and business approach to the emerging market for cloud computing, service orientation, and service management. She has served as a trusted advisor to many industry leaders over the years. Judith has helped these companies make the transition to a new business model focused on the business value of emerging platforms. Judith is an accomplished author and most recently co-author of Big Data for Dummies.
  4. 4. Our Team’s Latest Book 4
  5. 5. What is so big about big data? §  Definition of Big Data §  §  §  §  Volume – How much data Variety – Various types of data (structured, unstructured) Velocity – Speed that data moves from one location to another Veracity – Accuracy (Do the results of a big data analysis make sense?) §  Big Data is not new §  So, why now? §  Impacting the way you collect, store, manage, analyze, and visualize data
  6. 6. What is the Purpose of Big Data? §  Gather, store, manage, and manipulate vast amounts of data at the right speed, at the right time to get the right results §  Gather enough data so that you can find patterns §  Put those patterns to work to gain insights in context 6
  7. 7. Examples of Big Data §  Analyze multiple data sources to detect and protect against insider trading, money laundering, credit card theft §  Monitoring market feeds §  Managing risk models §  Log files §  Spatial data from sensors §  Medical device data – data from sensors connected to medical equipment §  GPS data §  Unstructured data in emails, text messages, call center notes 7
  8. 8. Why do we need to think about Big Data? §  What big data means to business §  More data for better decision making §  Integration of data across business units and silos §  Detecting risks in real time §  Focus on putting information in context with supporting business decisions §  Improving the customer experience by leveraging customer feedback from many different sources 8
  9. 9. From Big to Small •  Big data is only the first step in the journey •  Big data requires that you reduce the amount of data to a subset so that your organization can take a deeper look •  Once this subset of data is cleansed and verified, it can help analyze, predict, and prepare to address the future 9
  10. 10. The Role of a Data Scientist? §  Combining computing science, math, statistics, and business (domain) knowledge §  Looking for answers when you don’t know the question you want to ask §  Asking new types of questions: finding nuggets of actionable information in huge volumes of data §  Making analytics consumable: real-time analysis to help the business take the right action at the right time §  Predictive analytics: What is the next best action? 10
  11. 11. Representation Technology Stack Interfaces$and$feeds$from/to$internal$applica@ons$ Interfaces$and$feeds$from/to$the$Internet$ Big$Data$Tech$Stack$ Big$Data$Applica@ons$ Repor@ng$&$Visualiza@on$ Analy@cs$(Tradi@onal$and$Advanced)$ Analy@cal$Data$Warehouses$and$Data$Marts$ “Organizing”$Databases$and$Tools$ Opera@onal$Databases$(Structured,$Unstructured,$SemiMstructured)$ Security$Infrastructure$ Redundant$Physical$Infrastructure$ 11
  12. 12. Where Most of This Began Data  Warehouse Data Mart Transactional System (Production  Data) 12
  13. 13. Then It Got “Better” Data  Warehouse Data Mart Data  Warehouse Data Mart Transactional System (Production  Data) Transactional System (Production  Data) 13
  14. 14. Then It Got “More Better” Operational   System LOB Data Mart Operational   System Data  Warehouse LOB Data Mart LOB Data Mart Transactional System(s) 14
  15. 15. And Better Still Operational   System Operational   System LOB Data Mart Staging Area Data  Warehouse LOB Data Mart LOB Data Mart Transactional System(s) 15
  16. 16. Oops. Data at rest vs. data in motion Operational   System Operational   System Staging Area ???? Transactional System(s) 16
  17. 17. Data At Rest, Data In Motion §  Data in motion is no longer a bad thing §  Trend is combining “traditional” with streaming §  Instant analysis isn’t fast enough §  It’s all about real-time §  What data to keep?
  18. 18. Is Big Analytics More Important? §  In a word YES §  We are looking for answers to questions we haven’t asked yet §  Patterns, patterns, patterns §  But… §  Current generation analytics engines can be overwhelmed §  Results may be too difficult to understand even with visualization §  You may be looking in the wrong place or at the wrong things
  19. 19. Is Hadoop the New EDW? §  No one type of Big Data platform is optimal for all requirements §  Hadoop is changing the economics of storing and analyzing large volumes and variety of data §  Results of Hadoop analytics needs to be understood in context §  Increasing importance of hybrid big data architectures – combine Hadoop with your systems of record §  Hadoop for specific roles §  Exploratory data-science sandboxes §  Staging platform for unstructured data 19
  20. 20. Rethinking Data Modeling §  Traditional data models assume: §  Relational data §  Clean data §  A few clearly identifiable data sources §  Next generation data model – the rules have changed §  §  §  §  Some relational data, some NoSQL Some of the data is dirty Lots of data sources coming from many different places Some of the data you will keep and some you will not §  Design your data model to account for new world of large and varied data sources 20
  21. 21. Big Data Use Cases §  “Voice of the Customer”, 360-degree view of customer §  Strengthen brand and increase customer loyalty §  Improve operational analytics §  Target and reduce fraud and improve security §  Use sensors to provide real-time information about rivers and oceans to predict impact of environmental changes 21
  22. 22. Correlating Varied Data Sources in Finance §  Financial services is highly competitive and highly regulated. Financial services needs to create innovative customer experience while protecting IP. Companies need to anticipate the next best action. §  What type of data is needed? §  §  §  §  §  §  §  §  Transaction data Threat data Log data Customer survey data Customer support data Customer social media data Partner data News and event data, …… §  Need to be able to correlate all types of structured and unstructured data to predict the future and provide opportunities for growth and expansion 22
  23. 23. Advanced Security Analytics to Predict and Protect §  Government agency needed more visibility into all system traffic §  Concern about the unknown – needed to look for and protect from malicious activity §  Used advanced security analytics to correlate data across seemingly unrelated events §  Real-time §  Analyze variety data sources- emails, documents, social media data, business process data, DNS transactions §  Analyze massive amounts structured and unstructured data 23
  24. 24. Matching Capabilities to Business Problems §  Text Analytics §  Next Best Action §  Data in Motion §  Adding business process and rules §  Anamoly Detection §  Data Visualization §  Correlation between customer service, comments in the market, customer management §  Putting a lot of data types together to determine best actions §  Detecting Fraud 24
  25. 25. How Do You Manage Big Data? §  Big data is not clean – it is massive and much is unstructured §  Resulting patterns from big data analytics needs to be culled, cleaned and matched to enterprise data §  Culled data now must be analyzed in context with your systems of record §  Apply data visualization and best practices to determine how to apply data to actions 25
  26. 26. You need to think about the following: §  Where are the sources of the data that could be important? §  How often do you need access to particular types of data? §  How long and how much data do you need to keep? §  Can you trust the data and its sources? §  Use Big Data analytics to overcome conventional wisdom and conventional thinking. §  If you already know the questions to ask you aren’t moving forward. 26
  27. 27. Q&A §  Thank you! §  Contact info: §  Judith Hurwitz: judith.hurwitz@hurwitz.com 27

×