This presentation explains how big data is transforming the way data is managed and provides a context on why it is essential to get to the data that matters.
1. Why Big Data is Really
About Small Data:
The Big Data Paradox
Judith Hurwitz
President & CEO, Hurwitz & Associates
2. Agenda
§ What is so big about Big Data?
§ What is a data scientist
§ Data at rest, data in motion
§ Is Big Analytics more important?
§ Rethinking data modeling in a big data world
§ A couple of examples
§ What you should think about
§ Questions?
3. Meet the Speaker
§ Judith Hurwitz
§
President and CEO of Hurwitz & Associates, Inc., a strategy consulting and research firm
focused on distributed computing technologies. A pioneer in anticipating technology
innovation and adoption, Judith advocates for a pragmatic adoption of an architectural and
business approach to the emerging market for cloud computing, service orientation, and
service management. She has served as a trusted advisor to many industry leaders over the
years. Judith has helped these companies make the transition to a new business model
focused on the business value of emerging platforms. Judith is an accomplished author and
most recently co-author of Big Data for Dummies.
5. What is so big about big data?
§ Definition of Big Data
§
§
§
§
Volume – How much data
Variety – Various types of data (structured, unstructured)
Velocity – Speed that data moves from one location to another
Veracity – Accuracy (Do the results of a big data analysis make
sense?)
§ Big Data is not new
§ So, why now?
§ Impacting the way you collect, store, manage, analyze,
and visualize data
6. What is the Purpose of Big Data?
§ Gather, store, manage, and manipulate
vast amounts of data at the right speed, at
the right time to get the right results
§ Gather enough data so that you can find
patterns
§ Put those patterns to work to gain insights
in context
6
7. Examples of Big Data
§ Analyze multiple data sources to detect and protect
against insider trading, money laundering, credit card
theft
§ Monitoring market feeds
§ Managing risk models
§ Log files
§ Spatial data from sensors
§ Medical device data – data from sensors connected to
medical equipment
§ GPS data
§ Unstructured data in emails, text messages, call center
notes
7
8. Why do we need to think about Big Data?
§ What big data means
to business
§ More data for better
decision making
§ Integration of data
across business units
and silos
§ Detecting risks in real
time
§ Focus on putting
information in context
with supporting
business decisions
§ Improving the
customer experience
by leveraging
customer feedback
from many different
sources
8
9. From Big to Small
• Big data is only the first
step in the journey
• Big data requires that
you reduce the amount
of data to a subset so
that your organization
can take a deeper look
• Once this subset of
data is cleansed and
verified, it can help
analyze, predict, and
prepare to address the
future
9
10. The Role of a Data Scientist?
§ Combining computing science, math, statistics, and
business (domain) knowledge
§ Looking for answers when you don’t know the question
you want to ask
§ Asking new types of questions: finding nuggets of
actionable information in huge volumes of data
§ Making analytics consumable: real-time analysis to help
the business take the right action at the right time
§ Predictive analytics: What is the next best action?
10
12. Where Most of This Began
Data
Warehouse
Data
Mart
Transactional
System
(Production
Data)
12
13. Then It Got “Better”
Data
Warehouse
Data
Mart
Data
Warehouse
Data
Mart
Transactional
System
(Production
Data)
Transactional
System
(Production
Data)
13
14. Then It Got “More Better”
Operational
System
LOB
Data
Mart
Operational
System
Data
Warehouse
LOB
Data
Mart
LOB
Data
Mart
Transactional
System(s)
14
15. And Better Still
Operational
System
Operational
System
LOB
Data
Mart
Staging
Area
Data
Warehouse
LOB
Data
Mart
LOB
Data
Mart
Transactional
System(s)
15
16. Oops. Data at rest vs. data in motion
Operational
System
Operational
System
Staging
Area
????
Transactional
System(s)
16
17. Data At Rest, Data In Motion
§ Data in motion is no longer a bad thing
§ Trend is combining “traditional” with
streaming
§ Instant analysis isn’t fast enough
§ It’s all about real-time
§ What data to keep?
18. Is Big Analytics More Important?
§ In a word
YES
§ We are looking for answers to questions we haven’t
asked yet
§ Patterns, patterns, patterns
§ But…
§ Current generation analytics engines can be overwhelmed
§ Results may be too difficult to understand even with visualization
§ You may be looking in the wrong place or at the wrong things
19. Is Hadoop the New EDW?
§ No one type of Big Data platform is optimal for all
requirements
§ Hadoop is changing the economics of storing and
analyzing large volumes and variety of data
§ Results of Hadoop analytics needs to be understood in
context
§ Increasing importance of hybrid big data architectures –
combine Hadoop with your systems of record
§ Hadoop for specific roles
§ Exploratory data-science sandboxes
§ Staging platform for unstructured data
19
20. Rethinking Data Modeling
§ Traditional data models assume:
§ Relational data
§ Clean data
§ A few clearly identifiable data sources
§ Next generation data model – the rules have changed
§
§
§
§
Some relational data, some NoSQL
Some of the data is dirty
Lots of data sources coming from many different places
Some of the data you will keep and some you will not
§ Design your data model to account for new world of
large and varied data sources
20
21. Big Data Use Cases
§ “Voice of the Customer”, 360-degree view of customer
§ Strengthen brand and increase customer loyalty
§ Improve operational analytics
§ Target and reduce fraud and improve security
§ Use sensors to provide real-time information about rivers
and oceans to predict impact of environmental changes
21
22. Correlating Varied Data Sources in Finance
§ Financial services is highly competitive and highly regulated.
Financial services needs to create innovative customer experience
while protecting IP. Companies need to anticipate the next best
action.
§ What type of data is needed?
§
§
§
§
§
§
§
§
Transaction data
Threat data
Log data
Customer survey data
Customer support data
Customer social media data
Partner data
News and event data, ……
§ Need to be able to correlate all types of structured and unstructured
data to predict the future and provide opportunities for growth and
expansion
22
23. Advanced Security Analytics to Predict and Protect
§ Government agency needed more visibility into all
system traffic
§ Concern about the unknown – needed to look for and
protect from malicious activity
§ Used advanced security analytics to correlate data
across seemingly unrelated events
§ Real-time
§ Analyze variety data sources- emails, documents, social
media data, business process data, DNS transactions
§ Analyze massive amounts structured and unstructured
data
23
24. Matching Capabilities to Business Problems
§ Text Analytics
§ Next Best Action
§ Data in Motion
§ Adding business
process and rules
§ Anamoly Detection
§ Data Visualization
§ Correlation between
customer service,
comments in the
market, customer
management
§ Putting a lot of data
types together to
determine best
actions
§ Detecting Fraud
24
25. How Do You Manage Big Data?
§ Big data is not clean –
it is massive and
much is unstructured
§ Resulting patterns
from big data
analytics needs to be
culled, cleaned and
matched to enterprise
data
§ Culled data now must
be analyzed in
context with your
systems of record
§ Apply data
visualization and best
practices to determine
how to apply data to
actions
25
26. You need to think about the following:
§ Where are the sources of
the data that could be
important?
§ How often do you need
access to particular types
of data?
§ How long and how much
data do you need to
keep?
§ Can you trust the data
and its sources?
§ Use Big Data analytics to
overcome conventional
wisdom and conventional
thinking.
§ If you already know the
questions to ask you
aren’t moving forward.
26