The document discusses the transition from systems of record (SoR) to systems of intelligence (SoI) in enterprise applications over the next 10-15 years. Some key points:
- SoI will be a bigger change than SoR and require a focus on data quality through better data sourcing, preparation, analysis and modeling.
- Speed of integrating analytics with operational apps is critical for SoI, but doesn't necessarily require streaming-only analytics.
- SoI require a new technology stack and enterprises must choose platforms balancing optimized functionality with simplicity.
- The journey to SoI depends on combinations of enterprise capabilities and technology maturity.
2. • SoI build
on
SoR but
are
biggest
change
in
enterprise
apps
in
five
decades
• Enterprises
need
deep
focus
on
sourcing,
preparing,
analyzing,
modeling
data:
SoI
pivot
on
data
quality,
fail
otherwise
• Speed
of
integrating
increasingly
sophisticated
analytics
with
operational
apps
ever
more
critical,
but
doesn’t
*necessarily*
require
streaming-‐only
analytics
• SoI require
new
stack:
enterprises
must
choose
their
platform
by
balancing
need
for
optimized
functionality
vs.
need
for
simplicity
AGENDA
Enterprises
Must
Manage
Journey
From
Systems
of
Record
(SoR)
to
Systems
of
Intelligence
(SoI)
By
Balancing
Skills
And
Tech
Maturity
3. Improve
business
process
efficiency
• On
time-‐shared
mainframes,
GUI
client-‐server,
or
in
cloud
via
SaaS: SoR’s automate
business
processes
• Standardized
processes
and
business
transactions
enable
performance
reporting
and
business
intelligence
• Limitations
of
historical
performance
reporting:
like
steering
a
ship
while
looking
backwards
at
its
wake
Systems
of
Record
Automate
Business
Processes:
Five
Decades
From
Airline
Reservations
to
ERP
and
Data
Warehouses
4. Systems
of
Intelligence Build
on
Systems
of
Record
Retail
Sales
Associate Consumer
Mobile
Retail Call
Center TV
Ads E-‐Mail
Social
MediaeCommerce
• Modern
Systems
of
Intelligence
optimize
loyalty
by
anticipating
consumer
“conversation”
• Omni channel:
comprehensive,
real-‐time
integration
of
all
touch
points,
channels
via
common
data
• Intelligence:
predictive
and
real-‐time
to
influence
consumer
interaction
• Loyalty and
profitability
are
higher
Build
on
SoR:
still
run
core
processes
varying
degrees
of
“real-‐time”
integration
to
SoI
• Modern SoR: are
cloud,
mobile,
social,
most
critically:
supports
fast
data
integration
with
other
apps
• Data
from
SoR can
be
approximate: can
be
modestly
stale
if
apps
can’t
support
RT
query
from
consumer-‐facing
apps;
SoI can
then
run
in
cloud
more
easily
• Omni
channel: still
needs
access
to
pricing
info,
inventory,
billing
process
• Intelligence: still
needs
master
customer
data
and
transaction
history
Machine
Learning
Predictive
Model
2
New
Elements
Based
Solely
on
Forward-‐Looking
Analytics
&
Data
Data
Platform
5. Systems
of
Intelligence Will
Cross
Functions
And
Industries
Transformation
from
SoR to
SoI
SoI will
progressively
remake
existing
application
categories
via
use
of
machine
learning
HR
Talent
Management
example
SoR:
track
recruit-‐to-‐retire
processes
including
source,
attract,
develop,
motivate,
retain…
SoI:
for
retention,
predict
who
is
at
risk
for
proactive
intervention
6. Systems
of
IntelligenceAre
Prototype
for
IoT:
Example
of
Systems
Management
Becoming
Autonomic
Systems
Management
Traditional
Management Autonomic Service Management
Objects Servers,
storage, networks,
databases, web
servers
Physical infrastructure
and
services
are
like
IoT
“devices”
Analytics Real-‐time dashboard
of
Predictive model
of
behavior
from
real-‐time
streaming
data
Alerts Pre-‐set
performance
thresholds Anomalous behavior
Action Send
alerts
to
administrators Suggest
or
auto
remediate behavior
Analytics
=
“Lights
out”
via
real-‐time,
predictive +
prescriptive
Auto
Pilot
Real-‐time
Dashboard
=
Backward
looking
7. The
Journey
To
Systems
Of
Intelligence:
Determined
By
Combination
of
Enterprise
Capabilities,
Tech
Maturity
Smart
Grid
Adjunct
Data
Warehouse
Customer
360
Real-‐time
loyalty
omni-‐channel
multi-‐touchpoint
Predictive
model
learns
from
and
anticipates
consumer
in
near
real-‐
time
Continuously
updated
prediction
of
energy
supply,
demand
tunes
end-‐point
consumption
Autonomic
systems
management System
learns
“normal”
behavior
of
apps
and
infrastructure
and
flags
or
fixes
anomalies
Data
Lake
with
some
production
analytics
offload
from
Data
Warehouse
Enough
internal
and
external
customer
data
in
a
pipeline
to
start
predictive
modeling
Applications
Technology
Maturity,
Enterprise
Capabilites
Time
8. • SoI build
on
SoR but
are
biggest
change
in
enterprise
apps
in
five
decades
• Enterprises
need
deep
focus
on
sourcing,
preparing,
analyzing,
modeling
data:
SoI
pivot
on
data
quality,
fail
otherwise
• Speed
of
integrating
increasingly
sophisticated
analytics
with
operational
apps
ever
more
critical,
but
doesn’t
*necessarily*
require
streaming-‐only
analytics
• SoI require
new
stack:
enterprises
must
choose
their
platform
by
balancing
need
for
optimized
functionality
vs.
need
for
simplicity
Agenda
Enterprises
Must
Manage
Journey
From
Systems
of
Record
to
Systems
of
Intelligence
By
Balancing
Skills
And
Tech
Maturity
9. Collecting
*Usable*
Data
About
Customer
Interactions
Requires
New
Sourcing,
Prep’ing,
Analytic
Techniques
10. SoR:
Traditional
Data
Warehouse
Challenge
• Time-‐to-‐analysis
bottlenecked
by
need
to
decide
questions
before
building/designing
DW
• Design
of
DW
limits
available
data
and
then
development
cycle
for
ETL
severely
limits
ability
to
ask
new
questions
SoI Analytics:
Data
Lake
=
Training
Wheels
• Time-‐to-‐analysis
becomes
short
enough
to
be
iterative
by
providing
self-‐service
access
to
all
data
before
building
the
analytic
pipeline
• Analysis
open
to
interoperation
with
any
data
processing
engine
that
writes
to
HDFS
• New
production
pipelines
can
stay
to
production
Hadoop
cluster
or
go
back
to
DW
ETL
+
Database
Design
Mostly
Hardwired
Questions
Available
Data
HDFS
Self-‐service
iterative
and
incremental
database
design
Data
provisioning
New
Questions
Journey
to
SoI Requires
Skills,
Technology
to
Start
Iteratively
Prep’ing Data
and
Building
Predictive
Models
Bottleneck
11. Systems
of
Intelligence
Always
Need
More
Sources
of
Customer
Data
– Including
Externally
Syndicated
Source:
Oracle
BlueKai
The
internal
customer
master
is
no
longer
the
last
word
about
the
customer
12. Raw
data
from
one
source:
logs
Preparing
Hundreds
of
Raw
Data
Sources
for
Analytics
Often
Requires
Techniques
as
Advanced
as
Machine
Learning
on
the
Data
Sources
Themselves
Prep’ing hundreds
of
sources
requires
SoI
technology
such
as
machine
learning
to
inform
data
scientists’
decisions
Source:
Tamr
13. • SoI build
on
SoR but
are
biggest
change
in
enterprise
apps
in
five
decades
• Enterprises
need
deep
focus
on
sourcing,
preparing,
analyzing,
modeling
data:
SoI pivot
on
data
quality,
fail
otherwise
• Speed
of
integrating
increasingly
sophisticated
analytics
with
operational
apps
ever
more
critical,
but
doesn’t
*necessarily*
require
streaming-‐only
analytics
• SoI require
new
stack:
enterprises
must
choose
their
platform
by
balancing
need
for
optimized
functionality
vs.
need
for
simplicity
AGENDA
Enterprises
Must
Manage
Journey
From
Systems
of
Record
(SoR)
to
Systems
of
Intelligence
(SoI)
By
Balancing
Skills
And
Tech
Maturity
14. Range
of
“Real-‐Time”
Interactions
• REAL
RT:
high
frequency
algorithmic
securities
trading
on
one
end
of
the
spectrum
• Updates
every
couple
hours:
inventory
levels
accessed
by
ecommerce,
mobile
apps
at
other
end
of
spectrum
Modern
SoR makes
it
easier
to
get
to
fastest
part
of
spectrum
Real-‐Time
is
a
Matter
of
Degree:
Choices
Depend
on
Usage
Scenario,
Accessibility
of
Applications
That
Need
to
be
Integrated
– Including
Legacy
and
Modern
Systems
of
Record
15. Network
Operations-‐
Facing
Data
Data
Warehouse
Call
Detail
Records
Billing
CRM
Key:
Scale-‐Up
RDBMS
(Oracle,
IBM,
Microsoft)
Customer-‐
Facing
Data
Batch
ETL
Legacy
SoR Analytic
Data
Pipeline
Limitations
• Batch
ETL:
Too
slow
to
build
closed
loop
analytics
• Database
Scale
+
Cost:
Limit
amount
+
use
of
data
Operational
Applications
*Legacy*
Systems
of
Record
Need
Completely
New
Analytic
Data
Pipelines
Built
for
Speed
Legacy
SoR Analytics:
Historical
reporting
16. Consumer
MobileRetail eCommerce
Call
Detail
Records
ERP
CRM
Fast
Data:
Machine
learning
on
MOST
RECENT
call
data
for
anticipating
and
influencing
customer
interaction
Batch
ETL Customer-‐
AND
Network-‐
Facing
Data
Real-‐Time
Interactions:
Loyalty
offers
based
on
historical
and
most
recent
data
Connection
prioritization
*Modern*
Systems
of
Record:
Addition
of
Streaming
Data
More
Easily
Supports
Real-‐Time
Data
for
Predictive
Models
of
Systems
of
Intelligence
Key:
Modern
SoR Built
On
Scale-‐Out
Data
Platform
Fast
Data:
Machine
Learning
on
MOST
RECENT
call
data
for
anticipating
and
influencing
customer
interaction
Big
Data:
Machine
Learning
on
HISTORICAL
data
provides
context
for
building
customer
profiles
and
model
of
network
utilization
over
time
Streaming
Data
17. GB
TB
PB
Batch
Processing
Min Sec MS µS
Streaming
-‐ Velocity
Big
Data
Maximum
throughput
of
data
Exploratory
analysis
of
historical
data
Fast
Data
Fastest
speed
to
make
a
decision
on
each
event
Streaming
is
Newest
Religious
War:
Use
It
For
*All*
Analytic
Workloads?
Processing
Lots
of
Data
vs.
Analyzing
Each
Event
=
Inherent
Conflict
“Streams
can
do
it
all”
school:
Big
Data
Apps
are
Just
Fast
Data
Apps
Scaled-‐Out
• If
it
can
handle
fast
data,
just
scale
it
out
to
handle
big
data
• Big
win:
only
one
application
needed
Wikibon recommendation
(elaborated
on
next
page):
Streaming
and
batch
*will
always*
coexist
• Even
batch
programs
on
streaming
platform
will
still
have
different
application
logic…
• High
volume
machine
learning vs.
incremental
update
• Historical
performance
analysis
vs.
looking
up
a
profile
18. Latency
(Higher
is
Slower)
Even
When
Streaming
Engines
Support
More
Sophisticated
Analytic
Workloads
The
Applications
Are
Likely
to
Differ
Between
Event-‐at-‐a-‐Time
vs.
Batch
Analytic
Sophistication
Basic
Streaming
SQL
Machine
Learning
What
Happened
Counting
What
Happened
Exploration,
OLAP
or
Dashboard
Anticipate
or
Act
Automatically
Prediction
or
Prescription
IMPLICATION:
Converging
on
one
application
engine
not
critical
Stream
processors:
Spark,
Flink,
InfoStreams,
Samza,
DataTorrent,
(DB):
VoltDB /
MemSQL
Historical
analysis
Batch-‐orientedPer
Event-‐Oriented
Profile
lookup
Explore
large,
new
dataIncremental
model
update
19. YARN
– Cluster
Resource
Management
HDFS
or
operational
database
Streaming
Storm,
Flink,
Samza,
Data
Torrent
SQL
Impala,
Drill,
Hive,
HAWQ…
Machine
Learning
Mahout…
Key
Takeaway:
Coexistence
of
Batch
and
Streaming
Means
One
Application
Engine
Doesn’t
Have
to
Rule
All
-‐ Spark
and
Hadoop
Can
Live
Together
Pro:
Mix
and
match
pipeline
comprised
of
specialized
processing
*optimized*
for
each
workload
Con:
Batch-‐only
-‐ hand-‐off
between
processing
engines
via
storage
is
slow.
Each
processing
engine
is
standalone
and
can’t
leverage
the
others’
functionality
Pro:
Fast
and
simple
-‐
pipeline
comprised
of
one
in-‐memory
engine
with
streaming,
SQL,
machine
learning,
graph
personalities
(libraries)
Con:
still
immature
–
performance
an
issue;
haven’t
fully
delivered
integration
– But
Tungsten
per
boost,
IBM
projects
could
add
huge
new
value
Spark
Core
Spark
MLlib
Spark
Streaming
Machine
Learning
Spark
SQL:
Join,
filter,
aggregate
Streaming
Ingest
Spark
SQL
HDFS
or
operational
database
YARN
or
Mesos or
other
Workload
Mgr
20. Big
Data Streaming
Data
Operational
Prediction
Machine
Learning
Predictions
informed
by
most
recent
data:
But
model
lacks
historical
context
Model
with
most
recent
data:
Learns
from
recent
or
streaming
data
streams
but
lacks
historical
context
Predictions
informed
by
historical
context:
But
model
operates
on
old
data
Future:
Real
Time
+
Historical
Context
Learn
+
Predict
Model
with
historical
context:
But
model
drifts
when
put
into
operation
How
Systems
of
Intelligence
Get
Smarter:
Big
Data
vs.
Streaming
Data
-‐&-‐ Learning
vs.
Predicting
Netflix
Movie
library
example
• Big
Data
+
machine
learning:
At
first
sign-‐in,
customer
clicks
through
favorite
genres,
favorite
movies;
offline
that’s
compared
with
customers
with
similar
tastes
to
generate
individual
recommendations
(operational
prediction)
• Fast
Data
+
ML:
As
the
user
browses
for
next
movie,
streaming
data
feeds
machine
learning,
which
updates
the
recommendations
in
real-‐time
(operational
prediction)
21. • SoI build
on
SoR but
are
biggest
change
in
enterprise
apps
in
five
decades
• Enterprises
need
deep
focus
on
sourcing,
preparing,
analyzing,
modeling
data:
SoI pivot
on
data
quality,
fail
otherwise
• Speed
of
integrating
increasingly
sophisticated
analytics
with
operational
apps
ever
more
critical,
but
doesn’t
*necessarily*
require
streaming-‐only
analytics
• SoI require
new
stack:
enterprises
must
choose
their
platform
by
balancing
need
for
optimized
functionality
vs.
need
for
simplicity
AGENDA
Enterprises
Must
Manage
Journey
From
Systems
of
Record
(SoR)
to
Systems
of
Intelligence
(SoI)
By
Balancing
Skills
And
Tech
Maturity
22. Systems
of
Intelligence
Require
New
Technology
at
Every
Level
of
Stack
Compared
to
Systems
of
Record
Systems
of
Record Systems
of
Intelligence
Data Business
transactions Big
Data:
User
interactions,
contextual
observations,
machine
data
measurements
Data
preparation
Batch
ETL “All”
raw
data
collected
for
data
scientists
to
either
build
predictive
models
or
to
prep
for
business
analysts;
results
of
both
put
into
continually
evolving
production
analytic
data
pipeline
Analytic data
pipeline
Historicalreporting
from
data
warehouse
Predictive
models
developed
via
machine
learning
from
Big
Data
and
Fast
Data
Platforms Oracle
12c, SQL
Server,
DB2,
Teradata,
Informatica
Hadoop,
AWS, Azure,
Google
Cloud
Platform,
best-‐of-‐breed
specialized
databases,
Oracle,
Spark
Data
platform
components
OLTP SQL
DBMS,
MPP
SQL
DBMS
OLTP,
MPP
analytic,
key-‐value,
Bigtable-‐type,
doc
store,
streaming, machine
learning,
graph
processing
Elaborated
on
next
slide
23. Data platform
component
Functionality Role Examples
Key
value
store Cache
or
session
store Serve
content
like
offers,
profiles
-‐ fast Aerospike,
Redis,
Couchbase
Document
store Manage
JSON
data Serve
Web,
mobile
UI MongoDB
Graph processor Manage extremely
inter-‐related
data Understand
relationships
such
as
a user’s
product
preferences
Neo4j,
Titan,
Giraph
Event
log Deliver
data
from
any
source(s)
to
any
destination(s)
Ensure
exactly once
delivery Kafka,
RabbitMQ
Stream
processor Analyze
fast
data Analytics without
lag
of
first
storing
data Spark
Streaming,
Data
Torrent,
Flink
Machine
learning Create
predictive
model Intelligence for
anticipating
and
influencing
outcomes
Azure ML,
Spark
Mllib,
Mahout
BigTable DB Operational
database
(scalable,
lite
OLTP)
Manage
millions
of
columns
by
trillions
of
rows
HBase,
Cassandra
OLTP
SQL
DBMS Operational database Heavy
duty
OLTP Oracle,
SQL
Server,
DB2
Analytic SQL
DBMS Business
Intelligence,
sometimes
machine
learning
High
performance analysis
on
Big
Data Teradata,
Vertica,
Greenplum
Orchestration Build, run,
and
manage
an
analytic
data
pipeline
Developer
focuses
on
end-‐to-‐end
application
rather
than
each
service
Google
Cloud
Dataflow,
Azure
Data
Factory
Systems
of
Intelligence
Data
Platform
Components
24. Many
optimized
data
managers
(Cassandra,
Aerospike,
MongoDB,
Neo4j…)
Single
vendor
data
platform
(Azure,
AWS,
Google
Cloud
Platform,
Bluemix, Pivotal)
Single
multi-‐purpose
engine
(Oracle,
Spark)
Enterprises
Must
Choose
Their
Platform
By
Balancing
Ability
to
Handle
Optimized
but
Complex
vs.
General
Purpose
Simplicity
but
Slower
Evolving
Optimized
+
More
Complex
General
Purpose
+
Less
Complex
Faster Slower
Innovation
Hadoop
ecosystem
(Cloudera,
Hortonworks,
MapR)
25. Pro:
Greatest
innovation
and
choice
of
products
with
optimal
functionality
Con:
Complexity
-‐ customers
have
to
build,
integrate,
test,
operate
multi-‐vendor,
mostly
open
source
databases
(chart
source:
451
Research)
Many
Optimized
Data
Managers:
“Wild
West”
of
the
Ecosystem
-‐ Best
for
Internet-‐Centric
Companies
Needing
Optimized
Functionality,
Fastest
Innovation
Customer
sweet
spot
• Leading-‐edge
Internet-‐centric
companies
• Facebook,
LinkedIn,
Netflix,
Uber,
ad-‐tech,
gaming,
ecommerce
Many
optimized
data
managers
26. Pro:
Widest
and
deepest
ecosystem
that
is
curated
Con:
It’s
still
more
of
an
ecosystem
than
a
product
and
that
means
operational
and
development
complexity
Hadoop
Ecosystem is
Best
for
Those
Who
Need
Fast
Innovation
Simplified
By
Curated
Ecosystem
Customer
sweet
spot:
• Internet-‐centric
and
sophisticated
IT
enterprises
• Ad-‐tech,
gaming,
ecommerce,
telco’s,
banks,
retailers
27. Hadoop
is
Moving
Toward
Becoming
an
Integral
Platform
But
“Seams”
Between
Individual
Components
Still
Visible
28. Single
Vendor
Data
Platform-‐as-‐a-‐ServiceDelivers
More
Simplicity
via
an
Integral
Offering
Balanced
With
Some
Optimization
Cloud
platforms:
Built,
integrated,
tested,
delivered,
and
operated
as
a
service
o Microsoft:
HDInsight,
SQL
Azure,
Azure
ML,
Streaming,
Data
Factory,
Cortana
Analytics
o AWS:
Kinesis,
S3,
DynamoDB,
EMR,
Redshift
o Google
Cloud
Dataflow,
BigQuery,
BigTable
Pro:
Single-‐vendor
simplicity
combined
with
optimized
functionality
Con:
Potential
for
lock-‐in;
leading-‐edge
innovation
will
likely
exist
outside
platform
Customer
sweet
spot
• Mainstream
enterprises
that
need
a
mix
of
optimized
functionality
and
the
simplicity
of
a
single
platform
• Less
effort
on
admin,
development
29. Single
Multi-‐Purpose
Engine Can
Have
Wide
Appeal
if
It
Stays
Close
to
Innovation,
Performance
Frontier
With
Open
Source
Economics
Pro:
Simplicity
• Single
interface
for
developers,
admins
• Deep
integration
greatly
reinforces
value
of
each
component
of
functionality
– e.g.
high
volume
event
streams,
queried
to
feed
continual
iteration
of
machine
learning,
which
updates
predictive
model,
which
drives
transaction
in
real-‐time
Con:
Really
hard
to
evolve
at
pace
of
ecosystem
innovation
• Spark
immaturity
• Web-‐scale
issues,
Oracle=EXPENSIVE
Integrated
analytic
data
processing
engine:
Oracle,
Spark
o (OLTP
-‐ Oracle)
o SQL
query
o Event
processing
o Machine
learning
o Graph
processing
Customer
sweet
spot
Oracle:
Mainstream
enterprises
that
want
to
build
on
their
existing
data
platform
and
leverage
its
low-‐latency
analytics
Spark:
enterprises
at
leading
edge
and
ISV’s
that
want
deeply
integrated
processing
capabilities
30. • Most
mainstream
enterprises
are
very
early
in
the
journey
• Critical
new
data
and
analytic
skills
are
required:
sourcing,
preparing,
analyzing,
modeling
• Modernizing
SoR can
accelerate
the
journey:
from
after-‐the-‐fact
analytics
to
predictive
models
that
inform
transactions
and
interactions
in
real-‐time
• Choice
of
new
platform:
depends
on
need
for
simplicity
vs.
optimized
functionality
and
latest
innovation
Recap:
Pace
and
Place
in
Enterprise
Journey
and
Choice
of
Platform
Requires
Assessment
of
Skills,
Use
Cases