The return of big iron?

Remember when it looked like this.
They were all pretty much alike?

But it’s also quite
exciting!

We Lose: Joe Hellerstein (Berkeley) 2001

“Databases are commoditised and cornered
to slow-moving, evolving, structure intensive,
applications that require schema evolution.“ …
“The internet companies are lost and we will
remain in the doldrums of the enterprise
space.” …
“As databases are black boxes which require a
lot of coaxing to get maximum performance”

New Approach to Data Access
Simple

Pragmatic

Solved an insoluble problem

Unencumbered by tradition (good & bad)

With this came a Different Focus
Tradition No SQL
•  Global consistency •  Local consistency
•  Schema driven •  Schemaless
•  Reliable Network •  Unreliable Network
•  Highly Structured •  Semi-structured/
Unstructured

NoSQL / Big Data technologies really focus on load
and volume problems by avoiding the complexities
associated with traditional transactional storage

The ‘Relational Camp’ had
been busy too
Realisation that the traditional
architecture was insufficient for
various modern workloads

End of an Era Paper - 2007
“Because RDBMSs can be beaten by more
than an order of magnitude on the standard
OLTP benchmark, then there is no market
where they are competitive. As such, they
should be considered as legacy technology
more than a quarter of a century in age, for
which a complete redesign and re-architecting
is the appropriate next step.” – Michael
Stonebraker

There is a new and impressive breed
•  Products ~ 5 years old
•  Shared nothing (sharded)
•  Designed for SSD’s & 10GE
•  Large address spaces (256GB+)
•  No indexes (column oriented)
•  Dropping traditional tenets (referential integrity
etc)
•  Surprisingly quick for big queries when
compared with incumbent technologies.

Both types of solution have
clear value

..and it’s not really a question of size

TB
0 1 10 100 1000 10,000

The majority of us live in the overlap region

More a question of utility …
which this tends to lead to
composite offerings

So what does this mean for the
enterprise?

80% Enterprise Databases < 1TB

This
Reference
is getting
pretty old
now, sorry
(2009)

Yet we often have a lot of them

Communication is Store & Forward
The outside world

Sometimes we’re a bit more
organized!

But most of our data is not that accessible

Core
Operational
Exposed
Data

…and sharing is often an afterthought

Core
Operational
Exposed
Data

But as data is getting bigger and
heavier

..it can make it hard to join data
together

So we often we turn to some form of
Enterprise Data Warehouse
(or maybe data virtualization)

Big data tech sometimes provides a
composite solution (or ETL)

Ability to model data is much more of a
gating factor than raw size

Dave Campbell
(Microsoft – VLDB Keynote 2012)

Importing data into a standard
model is a slow and painful processs

An alternative is to use a Late
Bound Schema

Combining structured & unstructured approaches in a
layered fashion makes the process more nimble

Structured Late Bound
Standardisation Schema
Layer

Raw Data

We take this kind of approach
•  Grid of machines
•  Late bound schema
•  Sharded, immutable data
•  Low latency (real time) and high throughput
(grid) use cases
•  All data is observable (an event)
•  Interfaces: Standardised (safe) or Raw
(uayor)

Both Raw & Standardised data is available
Operational Relational
(real time / MR) Analytics

Object/SQL
Standardisation

Raw Data

This helps to loosen the grip of the
single schema, whilst also
providing a more iterative
approach to standardisation

Support for both one standardised and many
bespoke models in the same technology

Raw Facts from
different systems
Standardised
Model

Next step: to centralise common
processing tasks

Standardised Risk
Model Calculation

Thanks

http://www.benstopford.com

The return of big iron?

More Related Content

What's hot

Viewers also liked

Similar to The return of big iron?

More from Ben Stopford

The return of big iron?