ODSC West TidalScale Keynote Slides

Software-Defined Servers
A Game Changer for Data Scientists
Gary Smerdon
CEO
TidalScale, Inc.

NVMe Flash 150 μs $150,000
Flash Array 1 ms $1,500,000
Hard Drive 5 ms $7,500,000
TCP packet retransmit 2 s $2,000,000,000
The Problem
In-Memory Computing is Key
Operation
Processing
Latency
In Human
Terms
L1-L3 Cache 1-13 ns $1
DRAM 50 ns $50
Variety &
Volume: Data
growing at 62%
CAGR GB TB
PB
EB
ZB
DataVolumes
Time
Velocity:
Data value declines
with age
BusinessValue
Age of Data (seconds)

Current Approaches
- Lower prediction accuracy
Algorithm
Sub Sample
Shard 4
Shard 3
Shard 2
Shard 1
- Time & Money

Single Huge Server
Easy to discover
relationships
Hard to discover
relationships
Sharded
Department A
Department B
Department C
Department D
Department E
Department F
Department G
Product A
Product B
Product C
Product D
Seeing All the Data Uncovers Relationships

Recent Data is more Predictive
2005-2007 mortgage
data would have
predicted the 2008
mortgage crisis…but
analysts used data only
from 2004
Five-year Modeled Default Frequency Rate by Deal Vintage Year
Year
Five-yearCumulativeDefaultFrequencyRate
2002 2003 2004 2005 2006 2007
0%
5%
10%
15%
20%
25%
30%
35%
Actual Default Frequency Rate
Model with data through 2004
Model with data through 2007

Model Accuracy Needs RAM: Decision Trees
• Decision Trees model error
rates decline with data size
& tree depth
• Learning time decreases
with tree depth
• More data & greater
tree depth consumes more
RAM
Number of Observations:
Prediction Error Rate by Data Set Size & Decision Tree Depth

What If Servers were Software-Defined?
• In-memory performance at scale
• As many cores as needed
• Self optimizing
• Everything just works
• Uses standard hardware
Software-Defined Servers

Traditional Virtualization
VirtualPhysical
Multiple virtual machines share a
single physical server
Virtual
Machine
Virtual
Machine
Virtual
Machine
Application
Operating System
100%, bit-for-bit
unmodified
Application
Operating System
Application
Operating System

Single virtual machine spans multiple physical servers
TidalScale: Software-Defined Servers
Application
Operating System
…
HyperKernel HyperKernel HyperKernel
TidalScale
Virtual
Machine
100%, bit-for-bit
unmodified

HyperKernel
…
HyperKernel HyperKernel HyperKernel HyperKernel
Application
Operating System
TidalScale Software-Defined Server
Flexible – Scales Up or Down Quickly
Seamless Scalability

HyperKernel
…
Uses patented machine learning to transparently align resources
Application
Operating System
Machine Learning Driven Self-Optimization

Applications
Operating Systems
Virtual Machine
If it runs, it runs on a TidalScale Software-Defined Server
HyperKernel
…
100% Compatible
Containers

Use Case: Retail Analytics on TidalScale
Performance Comparisons (TPC-H “Powertest” in Minutes)
Workload Size in GB
MinutestoProcess
100
0
10
20
30
40
50
70
100
Amazon EC2
0 200 300 400 500 600 700 800 900 1,000
60
80
90
69.1
TEST FAILS
22.0
33.7

Benchmark: Open Source R on TidalScale
• Version: Revolution R Open 8.0.3
with pryr, dplyr, mgcv, rpart,
randomForest, FNN, Matrix, doparallel &
foreach
• Data: CMS Public Use Dataset
• In-memory footprints: 32GB-680GB
• Operations timed:
• Load
• Join
• GAM linear regression
• GLM linear regression
• Decision Tree
• Random Forest (fixed seed)
• K Nearest Neighbors
Open Source R Performance Comparisons
TotalExecutionTime(Minutes)
100
0
100,000
200,000
400,000
700,000
- 200 300 400 500 600
300,000
500,000
600,000
Bare Metal Server (128GB)
158 days
TidalScale Software-Defined Server (5 x 128GB nodes)
1,325 3,787
https://github.com/TidalScale/R_benchmark_test
Workload Size in GB

Tomorrow’s Servers Today: A Game-Changer
“Software-defined Servers make it easy to run
memory-intensive applications like data mining,
machine learning and simulation.”
Marc Jones, Director &
Distinguished Engineer, IBM

“This is the way all servers will be built in the future.”
Gordon Bell
Industry legend & 1st outside investor in TidalScale

ODSC West TidalScale Keynote Slides

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to ODSC West TidalScale Keynote Slides

Similar to ODSC West TidalScale Keynote Slides (20)

Recently uploaded

Recently uploaded (20)

ODSC West TidalScale Keynote Slides

Editor's Notes