SlideShare a Scribd company logo
Making the Most of In-Memory: More than Speed

The Briefing Room
Welcome

Host:
Eric Kavanagh
eric.kavanagh@bloorgroup.com

Twitter Tag: #briefr

The Briefing Room
Mission

!   Reveal the essential characteristics of enterprise software,
good and bad
!   Provide a forum for detailed analysis of today s innovative
technologies
!   Give vendors a chance to explain their product to savvy
analysts
!   Allow audience members to pose serious questions... and get
answers!

Twitter Tag: #briefr

The Briefing Room
Topics

This Month: DATA PROCESSING
November: DATA DISCOVERY & VISUALIZATION
December: INNOVATORS

Twitter Tag: #briefr

The Briefing Room
Data Processing

“

Efficiency	
  is	
  doing	
  things	
  
right;	
  effec2veness	
  is	
  doing	
  
the	
  right	
  things.	
  
~Peter Drucker

Twitter Tag: #briefr

The Briefing Room
Analyst: Robin Bloor

Robin Bloor is
Chief Analyst at
The Bloor Group	
	

robin.bloor@bloorgroup.com

Twitter Tag: #briefr

The Briefing Room
Kognitio
!   Founded in 1989, Kognitio is both an in-memory database
and an analytical engine
!   The Kognitio Analytical Platform can be deployed as
software, as an appliance, or in the cloud
!   The platform enables flexible, ad hoc queries on complex
data sets, including data from Hadoop, and it offers scaleup and scale-out capabilities

Twitter Tag: #briefr

The Briefing Room
Guest: Roger Gaskell

 
Roger Gaskell is the Chief Technology Officer and one of the founding members
of the Kognitio Development Team. He has overall responsibility for all product
development, strategic direction and roadmap of new innovation for the
Kognitio Analytical Platform. Roger has been instrumental in all generations of
the product to date. Over this time, it has evolved from an appliance-based
system in the original beta offering in 1989, to a hardware-independent
software for x86 processing, then to a cloud-based Platform-as-a-Service
offering in in the mid-1990s. Prior to Kognitio, Roger was test and development
manager at AB Electronics. During this time his primary responsibility was for
the famous BBC Micro Computer and the development and testing of the first
mass production of personal computers for IBM.

Twitter Tag: #briefr

The Briefing Room
Making the most of
in-memory platforms
October 2013
What is an “In-memory” analytical platform

A database where queries are run from data held in
computer memory (RAM) rather than mechanical disk

Memory = Fast / Disk = Slow
Analytics go much quicker – SIMPLE?

Unfortunately, it’s not as simple as that….
10
Why in-memory: RAM is faster than disk (really!)
Actually, this only part of the story:
workload
filtering
crunching

Analytics completely change the workload
characteristics on the database
Simple reporting & transactional processing
is all about “filtering” the data of interest
Analytics is all about complex “crunching”
of the data once it is filtered

CPU cycles
storing

Storing data on physical disks severely limits the
rate at which data can be provided to the CPUs

access
11

Crunching needs processing power & consumes
CPU cycles

Accessing data directly from RAM allows
much more CPU power to be deployed
Analytics is about

crunching through data

CPU cycle-intensive & CPU-bound
“CRUNCHING”
Analytical
Functions

Joins
Aggregations

Sorts

Grouping

•  To understand what is happening in the data
More complex analytics

=

More pronounced this becomes

•  In-memory analytical platforms are therefore CPU-bound
–  Assume disk I/O speeds not a bottleneck
–  In-memory removes the disk I/O bottleneck
12
For analytics, the CPU is king
Being CPU-bound fundamentally changes
a system’s design philosophy

Disk IO Bound

CPU Bound

CPUs wait for data from disk
No need for efficient coding
Parallelisation ineffective

Every CPU cycle is precious – efficient coding
Parallelization = scalable performance
Advanced techniques minimize CPU cycles

Interactive / ad hoc analytics:
THINK data to core ratios ≈ <10GB data per CPU core
13
Why now?

Interest in
in-memory

Price
of RAM,
Logarithmic
(10)

1987

14

1995

2000

2005

2010
Mature BI being overtaken
Numbers, tables, charts, indicators
Historical information, latency
…accessed with ease and simplicity

Decision Support
But BI and BI tools have plateaued!
Progression into advanced analytics & data science

It’s now all about doing more math
…a lot more math
15
Thus more complex methods – real-time
Machine learning algorithms

Analytical Complexity

Behaviour
modelling

Statistical
Analysis

Dynamic
Simulation

Clustering

Dynamic
Interaction
Fraud detection
Reporting & BPM

Campaign
Management

#PP_R
Technology/Automation
16
How to efficiently exploit RAM
•  A large cache is not in-memory
–  In-memory platforms hold data in structures that take advantage of the
properties of RAM
–  Caches are copies of frequently used disk blocks

•  Platform designed to specifically exploit the random
access nature of memory
–  Different algorithms
–  CPU cycles are precious – code efficiency paramount
–  Advanced techniques used to reduce code path length
•  Dynamic Machine Code Generation
•  Extended CPU instruction sets

•  Parallelize everything
–  Scale-out and Scale-up
–  Fully and efficiently use every CPU
core, in every CPU, in every server
17
Analytical Platform Reference Architecture
Application &
Client Layer
All BI Tools

All OLAP Clients

Excel

Analytical
Platform
Layer

Near-line
Storage
(optional)

Reporting

Persistence
Layer
18

Kognitio
Storage

Hadoop
Clusters

Cloud
Storage

Enterprise Data
Warehouses

Legacy
Systems
Perceptions & Questions

Analyst:
Robin Bloor

Twitter Tag: #briefr

The Briefing Room
Big Data, Maybe — Big Parallelism, Yes
Many latency-reducing changes are afoot:
u  Hadoop
u  CPU

is a data lake – It’s about latency

and memory rule – The old database is dying

u  Grids,

not clusters – A server is now a cluster

u  Scaling

Up AND Scaling Out – “Only scaling out”
is last year’s story

u  SSD

will replace spinning disk – But it will never
compete with RAM
Why the Excitement?
What are the “new” applications?
BIG DATA capture and staging
BIG DATA ANALYTICS
LITTLE DATA ANALYTICS
OPERATIONAL INTELLIGENCE
A “Modern” Workload

Query Light
&
Math Heavy
Where the Rubber Meets the Road
It isn’t really about application latency any more, it’s
about business process latency (business time!). This
can have many aspects:
u  The

collapse of data flows – take the processing
to the data

u  Data
u  Full

warehouse offload

process automation

u  Lower

latency = NEW BUSINESS PROCESSES
The Question
The question for most organizations is:

Exactly how do
we take
advantage of
these changes?
This is a BUSINESS question AND a TECHNICAL question.
u  Low

latency is exciting, but where do you see the
clear business opportunities?

u  There

seems to be a conundrum about where to
store “slow” data:
Ø  Hadoop?
Ø  Traditional data warehouse?
Ø  New data warehouse?

u  Is

the split between the application and the data
real any more?
u  In

your opinion, does the Enterprise need a new
architecture?

u  How

is it possible to define and monitor service
levels with in-memory applications?

u  Whither

data governance?
Twitter Tag: #briefr

The Briefing Room
Upcoming Topics

This Month: DATA PROCESSING
November: DATA DISCOVERY & VISUALIZATION
December: INNOVATORS

www.insideanalysis.com

Twitter Tag: #briefr

The Briefing Room
Thank You
for Your
Attention

Twitter Tag: #briefr

The Briefing Room

More Related Content

Similar to Making the Most of In-Memory: More than Speed

Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?
Jos van Dongen
 
Why sap hana
Why sap hanaWhy sap hana
Why sap hana
ugur candan
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
✔ Eric David Benari, PMP
 
Thinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters AnalyticsThinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters Analytics
Inside Analysis
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Kinetica
 
Vectorization whitepaper
Vectorization whitepaperVectorization whitepaper
Vectorization whitepaper
VIVEKSINGH634333
 
How In Memory Computing Changes Everything
How In Memory Computing Changes EverythingHow In Memory Computing Changes Everything
How In Memory Computing Changes Everything
Debajit Banerjee
 
Architecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-HavesArchitecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-Haves
Yellowbrick Data
 
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Senturus
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
Anant Corporation
 
Afterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écranAfterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écran
Joseph Glorieux
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
Harald Erb
 
Big Data
Big DataBig Data
Big Data
NGDATA
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and Practices
Jeff Bertman
 
big data et data viz - du lac à votre écran - afterwork
big data et data viz - du lac à votre écran - afterwork big data et data viz - du lac à votre écran - afterwork
big data et data viz - du lac à votre écran - afterwork
OCTO Technology Suisse
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
DATAVERSITY
 
Tek12: Graphing real-time performance with Graphite
Tek12: Graphing real-time performance with GraphiteTek12: Graphing real-time performance with Graphite
Tek12: Graphing real-time performance with Graphitenanderoo
 
Hadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsHadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both Worlds
Inside Analysis
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
StampedeCon
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everything
Lew Tucker
 

Similar to Making the Most of In-Memory: More than Speed (20)

Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?
 
Why sap hana
Why sap hanaWhy sap hana
Why sap hana
 
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, SisenseDatabase Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
Database Camp 2016 @ United Nations, NYC - Amir Orad, CEO, Sisense
 
Thinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters AnalyticsThinking Outside the Cube: How In-Memory Bolsters Analytics
Thinking Outside the Cube: How In-Memory Bolsters Analytics
 
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU DatabasePowering Real-Time Big Data Analytics with a Next-Gen GPU Database
Powering Real-Time Big Data Analytics with a Next-Gen GPU Database
 
Vectorization whitepaper
Vectorization whitepaperVectorization whitepaper
Vectorization whitepaper
 
How In Memory Computing Changes Everything
How In Memory Computing Changes EverythingHow In Memory Computing Changes Everything
How In Memory Computing Changes Everything
 
Architecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-HavesArchitecting a Modern Data Warehouse: Enterprise Must-Haves
Architecting a Modern Data Warehouse: Enterprise Must-Haves
 
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
Demystifying In-Memory Technologies: Best Uses and Competitive Advantages for...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
Afterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écranAfterwork big data et data viz - du lac à votre écran
Afterwork big data et data viz - du lac à votre écran
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
 
Big Data
Big DataBig Data
Big Data
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and Practices
 
big data et data viz - du lac à votre écran - afterwork
big data et data viz - du lac à votre écran - afterwork big data et data viz - du lac à votre écran - afterwork
big data et data viz - du lac à votre écran - afterwork
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Tek12: Graphing real-time performance with Graphite
Tek12: Graphing real-time performance with GraphiteTek12: Graphing real-time performance with Graphite
Tek12: Graphing real-time performance with Graphite
 
Hadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both WorldsHadoop and the Relational Database: The Best of Both Worlds
Hadoop and the Relational Database: The Best of Both Worlds
 
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
Analytics, Big Data and Nonvolatile Memory Architectures – Why you Should Car...
 
Cloud Computing ...changes everything
Cloud Computing ...changes everythingCloud Computing ...changes everything
Cloud Computing ...changes everything
 

More from Inside Analysis

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
Inside Analysis
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
Inside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
Inside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
Inside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
Inside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
Inside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
Inside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Inside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
Inside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Inside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
Inside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
Inside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
Inside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
Inside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
Inside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
Inside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
Inside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
Inside Analysis
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
Inside Analysis
 

More from Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Recently uploaded

Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

Making the Most of In-Memory: More than Speed

  • 1. Making the Most of In-Memory: More than Speed The Briefing Room
  • 3. Mission !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room
  • 4. Topics This Month: DATA PROCESSING November: DATA DISCOVERY & VISUALIZATION December: INNOVATORS Twitter Tag: #briefr The Briefing Room
  • 5. Data Processing “ Efficiency  is  doing  things   right;  effec2veness  is  doing   the  right  things.   ~Peter Drucker Twitter Tag: #briefr The Briefing Room
  • 6. Analyst: Robin Bloor Robin Bloor is Chief Analyst at The Bloor Group robin.bloor@bloorgroup.com Twitter Tag: #briefr The Briefing Room
  • 7. Kognitio !   Founded in 1989, Kognitio is both an in-memory database and an analytical engine !   The Kognitio Analytical Platform can be deployed as software, as an appliance, or in the cloud !   The platform enables flexible, ad hoc queries on complex data sets, including data from Hadoop, and it offers scaleup and scale-out capabilities Twitter Tag: #briefr The Briefing Room
  • 8. Guest: Roger Gaskell   Roger Gaskell is the Chief Technology Officer and one of the founding members of the Kognitio Development Team. He has overall responsibility for all product development, strategic direction and roadmap of new innovation for the Kognitio Analytical Platform. Roger has been instrumental in all generations of the product to date. Over this time, it has evolved from an appliance-based system in the original beta offering in 1989, to a hardware-independent software for x86 processing, then to a cloud-based Platform-as-a-Service offering in in the mid-1990s. Prior to Kognitio, Roger was test and development manager at AB Electronics. During this time his primary responsibility was for the famous BBC Micro Computer and the development and testing of the first mass production of personal computers for IBM. Twitter Tag: #briefr The Briefing Room
  • 9. Making the most of in-memory platforms October 2013
  • 10. What is an “In-memory” analytical platform A database where queries are run from data held in computer memory (RAM) rather than mechanical disk Memory = Fast / Disk = Slow Analytics go much quicker – SIMPLE? Unfortunately, it’s not as simple as that…. 10
  • 11. Why in-memory: RAM is faster than disk (really!) Actually, this only part of the story: workload filtering crunching Analytics completely change the workload characteristics on the database Simple reporting & transactional processing is all about “filtering” the data of interest Analytics is all about complex “crunching” of the data once it is filtered CPU cycles storing Storing data on physical disks severely limits the rate at which data can be provided to the CPUs access 11 Crunching needs processing power & consumes CPU cycles Accessing data directly from RAM allows much more CPU power to be deployed
  • 12. Analytics is about crunching through data CPU cycle-intensive & CPU-bound “CRUNCHING” Analytical Functions Joins Aggregations Sorts Grouping •  To understand what is happening in the data More complex analytics = More pronounced this becomes •  In-memory analytical platforms are therefore CPU-bound –  Assume disk I/O speeds not a bottleneck –  In-memory removes the disk I/O bottleneck 12
  • 13. For analytics, the CPU is king Being CPU-bound fundamentally changes a system’s design philosophy Disk IO Bound CPU Bound CPUs wait for data from disk No need for efficient coding Parallelisation ineffective Every CPU cycle is precious – efficient coding Parallelization = scalable performance Advanced techniques minimize CPU cycles Interactive / ad hoc analytics: THINK data to core ratios ≈ <10GB data per CPU core 13
  • 14. Why now? Interest in in-memory Price of RAM, Logarithmic (10) 1987 14 1995 2000 2005 2010
  • 15. Mature BI being overtaken Numbers, tables, charts, indicators Historical information, latency …accessed with ease and simplicity Decision Support But BI and BI tools have plateaued! Progression into advanced analytics & data science It’s now all about doing more math …a lot more math 15
  • 16. Thus more complex methods – real-time Machine learning algorithms Analytical Complexity Behaviour modelling Statistical Analysis Dynamic Simulation Clustering Dynamic Interaction Fraud detection Reporting & BPM Campaign Management #PP_R Technology/Automation 16
  • 17. How to efficiently exploit RAM •  A large cache is not in-memory –  In-memory platforms hold data in structures that take advantage of the properties of RAM –  Caches are copies of frequently used disk blocks •  Platform designed to specifically exploit the random access nature of memory –  Different algorithms –  CPU cycles are precious – code efficiency paramount –  Advanced techniques used to reduce code path length •  Dynamic Machine Code Generation •  Extended CPU instruction sets •  Parallelize everything –  Scale-out and Scale-up –  Fully and efficiently use every CPU core, in every CPU, in every server 17
  • 18. Analytical Platform Reference Architecture Application & Client Layer All BI Tools All OLAP Clients Excel Analytical Platform Layer Near-line Storage (optional) Reporting Persistence Layer 18 Kognitio Storage Hadoop Clusters Cloud Storage Enterprise Data Warehouses Legacy Systems
  • 19. Perceptions & Questions Analyst: Robin Bloor Twitter Tag: #briefr The Briefing Room
  • 20.
  • 21. Big Data, Maybe — Big Parallelism, Yes Many latency-reducing changes are afoot: u  Hadoop u  CPU is a data lake – It’s about latency and memory rule – The old database is dying u  Grids, not clusters – A server is now a cluster u  Scaling Up AND Scaling Out – “Only scaling out” is last year’s story u  SSD will replace spinning disk – But it will never compete with RAM
  • 22. Why the Excitement? What are the “new” applications? BIG DATA capture and staging BIG DATA ANALYTICS LITTLE DATA ANALYTICS OPERATIONAL INTELLIGENCE
  • 23. A “Modern” Workload Query Light & Math Heavy
  • 24. Where the Rubber Meets the Road It isn’t really about application latency any more, it’s about business process latency (business time!). This can have many aspects: u  The collapse of data flows – take the processing to the data u  Data u  Full warehouse offload process automation u  Lower latency = NEW BUSINESS PROCESSES
  • 25. The Question The question for most organizations is: Exactly how do we take advantage of these changes? This is a BUSINESS question AND a TECHNICAL question.
  • 26. u  Low latency is exciting, but where do you see the clear business opportunities? u  There seems to be a conundrum about where to store “slow” data: Ø  Hadoop? Ø  Traditional data warehouse? Ø  New data warehouse? u  Is the split between the application and the data real any more?
  • 27. u  In your opinion, does the Enterprise need a new architecture? u  How is it possible to define and monitor service levels with in-memory applications? u  Whither data governance?
  • 28. Twitter Tag: #briefr The Briefing Room
  • 29. Upcoming Topics This Month: DATA PROCESSING November: DATA DISCOVERY & VISUALIZATION December: INNOVATORS www.insideanalysis.com Twitter Tag: #briefr The Briefing Room
  • 30. Thank You for Your Attention Twitter Tag: #briefr The Briefing Room