The document discusses the IBM Data Engine for NoSQL, which uses a combination of DRAM and flash memory attached via CAPI to provide a new tier of memory capacity up to 40TB for NoSQL databases like Redis. This solution offers significantly lower costs while improving performance over traditional all-DRAM or all-flash deployments. By reducing nodes required, the total cost of operating the database can be reduced by up to 24 times while maintaining high performance to cost ratios.
Developer Data Modeling Mistakes: From Postgres to NoSQL
The IBM Data Engine for NoSQL on IBM Power Systems™
1.
2. The use of NoSQL has exploded
in recent years to meet user
expectations for real-time
response at scale.
The massive size and growth of mobile and social
applications built around cloud architectures have
driven the adoption of NoSQL databases for their
speed, capacity, resiliency and simplicity.
Many industries, including banking, defense,
biotech, web, telecom and others, have adopted
NoSQL database capabilities.
NoSQL databases can fall into one of the
following categories:
• Key value store (Redis, Memcached)
• Column store (Cassandra, Bigtable)
• Document Store (MongoDB, CouchDB)
• Graph (Neo4j,Titan)
3. Many high-performance
databases run in-memory
to meet the demands of
analytics, web, mobile and
social applications that
need lightning-fast response;
therefore, memory capacity
defines the size of the data
set that can be processed.
• NoSQL databases in particular run
entirely in-memory or rely heavily on
memory as a cache to meet application
performance requirements.
• These solutions can get expensive and
hard to scale, and the latency associated
with traditional I/O attached storage can
degrade application performance.
4. 40 TB
IBM has a solution:
The IBM Data Engine for NoSQL
The IBM Data Engine for NoSQL is an integrated
platform for a large and fast-growing key value
store NoSQL database (Redis). By using a
combination of DRAM and Coherent Accelerator
Processor Interface (CAPI)–attached flash
memory, this integrated platform creates a new
tier of memory of up to 40TB capacity. The IBM
Data Engine for NoSQL offers significantly lower
deployment and operational costs and improved
computing performance for super-scalable,
high-performing KVS memory databases
(Redis: Provided by Redis Labs) on a
scale-out infrastructure.
5. 10%
3x
80%
Flash
All FlashAll Memory
Performance
Cost
Relative Performance
and Cost as a Function
of Memory/Flash Ratio
Performance (typical)
Cost (typical)
With the IBM Data Engine for NoSQL, large databases
are faster and cheaper to run.
By reducing the number of nodes required for the solution by up to 24 times, there is a
dramatic reduction in the total cost of operation (TCO) for networking floor space, energy
cooling and operations overhead.* A 12TB database is one-third the cost of traditional
deployment, while maintaining a very high ratio of performance to cost.
*For KVS workloads only.
6. What is the Coherent Accelerator Processing Interface (CAPI)?
A key innovation in the IBM POWER8®
architecture, CAPI is an innovative method of adding a processing
engine to a POWER8 system.
• CAPI accelerator acts as a peer to POWER8 cores, sharing the same memory space and greatly
reducing device communication overhead.
• CAPI devices can accelerate applications beyond the capabilities of a general-purpose processor.
• CAPI accelerators can participate like POWER8 processors, with direct access to memory,
greatly reducing overhead.
• Simplified addressing makes CAPI easy to use and easy to program.
• Monte Carlo algorithms, key value stores, and financial and medical algorithms are ideal for CAPI.
• CAPI can also be used as a foundation for flash memory expansion.
• A wide variety of application domains can take advantage of CAPI, including database acceleration
and fast storage, data analytics and pattern recognition, visual/biometric analysis, and high-
performance computing applications in healthcare, weather, finance and insurance, oil and gas
and manufacturing.
7. What is Redis?
Redis (REmote DIctionary Server) is an in-memory, key value store NoSQL database that
offers high performance, scalability and persistent storage on disk.
Redis supports several kinds of values, including simple string values or more complex
data structures.These include binary-safe strings, lists, sets, sorted sets, hashes, bit arrays
or bitmaps and HyperLogLogs. It also supports a lightweight and easy-to-use publish/
subscribe mechanism for broadcasting messages and client libraries that are available for
all major languages.
Redis is used by a number of organizations, including Twitter, Instagram, Pinterest, GitHub,
Craigslist and Stack Overflow.
8. About
Redis Labs is the leading commercial provider for
Redis open-source. Redis Labs Enterprise Cluster
(RLEC) is the only on-premise, enterprise-grade
deployment environment for Redis OSS, enabling
super-fast performance, seamless scalability,
true high availability, reliability and best-in-class
expertise.
4,200 customers, 40 countries, 24,000 free
trial customers, over 80,000 DBs, 24/7 support.
HQ in Mountain View, CA, RD in Tel Aviv.
9. Application
Flash
APIs
POWER8
DRAM
FLASH ARRAY
PSL
Flash
AFU
Hardware Components of the IBM Data Engine for NoSQL
What are the hardware components of the
IBM Data Engine for NoSQL?
The design enables the processor main memory to provide the fast response times
that applications require by using main memory to cache or hold the most frequently
accessed data, while leveraging the flash storage attached via CAPI to store the
remaining in-memory data*.
• IBM FlashSystem®
840 Storage solution, firmware version 1.1.3.0 or later
• FlashSystem storage array
• CAPI adapter card
• FPGA chip
• Fiber channel I/O ports
*Providing the POWER8 processors with direct access to both DRAM and flash enables application software to adjust memory and flash usage ratios to optimize performance and cost.
10. Redis Configuration/
Setup/Provisioning
Redis Instance
KV Fcn
Block FcnDisk Utility
Linux
Kernel
Firmware
PSL
AFU
Up to 40 TB - Fiber Attached
Master Context
Adapter STUBData Flows
Configuration Paths
Error Flows
Software Components of the
IBM Data Engine for NoSQL
What are the software components of the NoSQL Data Engine?
This software arrangement provides the application with direct access to the flash memory through a set
of developer APIs that provides a key value, and raw block I/O interfaces to manage and access the data in
flash memory.
Management Layer: Consists of the initialization scripts invoked at system boot and shutdown.
Master Context: Daemon that initializes the adapter, completes logical unit number (LUN) discovery
and mapping, does error recovery and health checking, addresses uncorrectable errors and manages
link events on behalf of client application software.
Block I/O APIs: Handle read/write requests for specific blocks and issue commands directly to the
accelerator function unit (AFU) to read/write data on a logical address in flash memory.
Key Value Storage APIs: Provide a generic key value database that forms the bridge between Redis
and the block I/O APIs.
Redis Instance: A commercial grade Redis implementation provided by Redis Labs.
11. Built for Linux
IBM has introduced a line of Linux®
-only scale-
out servers that include the POWER8 processors
optimized for Linux. What that means is nearly
seamless swapping of POWER8 into any
infrastructure built on Linux. Specifically:
• Hardware-agnostic applications written
in scripting or interpretive languages
(Java, Perl, Python, PHP) run as is on
IBM Power SystemsTM
versus x86.
• Most x86/Linux applications written in
C/C++ require only a recompile.
12. 10x
7x
140 msec
BANDWIDTH
PROCESSING
REDUCTION IN
LATENCY
The POWER8 Difference
Building on the collaboration with the
OpenPOWER Foundation, IBM is uniquely
positioned to deliver a higher-performing stack by
working with key component providers while still
allowing interchangeability of the components.*
Here are the indisputable facts: POWER8 vs. x86:
• 10x increase in bandwidth
• 7x reduction in latency
• From 1-second processing to
140 milliseconds
• CAPI, SMT and NVIDIA GPU
accelerators
• OpenPOWER Foundation
*Based on a POWER8 S824 with 24 cores, 256 GB Memory, 3.52 GHz, RHEL 7.0, WAS
8.5.5.2, DB2 9.7, JDK 7.0 FP1 compared to an Ivy Bridge EP 24 cores, 256 GB
Memory, 2.7 GHz, RHEL 6.5, WAS 8.5.5.1, DB2 9.7, JDK 7.0 FP1.