Assessing New Database Capabilities – Multi-Model

Assessing New
Database Capabilities:
Multi-Model
Presented by: William McKnight
President, McKnight Consulting Group
williammcknight
www.mcknightcg.com
(214) 514-1444

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2021. All rights reserved.
Rick Jacobs, Technical Marketing Manager
October 10th, 2022
Enterprise Level
Advanced Analytics

Agenda
Why Couchbase
Couchbase Analytics
Use Cases & Customer Stories
1
2
3

Why Couchbase
1

Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2020. All rights reserved. 4
How is Couchbase Different?
Mobile/Edge Apps
Applications and Microservices
Fast
• Memory-first design
• Cloud-native scale
• Geo-replication via XDCR
• HA, DR & backup
• Low latency Cloud to Edge
Familiar
• SQL++ query language
• Dynamic Schema
• ACID SQL Transactions
• Cost-based optimizer
• SDKs for 12+ languages
Affordable
• Elastic scaling, sharding &
rebalancing
• Multidimensional scaling
• High-density storage
• Incredible price/performance
Flexible
• JSON document
• Multimodel services
• Cloud deploy anywhere
• Mobile & Edge ready
SQL
Integrated
Cache
JSON
Documents
SQL
Query
Full Text
Search
Operational
Analytics
Eventing
Key-Value
Access
Geo-Replication
& Sync
Mobile
Database
Relational
Capabilities

Database-as-a-Service Self-Managed Cloud
• Maximize convenience
• Easy to start, manage, and scale
• Industry leading price-performance
• Highly available and secure
• Maximize control & customizability
• Leverage DBA’s & OPS team skills
• Choose management strategy & tools
• Deploy via Kubernetes if you choose
Capella Server
Flexible Cloud and Edge Options: Delivering Consistency
“We wanted a solution that seamlessly works across server and mobile, without lots of
retraining. No other solutions came even close to Couchbase.”
Aviram Agmon
Chief Technical Officer
Maccabi
• Offline first design for max uptime
• Extreme speed and reliability
• Data integrity: secure, automated sync
• Broad SQL and device support
Edge & IoT
Mobile

Couchbase
Analytics
2

Analytics fundamentals
• Fast ingestion
• Near real-time data availability (using DCP)
• No ETL (simple, no paradigm shift)
• Same data model and query language
• MPP processing
• Uses best-of-breed DW algorithms (join,
aggregation, sorting)
• Memory-conscious operators (DGM)
• Workload isolation
• MDS – has its own sub-cluster
• Each query uses all resources
Operations Data
Real-time Analytics
Analytics Tool
Business
Application Ops Data
Node
Analytics
Node
Couchbase Data Platform

Timely
Operational data is
readily available for
analytics when created
and as current
as possible
Flexible
Schema changes on
operational side don’t
impact analyses
Speedy
Analysis queries run
quickly without
impacting operational
performance
Scalable
Scale to
speed up queries
and scale up data
Requirements for an Agile Analytics Platform

Couchbase Analytics Architecture

Customer Stories
3

Key Use Cases
Need: Perform data exploration on
operational data in near-real time with
agile data science modeling
Outcome: Enabled new customer
attributes to enable data science
focused consumer segment strategies
→ faster time to insights for consumer
marketing responses from
weeks/months to hours
Need: Perform complex analytical
queries, computations, and aggregations
on JSON data enriched with 3rd party
data without data movement
Outcome: Analytics Service powered
regression calculations to compute 2M+
prices to further improve query
performance by 100% for 200GB+ data.
No need for ETL
eCommerce
Real-time marketing
campaigns
Finance
Investments Modeling
Need: Scale data platform to meet
increased analytics and reporting needs
Outcome: Executives able to answer
key business revenue impact questions
→ “Show detailed effects of COVID-19
on hospitals cancelling elective
procedures to identify underpaid or
unidentified revenue”
Healthcare
Hospital/Clinics Customer
Revenue
Personalized Ordering Risk Scoring BI & Data Scale
eCommerce Food Delivery. Finance. Healthcare

Confidential and Proprietary. Do not distribute without
Couchbase consent. © Couchbase 2020. All rights
reserved.
Outcomes
• Reduction of targeted consumer
offers from of weeks/months →
hours & analyze data in near real-
time
• Enabled agile data mining models
focused on order behaviors, propensity
scoring and enabled flexible attribute
creation
• Removed need to ETL for data
science experiments
Requirements
• Track average transaction size,
annual purchase frequency and
loyalty to determine customer lifetime
value (CLV)
• Deliver personalized marketing
campaigns, segments and reduce
time to perform data science
experiments
• Ability to perform data exploration on
operational data in near-real time
SOLUTION:
Customer Data Management
APPLICATION:
Commerce Data Hub
Data science experimentation
USE CASE(S):
Real time marketing
campaigns and personalized
ordering experience
ABOUT:
World leader in pizza delivery
operating a network of
company-owned and
franchise-owned stores
globally. 3M pizzas a day,
16.5K stores in 85 countries

Confidential and Proprietary. Do not distribute without
Couchbase consent. © Couchbase 2020. All rights
reserved.
Requirements
• Action on near real-time data flow without
transformation
• Enable better fan experience at concession
stands during games and IoT functionality
for ticket scans
• Easy to use SQL-like interface as their
resources are lean and skilled in SQL
Outcomes
• Continuous data sync for real-time
visitor and customer concessionaire
analytics
• Increased customer engagement via
interactive scoreboards, fan kiosks, and
more
• Easy integration with Knowi and Tableau
for real-time executive reporting
SOLUTION:
Customer 360
APPLICATION:
Ticket scan
VIP loyalty program
USE CASE(S):
Real time analytics for
fan interactions
ABOUT:
Professional baseball
franchise valued at
$600M+ with 1.8M+
fan base

Scaling
legacy DB
Mainframe
access
NoSQL
sprawl
Scaling other
NoSQL DB
Managing
multiple DBs
Dedicated DB
per use case
Slow
dev. cycles
Mission-critical
new features
Ever-changing
requirements
Mobile apps
take too long
Modern DB
tech. required
Need to
consolidate tech.
Personalization
+ performance
Fully featured
mobile apps
Single view of
customer
Legacy = more
time, $$, effort
Integrate
disparate data
Delivering Business Outcomes by Solving
Technology Problems
Improving
customer
experience &
engagement
Faster
innovation
& time to
market
Reducing
infrastructure
& operations
costs
Predictable
performance

Try Couchbase Capella free:
No credit card required
https://www.couchbase.com/products/capella/get-started
THANK YOU

William McKnight
President, McKnight Consulting Group
• Frequent keynote speaker and trainer internationally
• Consulted to Pfizer, Scotiabank, Fidelity, TD
Ameritrade, Teva Pharmaceuticals, Verizon, and many
other Global 1000 companies
• Hundreds of articles, blogs and white papers in
publication
• Focused on delivering business value and solving
business problems utilizing proven, streamlined
approaches to information management
• Former Database Engineer, Fortune 50 Information
Technology executive and Ernst&Young Entrepreneur
of Year Finalist
• Owner/consultant: Research, Data Strategy and
Implementation consulting firm
2
William McKnight
The Savvy Manager’s Guide
The
Savvy
Manager’s
Guide
Information
Management
Information Management
Strategies for Gaining a
Competitive Advantage with Data

McKnight Consulting Group Offerings
Strategy
Training
Strategy
§ Trusted Advisor
§ Action Plans
§ Roadmaps
§ Tool Selections
§ Program Management
Training
§ Classes
§ Workshops
Implementation
§ Data/Data Warehousing/Business
Intelligence/Analytics
§ Big Data
§ Master Data Management
§ Governance/Quality
Implementation
3

McKnight Consulting Group Client Portfolio

Decisions, Decisions, Decisions
• Unprecedented variety of data store choices to meet
the needs of their varied workloads
• Enterprises have many needs for databases, including
cache, operational, data warehouse, master data, ERP,
analytical, graph data, data lake, and time series data
• While vendor offerings have exploded in recent
years, in due time frameworks will integrate
components into what amounts to a single offering
for multiple workloads, perhaps even for the
enterprise
• But what if price-performant offerings for adjacent
workloads in an enterprise have materialized?
5

Many Data Types
• Web Crawlers
• Open Linked Data
• JSON
• XML
• Documents
• Binary
• Graph
• Log Files
6

Why NoSQL for Operational Big Data
More data model flexibility
– Web Services as a data model
– No !schema first" requirement; load first
Faster time to insight from data acquisition
Relaxed ACID
– Eventual consistency
– Willing to trade consistency for availability
– ACID would crush things like storing clicks on Google
Low upfront software and development costs
Programmers love the freedoms
Fault-tolerant redundancy
Linear Scaling to “webscale”
7

• Placement policy:
A copy is written to the node creating the file (write affinity)
A second copy is written to a data node within the same rack (to
minimize cross-rack network traffic)
A third copy is written to a data node in a different rack (to tolerate
switch failures)
Node 5
Node 4
Node 3
Node 2
Node 1
Block
1
Block
3
Block
2
Block
1
Block
3
Block
2
Block
3
Block
2
Block
1
Objectives: load balancing, fast access, fault tolerance
DFS Block Placement
8

CAR
DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Property Graph Model Components
Nodes
• The objects in the graph
• Can have name-value
properties
• Can be labeled
friends
friends
LIVES WITH
O
W
N
S
PERSON PERSON
Relationships
• Relate nodes by type and
direction
• Can have name-value
properties
9

Semantic Graph
• RDF Triple Store
– Semantic databases only work with RDF
• Target market is users of third-party
data in RDF (all Linked open data)
– Working across data sets
10

Databases are Multi-Model when they can
be either (for example):
11

Data Types and NoSQL Data Models
Data Type Data Model
CSV, TSV or web logs Column, Document
Documents Document
JSON Document
Metadata catalog Column, Document
Keyed images and documents Key-Value
RDF, Linked data Graph
12

Key-Value Stores
What are they?
• NoSQL’s OLTP equivalent
• Extremely simple
• Key-”blob pairs”, that’s it
• Associative array data model
• Retrieve value given a key
– All access is by a key
(key,value)
13

Key-Value Stores
Technical Characteristics:
• Horizontally scalable
• Fast (did I mention fast)
• Resiliency to cluster failures
• Simplicity
• All nodes equal
14
(key,value)

Key-Value Stores
Good for:
• Any single object of unstructured data
• Storing BLOBs
• Fast writes
• Web app cache
• Session Information – get all session information in a
single put/get
• User profile data
• Massive multi-player on-line gaming
• Shopping carts (up until the payment transaction)
• Geo-localized processing
• Speed when you can’t be down
(key,value)
15

A multi-model database is a single, integrated
database that can store, manage and query data i
multiple models such as relational, document,
graph, key-value, column-store, cache. It is the
opposite approach to Polyglot Persistence – the
use of multiple databases in a workload.
16

Document-oriented Databases
What are they?
• Key-Value Stores with added capabilities
– Ability to nest sub-documents
• JSON/XML data models
• With Tree-Like Structure
• Encapsulated document objects
• Groups data together more naturally and
logically
17

Technical Characteristics:
• Store all data together
– Example: Order document contains all line items
• Documents are self-describing hierarchical tree
structures
• Unlike Key-Value Stores, the value part of the field
can be queried
18

Good for:
• Semi-structured data
• Web pages
• Web traffic/E-Commerce
• Web analytics
• Log files
• User actions/behaviors
• Content Management Systems
• Full text
• Uncertain data
• Extending object-oriented approaches
• Event logging
• JSON/XML data
19

Document Example
{
"type": "BakingRecipe",
"name": "Mama’s Cornbread",
"ingredients": [
{ "name": "cornmeal", "amount": ”1c" },
{ "name": "flour", "amount": "3/4c" },
{ "name": "baking powder", "amount": "1-1/2t" },
{ "name": "eggs", "amount": "2 large" },
{ "name": ”butter", "amount": "6T" },
{ "name": "buttermilk", "amount": "1-1/2c”,
“brand”: “ABC Brand”}
],
”ovenTemperature": ”425 deg F"
”bakeTime": ”20 min”
}
20

Multiple NoSQL Solutions Working Together
You could use
• Key-Value Store for Shopping Cart and
Session Data
• Document or Column Store for Consuming
Completed Orders
• RDBMS for inventory (small, not served real-
time), financials
• Graph Store for Customer Relationships for
Marketing
21

Column Stores
What are they?
• Data model:
– A big table, with column families
– Map-reduce for querying/processing
• Schema-lite
• No single point of failure
• Operational simplicity
• Closest NoSQL implementation to RDBMS
22

Column Stores
Good for:
• Large amounts of data
• Data that needs compression
• Event logging
• Content Management Systems
• Data model supports semi-structured
data
• Naturally indexed (columns)
• Good at scaling out horizontally
• Time Series data
– Weather data
– Location data
– Sensor data
23

What to Look for in Multi-Model 1/2
• Excellent implementation of multiple
models
• Single copy of data
• Model change propagation
• Works in microservices world
• Submillisecond response time
25

What to Look for in Multi-Model 2/2
• Globally distributed multi-region
deployments
• Cross-model data processing language
and optimizer
• Edge-capable database
• JSON flattening without data explosion
• Universal indices
26

Emerging Technologies
• Use of artificial
intelligence (AI)
• Integration with data
catalog platforms
• Robust user
experience
• Multi-cloud/native
application
27

Assessing New Database Capabilities – Multi-Model

Recommended

Recommended

More Related Content

Similar to Assessing New Database Capabilities – Multi-Model

Similar to Assessing New Database Capabilities – Multi-Model (20)

More from DATAVERSITY

More from DATAVERSITY (20)

Recently uploaded

Recently uploaded (20)

Assessing New Database Capabilities – Multi-Model