data brewery

Pluggable Model
Cubes Analytical Workspace Redesign

Stefan Urbanek – @Stiivi

February 2014
Original Cubes
Cubes before 1.0
Model
■ single JSON or a model bundle
■ contains all model objects
■ full description required
one file or one directory bundle

✂

model

browser

formatters

http

backends

one per serving:

server

workspace

[workspace]
backend=sql
url=postgresql://localhost/database

modules
Browser
Aggregation
Browser

SQL Snowflake
Browser

SQL Denormalized
Browser

MongoDB Browser

Some HTTP Data
Service Browser

?

multiple backends available
Backend
■ implemented as python module
with an entry point create_workspace()

■ provides Workspace and Browser
workspace represents data storage

■ only one Workspace per serving
only one kind of storage per serving
Requirements
Model
■ composed of multiple parts
■ external model definition
provided from external source, such as analytical service

■ shared dimension descriptions
only one dimension description is necessary per composed model
Backend
■ heterogenous storage
multiple data stores, different types of data stores

■ different schemas in same store
■ multiple environments
dev, test, production, ...
Redesign
Backend
■ “backend” are multiple objects:
!

|
Provider

Browser

■ better plug-in system
instead of Python module

■ more flexible composition

Store
Backend Objects
■ Browser – performs aggregated browsing
■ Store – maintains database connection
■ Model Provider – provides model

Note: not every kind has to be implemented
Logical

Physical

∑

create model

connect

aggregate

|
Provider

Browser

Store

model

cubes
model

physical data store
(database or API)

dimensions

backend objects
Browser
Browser
■ depends on the logical model
■ implements aggregation
aggregate(), values(), …

■ gets data from associated store
Logical

Physical

∑
aggregate

|
Browser

Store

model

physical data store
(database or API)

browser
Browser Methods
■
■
■
■
■

features()
aggregate()
members()
facts()
fact()
Store
Store

*

■ provides database or API connection
■ might provide a model
■ slicer tool actions
physical mapping validation, model from schema generation,
schema from model generation, schema conversions and
optimization, ...

*former backend’s “Workspace” object
Logical

Physical
connect

|
Browser

Store

physical data store
(database or API)

store
Store Methods
Store is not required to implement any
methods at this time. Future:

■ validate(cube) – does logical map to physical?
■ create(object) – create physical structure
Model Provider
Model Provider
■ creates model from external source
■ might suggest store to be used
Logical

create model

Provider
model

cubes
model
dimensions

model provider
Provider Methods
■ dimension_metadata(name,temps,locale)
■ cube_metadata(name,locale)
or

■ dimension(name,temps,locale)
■ cube(name,locale)
SQL Backend

Mongo Backend

|

|

Snowflake Browser

Mongo Browser

SQL Store

Google Analytics Backend

|

Mongo Store

GA Model Provider

example backends

GA Browser

GA Store
from cubes import | AggregateBrowser,
Store
!
class
SQLStore( Store):
| default_browser_name = “sql_snowflake”
!
def __init__(self,
**options):
# initialize the store here
!
def validate_cube(self, cube):
return True # if valid
!
!
class | SQLSnowflakeBrowser(| Browser):
def __init__(self, model, locale):
# initialize the browser
!
def features(self):
# return list of browser features
def aggregate(self, cell, ...):
# return aggregation of the cell

from
slicer.ini
New Workspace

*

■ global object at library level
■ provides appropriate browser
■ contains run-time configuration
■ might have state persistence
*former backend Workspace is now Store
Future Workspace
■ caching
■ cube composition
■ …?
Workspace Example
Workspace

Model Providers
API Model
Provider

Static Model
Provider

Cubes

sales

churn

activations

events

BI Data 2
(Mongo)

Events
(API)

Stores

BI Data
(Postgres)

heterogenous environment
Workspace

[workspace]
models_path: /var/lib/cubes/models

Model Providers

crm

sales

!

Static Model
Provider

events

[models]
crm: crm.cubesmodel
sales: sales.cubesmodel
events: events.cubesmodel

Cubes

!
sales

churn

activations

events

[datastore_bidata]
type: sql
url: postgresql://localhost/crm

!
Stores

BI Data
(Postgres)

BI Data 2
(Mongo)

[datastore_bidata2]
type: mongo
host: localhost
collection: events
Conclusion
Conclusion
■ heterogenous pluggable environment
■ externally provided models
■ easier backend implementation
Cubes Home

cubes.databrewery.org
github

github.com/Stiivi/cubes
Development Documentation

cubes.databrewery.org/dev/doc/
for github master HEAD

Cubes – pluggable model explained