Cubes – pluggable model explained

10,439 views

Published on

Description of the new pluggable model in Cubes – lightweight Python OLAP framework.

Published in: Technology

Cubes – pluggable model explained

  1. 1. data brewery Pluggable Model Cubes Analytical Workspace Redesign Stefan Urbanek – @Stiivi February 2014
  2. 2. Original Cubes Cubes before 1.0
  3. 3. Model ■ single JSON or a model bundle ■ contains all model objects ■ full description required
  4. 4. one file or one directory bundle ✂ model browser formatters http backends one per serving: server workspace [workspace] backend=sql url=postgresql://localhost/database modules
  5. 5. Browser Aggregation Browser SQL Snowflake Browser SQL Denormalized Browser MongoDB Browser Some HTTP Data Service Browser ? multiple backends available
  6. 6. Backend ■ implemented as python module with an entry point create_workspace() ■ provides Workspace and Browser workspace represents data storage ■ only one Workspace per serving only one kind of storage per serving
  7. 7. Requirements
  8. 8. Model ■ composed of multiple parts ■ external model definition provided from external source, such as analytical service ■ shared dimension descriptions only one dimension description is necessary per composed model
  9. 9. Backend ■ heterogenous storage multiple data stores, different types of data stores ■ different schemas in same store ■ multiple environments dev, test, production, ...
  10. 10. Redesign
  11. 11. Backend ■ “backend” are multiple objects: ! | Provider Browser ■ better plug-in system instead of Python module ■ more flexible composition Store
  12. 12. Backend Objects ■ Browser – performs aggregated browsing ■ Store – maintains database connection ■ Model Provider – provides model Note: not every kind has to be implemented
  13. 13. Logical Physical ∑ create model connect aggregate | Provider Browser Store model cubes model physical data store (database or API) dimensions backend objects
  14. 14. Browser
  15. 15. Browser ■ depends on the logical model ■ implements aggregation aggregate(), values(), … ■ gets data from associated store
  16. 16. Logical Physical ∑ aggregate | Browser Store model physical data store (database or API) browser
  17. 17. Browser Methods ■ ■ ■ ■ ■ features() aggregate() members() facts() fact()
  18. 18. Store
  19. 19. Store * ■ provides database or API connection ■ might provide a model ■ slicer tool actions physical mapping validation, model from schema generation, schema from model generation, schema conversions and optimization, ... *former backend’s “Workspace” object
  20. 20. Logical Physical connect | Browser Store physical data store (database or API) store
  21. 21. Store Methods Store is not required to implement any methods at this time. Future: ■ validate(cube) – does logical map to physical? ■ create(object) – create physical structure
  22. 22. Model Provider
  23. 23. Model Provider ■ creates model from external source ■ might suggest store to be used
  24. 24. Logical create model Provider model cubes model dimensions model provider
  25. 25. Provider Methods ■ dimension_metadata(name,temps,locale) ■ cube_metadata(name,locale) or ■ dimension(name,temps,locale) ■ cube(name,locale)
  26. 26. SQL Backend Mongo Backend | | Snowflake Browser Mongo Browser SQL Store Google Analytics Backend | Mongo Store GA Model Provider example backends GA Browser GA Store
  27. 27. from cubes import | AggregateBrowser, Store ! class SQLStore( Store): | default_browser_name = “sql_snowflake” ! def __init__(self, **options): # initialize the store here ! def validate_cube(self, cube): return True # if valid ! ! class | SQLSnowflakeBrowser(| Browser): def __init__(self, model, locale): # initialize the browser ! def features(self): # return list of browser features def aggregate(self, cell, ...): # return aggregation of the cell from slicer.ini
  28. 28. New Workspace * ■ global object at library level ■ provides appropriate browser ■ contains run-time configuration ■ might have state persistence *former backend Workspace is now Store
  29. 29. Future Workspace ■ caching ■ cube composition ■ …?
  30. 30. Workspace Example
  31. 31. Workspace Model Providers API Model Provider Static Model Provider Cubes sales churn activations events BI Data 2 (Mongo) Events (API) Stores BI Data (Postgres) heterogenous environment
  32. 32. Workspace [workspace] models_path: /var/lib/cubes/models Model Providers crm sales ! Static Model Provider events [models] crm: crm.cubesmodel sales: sales.cubesmodel events: events.cubesmodel Cubes ! sales churn activations events [datastore_bidata] type: sql url: postgresql://localhost/crm ! Stores BI Data (Postgres) BI Data 2 (Mongo) [datastore_bidata2] type: mongo host: localhost collection: events
  33. 33. Conclusion
  34. 34. Conclusion ■ heterogenous pluggable environment ■ externally provided models ■ easier backend implementation
  35. 35. Cubes Home cubes.databrewery.org github github.com/Stiivi/cubes Development Documentation cubes.databrewery.org/dev/doc/ for github master HEAD

×