2. Common Issues of Traditional BI
High reliance on pre-processing of information (cubes
and views) limits user ability to explore the data beyond
pre-programmed reports
Explosion of data stored in the DB – less than 10% of
productive data, rest is derived data
Difficult for users to create new reports, which basically
prevents managers and other decision makers from
using the tool in day to day work for real time decision
making support
Expensive consultants are needed to re-program reports
– costing both money and time
3. Short overview of Calligraph terminology and
design principles
Calligraph delivers integrated OLAP + Query & Reporting
functionality for decision making support - On Line, On The Fly,
On Demand;
Calligraph is aimed at the end user without any programming
skills
User interacts with Calligraph in his native language and in his
subject matter terms. We call it User Semantic Layer. User
Semantic Layer is created according to user rights to access the
data
Tables of any complexity and size and in any theoretical
representation are translated into linear set of queries with
automatic parallelization. This is a key characteristic which
prevents “information blast” and allows to keep information
processing time linearly dependent on the volume of information
being processed.
4. Calligraph technology benefits
Flexible multidimensional on-line analysis of recent data
for decision making support
User generates queries and reports via direct interaction
with the system in terms of his subject domain
Semantic layer of the user is formed in a strict conformity
with his rights to access the data
Queries are highly parallelizable and their structure
allows optimal execution
Supports connection to any relational DB via OLEDB
Full conformity to all 12 classical OLAP rules
5. Types of tables formed by Calligraph
Listing table example Analytic (or cross) table
example
6. Important notes to the previous slide
There are only two principal types of homogeneous tables:
Listing tables
Analytical tables (cross tables)
Any other table is non-homogeneous and can be
decomposed into components of either listing or analytic
tables
Ergonomics asserts that human perception can only get
homogeneous tables easily, any non-homogeneous
(composed) table will be perceived partially, by picking
out and analyzing homogeneous components
7. Definition of “Task”
User (such as manager) can have access to different
types of information – for example, commercial, HR,
logistics and warehouse, finance, etc – from different
DBs deployed by the enterprise
To ease perception, User semantic layer can be logically
split into linked fragments, which we call Task:
“Commerce”, “HR”, “Warehouse”, “Finance” etc.
Technically, Task is a set of fields from different DB
tables with all necessary connections between them.
Each field has its own user-friendly alias. Thus we create
an environment which is clear and convenient to the end
user.
9. Configuration of the user semantic layer (field names
and mapping)
De facto, this is example of manual creation of User Semantic
Layer (automatic creation is also possible)
10. Semantic User Layer
Is a list of field names accessible to the user, in user language
and in user subject domain terms
11. Definition of “Gradation”
Gradation is any field from the user semantic layer with
a set of boundary conditions
Boundary conditions for the gradation are connected by
logical “OR”
Conditions can be grouped into simple or extended totals
Gradation is used to create a dimension in analytic table,
in the filter or in “master-detail” section
12. Difference from OLAP using cubes
Any field from the user semantic layer can form a dimension
for analytical table
All DB fields are “equal”, without separating them into
“dimensions” and “facts”
Boundary conditions for the gradations can also be
described as range, mask or a formula
User can create “virtual” gradation (i.e. the gradation which
is calculated by applying a formula), enabling “what-if”
analysis on the fly
No need to perform pre-processing and create (and then
continually increment) cubes, which limits user ability to
perform analysis in a way he/she needs, as user can specify
any dimension through direct interaction with Calligraph
User creates table template in any theoretically possible
view on-line
All queries are performed on-line and can be parallelized
13. Definition of “Filter”
We use filter if we need apply certain conditions to all data
in the particular query
Gradation is the minimal element to form the filter
Several gradations connected by a logical “AND” are
called aggregate
Filter is a set of aggregates which are connected by
logical “AND” or “OR” (in any order)
Calligraph sets no limits on the “depth” of the filter and its
length
Filters give user a very easy and visual way to create data
filtering rules on the fly
14. Definition of ”Master-Detail”
Any complex table can be automatically split into a set of
simpler tables by drag and drop of any gradation in
“master-detail” query
Simpler table are formed by using dropped gradation
boundary conditions to select the information
Example: analytic report on EMC business around the
world can contain gradation “Continents”. If user moves
this gradation in “master-detail”, then complex table will
be split into several simpler tables which contain only
information about business in every continent. If you
further move gradation “country”, then every table
containing information on continents will be further split
in several tables with information on every country.
15. Definition of “Drill-Down”
Any cell of the analytical table contains data which was
filtered based on the boundary conditions set for its column
and row, as well as those defined by the “master-detail”.
Decision making often requires detailed understanding of
the information in the analytical table – such as to
understand the reasons behind unsatisfactory results.
Calligraph provides an easy way to achieve this, with
maximum allowed detailing according to user rights for
data access.
User can select a cell (or cells) of the table and press
“Drill-Down”, and get automatically generated listing table
with all fields from the analytical table.
16. Demo block diagram
Greenplum Master Server
Segment Segment SegmentSegment
Windows Client Machine
Calligraph
Remote Desktop
Sample Database
18. Current status of Calligraph
Version 5.2 is available as a standalone Windows
application
Hundreds of copies have been sold and are being used
within big and small enterprises
Some of the Calligraph enterprise customers include
Atommash, Novorosiyskiy port, in big medical institutes and
hospitals, in government (Republican Statistical Service,
Russian State Parliament, Tax authorities, police
departments, etc) and in small and medium businesses.
Calligraph is registered in the Russian agency of patents
and trademarks
19. Possible ways to further develop
Calligraph technology
Cloud service
External reporting unit for CASE system
Automatic configuration
Support for Hadoop
Voice input etc.
20. Benefits of Calligraph to
EMC/Greenplum
Full alignment with Greenplum data analytics and
exploration focus
On demand, on the fly analysis
Parallelization and speed of query execution
Calligraph can be developed as cloud service, giving
access to data analytics to every user in the enterprise
Loyalty of users through ease of use and convenience
Highly competitive offer in terms of functionality and
price
Easier to demonstrate business value of Greenplum DB
and data analytics to the customers
Editor's Notes
Есть только два принципиально разных типов однородных таблиц: списковые и аналитические.
Настройка сделана через ODBC.
Для подключения к БД GreenPlum использован рекомендованный специалистами EMC драйвер к СУБД PostgreSQL.