Calligraph overview for emc 09.09.2011

Common Issues of Traditional BI
 High reliance on pre-processing of information (cubes
and views) limits user ability to explore the data beyond
pre-programmed reports
 Explosion of data stored in the DB – less than 10% of
productive data, rest is derived data
 Difficult for users to create new reports, which basically
prevents managers and other decision makers from
using the tool in day to day work for real time decision
making support
 Expensive consultants are needed to re-program reports
– costing both money and time

Short overview of Calligraph terminology and
design principles
 Calligraph delivers integrated OLAP + Query & Reporting
functionality for decision making support - On Line, On The Fly,
On Demand;
 Calligraph is aimed at the end user without any programming
skills
 User interacts with Calligraph in his native language and in his
subject matter terms. We call it User Semantic Layer. User
Semantic Layer is created according to user rights to access the
data
 Tables of any complexity and size and in any theoretical
representation are translated into linear set of queries with
automatic parallelization. This is a key characteristic which
prevents “information blast” and allows to keep information
processing time linearly dependent on the volume of information
being processed.

Calligraph technology benefits
 Flexible multidimensional on-line analysis of recent data
for decision making support
 User generates queries and reports via direct interaction
with the system in terms of his subject domain
 Semantic layer of the user is formed in a strict conformity
with his rights to access the data
 Queries are highly parallelizable and their structure
allows optimal execution
 Supports connection to any relational DB via OLEDB
 Full conformity to all 12 classical OLAP rules

Types of tables formed by Calligraph
Listing table example Analytic (or cross) table
example

Important notes to the previous slide
There are only two principal types of homogeneous tables:
 Listing tables
 Analytical tables (cross tables)
Any other table is non-homogeneous and can be
decomposed into components of either listing or analytic
tables
Ergonomics asserts that human perception can only get
homogeneous tables easily, any non-homogeneous
(composed) table will be perceived partially, by picking
out and analyzing homogeneous components

Definition of “Task”
 User (such as manager) can have access to different
types of information – for example, commercial, HR,
logistics and warehouse, finance, etc – from different
DBs deployed by the enterprise
 To ease perception, User semantic layer can be logically
split into linked fragments, which we call Task:
“Commerce”, “HR”, “Warehouse”, “Finance” etc.
 Technically, Task is a set of fields from different DB
tables with all necessary connections between them.
Each field has its own user-friendly alias. Thus we create
an environment which is clear and convenient to the end
user.

Calligraph configuration for the “Company” DB,
converted by EMC into Greenplum format

Configuration of the user semantic layer (field names
and mapping)
De facto, this is example of manual creation of User Semantic
Layer (automatic creation is also possible)

Semantic User Layer
Is a list of field names accessible to the user, in user language
and in user subject domain terms

Definition of “Gradation”
 Gradation is any field from the user semantic layer with
a set of boundary conditions
 Boundary conditions for the gradation are connected by
logical “OR”
 Conditions can be grouped into simple or extended totals
 Gradation is used to create a dimension in analytic table,
in the filter or in “master-detail” section

Difference from OLAP using cubes
 Any field from the user semantic layer can form a dimension
for analytical table
 All DB fields are “equal”, without separating them into
“dimensions” and “facts”
 Boundary conditions for the gradations can also be
described as range, mask or a formula
 User can create “virtual” gradation (i.e. the gradation which
is calculated by applying a formula), enabling “what-if”
analysis on the fly
 No need to perform pre-processing and create (and then
continually increment) cubes, which limits user ability to
perform analysis in a way he/she needs, as user can specify
any dimension through direct interaction with Calligraph
 User creates table template in any theoretically possible
view on-line
 All queries are performed on-line and can be parallelized

Definition of “Filter”
 We use filter if we need apply certain conditions to all data
in the particular query
 Gradation is the minimal element to form the filter
 Several gradations connected by a logical “AND” are
called aggregate
 Filter is a set of aggregates which are connected by
logical “AND” or “OR” (in any order)
 Calligraph sets no limits on the “depth” of the filter and its
length
 Filters give user a very easy and visual way to create data
filtering rules on the fly

Definition of ”Master-Detail”
 Any complex table can be automatically split into a set of
simpler tables by drag and drop of any gradation in
“master-detail” query
 Simpler table are formed by using dropped gradation
boundary conditions to select the information
 Example: analytic report on EMC business around the
world can contain gradation “Continents”. If user moves
this gradation in “master-detail”, then complex table will
be split into several simpler tables which contain only
information about business in every continent. If you
further move gradation “country”, then every table
containing information on continents will be further split
in several tables with information on every country.

Definition of “Drill-Down”
 Any cell of the analytical table contains data which was
filtered based on the boundary conditions set for its column
and row, as well as those defined by the “master-detail”.
 Decision making often requires detailed understanding of
the information in the analytical table – such as to
understand the reasons behind unsatisfactory results.
 Calligraph provides an easy way to achieve this, with
maximum allowed detailing according to user rights for
data access.
 User can select a cell (or cells) of the table and press
“Drill-Down”, and get automatically generated listing table
with all fields from the analytical table.

Demo block diagram
Greenplum Master Server
Segment Segment SegmentSegment
Windows Client Machine
Calligraph
Remote Desktop
Sample Database

DEMO
 Lets go to a live Calligraph demo

Current status of Calligraph
 Version 5.2 is available as a standalone Windows
application
 Hundreds of copies have been sold and are being used
within big and small enterprises
 Some of the Calligraph enterprise customers include
Atommash, Novorosiyskiy port, in big medical institutes and
hospitals, in government (Republican Statistical Service,
Russian State Parliament, Tax authorities, police
departments, etc) and in small and medium businesses.
 Calligraph is registered in the Russian agency of patents
and trademarks

Possible ways to further develop
Calligraph technology
 Cloud service
 External reporting unit for CASE system
 Automatic configuration
 Support for Hadoop
 Voice input etc.

Benefits of Calligraph to
EMC/Greenplum
 Full alignment with Greenplum data analytics and
exploration focus
 On demand, on the fly analysis
 Parallelization and speed of query execution
 Calligraph can be developed as cloud service, giving
access to data analytics to every user in the enterprise
 Loyalty of users through ease of use and convenience
 Highly competitive offer in terms of functionality and
price
 Easier to demonstrate business value of Greenplum DB
and data analytics to the customers

Calligraph overview for emc 09.09.2011

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Similar to Calligraph overview for emc 09.09.2011

Similar to Calligraph overview for emc 09.09.2011 (20)

Recently uploaded

Recently uploaded (20)

Calligraph overview for emc 09.09.2011

Editor's Notes