dBConf 2014
High-performance Business Intelligence
solution based on IBM Cognos and
ParAccel Analytic Database
Karol Chlasta
Agenda
About me
Plan for today:
– Business Intelligence Concepts & Technologies
IBM Cognos Business Intelligence
ParAccel Analytic Database
– Roll out in an Investment Banking division of a global
bank
– Adoption by Client & Sales BI Unit
Reporting execution summary
Report conversion results
Issues discovered & their solutions
Next steps
Credits: Sanjeev Aggarwal, Technology Architect
Motto
We did this to solve a business issue and not because
of technology...
The business need for and expectations of MI are evolving
rapidly as is the value we can deliver.
The current technology architecture struggles
(and sometimes fails) to deliver the complexity of
information required in a timely manner to the dispersed,
diverse user community via their channels of choice.
Business Intelligence
 Good decisions are the building blocks of great business performance.
 Understand and improve your business based on:
 How are we doing?
Monitoring KPIs with dashboards and scorecards, tracking key metrics.
 Why?
Reporting and analysis to get close to your data, gain context,
understand trends, and spot anomalies.
 What should we be doing?
Planning, budgets, and forecasts let you set and share a reliable view of
the future.
 Business intelligence (BI) is the set of techniques and tools for the
transformation of raw data into meaningful and useful information to get
competitive advantage:
Data → Information → Knowledge → Action
Concepts - BI System
Query & reporting
Analysis
Dashboards for on-line and off-line analysis
Scorecards
Planning & budgets
Statistics, predictive modeling & advanced
analytics
Real-time monitoring
Collaboration & social networking
Mobile applications
Concepts - DWH System
Real-Time
Massively Parallel Processing
(MPP):
 Shared Nothing vs Shared
Everything
 Near-linear Scalability
Big Data:
“Big data is like teenage sex:
everyone talks about it,
nobody really knows how to do
it, everyone thinks everyone
else is doing it, so everyone
claims they are doing it…”
–Dan Ariely
MPP OLAP
Database
Typical OLTP
Database
Large volumes of
data (TB to PB)
Smaller volumes of
data (GB to TB)
Low number of
power database
users
High number of
concurrent light
users
Complex analytic
queries
Optimized of single
row access
Low level of data
granularity (Facts)
High use of tuning
structures
Decision support,
and what if
scenarios
Transaction
processing
and data integrity
Bulk data loading
(TB / day is typical)
Low volume data
loading
References: Press, G. (2013, June 03). [Text]. Retrieved from
http://whatsthebigdata.com/2013/06/03/big-data-quotes/
BI Platforms
Business Intelligence (BI) and
analytics systems are applications
and technologies for gathering,
storing, analyzing, and accessing
information for better business
decision making to gain
competitive advantage.
What is IBM Cognos BI?
What is ParAccel Analytic Database
(PADB)?
Strong hints:
- on my employer's name
- on why we actually should it PADB
- Data sheet Data sheet of
a new product References: Henschen, D. (2014, February 26). Gartner BI Magic Quadrant: Winners & Losers [Image
file]. Retrieved from http://www.informationweek.com/big-data/big-data-analytics/gartner-bi-magic-
quadrant-winners-and-losers/d/d-id/1114013
References: Manoria, V. (2012, May 3). IBM Cognos 10 BI: Components & User Interfaces
[Image files]. Retrieved from https://www.ibm.com/developerworks/community/blogs/ibm-bi-
capabilities/entry/ibm_cognos_10_bi_components_user_interfaces1?lang=en
IBM Cognos BI
Actian Analytics Database
 Created by ParAccel, and known as ParAccel
Analytic DataBase (PADB), now re-branded
to Actian Analytics Database - Matrix
 Parallel processing database management
system designed for high performance
advanced analytics for business
intelligence:
 I/O: Columnar & Compressed
 CPU: Fully Compiled Queries
 Interconnect: MPP Grid Protocol for inter-
node communications.
 Core Components
 Leader node
 Compute node
 Interconnect
 SAN Integration
 Runs on Red Hat or Cent OS! References: Anderson, D. (2012, July 24). Column Oriented Database Technologies [Image files].
Retrieved from http://www.dbbest.com/blog/column-oriented-database-technologies/
 Adaptive Compression (~4x) and no
performance structures:
 Using columnar storage, each block holds
column field values for more records then in
row-based storage. As a result reading the
same number of column fields for the same
number of records requires proportionally
less I/O operations
 Blocks hold the same data type, so
compression can can be selected based on
the column type of data stored in a block
(mostly, delta, bytedict, runlength, text)
 High-performance MPP optimizer (Omne)
 Cost-based, Columnar aware
●
Founded on PostgreSQL Planner/Optimizer
 Unlimited Table Join Ability (patented) for
use in complex database schema
referenced by views
 View Folding for planning through views,
eliminating unused columns
References: Saama blog (2014, October 20). On Big Data and In-Memory Data Clouds [Image file].
Retrieved from http://www.saama.com/on-big-data-and-in-memory-data-clouds/
 Columnar storage vs row-wise database
storage:
 Data blocks store column values for
consecutive rows
 Data blocks store values for
consecutive rows making up the entire
record
Actian Analytics Database
Business Case
Business drivers:
Improved decision making
Reduced data latency
Perform deep, near real time analytics not possible on current platform
Simplified BI infrastructure
(We need to initiate, choose, purchase, engineer & implement the MPP
database)
Landscape before the solution:
 MS SQL Server 2008 with aggregate tables & MSAS cubes
 A number of front-ends to access data, incl.:
• Dozens of QlikView dashboards
• Hundreds of Cognos BI reports
• Reporting packs not always reconciling to each other due to timing
difference or system issues
• Additional manual processing, or inputs needed to build complex reports
Front Office Implementation
PoC was done in 2011 and that PADB
was the best fit for the business use
case, winning with Sybase IQ SMP &
PlexQ.
Implementation happened in Q3
2012 and brought:
- Introduction of MPP Database
technology reduced query times
from minutes to sub seconds
- Reduced data processing from T+1
to near real-time
(within 5 minutes from trade
received to available for reporting)
- Introduction of high performance
analytics
New strategic platform delivered to
bank's clients
References: Slicker city blog (2014, June 24). Rogers' technology adoption curve [Image file]. Retrieved
from http://slickercity.net/tag/information-communication-technology/
Adoption by Finance BI
Client & Sales BI function based in Warsaw, Poland
Project Summary: Who is it about? What happens?
When did it take place? Where did it take place?
Why did it happen?
Report Execution Summary
Report Conversion Results
Region
Report
Type
(D/W/M)
Original Package
Original Execution Time
(min:sec)
PABD Execution
Time
(min:sec)
Result
EMEA D Aggregated Relational 16:56 05:00 339%
EMEA W Standard Relational 20:00 03:30 571%
US D Aggregated Relational 01:20 00:40 200%
US D Aggregated Relational 01:46 00:32 331%
US M Aggregated Relational 00:17 00:19 89%
US D Aggregated Relational 01:38 00:22 445%
US D Aggregated Relational 04:01 00:52 463%
US M Aggregated Relational 00:25 00:35 71%
US D Aggregated Relational 08:01 00:15 3207%
Report Issues & Solutions
Issue Description Solution
Report page containing list does not
run returning error RQP-DEF-0177.
(this is despite an underlying query
returning tabular data with no issues).
Default list header might use ‘Data Item Label’ as
a column source type. After changing the setting
from ‘Data Item Label’ to ‘Text’ the error
disappears.
Rank in not working as expected when
applied in the filters, returning incorrect
sequence.
Set Application to After Auto Aggregation.
When using case when {expression}
then … end the returned else result
expression type is always BLOB.
Report fails to run.
Perform CAST to char or any other desired type in
the case statement definition.
PADB does not handle DATETIME
data type.
Use TRUNC (date) function to return date with no
time portion
Adoption Next Steps
Next steps are to:
Deliver additional end-user trainings
Finalize report conversion exercise
Add additional facts and dimensions to PADB
Connect ALL remaining standard MI reports to a single
real-time Cognos data source (PADB Cognos package)
Decommission:
aggregate tables in SQL Server
redundant Cognos packages
Make a better use of off-the-shelf PABD analytic
functions with Cognos
Monitor and optimize...
Last slide
Thanks
Any questions?

High Performance BI with Cognos and ParAccel Analytic Database

  • 1.
    dBConf 2014 High-performance BusinessIntelligence solution based on IBM Cognos and ParAccel Analytic Database Karol Chlasta
  • 2.
    Agenda About me Plan fortoday: – Business Intelligence Concepts & Technologies IBM Cognos Business Intelligence ParAccel Analytic Database – Roll out in an Investment Banking division of a global bank – Adoption by Client & Sales BI Unit Reporting execution summary Report conversion results Issues discovered & their solutions Next steps Credits: Sanjeev Aggarwal, Technology Architect
  • 3.
    Motto We did thisto solve a business issue and not because of technology... The business need for and expectations of MI are evolving rapidly as is the value we can deliver. The current technology architecture struggles (and sometimes fails) to deliver the complexity of information required in a timely manner to the dispersed, diverse user community via their channels of choice.
  • 4.
    Business Intelligence  Gooddecisions are the building blocks of great business performance.  Understand and improve your business based on:  How are we doing? Monitoring KPIs with dashboards and scorecards, tracking key metrics.  Why? Reporting and analysis to get close to your data, gain context, understand trends, and spot anomalies.  What should we be doing? Planning, budgets, and forecasts let you set and share a reliable view of the future.  Business intelligence (BI) is the set of techniques and tools for the transformation of raw data into meaningful and useful information to get competitive advantage: Data → Information → Knowledge → Action
  • 5.
    Concepts - BISystem Query & reporting Analysis Dashboards for on-line and off-line analysis Scorecards Planning & budgets Statistics, predictive modeling & advanced analytics Real-time monitoring Collaboration & social networking Mobile applications
  • 6.
    Concepts - DWHSystem Real-Time Massively Parallel Processing (MPP):  Shared Nothing vs Shared Everything  Near-linear Scalability Big Data: “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…” –Dan Ariely MPP OLAP Database Typical OLTP Database Large volumes of data (TB to PB) Smaller volumes of data (GB to TB) Low number of power database users High number of concurrent light users Complex analytic queries Optimized of single row access Low level of data granularity (Facts) High use of tuning structures Decision support, and what if scenarios Transaction processing and data integrity Bulk data loading (TB / day is typical) Low volume data loading References: Press, G. (2013, June 03). [Text]. Retrieved from http://whatsthebigdata.com/2013/06/03/big-data-quotes/
  • 7.
    BI Platforms Business Intelligence(BI) and analytics systems are applications and technologies for gathering, storing, analyzing, and accessing information for better business decision making to gain competitive advantage. What is IBM Cognos BI? What is ParAccel Analytic Database (PADB)? Strong hints: - on my employer's name - on why we actually should it PADB - Data sheet Data sheet of a new product References: Henschen, D. (2014, February 26). Gartner BI Magic Quadrant: Winners & Losers [Image file]. Retrieved from http://www.informationweek.com/big-data/big-data-analytics/gartner-bi-magic- quadrant-winners-and-losers/d/d-id/1114013
  • 8.
    References: Manoria, V.(2012, May 3). IBM Cognos 10 BI: Components & User Interfaces [Image files]. Retrieved from https://www.ibm.com/developerworks/community/blogs/ibm-bi- capabilities/entry/ibm_cognos_10_bi_components_user_interfaces1?lang=en IBM Cognos BI
  • 9.
    Actian Analytics Database Created by ParAccel, and known as ParAccel Analytic DataBase (PADB), now re-branded to Actian Analytics Database - Matrix  Parallel processing database management system designed for high performance advanced analytics for business intelligence:  I/O: Columnar & Compressed  CPU: Fully Compiled Queries  Interconnect: MPP Grid Protocol for inter- node communications.  Core Components  Leader node  Compute node  Interconnect  SAN Integration  Runs on Red Hat or Cent OS! References: Anderson, D. (2012, July 24). Column Oriented Database Technologies [Image files]. Retrieved from http://www.dbbest.com/blog/column-oriented-database-technologies/
  • 10.
     Adaptive Compression(~4x) and no performance structures:  Using columnar storage, each block holds column field values for more records then in row-based storage. As a result reading the same number of column fields for the same number of records requires proportionally less I/O operations  Blocks hold the same data type, so compression can can be selected based on the column type of data stored in a block (mostly, delta, bytedict, runlength, text)  High-performance MPP optimizer (Omne)  Cost-based, Columnar aware ● Founded on PostgreSQL Planner/Optimizer  Unlimited Table Join Ability (patented) for use in complex database schema referenced by views  View Folding for planning through views, eliminating unused columns References: Saama blog (2014, October 20). On Big Data and In-Memory Data Clouds [Image file]. Retrieved from http://www.saama.com/on-big-data-and-in-memory-data-clouds/  Columnar storage vs row-wise database storage:  Data blocks store column values for consecutive rows  Data blocks store values for consecutive rows making up the entire record Actian Analytics Database
  • 11.
    Business Case Business drivers: Improveddecision making Reduced data latency Perform deep, near real time analytics not possible on current platform Simplified BI infrastructure (We need to initiate, choose, purchase, engineer & implement the MPP database) Landscape before the solution:  MS SQL Server 2008 with aggregate tables & MSAS cubes  A number of front-ends to access data, incl.: • Dozens of QlikView dashboards • Hundreds of Cognos BI reports • Reporting packs not always reconciling to each other due to timing difference or system issues • Additional manual processing, or inputs needed to build complex reports
  • 12.
    Front Office Implementation PoCwas done in 2011 and that PADB was the best fit for the business use case, winning with Sybase IQ SMP & PlexQ. Implementation happened in Q3 2012 and brought: - Introduction of MPP Database technology reduced query times from minutes to sub seconds - Reduced data processing from T+1 to near real-time (within 5 minutes from trade received to available for reporting) - Introduction of high performance analytics New strategic platform delivered to bank's clients References: Slicker city blog (2014, June 24). Rogers' technology adoption curve [Image file]. Retrieved from http://slickercity.net/tag/information-communication-technology/
  • 13.
    Adoption by FinanceBI Client & Sales BI function based in Warsaw, Poland Project Summary: Who is it about? What happens? When did it take place? Where did it take place? Why did it happen?
  • 14.
  • 15.
    Report Conversion Results Region Report Type (D/W/M) OriginalPackage Original Execution Time (min:sec) PABD Execution Time (min:sec) Result EMEA D Aggregated Relational 16:56 05:00 339% EMEA W Standard Relational 20:00 03:30 571% US D Aggregated Relational 01:20 00:40 200% US D Aggregated Relational 01:46 00:32 331% US M Aggregated Relational 00:17 00:19 89% US D Aggregated Relational 01:38 00:22 445% US D Aggregated Relational 04:01 00:52 463% US M Aggregated Relational 00:25 00:35 71% US D Aggregated Relational 08:01 00:15 3207%
  • 16.
    Report Issues &Solutions Issue Description Solution Report page containing list does not run returning error RQP-DEF-0177. (this is despite an underlying query returning tabular data with no issues). Default list header might use ‘Data Item Label’ as a column source type. After changing the setting from ‘Data Item Label’ to ‘Text’ the error disappears. Rank in not working as expected when applied in the filters, returning incorrect sequence. Set Application to After Auto Aggregation. When using case when {expression} then … end the returned else result expression type is always BLOB. Report fails to run. Perform CAST to char or any other desired type in the case statement definition. PADB does not handle DATETIME data type. Use TRUNC (date) function to return date with no time portion
  • 17.
    Adoption Next Steps Nextsteps are to: Deliver additional end-user trainings Finalize report conversion exercise Add additional facts and dimensions to PADB Connect ALL remaining standard MI reports to a single real-time Cognos data source (PADB Cognos package) Decommission: aggregate tables in SQL Server redundant Cognos packages Make a better use of off-the-shelf PABD analytic functions with Cognos Monitor and optimize...
  • 18.