We did this to solve a business issue and not because of technology... . The business need for and expectations of MI are evolving rapidly as is the value we can deliver. The current technology architecture struggles (and sometimes fails) to deliver the complexity of information required in a timely manner to the dispersed, diverse user community via their channels of choice.
2. Agenda
About me
Plan for today:
– Business Intelligence Concepts & Technologies
IBM Cognos Business Intelligence
ParAccel Analytic Database
– Roll out in an Investment Banking division of a global
bank
– Adoption by Client & Sales BI Unit
Reporting execution summary
Report conversion results
Issues discovered & their solutions
Next steps
Credits: Sanjeev Aggarwal, Technology Architect
3. Motto
We did this to solve a business issue and not because
of technology...
The business need for and expectations of MI are evolving
rapidly as is the value we can deliver.
The current technology architecture struggles
(and sometimes fails) to deliver the complexity of
information required in a timely manner to the dispersed,
diverse user community via their channels of choice.
4. Business Intelligence
Good decisions are the building blocks of great business performance.
Understand and improve your business based on:
How are we doing?
Monitoring KPIs with dashboards and scorecards, tracking key metrics.
Why?
Reporting and analysis to get close to your data, gain context,
understand trends, and spot anomalies.
What should we be doing?
Planning, budgets, and forecasts let you set and share a reliable view of
the future.
Business intelligence (BI) is the set of techniques and tools for the
transformation of raw data into meaningful and useful information to get
competitive advantage:
Data → Information → Knowledge → Action
5. Concepts - BI System
Query & reporting
Analysis
Dashboards for on-line and off-line analysis
Scorecards
Planning & budgets
Statistics, predictive modeling & advanced
analytics
Real-time monitoring
Collaboration & social networking
Mobile applications
6. Concepts - DWH System
Real-Time
Massively Parallel Processing
(MPP):
Shared Nothing vs Shared
Everything
Near-linear Scalability
Big Data:
“Big data is like teenage sex:
everyone talks about it,
nobody really knows how to do
it, everyone thinks everyone
else is doing it, so everyone
claims they are doing it…”
–Dan Ariely
MPP OLAP
Database
Typical OLTP
Database
Large volumes of
data (TB to PB)
Smaller volumes of
data (GB to TB)
Low number of
power database
users
High number of
concurrent light
users
Complex analytic
queries
Optimized of single
row access
Low level of data
granularity (Facts)
High use of tuning
structures
Decision support,
and what if
scenarios
Transaction
processing
and data integrity
Bulk data loading
(TB / day is typical)
Low volume data
loading
References: Press, G. (2013, June 03). [Text]. Retrieved from
http://whatsthebigdata.com/2013/06/03/big-data-quotes/
7. BI Platforms
Business Intelligence (BI) and
analytics systems are applications
and technologies for gathering,
storing, analyzing, and accessing
information for better business
decision making to gain
competitive advantage.
What is IBM Cognos BI?
What is ParAccel Analytic Database
(PADB)?
Strong hints:
- on my employer's name
- on why we actually should it PADB
- Data sheet Data sheet of
a new product References: Henschen, D. (2014, February 26). Gartner BI Magic Quadrant: Winners & Losers [Image
file]. Retrieved from http://www.informationweek.com/big-data/big-data-analytics/gartner-bi-magic-
quadrant-winners-and-losers/d/d-id/1114013
8. References: Manoria, V. (2012, May 3). IBM Cognos 10 BI: Components & User Interfaces
[Image files]. Retrieved from https://www.ibm.com/developerworks/community/blogs/ibm-bi-
capabilities/entry/ibm_cognos_10_bi_components_user_interfaces1?lang=en
IBM Cognos BI
9. Actian Analytics Database
Created by ParAccel, and known as ParAccel
Analytic DataBase (PADB), now re-branded
to Actian Analytics Database - Matrix
Parallel processing database management
system designed for high performance
advanced analytics for business
intelligence:
I/O: Columnar & Compressed
CPU: Fully Compiled Queries
Interconnect: MPP Grid Protocol for inter-
node communications.
Core Components
Leader node
Compute node
Interconnect
SAN Integration
Runs on Red Hat or Cent OS! References: Anderson, D. (2012, July 24). Column Oriented Database Technologies [Image files].
Retrieved from http://www.dbbest.com/blog/column-oriented-database-technologies/
10. Adaptive Compression (~4x) and no
performance structures:
Using columnar storage, each block holds
column field values for more records then in
row-based storage. As a result reading the
same number of column fields for the same
number of records requires proportionally
less I/O operations
Blocks hold the same data type, so
compression can can be selected based on
the column type of data stored in a block
(mostly, delta, bytedict, runlength, text)
High-performance MPP optimizer (Omne)
Cost-based, Columnar aware
●
Founded on PostgreSQL Planner/Optimizer
Unlimited Table Join Ability (patented) for
use in complex database schema
referenced by views
View Folding for planning through views,
eliminating unused columns
References: Saama blog (2014, October 20). On Big Data and In-Memory Data Clouds [Image file].
Retrieved from http://www.saama.com/on-big-data-and-in-memory-data-clouds/
Columnar storage vs row-wise database
storage:
Data blocks store column values for
consecutive rows
Data blocks store values for
consecutive rows making up the entire
record
Actian Analytics Database
11. Business Case
Business drivers:
Improved decision making
Reduced data latency
Perform deep, near real time analytics not possible on current platform
Simplified BI infrastructure
(We need to initiate, choose, purchase, engineer & implement the MPP
database)
Landscape before the solution:
MS SQL Server 2008 with aggregate tables & MSAS cubes
A number of front-ends to access data, incl.:
• Dozens of QlikView dashboards
• Hundreds of Cognos BI reports
• Reporting packs not always reconciling to each other due to timing
difference or system issues
• Additional manual processing, or inputs needed to build complex reports
12. Front Office Implementation
PoC was done in 2011 and that PADB
was the best fit for the business use
case, winning with Sybase IQ SMP &
PlexQ.
Implementation happened in Q3
2012 and brought:
- Introduction of MPP Database
technology reduced query times
from minutes to sub seconds
- Reduced data processing from T+1
to near real-time
(within 5 minutes from trade
received to available for reporting)
- Introduction of high performance
analytics
New strategic platform delivered to
bank's clients
References: Slicker city blog (2014, June 24). Rogers' technology adoption curve [Image file]. Retrieved
from http://slickercity.net/tag/information-communication-technology/
13. Adoption by Finance BI
Client & Sales BI function based in Warsaw, Poland
Project Summary: Who is it about? What happens?
When did it take place? Where did it take place?
Why did it happen?
15. Report Conversion Results
Region
Report
Type
(D/W/M)
Original Package
Original Execution Time
(min:sec)
PABD Execution
Time
(min:sec)
Result
EMEA D Aggregated Relational 16:56 05:00 339%
EMEA W Standard Relational 20:00 03:30 571%
US D Aggregated Relational 01:20 00:40 200%
US D Aggregated Relational 01:46 00:32 331%
US M Aggregated Relational 00:17 00:19 89%
US D Aggregated Relational 01:38 00:22 445%
US D Aggregated Relational 04:01 00:52 463%
US M Aggregated Relational 00:25 00:35 71%
US D Aggregated Relational 08:01 00:15 3207%
16. Report Issues & Solutions
Issue Description Solution
Report page containing list does not
run returning error RQP-DEF-0177.
(this is despite an underlying query
returning tabular data with no issues).
Default list header might use ‘Data Item Label’ as
a column source type. After changing the setting
from ‘Data Item Label’ to ‘Text’ the error
disappears.
Rank in not working as expected when
applied in the filters, returning incorrect
sequence.
Set Application to After Auto Aggregation.
When using case when {expression}
then … end the returned else result
expression type is always BLOB.
Report fails to run.
Perform CAST to char or any other desired type in
the case statement definition.
PADB does not handle DATETIME
data type.
Use TRUNC (date) function to return date with no
time portion
17. Adoption Next Steps
Next steps are to:
Deliver additional end-user trainings
Finalize report conversion exercise
Add additional facts and dimensions to PADB
Connect ALL remaining standard MI reports to a single
real-time Cognos data source (PADB Cognos package)
Decommission:
aggregate tables in SQL Server
redundant Cognos packages
Make a better use of off-the-shelf PABD analytic
functions with Cognos
Monitor and optimize...