2. Mike Stegeman
Sr. Data Access Consultant
Introductions
Heath Kath
Sr. Data Access Consultant
Today’s Speakers:
3. Data Warehouse
• What is data warehousing (DW)
• Signs you need a data warehouse
• Common pitfalls
• SEQUEL Data Warehouse
4. Data Warehouse
What is a data warehouse (or business intelligence)?
Sales Data
Budgets
Customers, Inventory, Financials
Industry Data
5. Why do you need more than just a query
tool?
• You have multiple application databases
• The database is complex and not well designed
• Your data contains errors
• Your reporting needs are complex
• You have many query users
What’s real? Data is growing, data is becoming more complex, and
users need and want more reliable information.
SEQUEL Data Warehouse
6. Multiple Application Databases
Sales (DB2 for i5/OS)
BI Reporting
Financials (DB2 for i5/OS)
POS System (SQL Server) Purchasing (Oracle)
You Need a Data Warehouse: Sign 1
7. You Need a Data Warehouse: Sign 2
CUSTNO CUSTNAME
1001 John Smith
1002 Mary Jones
1003 Chris Anderson
1004 David Perry
Customer File - US
CUSTNO CUSTNAME
1001 Harry Potter
1002 Jeremy Carr
1003 Penny Hayes
1004 Debbie Thornton
Customer File - Canada
CUSTID CUSTNAM
AA234 Julie Johnson
AA235 Fred Hunter
AB670 John Smith
BD309 Alan Jordan
Customer File - Canada
CUSTNO CUSTNAME
1001 John Smith
1002 Mary Jones
1003 Chris Anderson
1004 David Perry
Customer File - US
Files (tables) are the same, but different…
Multiple instances of the same table, with duplicate key values
Or different versions of the same entity with incompatible data types
9. Changing dimensions
You Need a Data Warehouse: Sign 4
100 Smith & Jones Electrical Small Retailer Jenny Brown
100 Smith & Jones Electrical Major Retailer Rob McAdam
100 Smith & Jones Electrical Major Retailer Jenny Brown
2012
2013
2014
10. You Need a Data Warehouse: Sign 5
Data errors
• Failed joins
• Invalid dates
• Missing values
Difficult dates
• Dates are in MDY format,
but you need to sort by date
• Separate Year, Month, Day
columns
Hidden meanings and conditional rules
• Second character of column X means…
• If column Y = ‘C’, value Z must be multiplied by -1
• If record type = 1, there must be a matching record in table B
• If type = 2, there may be a record
• If type = 3, there must not be a record
• For data older than 7/1/2005 column X will be zero, but it
must be a value in the range of 1-5 from that date onwards
11. You Need a Data Warehouse: Sign 6
A chaotic reporting environment!
Sales FinancialsPurchasing
GL Summary
(Excel)
Summary Sales by
Customer/Brand
Profitability
Extract
Summary Sales
by Region
Purchasing
extract
(MS Access)
~~~~~~~
~~~~~~~~
~~~~~~~
~~~~~~~~
~~~~~~~
~~~~~~~~
~~~~~~~
~~~~~~~~
~~~~~~~
~~~~~~~~
~~~~~~~
~~~~~~~~
Joe downloads this manually
via Client Access every
Monday
.. except when he’s on
vacation or out with the flu!
Mary wrote this
extract. She left last
year and no-one
knows how it works.
The Net Sales
calculation in this
extract is different to
Mary’s
No one has yet realized
that this is loaded
incorrectly. The auditors
will be the first to discover
the problem
These
reports don’t
balance with
each other.
These
reports don’t
balance with
each other.No one trusts
this report
John spends 5 days
every month generating
this and massaging the
numbers until he thinks
it is correct
12. You Need a Data Warehouse: Sign 7
Example of poor data quality:
•Property assessment incorrectly changed to $400M
•Property tax revenue of $8M was included in the county budget
•County had a huge revenue shortfall, resulting in lots of cuts
•The school district forced to return $2.7M
13. Common Results
We found:
• 96% of data marts require change in the first year (usually
requiring them to be totally rebuilt)
• 75% of independent data marts do not survive past two years
• 60% of companies without a data warehouse architecture
abandon their BI investment within 5 years, citing
maintenance complexity and cost as the prime reasons
18. SALES PURCHASING FINANCIALS
O P E R A T I O N A L S Y S T E M S
SEQUEL
Implemented Against Operational Data
Simple Implementation
19. Front End Tools Implemented Against DW/DM Tables
Data Warehouse/Data Marts
O P E R A T I O N A L S Y S T E M S
SALES PURCHASING FINANCIALS
Adding SEQUEL Data Warehouse
20. SEQUEL Data Warehouse – Data Access
+ Non DB2 Data Sources
SEQUEL Data
Warehouse on IBM i
Oracle, MS SQL,
MySQL, Sybase
XML FilesText Files
(fixed length
or delimited)
MS Excel Salesforce Apache Hive,
Impala
22. Share information with SEQUEL Web Interface
Solves Business Problems
•Reduces the time required to deploy
•Requires no software to install for end users
•Builds web pages without needing another
tool
•Provides a secure way to view your data
24. Why use a Data Warehouse
We are ready for your questions!
Mike Stegeman
Sr. Data Access Consultant
Heath Kath
Sr. Data Access Consultant
Today’s Speakers:
25. Thank You for Joining Us Today!
Website: www.helpsystems.com/sequel
Phone: 800-328-1000 or
+1 952-933-0609
Email: mike.stegeman@helpsystems.com
heath.kath@helpsystems.com
Editor's Notes
Mike -- Hello everyone. Welcome to today’s presentation on Why you need a data warehouse
As most of you know, the amount of data that we continue to collect and manage grows significantly each and every day. I am talking in the millions… The challenge today is to find the right tool or combination of tools that will let you be successful in analyzing all that data and conforming it into meaningful information allowing you to do your business.
Before we go any further,
2
Mike – Before I hand this over to Heath, today’s webinar is being recorded and we will send you a link to the recording usually within a day or twos. Also, if you have any questions, please use WebEx’s chat option and send your questions to ALL PANELISTS
Heath – perfect…
What is Business Intelligence (BI), or what is Data Warehousing (DW)
Quick overview and why it’s important
Why do YOU need a Data Warehouse
Incompatible Sources
Operational Databases initial design
Database Don’t come with Instructions
SEQUEL Data Warehouse Implement Change Management
SEQUEL Data Warehouse (SDW) – two great tools!
History…
What is SDW
What is Business Intelligence (BI)
Business intelligence (BI) is a broad category of applications and technologies for gathering, storing, analyzing, and providing access to data to help enterprise users make better business decisions.
BI applications include data warehouses, data marts, decision support systems, query and reporting, online analytical processing (OLAP), statistical analysis, forecasting, and data mining.
Business intelligence applications can be:
Mission-critical an integral to an enterprise's operations or occasional to meet a special requirement
Enterprise-wide or local to one division, department, or project
GOAL: In other words, Business Intelligence is a set of tools and technologies to get from here, the raw data. to here, an informative dashboards or spreadsheets,
You may have a query tool already, but you could be doing more.
Now, lets focus on some of the common pitfalls that could be surrounding you and your basic query tool.
You have many different environments, one could be for your sales data, that is stored locally on the IBM i. Then you have … Fincancials / POS – Point of Sales data / Purchasing information…
So the data is on different structures – but the information is related.
It’s very difficult, if not impossible to join tables across databases
You could be running into different levels of security and availability – of which will add significant complexity
And then you have the challenge of trying to get all that information in a timely matter--- Are your delays costing you money? -- >poor performance
Working with multiple files, they are very similar, however, we find this to be very true:
Duplicate key values…. CUSTNO 10001 is John Smith in the US file, but in Canada, Harry Potter is customer # 10001
Incompatible data types…. CUSTNO is numeric in one file but is character in another.
Heath – Another very common issue that we are faced with is PERFORMANCE.
The files, the amount of data being collected is only growing..
Large transaction table and Many related tables
Several different reports, most could be summary level over one or several of these tables.
When running the Reports and queries, many existing customers, and maybe you to, are seeing longer run times and the jobs could be consuming considerable amount of system resources
Heath -
Changing Dimensions
e.g. customer attributes such as group, territory are periodically changed
What happens if….
I need to re-run a report from 2012
You cannot reproduce the same report!
The original groupings (customer group, territory) are no longer available in the operational database
I need to compare TY sales against LY for all sales reps
Same problem!
** Need actual image from Shelley
Files and fields are different (field length, types, numeric date vs character type date fields and more)
6 character field names (RPG III legacy)
First 2 characters are file prefix – so only 4 characters left for the actual field name!
The database is designed to support transactions, not query access! Use of 3rd normal form to avoid redundancy results in many tables.
Date fields; One file may have a numeric or character type date as YYMMDD, another MMDDYYYY, or the date might be in a Julian format
To many fields… some files could have hundreds of fields, but yet most of the time, only 10% of the available fields are useful for BI reporting
Data errors
Dates
Rules…
Independent ‘stovepipe’ data marts
Built independently without any coherent plan
Do not agree with each other
Duplicated effort
Usually undocumented
A major maintenance headache
Poor Data Quality
Very little if any validation and error management in the load process
Inconsistent calculations and business rules
Inflexible
Very difficult to modify to cater for emerging business needs
Leading to…
Bad decisions, based on incorrect or incomplete data
Eventual lack of trust, leading to disuse
Disillusionment with Business Intelligence
The results of various studies
2005 Valparaiso, Indiana - somehow a property assessment value for the home shown below was incorrectly changed to $400M in the property tax database
The expected property tax revenue was included in the county budget - but the $8M property tax bill on the house was (of course) not paid
The county had a huge revenue shortfall, resulting in lots of cuts
The school district was forced to return $2.7M
All extracurricular activities and sports were cancelled that year
We have found through various studies that
A solution - SEQUEL Data Warehouse, previously known as RODIN as been around since mid 80’s.
SEQUEL Data Warehouse is an advanced, fully integrated, visual development environment primarily designed to define, build, load and manage data warehouses, data marts and / or operational data stores (ODS) for use in Business Intelligence applications. Similar tools are often referred to as Extract,Transform, Load (ETL) tools.
SEQUEL Data Warehouse also includes an extensive Management layer that monitors and controls the development and execution of these ETL processes to ensure a very high degree of data integrity in a rapid development environment. Hence SEQUEL Data Warehouse is a fully integrated development and deployment environment for ETLM applications. We call this "Data Engineering".
It is also important to realize that SEQUEL Data Warehouse is not just suitable for data warehousing applications, although it was specifically designed for and excels at that task. Because it can basically do anything in relation to selecting, manipulating, validating and moving data of all types, it has many other uses. For example, our customers use SEQUEL Data Warehouse to undertake data conversion projects when migrating from one software package to another. They also use it to move and integrate data between applications that aren't currently integrated (e.g. to update last payment date onto the customer master file whenever a payment is made) or to provide data such as price lists or inventory updates to suppliers and customers (and vice versa). There are ever increasing requirements to supply data and information about your business to government and regulatory agencies. This is an area that often re-directs valuable development resources away from core business applications. These and many other requirements can be easily handled by SEQUEL Data Warehouse, often within minutes, hours or days as opposed to days, weeks or months.
It is a data Extraction, Transformation and Load tool (ETL tool).
It is a Data Management tool to define and maintain user-defined Data Warehouses, Data Marts and Operation Data Stores.
It is based around an active metadata repository
It includes a fully managed development environment, used to define and maintain the Tables, ETL Definitions and Metadata
It includes a self-contained native IBM i run-time environment
Visual representations used where appropriate to aid understanding and building. Wizards guide you through step by step to create an object or definition
Powerful, easy to use visual development environment…. Drag and drop, double click, or right click for options…
With Extensive set of audit reports, error reports and metadata reports available - every step you make is being tracked
One more time - Without SDW, users are forced to pull in data from multiple unstructured databases
With SDW, you have a structured environment, data is being managed and accessed in a controlled manner.
With SDW, you can also included non iSeries data… data from an Oracle, MS SQL , and other databases.
With SEQUEL, we can take the information gathered by SEQUEL Data Warehouse and distribute several different ways.
One is though reporting.
Reporting and distribution are very important because not all information is displayed on a screen. This is an area where SEQUEL really excels.
Reports can be initiated by a user, called from a program, or scheduled.
You can send custom reports to many users using a list based process.
And, your reports can be displayed; printed; saved to a folder (PC, network, IFS); delivered using FTP; and sent via e-mail, including attachments.
We all have different needs and preferences and SEQUEL is designed to meet them.
Some people want results in text format, such as text files (.txt) and rich text files (.rtf).
Others want XLS (Excel), PDF, XML, HTML, or CSV formats.
SEQUEL handles all these plus other standard and custom types.
SEQUEL’s report formatting tools let you change fonts, colors, insert pictures and graphs, and save the results.
The result—each time you access your information, it looks great!
With SWI
A BI tool like SEQUEL Data Warehouse, can help your business manage, verify, control, convert data, and distribute meaningful information to the users that need it and when they need it.
No more waiting around, no more worrying if the data is correct. The information will be correct and the data will be displayed in a modern way with SEQUEL.
24
MIKE: We hope that you now have a good feel for how SEQUEL can address your business needs by improving your productivity --allowing users not only to access data when they need it, from where ever they are at, but also, to see information in an easy to read modernized format.
If you have any further questions, or would like to see more of SEQUEL, give us a call, or email Mike or I – we would be glad to schedule a demonstration with you or your team.
We appreciate your time and look forward to seeing you in futures Webinars.
This concludes our session on: Dashboards: The key to effectively sharing data.
Thank you, and have a great day!
** STOP RECORDING