CSI 5115                                                            Roadmap


       Using IBM DB2 at SITE                ...
Outline (using DB2 at SITE)                                                      Over View

                              ...
Remote Access Demo                        Public Labs at SITE
                                          Use Windows




  ...
Using Command Line                                                  Outline (using DB2 at SITE)

    Command-line processo...
Register your DB                                 Register…
   Start Programs IBM DB2    Set-up Tools
                     ...
Outline (using DB2 at SITE)                                                  Survival Commands
Create your DW using DB2


...
Two Aspects                                                           Create Dimension Model
Implementation of the DM     ...
Basic SQL                                                                 Stored Procedure
Two examples for staging       ...
Cursor Processing                                                                                  Two-Phase Commit
An exa...
DB2 Load (GUI)                                                                  Generate Surrogate keys
                  ...
OLAP Interface                                                                      OLAP Interface
Present your data



  ...
ODBC- Testing                                                                       JDBC
        Test your ODBC           ...
Data Components
(what)                 Connection Configuration




                  73                              74

...
DataGrid to Show Data
Generate Data Sets...          (how)




                          79                               ...
Adding Charts                                                                         Coding




                         ...
Summary

  Two main aspects of a DM: good chance to
  experience and learn
    Usefully practical project                 ...
Upcoming SlideShare
Loading in …5
×

Roadmap Roadmap

960 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
960
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
25
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Roadmap Roadmap

  1. 1. CSI 5115 Roadmap Using IBM DB2 at SITE About the project Using IBM DB2@ at SITE Professor : Dr. Herna L. Viktor Presenter : Hongyu Guo (Harry) Implementing your project {hlviktor, hguo028}@site.uottawa.ca lviktor, hguo028}@site.uottawa.ca School of IT and Engineering, uOttawa Oct., 2006 @ www.ibm.com2 The Project Expected Data Sources (in flat files, contain noise) ETL Design and Implement a data mart Two main aspects Back room: Data staging Data quality (complete, correct, consistent, etc.) Data Mart Staging method Front room: OLAP GUI Interface Analytic capabilities (correctness, informativeness, etc.) User friendliness OLAP GUI 3 4 Implementation Roadmap Apply the four-step method as described in your four- textbook (grain, dimensions, facts…) facts… Know your data and requirements About the project (project requirement) requirement) Using IBM DB2@ at SITE Refer to the base-line dimensional model base- provided by the Prof. Implementing your project Choose the full-fledged DBMS (e.g. DB2, Oracle, full- SQL Server) and GUI developing tools which you are familiar with 5 6
  2. 2. Outline (using DB2 at SITE) Over View Product family Over view of DB2 DB2 Universal Database V8.2 (UDB) UDB) Flexible and cost-effective relational database cost- Work in public labs at SITE Business Intelligent Work in research labs at SITE Data Warehouse Intelligent Miner Work from home Query Management Survival commands DB2 Connect … 7 8 DB2 UDB (System Overview) Outline (using DB2 at SITE) Machine Over view of DB2 instance Work in public labs at SITE instance Work in research labs at SITE database Work from home database schema Survival commands Schema Tables ID2C DB22 Your username Triggers at SITE Stored Procedures 9 10 Public Labs at SITE Unix Use Unix Unix shell: SSH ugate.site.uottawa.ca –l userID db2 –t Using your site account password ("-t" option, which allows Available at public labs at SITE you to use semicolon (";") Download: to terminate SQL www.ccs.uottawa.ca/software/licensed/ssh/download.html commands) A prompt that looks like SSH ID2C.site.uottawa.ca –l userID db2 => Using your db2 account password 11 12
  3. 3. Remote Access Demo Public Labs at SITE Use Windows See if we can connect to SITE… SITE… GUI interface Start Programs Database Applications IBM DB2 General Administration Tools Control Centre 13 14 Register your Database Register your Database 15 16 Register your Database Using GUI 17 18
  4. 4. Using Command Line Outline (using DB2 at SITE) Command-line processor (CLP) Command- Program Database Applications IBM DB2 Command Over view of DB2 Line Tools Command Window Work in public labs at SITE Work in research labs at SITE Work from home Survival commands 19 20 Research Labs Outline (using DB2 at SITE) Install the DB2 development client Over view of DB2 Download from IBM.com or from Work in public labs at SITE SITE at Work in research labs at SITE www.site.uottawa.ca~hguo028DB2_development_win_ www.site.uottawa.ca ~hguo028 dadv82w3.zip Work from home Copyright: IBM Corp. Survival commands 21 22 Access to DB2 Off Campus Off Campus Using Windows Local machine Windows client (using the DB2 server at SITE) Download and install IBM DB2 personal edition Install the DB2 development client from IBM. COM Register your database from your local machine No difference for this project Access to DB2 at SITE through Unix SSH ugate.site.uottawa.ca –l userID SSH ID2C.site.uottawa.ca –l userID 23 24
  5. 5. Register your DB Register… Start Programs IBM DB2 Set-up Tools Set- Configuration Assistant 25 26 Discover… Select… 27 28 Test… Add… 29 30
  6. 6. Outline (using DB2 at SITE) Survival Commands Create your DW using DB2 Connect to <YourDatabaseName> Over view of DB2 Disconnect <YourDatabaseName> Work in public labs at SITE Exit the DB2 command-line processor command- Work in research labs at SITE => quit; Work from home Get HELP any time : ? or ? <command> Survival commands 31 32 Run Commands Warming Up Commands List all your tables LIST tables; or List tables for schema.< schema_name> Show table structure DESCRIBE table < table_name> Insert values for a new record into a table INSERT INTO tablename VALUES (value1, value2, value3) value3) Updating values for records already in a table UPDATE tablename SET field1 = 'newvalue1', field2 = 'newvalue2' 33 WHERE conditions 34 Warming Up Commands… Roadmap Query a table SELECT field1, field2 FROM tablename About the project WHERE condition1 AND condition2 Using IBM DB2@ at SITE Deleting records from a table Implementing your project DELETE FROM tablename WHERE conditions; Or DELETE * FROM tablename; (delete all rows) 35 36
  7. 7. Two Aspects Create Dimension Model Implementation of the DM Generate fact and Dimension tables Use DDL/DML (from DB2 CC or DB2 CLP) to create your DM and manage your data Data staging using DB2 Create table… OLAP GUI Development Options: use DB design tools to do that Popular tools ERwin from Logic Works PowerDesigner from Sybase … 37 38 ETL Staging with UDB (an Example) Deal with your source data Example of doing it in DB2 UDB Clean Build a staging area in DB2 UDB 1) Load source data into staging area (you may want to clean/ (you Noise, miss values, inconsistence, duplication… duplication… convert your data using MS Excel, Text Editor, etc.) etc.) Transform 2) Staging with SQL/stored procedure within UDB Denormalization, deduping, merge/purge, 3) Load into DW dimension models aggregation… aggregation… Load Generate surrogate key, maintain lookup table, … Bulk loading staging Create Index Data Sources load load area DML DML DW (Data in txt format, contain noise) Models Data content audit 39 DB2 UDB 40 Review of Commands DB2 SQL Create your DW using DB2 Staging with SQL Retrieving and modifying your data and models A nonprocedural language. Just tell DB2 UDB what Useful commands for your data data to retrieve or modify staging process DML- data manipulation language DML- Select, insert, update, delete, etc. DDL – data definition language Create, drop, etc. Create your dimension tables and fact tables. Rich built-in functions for staging built- Conversion functions, aggregation and group functions 41 42
  8. 8. Basic SQL Stored Procedure Two examples for staging Complex staging functions 1. Generate a dimension table A procedure that is stored on the database server CREATE TABLE tablename( encapsulate codes Field1 INTEGER NOT NULL, field2 REAL, greatly reduces network traffic field3 CHAR(30), Programming to control the logic of a database PRIMARY KEY(field1) operations ); Could add SELECT, INSERT, UPDATE, and DELETE statements Declare variables and constants, flow-of control flow- 2. Clean your data in the staging statements, function, etc. UPDATE tablename SET field1 = 'newvalue1', field2 = 'newvalue2' WHERE conditions 43 44 Stored Procedure in DB2 SQL Procedure Two kinds of Stored Procedures in DB2 An example in DB2 UDB CREATE PROCEDURE hguo028.UPDATE_SAL SQL Procedure (IN empNum CHAR(6), IN rating SMALLINT) Written in SQL (Structured Query Language) LANGUAGE SQL BEGIN IF rating = 1 THEN External Procedure UPDATE employee Using languages such as Java, Visual Basic .NET, SET salary = salary * 1.10 C/C++, C#, etc. WHERE empno = empNum; Implement more complex logic than SQL can ELSE support UPDATE employee SET salary = salary * 1.05 WHERE empno = empNum; END IF; IF; END 45 @ 46 Maintain Procedure Cursor Processing Complex SQL Store your procedure in script files Staging with Cursor Select an alternate terminating character, other than the default terminating character of the end of the SQL statement Run the procedure script Manipulate your data record- record- by-record in a set of rows with a by- db2 -td@ -vf myScript.db2 td@ SELECT statement GUI interface rec1------------- Compare with update ( deal rec2------------- Build and maintain your stored procedures from the DB2 with set) DEVELOPMENT CENTER in the DB2 windows client rec3------------- rec4------------- Start Programs IBM DB2 Development Tools rec5------------- Development Centre rec6------------- 47 48
  9. 9. Cursor Processing Two-Phase Commit An example Data Consistence Declare STGING_CALCULATING Cursor for select com1, com2 from product Begin Maintain data integrity and accuracy open STAGING_CALCULATING Work with Commit and Rollback loop fetch STGING_CALCULATING into var1, var2 Commit : saving work calculating… calculating… Rollback :undoing changes end loop; Close STAGING_CALCULATING End rec1------------- @ rec2------------- rec3------------- rec4------------- rec5------------- 49 50 rec6------------- DB2 Load DB2 Load Load your data into DB2 UDB Control/Commands Import file writes to the database via SQL INSERT, provide logging for individual records not recommended DB2 Load Input data DB2 UDB executable files Load much faster loading tool move massive amounts of data (bulk load) Message/log Invoked CLP, control centre, or API(db2Load) file 51 52 DB2 Load DB2 Load An example SQL from CLP CONNECT TO HGUO028 USER HGUO028 USING ******; CONNECT TO HGUO028 USER HGUO028 USING *******; LOAD CLIENT FROM "C:source.txt" OF DEL "C: LOAD CLIENT FROM "C:source.txt" OF DEL "C:source.txt" Source.txt MODIFIED BY CHARDEL"" MODIFIED BY CHARDEL"" COLDEL, 1,10/1/1994,"Saturday",1.01 METHOD P (1, 2, 3, 4) COLDEL, Field number 2,11/21/1995,"Tuesday",21.00 MESSAGES "C:source.log" "C: METHOD P (1, 2, 3, 4) INSERT INTO HGUO028.TIME_D (TIME_KEY, REPALCE: delete DATE1, DAY_OF_WEEK, MESSAGES "C:source.log" "C: DAY_NUMBER_IN_MONTH) all tuples first DDL.sql: .sql: INSERT INTO HGUO028.TIME_D (TIME_KEY, DATE1, COPY NO INDEXING MODE AUTOSELECT; DAY_OF_WEEK, DAY_NUMBER_IN_MONTH) CREATE TABLE TIME_D ( CONNECT RESET; TIME_KEY INTEGER NOT NULL , COPY NO INDEXING MODE AUTOSELECT; DATE1 DATE , CONNECT RESET; DAY_OF_WEEK VARCHAR(255) , DAY_NUMBER_IN_MONTH decimal (7,2) , Source.log: Source.log: PRIMARY KEY(TIME_KEY) ); SQL3520W Load Consistency Point was successful. … successful. Number of rows read =2 Number of rows skipped = 0 Complete info. : www.site.uottawa.ca~hguo028db2dme81.pdf www.site.uottawa.ca 28 Number of rows loaded =2 Number of rows rejected = 0 copyright: IBM Corp. Number of rows deleted = 2 53 54 Number of rows committed = 2
  10. 10. DB2 Load (GUI) Generate Surrogate keys Invoke load from GUI – control centre CREATE SEQUENCE CREATE SEQUENCE ORDER_SEQ START WITH 1 INCREMENT BY 1 NO MAXVALUE NO CYCLE CACHE 20; INSERT INTO ORDERS (ORDERNO, CUSTNO) VALUES (NEXT VALUE FOR ORDER_SEQ, 123456); (NEXT ORDER_SEQ, 55 56 Execute SQL Script File Transfer Data/Schema Create a SQL file Load (import) Contains a set of commands such as create tables, stored procedures, procedures, triggers, etc. Load data into DW from flat file To execute a SQL Script file From a db2 command line use db2 -t -f <name of file> Unload (export) unload DB2 data into delimited files and then use Parts of meta data of your DM these files as input into another DB2 database. home school 57 58 DB2 Manuals Two Aspects Big helpers Complete DB2 manuals are available at Data staging using DB2 http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.j sp?topic=/com.ibm.db2.doc/db2proghome.htm OLAP GUI Development 59 60
  11. 11. OLAP Interface OLAP Interface Present your data Interface example 10% DW 10% 10% Choosing dimensions, facts Models OLAP Interface 10% 10% OLAP Interface OLAP capabilities 10% 10% 10% 10% 10% Data, graph Keys User friendliness Understand the requirements of the project Sufficiently analytic functions Analytic Capabilities Basic OLAP functions Dice, slice, roll up, drill down, comparison, trend analysis… analysis… Ad. functions Ad Hoc query … 61 62 Copyright: screenshot from www.contourcomponents.com OLAP GUI Design ODBC Creative work Widely accepted API for database access Choose tools that you are familiar with PowerBuilder, Delphi database programming Jbuilder, C++Builder, VB, VC, C#, J#... Microsoft Visual Studio, including VB, VC, VC#, VJ# Available in public Labs and free download through your MSDN academic account • Keep communication between your program and the RDBMS 63 • Usually different RDBMS use different driver 64 ODBC ODBC Configuration Configure your connect Control Panel->Administrative Panel- tools-> Datasource (ODBC) tools- Set up a ODBC connection (create your data source name- DSN) name- Different DSN USER DSN: local machine, the current user SYSTEM DSN: local machine, any user FILE DSN: local or remote machine, any user 65 66
  12. 12. ODBC- Testing JDBC Test your ODBC For Java developer at Labs Driver=“COM.ibm.db2.jdbc.app.DB2Driver” Driver= COM.ibm.db2.jdbc.app.DB2Driver” URL = “jdbc:db2:yourDatabaseName” jdbc:db2:yourDatabaseName” Copy the DB2 driver DB2JAVA.ZIP to your work directory 67 68 Showing DB Data Visual J#@ Usually three things What data : Specify the data you want Available in public labs and MSDN academic DML(SQL, Procedure..) to retrieve data from RDBMS alliance (download for free using your university Where : to cache data retrieved for RDBMS Specify a data source or data set account) How : to show your data in the cache Java-language syntax + MS Visual Studio IDE Java- DB Grid or DB Charts in your development tools Not run on a Java Virtual Machine GUI integrated development environment Supports most of the functionalities from Java1.1.4 and Visual J++6.0 (stopped because of the lawsuits…) lawsuits… Grid/Chart Cache DML RDBMS Grid/Chart 69 70 @ Microsoft Corp. Using Visual J# IDE Environment File->New-> New Project File- >New- 71 72
  13. 13. Data Components (what) Connection Configuration 73 74 OLE DB Provider Using DML to Retrieve 75 76 Generate Data Sets Save Parameters (where) 77 78
  14. 14. DataGrid to Show Data Generate Data Sets... (how) 79 80 DataGrid – Data Sources Fill Dataset Component 81 82 Using Graphs (how) MS Chart Control Comp. Other chart tools available as well 83 84
  15. 15. Adding Charts Coding 85 86 Choose popular tools and get help (sample codes) from the internet!! Using VB@1 and Delphi@2 Using Oracle@ Slides from previous years will be Download any version of Oracle available… available… Data Staging SQL * Loader ~ Load from DB2 SQL Plus (Toad) ~ Control Centre or CLP PL/SQL ~ Stored Procedure/Embedded SQL Surrogate key: CREATE SEQUENCE @1 Microsoft Corp. 87 88 @ Oracle Corp. @2 Borland Software Corp. Embedded DB2 SQL in C++ More… Embed SQL query in your programming codes Use EXEC SQL to tell the compiler to send them to DB directly Cursor delcaration example in C++ #include <stdio.h> www.site.uottawa.ca/~hguo028/csi5115-2006.htm www.site.uottawa.ca/~hguo028/csi5115- EXEC SQL INCLUDE SQLCA; SQLCA; EXEC SQL BEGIN DECLARE SECTION; Examples of char account_no[8]; EXEC SQL END DECLARE SECTION; A complete data staging process void main() { Running cursors and procedures … Generating and assigning surrogate keys to Fact and EXEC SQL DECLARE c1 CURSOR FOR SELECT account_no FROM account; EXEC SQL OPEN c1; Dimension tables do { EXEC SQL FETCH c1 into :account_no; Frequently asked questions if (SQLCODE != 0) break; printf("%sn", account_no ); printf("%s … } while (1); EXEC SQL CLOSE c1; } 89 90
  16. 16. Summary Two main aspects of a DM: good chance to experience and learn Usefully practical project Thank You/Merci Hand on experience Choose the full-fledged DBMS (e.g. DB2, Oracle, full- SQL Server) and GUI developing tools which Further Questions: you are familiar with hguo028@site.uottawa.ca 91 92

×