• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Sql Loader

on

  • 4,522 views

 

Statistics

Views

Total Views
4,522
Views on SlideShare
4,510
Embed Views
12

Actions

Likes
2
Downloads
313
Comments
0

2 Embeds 12

http://www.slideshare.net 11
http://webcache.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Sql Loader Sql Loader Presentation Transcript

    • SQ L Lo ade r Fundam e ntals © 2003 Kanbay Incorporated - All rights reserved CONFIDENTIAL - INTERNAL USE ONLY
    • AGENDA Day 1 : SQL* Loader Basics Day 2 : SQL* Loader Scripting and SQL* Loader Do’s & Don’ts 2 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader SQL*Loader is an Oracle-supplied utility that allows you to load data from a flat file into one or more database tables. That's it. That's the sole reason for SQL*Loader's existence. 3 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Environment 4 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Data Loading Data Loading Methods Conventional Path Load: Conventional path load (the default) uses the SQL INSERT statement and a bind array buffer to load data into database tables. This method is used by all Oracle tools and applications. Direct Path Load: Instead of filling a bind array buffer and passing it to the Oracle database server with a SQL INSERT statement, a direct path load uses the direct path API to pass the data to be loaded to the load engine in the server. The load engine builds a column array structure from the data passed to it. The direct path load engine uses the column array structure to format Oracle data blocks and build index keys. The newly formatted database blocks are written directly to the database. Note: During a direct path load, data conversion occurs on the client side rather than on the server side. 5 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Data Loading Conventional & Direct Path Loads 6 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Control File The SQL*Loader control file is a text file that contains data definition language (DDL) instructions. The SQL*Loader control file is the key to any load process. DDL is used to control the following aspects of a SQL*Loader session: ▲Where SQL*Loader will find the data to load ▲How SQL*Loader expects that data to be formatted ▲How SQL*Loader will be configured (memory management, rejecting records, interrupted load handling, and so on) as it loads the data ▲How SQL*Loader will manipulate the data being loaded ▲The correspondence between the fields in the input record and the columns in the database tables being loaded ▲Selection criteria defining which records from the input file contain data to be inserted into the destination database tables ▲The names and locations of the bad file and the discard file Comments in the Control File: --This is a comment 7 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Control File Example 5-1 Sample Control File 1 -- This is a sample control file 2 LOAD DATA 3 INFILE 'sample.dat' 4 BADFILE 'sample.bad' 5 DISCARDFILE 'sample.dsc' 6 APPEND 7 INTO TABLE emp 8 WHEN (57) = '.' 9 TRAILING NULLCOLS 10 (hiredate SYSDATE, deptno POSITION(1:2) INTEGER EXTERNAL(2) NULLIF deptno=BLANKS ,job POSITION(7:14) CHAR TERMINATED BY WHITESPACE NULLIF job=BLANKS quot;UPPER(:job)“ , mgr POSITION(28:31) INTEGER EXTERNAL TERMINATED BY WHITESPACE NULLIF mgr=BLANKS , ename POSITION(34:41) CHAR TERMINATED BY WHITESPACE quot;UPPER(:ename)“ , empno POSITION(45) INTEGER EXTERNAL TERMINATED BY WHITESPACE , sal POSITION(51) CHAR TERMINATED BY WHITESPACE quot;TO_NUMBER(:sal,'$99,999.99')“); In this sample control file, the numbers that appear to the left would not appear in a real control file. They are keyed in this sample to the explanatory notes in the following list: 8 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Control File The LOAD DATA statement tells SQL*Loader that this is the beginning of a new data load. ▲ The INFILE clause specifies the name of a datafile containing data that you want to load. ▲ The BADFILE parameter specifies the name of a file into which rejected records are placed. ▲ The DISCARDFILE parameter specifies the name of a file into which discarded records are placed. ▲ The APPEND parameter is one of the options you can use when loading data into a table that is not empty. ▲ To load data into a table that is empty, you would use the INSERT parameter. The INTO TABLE clause allows you t identify tables, fields, and datatypes. It defines the relationship o ▲ between records in the datafile and tables in the database. The WHEN clause specifies one or more field conditions. SQL*Loader decides whether or not to load the ▲ data based on these field conditions. The TRAILING NULLCOLS clause tells SQL*Loader to treat any relatively positioned columns that are not ▲ present in the record as null columns. The remainder of the control file contains the field list, which provides information about column formats in ▲ the table being loaded. 9 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Log File The log file is a record of SQL*Loader's activities during a load session. It contains information such as the following: ▲The names of the control file, log file, bad file, discard file, and data file ▲The values of several command-line parameters ▲A detailed breakdown of the fields and datatypes in the data file that was loaded ▲Error messages for records that cause errors ▲Messages indicating when records have been discarded ▲A summary of the load that includes the number of logical records read from the data file, the number of rows rejected because of errors, the number of rows discarded because of selection criteria, and the elapsed time of the load Note: Always review the log file after a load to be sure that no errors occurred, or at least that no unexpected errors occurred. This type of information is written to the log file, but is not displayed on the terminal screen. contd.. 10 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Log File Exit Codes for Inspection and Display: All rows successfully loaded EX_SUCC (0) All or some rows rejected EX_WARN(2) All or some rows discarded EX_WARN Discontinued Load EX_WARN Command-line or syntax errors EX_FAIL (1) Oracle errors nonrecoverable for SQL Loader EX_FAIL Operating System error EX_FAIL (such as file open/close and malloc) File Extension: “.log” contd.. 11 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Log File SQL Loader Log File: ▲Header Information ▲Global Information ▲Table Information ▲Datafile Information ▲Table Load Information ▲Summary Statistics ▲Additional Summary Statistics for Direct Path Loads and Multithreading ▲Log File Created When EXTERNAL_TABLE=GENERATE_ONLY Description of some sections Datafile Information: ▲ - SQL*Loader and Oracle data record errors - Records discarded E.g. Record 8: Rejected - Error on table emp, column deptno. ORA-01722: invalid number contd.. 12 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Log File Table Load Information ▲ - Number of rows loaded - Number of rows that qualified for loading but were rejected due to data errors - Number of rows that were discarded because they did not meet the specified criteria for the WHEN clause - Number of rows whose relevant fields were all null - Date cache statistics, if applicable E.g.: Table EMP: 25000 Rows successfully loaded. 2 Rows not loaded due to data errors. 0 Rows not loaded because all WHEN clauses were failed. 0 Rows not loaded because all fields were null. Date Cache: Max Size: 2000 Entries: 1000 Hits: 11000 Misses: 0 contd.. 13 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Log File Summary Statistics ▲ - Amount of space used: o For bind array (what was actually used, based on what was specified by BINDSIZE) o For other overhead (always required, independent of BINDSIZE) - Cumulative load statistics. That is, for all datafiles, the number of records that were: o Skipped o Read o Rejected o Discarded - Beginning and ending time of run - Total elapsed time - Total CPU time (includes all file I/O but may not include background Oracle CPU time) contd.. 14 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Log File E.g.: Space allocated for bind array: 65336 bytes (64 rows) Space allocated for memory less bind array: 6470 bytes Total logical records skipped: 0 Total logical records read: 7 Total logical records rejected: 0 Total logical records discarded: 0 Run began on Wed Feb 27 10:46:53 1990 Run ended on Wed Feb 27 10:47:17 1990 Elapsed time was: 00:00:15.62 CPU time was: 00:00:07.76 15 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Bad File Whenever you insert data into a database, you run the risk of that insert failing because of some type of error. A bad file filename specified on the command line becomes the bad file associated with the first INFILE statement in the control file. If the bad file filename was also specified in the control file, the command-line value overrides it. E.g.: Integrity constraint violations the most common type of error. ▲ Lack of free space in a tablespace, can also cause insert operations to fail. ▲ Whenever SQL*Loader encounters a database error while trying to load a record, it writes that record to a file known as the bad file. File extension: “.bad” 16 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Discard Files Discard files are used to hold records that do not meet selection criteria specified in the SQL*Loader control file. A discard file filename specified on the command line becomes the discard file associated with the first INFILE statement in the control file. If the discard file filename is specified also in the control file, the command-line value overrides it. File Extension: “.dsc” Note: Bad files are not optional. Discard files are optional. ▲ The format of your bad files and discard files will exactly match the format of ▲ your input files. 17 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Scripting Let us consider example of uploading employee details into EMP_TAB a Oracle table. Following example shows 3 records from file having Employee’s personal details. It’s a delimited file with lines wrapped. “E1“}“ABC“}“E1 AB Street“}“BBC“}”2001-AUG-08”}quot;26“}“M” “E2“}“BBC“}“F1 AB Street“}“BBC“}” 2002-DEC-07”}quot;25“}“M” “E3“}“BCC“}“G1 AB Street“}“BBC“}”2003-JUL-01”}quot;24“}“M” As you can see the data in the file is ‘}’-delimited, and each field is enclosed within double quotes. contd.. 18 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Scripting Table Design: Column Name Constraint Data Type EMP_NO NOT NULL <PK> NUMBER(11) EMP_NAME NOT NULL VARCHAR2(100) EMP_ADDR NOT NULL VARCHAR2(250) EMP_STATE NOT NULL VARCHAR2(100) EMP_DOJ NOT NULL DATE EMP_AGE NOT NULL NUMBER(4) EMP_SEX NOT NULL CHAR(1) contd.. 19 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Scripting OPTIONS ( ERRORS=50,DIRECT=TRUE, PARALLEL=TRUE) UNRECOVERABLE LOAD DATA APPEND INFILE emp_data.txt BADFILE emp_data.bad DISCARDFILE emp_data.dsc INTO TABLE EMP_DAT FIELDS TERMINATED BY '}‘ ENCLOSED BY ‘ ” ‘ TRAILING NULLCOLS ( EMP_NO NULLIF EMP_NO= BLANKS , EMP_NAME NULLIF EMP_NAME=BLANKS , EMP_ADDR NULLIF EMP_ADDR=BLANKS , EMP_STATE NULLIF EMP_STATE= BLANKS , EMP_DOJ DATE 'YYYY-MM-DD‘ NULLIF EMP_DOJ=BLANKS , EMP_AGE NULLIF EMP_AGE= BLANKS , EMP_SEX NULLIF EMP_SEX= BLANKS ); 20 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Scripting Consider another example of uploading data into table EMP_TAB using file with following format: E1ABCE1 AB StreetBBC2001-AUG-0826M E2BBCF1 AB StreetBBC2002-DEC-0725M E3BCCG1 AB StreetBBC2003-JUL-0124M As you can see there is not data delimiter and fields are not enclosed in quotation mark contd.. 21 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL Loader Scripting OPTIONS ( ERRORS=50,DIRECT=TRUE, PARALLEL=TRUE) UNRECOVERABLE LOAD DATA APPEND INFILE emp_data.txt BADFILE emp_data.bad DISCARDFILE emp_data.dsc INTO TABLE EMP_DAT ( EMP_NO position(01:02) INTEGER , EMP_NAME position(03:05) VARCHAR , EMP_ADDR position(06:17) VARCHAR , EMP_STATE position(18:20) VARCHAR , EMP_DOJ position(21:31) DATE “YYYY-MM-DD” , EMP_AGE position(32:34) NUMBER , EMP_SEX position(35:35) CHAR ); 22 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL Loader Scripting Command to Execute SQL Loader: $ORACLE_HOME/bin/sqlldr userid=$uid/$pwd, control=$ctlfile, log=$logfile, bad=$badfile, data=$datafile >> /dev/null Where, $ORACLE_HOME/bin/sqlldr – SQL Loader Executable path $userid =$uid/$pwd – User Id and Password control=$ctlfile – assigns control file name log=$logfile – assigns log file name bad=$badfile – assigns bad file name data=$datafile – assigns data file name 23 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Scripting Household Standard: Example how execute SQL Loader from Shell Script: #! /bin/ksh r_dt=`date +%y%m%d'` datafile=$DATDIR/emp_data.txt - Data File Details logfile=quot;$LOGDIR/ld_emp_dat_$r_dt.logquot; - Log File Details badfile=quot;$DATDIR/baddir/ld_emp_dat.badquot; - Bad File Details ctlfile=quot;$SCRDIR/ctldir/sqlldr_emp_dat.ctlquot; - Control File Details $ORACLE_HOME/bin/sqlldr userid=$uid/$pwd, control=$ctlfile, log=$logfile, bad=$badfile, data=$datafile >> /dev/null stat=`grep ORA $logfile | wc -l` if ((stat==0)); then exit 0 else exit 2 fi 24 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • SQL* Loader Scripting Household Standard: CTL File (sqlldr_emp_dat.ctl): For the above shell script where data file name is passed in the command directly the CTL file will be: OPTIONS ( ERRORS=50, skip=0,direct=true,skip_unusable_indexes=true, skip_index_maintenance=true ) LOAD DATA APPEND INTO TABLE PLAN_ACCT_ANLYS FIELDS TERMINATED BY '}' TRAILING NULLCOLS (EMP_NO , EMP_NAME NULLIF EMP_NAME=BLANKS , EMP_ADDR NULLIF EMP_ADDR=BLANKS , EMP_STATE NULLIF EMP_STATE= BLANKS , EMP_DOJ DATE 'YYYY-MM-DD‘ NULLIF EMP_DOJ=BLANKS , EMP_AGE NULLIF EMP_AGE= BLANKS , EMP_SEX NULLIF EMP_SEX= BLANKS ); 25 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • Do’s and Don’ts SQL Loader If load speed is most important to you, you should use direct path load because it is faster than conventional path load. However, certain restrictions on direct path loads may require you to use a conventional path load. You should use a conventional path load in the following situations: When loading data into a clustered table ▲ A direct path load does not support loading of clustered tables. When loading a relatively small number of rows into a large table with referential and ▲ column-check integrity constraints Because these constraints cannot be applied to rows loaded on the direct path, they are disabled for the duration of the load. Then they are applied to the whole table when the load completes. The costs could outweigh the savings for a very large table and a small number of new rows. When loading records and you want to ensure that a record is rejected under any of the ▲ following circumstances: - If the record, upon insertion, causes an Oracle error - If the record is formatted incorrectly, so that SQL*Loader cannot find field boundaries 26 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • Do’s and Don’ts SQL Loader Make logical record processing efficient: ▲ Make it easy for the software to identify physical record boundaries. If you use the default (stream mode) on most platforms (for example, UNIX and NT) the loader must scan each physical record for the record terminator (newline character). Make field setting efficient: ▲ Field setting is the process of mapping fields in the datafile to their corresponding columns in the table being loaded. The mapping function is controlled by the description of the fields in the control file. Field setting (along with data conversion) is the biggest consumer of CPU cycles for most loads. Avoid delimited fields; use positional fields. If you use delimited fields, the loader must scan the input data to find the delimiters. If you use positional fields, field setting becomes simple pointer arithmetic (very fast). Avoid unnecessary NULLIF and DEFAULTIF clauses. Each clause must be evaluated on ▲ each column that has a clause associated with it for every row loaded. 27 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32
    • Thank Yo u..! Go o d Luck..! Best Regards, Archana Prasad Business Intelligence Kanbay Software (I) Pvt. Ltd. Tel : 91-40-23125000 Extn – 8228 SEP 2004 28 CONFIDENTIAL - INTERNAL USE ONLY 10/26/07 04:32