Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
COLLABORATE 14 – IOUG Forum
Engineered Systems
1 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
Migra...
COLLABORATE 14 – IOUG Forum
Engineered Systems
2 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
Table...
COLLABORATE 14 – IOUG Forum
Engineered Systems
3 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
o Ora...
COLLABORATE 14 – IOUG Forum
Engineered Systems
4 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
2. Pa...
COLLABORATE 14 – IOUG Forum
Engineered Systems
5 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
Total...
COLLABORATE 14 – IOUG Forum
Engineered Systems
6 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
4. Su...
COLLABORATE 14 – IOUG Forum
Engineered Systems
7 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
5. So...
COLLABORATE 14 – IOUG Forum
Engineered Systems
8 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
An un...
COLLABORATE 14 – IOUG Forum
Engineered Systems
9 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
v_Ste...
COLLABORATE 14 – IOUG Forum
Engineered Systems
10 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
Appe...
COLLABORATE 14 – IOUG Forum
Engineered Systems
11 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks
White Paper
and ...
Upcoming SlideShare
Loading in …5
×

Migrate 10TB to Exadata -- Tips and Tricks

1,717 views

Published on

White Paper associated with Presentation at IOUG COLLABORATE 14, Las Vegas, April 2014

Published in: Data & Analytics, Technology
  • Be the first to comment

Migrate 10TB to Exadata -- Tips and Tricks

  1. 1. COLLABORATE 14 – IOUG Forum Engineered Systems 1 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper Migrate 10 TB to Exadata X3 – Tips and Tricks Amin Adatia, KnowTech Solutions Inc. ABSTRACT Tips and Tricks for migrating 10TB data from AIX (2 Nodes and v10.2.0.4) to Exadata X3 (8 Nodes and v11.2.0.3) with only 6 hours of downtime. Database included Oracle Label Security, 40 Partitioned Tables, 4 Tables with Oracle Text Indexes, 5 Tables with a CLOB column and one table for which a BLOB column had to be migrated while dropping a CLOB column during migration. On the Target Exadata, the indexes had to be in place because there was data loading being done just prior to the migration and immediately afterwards. TARGET AUDIENCE The Intermediate to Advanced Level Developer/Database Administrator will find something useful and innovative when tackling data migration to Exadata when the application makes use of Partitioned Tables, Sub Partitioned Tables, Oracle Text and Oracle Label Security especially when there is limited downtime available and standard methods like transportable tablespaces and data pump prove to be impractical with respect to the available downtime. EXECUTIVE SUMMARY Learner will be able to: 1. Find which Techniques worked best for different Types of Tables a. Non Partitioned Tables b. Partitioned Tables c. Hash Sub Partitioned Tables 2. How to deal with Tables with LOB Data 3. How to deal with Oracle Text Indexes when Transportable Tablespaces approach is not practical 4. How to address Oracle Label Security setup when Custom Label Tags are not used Migration to Exadata surpassed expectations! We did not expect nor did we get any issues with non-partitioned tables. Of the Partitioned Tables, our optimistic expectation was that we would migrate 480 out of approximately 2500 partitions for each table in the 6 hour downtime allowed. We managed to completely migrate 26 of 30 Partitioned Tables which did not have any LOB columns. The remaining three took from 8, 15 and 20 hours to complete; partly because of diverting resources once the threshold number of Partitions had been migrated. Tables with LOB columns do not seem to be able to take advantage of parallelism. It appeared that while data from the non- LOB columns moved with the parallelism applied, LOB column reverted to one process and in a record at a time manner! This behaviour was masked probably by the CLOB size being less than 4000 characters. When migrating into a table with 32 sub-Partitions the issue became significant the approach had to be abandoned for this Table. Eventually, it was determined that the best approach was to not use Parallel on the Source for data migration but rather to use up to 45 jobs on each node for the data migration part. Also instead of Partition Exchange, which locked the Table and stopped ETL, a much better method was to insert into the table with input rows sorted by the OraHash Key. Table below provides a summary of the data migration for each type on Table.
  2. 2. COLLABORATE 14 – IOUG Forum Engineered Systems 2 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper Table Data Migration Summary Object Type Estimate Partitions Completed/Total Actual Partitions within 6 Hour Downtime Actual Time Non Partitioned Tables (65) 30 minutes 12 minutes Partitioned Tables (33 ~2500 Parts/Table) Non LOB Columns (30) 480/2500 26 Tables completed 1 Table 880 2 Tables 440 1 Table 200 8 Hours 11 Hours 20 Hours CLOB Column Tables (2 Tables) 1 Table 480/2500 1 Table 40/1800 1000 160 11 Hours 15 Hours BLOB Column and Sub-Partitioned (1 Table) 28/1200 16 8 Days!! The approach to Oracle Text was to rebuild the Text Indexes rather than use Transportable Tablespaces to preserve the Oracle Text Index data. One of the main reasons was that the Transportable Tablespaces method would have taken about 20 days of downtime. Besides we would need the source document files if there were any documents added or deleted. Utilizing the processing power in Exadata we were able to tune the Oracle Text Indexing so as to be able to SYNC_INDEX a Partition within about 30 minutes which was well within the time to regenerate the XML clob and the combined documents file. Sometimes we had to increase the Parallel Degree parameter to 96 but the normal setting was 32. For Oracle Label Security deployment, rather than update the Label Tag value after the data was migrated it was simpler to use the Target Label Tag values within the migration process to replace the Source Label Tag. An issue we ran into, that led to this replacement approach, was that the update of the Label Tags, once the OLS Policy was applied to the Table, was an extremely slow performance. A pre-requisite of this approach to OLS Setup is that the OLS Policy and Labels need to be defined on the Target and that the labels be synchronized between the Source and the Target prior to data migration. We had a catch-all Label Tag so that discrepancies could be corrected afterwards. BACKGROUND The source and target environment were as below. A month before the migration, Exadata X3 became available. Source Target Oracle 10gR2 (10.2.0.4)  AIX  Big Endian  2 Node RAC  48 CPU per Node Oracle 11.2.0.3  Exadata X3 (Linux)  Little Endian  8 Node RAC  32 CPU per Node  Tablespaces => 450 (9.8 TB Disk Space Used – measured as segment size)  Table Partitions distributed across all Tablespaces  Tables (for Migration) o Partitioned => 33 (~ 2500 Partitions/Table) o Non Partitioned => 65 (2 Large Number of Rows)
  3. 3. COLLABORATE 14 – IOUG Forum Engineered Systems 3 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper o Oracle Text Indexes => 4 o Tables with LOBs => 4 (CLOB and BLOB)  Oracle Label Security => ~ 280 Labels  Network Link => 1 MBit and 10 Mbit Migration Constraints and Conditions  Users mostly query for the most recent 120 partitions  Some queries span 1500 partitions  XML (for Text Indexing) o Built from 17 Tables o Took 30 – 45 minutes o Sync Index for a Partition took 20 – 30 minutes  Documents stored as Files o Took about 45-60 minutes to copy o Sync Index took about 30 minutes Migration Approach  Migrate at least 240 partitions data before ETL resumed  Downtime allowed was 6 hours  Text Indexing done as an “assembly line” triggered by a control table which was updated as each Partition was migrated and loaded on the Target  Safety Net o Dual ETL to Source and Target Environments o Keep Users on Source Environment o Journal User Actions and Periodically Apply to Target to Synchronize o Switch Users when All Data Migrated  Stop Feed to both Source and Target  Apply User Actions Journal to Target  Switch Users to Target  Resume Dual Mode ETL for a few Days TECHNICAL DISCUSSIONS AND EXAMPLES Target Setup The application environment consisted of four schemas, one of which acted as a proxy account for users. All the schema definitions were exported from the Source Environment with Rows = N option. On the Target Environment, the schemas were established, Oracle Text Preferences and Parameters created and Oracle Label Security Policy and Labels established. Once this was completed, the schema definitions were imported. Invalid Objects were reviewed and recompiled and cross- schema grants and privileges tested. Three types of Tables into which data had to be migrated were 1. Non Partitioned – 65 Tables
  4. 4. COLLABORATE 14 – IOUG Forum Engineered Systems 4 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper 2. Partitioned Tables a. Without LOB Columns – 30 Tables b. With LOB Column – 2 Tables 3. Partitioned Table with 32 Sub Partitions by Hash and LOB Columns – 1 Table Given the Source Environment it was not practical to make use of the Transportable Tablespaces option, for migration, to save on the rebuilding of Oracle Text Indexes. A test for the method gave an estimate of 20 days downtime to complete the migration. An alternate approach was developed whereby we would rebuild the Text indexes from the source data as it got migrated. The Migration 1. Non Partitioned Tables For the most part, these got migrated in under 10 minutes. Three sessions were invoked each dealing with about 20 tables. One of the two larger volume tables took 25 minutes using parallel 24 at the source. The other took 2 hours but it was discovered too late that only parallel 12 had been designated. Appendix A1 shows the script used. 2. Partitioned Tables There were two parts to migrating data into Partitioned Tables. One was to get the data across from the Source and the other was to load the data into the appropriate partition on the Target using the Partition Exchange mechanism. Given that the Source had 48 CPU on each Node and that one of the Nodes had a 10Mbit network, tables with more data were set to migrate using the larger bandwidth. Scheduled Jobs were submitted for each table in a controlled manner such that the at most 46 parallel sessions were running on any on the source nodes. Once the data was migrated over to the Target Exadata, the vastly more CPUs were utilized to parallelism as required. Sixteen of the tables were required to generate the XML used for Text Indexing. Four of the 8 Nodes were used exclusively for data migration and Partition exchange. Three were used for three Text Indexing Jobs. Three Tables in this class of objects failed to complete within the downtime period. Primary reason was resource re-allocation on the Source in favour of Tables with LOB Columns given that the minimum set of Partitions had already been migrated. Appendix 2 shows the code snippets used for migration of the data. The deployment was such that each Partition was migrated to a Queue Table. This was then copied to a Working Table for Partition Exchange. Indexes of the Working Table were created corresponding to the Target Table Partition. The Working Table to Partition sequence was initiated by the Partition Status updated in the control Table established to manage the migration. Allocating more CPU at the Source increased the data migration speed so that we could divert resources to Tables with more data to try and get everything to complete at about the same time. The technique was purely “gut feel” and the number of partitions already migrated. Graph below displays the time (Hours) for the migration of each of the Tables.
  5. 5. COLLABORATE 14 – IOUG Forum Engineered Systems 5 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper Total Time to Migrate Data for each Table Migration staggered as to the resources available on the AIX Source 3. Partitioned Tables (LOB Columns) Tables with LOB columns did not seem to be able to take advantage of parallelism. It appeared that while data from the non- LOB columns moved with the parallelism applied, LOB column reverted to one process and in a record at a time manner! This behaviour was masked probably by the CLOB size being less than 4000 characters. Eventually, it was determined that the best approach was to not use multiple Parallel processes on the Source for data migration but rather to use up to 45 separate jobs, one for each Partition, on each node for the data migration part. The time to migrate dropped from 45-75 minutes to 10- 15 minutes per Partition. Graph below shows the relative time to migrate data from Partitioned Tables with LOB Columns. Data Migration of LOB Column Tables Same Partition using different Degree Parallel on the SELECT from Source
  6. 6. COLLABORATE 14 – IOUG Forum Engineered Systems 6 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper 4. Sub Partition by Hash (LOB Columns) Instead of Partition Exchange, which locked the Table and thus stopped ETL, a much better method was to insert into the table with input rows sorted by (ORA_HASH (<Key>, 31, 0) + 1) since we had 32 Hash partitions. Graph below shows the relative time to migrate data using different degree of parallel settings for the data migration and methods for exchanging data into the sub partitions. Sub Partition Table Migrate and Exchange (Relative Times) PX32 – Partitions Migrated – 36 PX8 – Partitions Migrated – 112 PX1 – Partitions Migrated – 1800 Graph below shows the relative Time per Record for inserting into Table with Sub-Partition by Hash for the input stream sorted by the (ORA_HASH (<Key>, 31, 0) + 1). The data was already on the Exadata. Relative Time/Record for Sorted vs Unsorted Input into Sub-Partitioned Table
  7. 7. COLLABORATE 14 – IOUG Forum Engineered Systems 7 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper 5. Solving for Oracle Text Indexing Since we could not use the Transportable Tablespace approach to preserving the Oracle Text Index “DR#xxxnnnn$I, etc. tables we were left with the massive memory and CPUs available to speed up the re-generation of the indexes. The Oracle Text preferences were created prior to importing the schema from the source. We had two types of data to be Indexed using Oracle Text; one was XML generated from 17 tables and the other was Documents in multi-lingual text stored on a file system and generated by recombining individual child detail records. Since both of these data objects had to be in place, we could use an assembly line approach to Text Indexing. The steps in the assembly line were 1. Migrate the Data for the Partition in the 17 tables 2. Copy the documents to the file system 3. Generate the Recombined Document 4. Generate the XML 5. Text Index the recombined Document file 6. Text Index the XML The assembly line throughput was about 90 minutes. All of these steps were controlled via entries in a table to trigger the steps based on the Partitions migrated. Since we were dealing with partitioned tables, the CTX_DDL.SYNC_INDEX procedure requires passing in the Partition Name. The view CTXSYS.CTX_USER_PENDING appeared to be well suited to get the partition name for the pending index sync data. However, the following query used to get the Partition Name did not perform well, taking anywhere from 10 to 20 minutes. Appendix 4.1 shows the script used. An alternative approach had to be found to determine the Pending Partitions. The CTXSYS tables that provided the data needed are o CTXSYS.DR$PENDING o CTXSYS.DR$INDEX o CTXSYS.DR$INDEX_PARTITION Appendix 4.2 show the script for the view created to provide the Partition Names for which records were pending SYNC_INDEX operation. The Text Indexes were created with parameter SYNC (MANUAL) and the INDEX_SYNC was invoked using a DBMS_SCHEDULER Job which with a 10 second interval. This was preferred in favour of the SYNC(AUTO) option because of the large number of Partitions involved and consequently the large number of Jobs that would be invoked and the management overhead involved in enabling/disabling these Jobs. Also there was no way to pre- determine the Partitions for which data would be received. The SYNC_INDEX Parallel Degree parameter settings were as below based on the type of data o Text default was 8 (but had to use 16,24,32) o XML default was 16 (have used 24,32,64,96) In order to reduce the number of TOKENS generated during the SYNC_INDEX process as loaded in the DR#xxxnnnn$I Table, the indexes need to be optimized. However, the SYNC_INDEX especially when running with Parallel Degree greater than 1, conflicts and locks when OPTIMIZE_INDEX is also running. The SYNC_INDEX option CTX_DDL.LOCK_NOWAIT did not resolve the problem. So we now perform OPTIMIZE_INDEX for Previous Partition to the one currently loading. However, since we can get data for the Previous Partitions with the current Partition, the Partitions involved in the OPTIMIZE_INDEX had to be removed from the PENDING list. Appendix 4.3 shows the script used to identify the partitions involved in the OPTIMIZE_INDEX process.
  8. 8. COLLABORATE 14 – IOUG Forum Engineered Systems 8 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper An unforeseen impact of OPTIMIZE_INDEX is the large number of redo logs generated. There was a three to four fold increase in the switch log frequency. Increasing the size of the redo log files may solve this particular problem. 6. Deal with Oracle Label Security The Label Tags used by Oracle Label Security (OLS) were not custom tag but generated using OLS procedure. Thus on the Target environment we had to update the Label Tags coming in from the Source data. In testing the Label Tag update after the data was migrated it was found that the update after the migration was extremely slow when the OLS Policy was enabled on the Table. So we had to replace the Label Tags during the data migration. The method for finding the matching Label Tags for a given Label was to create equivalent OLS environment on the Target Exadata using the Labels and Policy used on the Source. The replacement of the Label Tags was done within the INSERT INTO ... SELECT FROM construct using the following construct for the Label_Tag conversion (CASE WHEN <Label_Tag> = Source_Label_Tag THEN <Target_Label_Tag> ELSE <not_matched_tag> END) APPENDICES Appendix A1: Non Partition Tables Migrate INSERT INTO <Table_Name> SELECT /*+ PARALLEL (T1,p_Degreee) */ FROM <Source_Table>@DbLink_Pipe T1 Appendix A2: Migrate with Partition Exchange v_Step := 'Create => '||v_Q_Table||' => '||p_Table_Name; EXECUTE IMMEDIATE 'create table '||v_Q_Table ||chr(10)||'TABLESPACE '||v_Tablespace_Name ||chr(10)||'NOLOGGING' ||chr(10)||'as select' ||chr(10)||' /*+' ||chr(10)||' parallel (a,'||p_Parallel_Source||')' ||chr(10)||' no_index ('||p_No_Index_Hint||')' ||chr(10)||' */' ||chr(10)||' * from '||p_Table_Name||'@'||p_Data_Source||' a' ||chr(10)||'where a.PartKey ||chr(10)||’ BETWEEN p_Start_Number - '||p_Offset_Start ||chr(10)||' AND p_Start_Number - '||p_Offset_End;
  9. 9. COLLABORATE 14 – IOUG Forum Engineered Systems 9 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper v_Step := 'Create Matching Partition Indexes';  If RECORDS > 0 WORKING_INDEX (p_Table_Name => p_Table_Name ,p_Working_Table => v_Working_Table ,p_Tablespace_Name => v_Tablespace_Name ,p_Parallel_Target => p_Parallel_Target ); v_Step := 'Create Index ('||v_Q_Table||')'; EXECUTE IMMEDIATE 'create index '||v_Q_Table||'_I on '||v_Q_Table ||chr(10)||'(PartKey,PartKey_Keep)' ||chr(10)||'TABLESPACE '||v_Tablespace_Name ||' NOLOGGING PARALLEL '||p_Parallel_Target ; v_Step := 'Gather Stats => '||v_Q_Table; DBMS_STATS.GATHER_TABLE_STATS ( OWNNAME => USER ,TABNAME => v_Q_Table ,GRANULARITY => 'AUTO' ,DEGREE => DBMS_STATS.AUTO_DEGREE ,ESTIMATE_PERCENT => DBMS_STATS.AUTO_SAMPLE_SIZE ,METHOD_OPT => 'FOR ALL COLUMNS SIZE AUTO' ,CASCADE => TRUE ); v_Step := 'Exchange Partition => '||v_Q_Table; EXECUTE IMMEDIATE 'ALTER TABLE '||p_Table_Name ||chr(10)||' EXCHANGE PARTITION‘ ||v_Partition_Name ||chr(10)||' WITH TABLE ' ||v_Working_Table ||chr(10)||' INCLUDING INDEXES' ||chr(10)||' WITHOUT VALIDATION' ||chr(10)||' UPDATE GLOBAL INDEXES';
  10. 10. COLLABORATE 14 – IOUG Forum Engineered Systems 10 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper Appendix A3: Migrate into Sub Partition Table INSERT INTO HASH_LOB_TABLE SELECT /*+ NO_INDEX(A) */ * FROM Q_TABLE A ORDER BY A.PartKey_Keep ,(ORA_HASH(A.<hashkey>,31,0) + 1); Appendix A4.1: View for Partitions with Pending Text Index Sync using CTX_USER_PENDING SELECT T1.IDX_NAME ,T1.IDX_PARTITION_NAME ,COUNT(*) RECORDS FROM CTX_USER_PENDING T1 GROUP BY T1.IDX_NAME ,T1.IDX_PARTITION_NAME / Appendix A4.2: View for Partitions with Pending Text Index Sync using CTXSYS.DR$PENDING CREATE OR REPLACE VIEW PARTS_PENDING_SYNC_V AS SELECT Z1.INDEX_NAME ,Z1.PARTITION_NAME ,Z1.PENDING_RECORDS FROM (select /*+ parallel (t2,2) */ (select t3.idx_name from ctxsys.dr$index t3 where t3.idx_id = a1.pnd_cid ) index_name ,t2.ixp_name partition_name ,a1.records pending_records from ctxsys.dr$index_partition t2 ,(select t1.pnd_cid ,t1.pnd_pid ,t1.records from (select /*+ parallel (t0,2) */ t0.pnd_cid ,t0.pnd_pid ,count(*) Records from ctxsys.dr$pending t0 group by t0.pnd_cid ,t0.pnd_pid ) t1 ) a1 where t2.ixp_idx_id = a1.pnd_cid
  11. 11. COLLABORATE 14 – IOUG Forum Engineered Systems 11 | P a g e Migrate 10 TB to Exadata X3 – Tips and Tricks White Paper and t2.ixp_id = a1.pnd_pid ) Z1 WHERE NOT EXISTS (SELECT NULL FROM PARTS_OPTIMIZING_V Z2 -- See Appendix 4.3 WHERE Z2.PARTITION_NAME = Z1.PARTITION_NAME ) / Appendix A4.3: View for Partitions undergoing OPTIMIZE_INDEX (and DBMS_SCHEDULER Job) create or replace view PARTS_OPTIMIZING_V as select A1.table_name ,A1.partition_name from user_tab_partitions A1 ,(select (CASE WHEN t2.job_name = '<Optimize_index_job>' THEN '<table_name>' .... END) TABLE_NAME ,to_char(t2.last_start_date-1,' YYYYMMDD') partdate from user_scheduler_jobs t2 where t2.job_name in (<List of Optimize_Index Jobs>) and t2.state = ' RUNNING' ) A2 where A1.table_name = A2.table_name and substrb(A1.partition_name,12,8) = A2.partdate / REFERENCES 1. Expert Oracle Exadata Kerry Osborne, Randy Johnson, Tanel Põder – Apress – ISBN13: 978-1-4302-3392-3 – Published 2011-08-07 2. Oracle Exadata Recipes John Clarke – Apress – ISBN13 : 978-1-4302-4914-6 – Published 2013-02-05 3. Oracle Database 11g Release 2 Performance Tuning Tips & Techniques Richard Niemiec – Oracle Press – ISBN13: 978-0-07-178026-1 – Published 2012-02-27 4. Oracle Database Architecture Second Edition Thomas Kyte – Apress – ISBN13: 978-1-4302-2946-9 – Published 2010-07-25

×