Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Oracle 11g


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Collaborate 2009 - Migrating a Data Warehouse from Microsoft SQL Server to Oracle 11g

  1. 1. Database MIGRATING A DATA WAREHOUSE FROM MICROSOFT SQL SERVER TO ORACLE 11G Dylan Kucera, Senior Manager – Data Architecture Ontario Teachers’ Pension PlanINTRODUCTIONIT infrastructure is often sized according to a 3 to 5 year growth projection which can generally be understood as a sensibleand cost effective practice. A sign of success for any deployment is when demand begins to outstrip the capability of thetechnology after this time period has past. When looking at the organization’s central Data Warehouse, a DBA or seniortechnology architect may foresee the need for a stronger technology capability, however, management may not be so easilyconvinced. Furthermore, core services such as an organization’s central Data Warehouse can be difficult to replace with adifferent vendor solution once dozens or even hundreds of mission critical applications are wired to the existing deployment.With patience, extensive metrics gathering, and a strong business case, management buy-in may be attainable. This paperoutlines a number of hints that are worth considering while crafting a business case for a Data Warehouse migration topresent to IT management.Oracle Database 11g provides a number of key technologies that allow for a gradual Data Warehouse migration strategy tounfold over a period of staged deployments, minimizing and in some cases completely eliminating disruption to the end-userexperience. The main purpose of this paper is to outline these technologies and how they can be employed as a part of themigration process. This paper is also meant to point out a number of pitfalls within these technologies, and how to avoidthem.GAINING BUY-IN FOR A DATA WAREHOUSE MIGRATIONWhen it comes to pitching a Data Warehouse migration, patience isn’t just a virtue, it is a requirement. Go in expecting theacceptance process to take a long time.Armed with the knowledge that you will return to this topic with the management group a number of times, plan for theseiterations and make your first presentation about your timeline for evaluation. Unfold your message in a staged fashion.Start thinking about metrics before anything else; management understands metrics better than technology. The challengewith this, however, is to make sure the metrics encompass the entire problem at hand. If the bar is set too low because yourmetrics do not capture the full scope of your Data Warehouse challenges, your goal of beginning a migration path may becompromised as management may not understand the severity or urgency of the issues.Remember to tie every message to management about the Data Warehouse to business requirements and benefits. As atechnologist you may naturally see the benefit to the business, but don’t expect management to make this leap with you.Highlight ways in which the technology will help meet service levels or business goals. 1 Session 387
  2. 2. DatabaseARTICULATING BENEFITS OF A DATA WAREHOUSE MIGRATIONBenefits of a Data Warehouse migration must be tailored to the specific business requirements of the organization inquestion. There are a number of general areas where Oracle Database 11g is particularly strong and will likely stand out asproviding significant benefit in any circumstance.• Scalability While RAC will be the obvious key point around scalability, RAC is actually only part of the solution. Consider how the Locking Model in Microsoft SQL Server enforces one uncommitted writer to block all readers of the same row until the writer commits. Oracle Database 11g on the other hand allows readers to proceed with reading all committed changes up to the point in time where their query began. The latter is the more sensible behaviour in a large scale Data Warehouse. Microsoft SQL Server will choose to perform a lock escalation during periods of peak load, in the worst case scenario causing an implicit lock of the temporary space catalog, effectively blocking all work until this escalation is cleared. Oracle on the other hand has no such concept of escalating a row level lock. Again, the latter behaviour will provide superior service in a busy multi-user Data Warehouse. Also evaluate the mature Workload Balancing capabilities of Oracle Database 11g to allow preferential treatment of priority queries.• Availability RAC is of course the key availability feature in Oracle 11g. Be sure to also consider Flashback capabilities which can allow for much faster recovery from data corruption than a traditional backup/restore model. Evaluate other availability issues in your environment; for example, perhaps your external stored procedures crash your SQL Server because they run in-process unlike Oracle extprocs which run safely out-of-process.• Environment Capability PL/SQL is a fully featured language based on ADA and as such may simplify development within your environment. The Package concept allows for code encapsulation and avoids global namespace bloat for large and complex solutions. Advanced Data Warehousing features such as Materialized Views may greatly simplify your ETL processes and increase responsiveness and reliability.• Maintainability Oracle Enterprise Manager is a mature and fully featured management console capable of centralizing the management of a complex data warehouse infrastructure. Your current environment may involve some amount of replication that was put in place to address scalability. Consider how RAC could lower maintenance costs or increase data quality by eliminating data replication.• Fit with strategic deployment Perhaps your organization is implementing new strategic products or services that leverage the Oracle Database. Should this be the case, be sure to align your recommendations to these strategies as this could be your strongest and best understood justification. 2 Session 387
  3. 3. DatabaseEXECUTING A DATA WAREHOUSE MIGRATIONIf you are lucky enough to manage a Data Warehouse that has a single front-end such as an IT managed Business Intelligencelayer, then it may be possible for you to plan a “Big Bang” migration. More likely, however, your Data Warehouse hasdozens or hundreds of direct consumers, ranging from Business Unit developed Microsoft Access links to complex customlegacy applications. Given this circumstance, a phased migration approach must be taken over a longer period of time. Youwill need to expose a new Data Warehouse technology that can be built against while continuing to support the legacy DataWarehouse containing a synchronized data set. This paper outlines three Oracle Database capabilities that are key to thesuccess of a seamless large-scale Data Warehouse Migration: Oracle Migration (Workbench), Transparent Gateway (DataGateway as of Oracle 11g), and Oracle Streams Heterogeneous Replication.ORACLE MIGRATION (WORKBENCH)The Oracle Migration features of Oracle’s SQL Developer (formerly known as Oracle Migration Workbench, hereafterreferred to as such for clarity) can help fast-track Microsoft Transact-SQL to Oracle PL/SQL code migration. Be awarethough that a machine will only do a marginal job of translating your code. The translator doesn’t know how to do things“Better” with PL/SQL than was possible with Transact-SQL. The resulting code product almost certainly will not conformto your coding standards in terms of variable naming, formatting, or syntax. You need to ask yourself the difficult question asto whether the effort or time saved in using Oracle Migration Workbench is worth the cost of compromising the quality ofthe new Data Warehouse code base.Executing the Oracle Migration Workbench is as simple as downloading the necessary Microsoft SQL Server JDBC driver,adding it to Oracle SQL Developer, creating a connection to the target SQL Server, and executing the “Capture MicrosoftSQL Server” function, as shown below Figure 1 : Oracle Migration – Capturing existing code 3 Session 387
  4. 4. DatabaseOnce Oracle captures the model of the target SQL Server, you will be able to view all of the Transact-SQL code. The samplebelow shows a captured Transact-SQL stored procedure that employs a temporary table and uses a number of Transact-SQLfunctions such as “stuff” and “patindex” Figure 2 : Oracle Migration – Existing code capturedUsing Oracle Migration Workbench to convert this Transact-SQL to Oracle PL/SQL produces this result: Figure 3 : Oracle Migration – Translated CodeNotice that the Temporary table is converted to the necessary DDL that will create the analog Oracle Global TemporaryTable. The name, however, may be less than desirable because tt_ as a prefix does not necessarily conform to your namingstandards. Furthermore, the Global Temporary Table is now global to the target schema and should probably have a bettername than the Transact-SQL table “Working” which was isolated to the scope of the single stored procedure. Also noticethat because there are often subtle differences in the built-in Transact-SQL functions as compared to similar PL/SQLfunctions, the Oracle Migration Workbench creates a package of functions called “sqlserver_utilities” to replicate thebehaviour of the Transact-SQL functions precisely. Again, this might not be the best choice for a new code base. 4 Session 387
  5. 5. DatabaseOracle Migration Workbench can also be used to migrate tables, data, and other schema objects. Taking this approachhowever considerably limits your ability to rework the data model in the new Oracle Data Warehouse. Using OracleMigration Workbench to migrate tables and data is not well suited to a “Parallel support” model where both the legacy DataWarehouse as well as the new Oracle Data Warehouse will be kept in sync as applications are migrated. The remainingsections of this paper describe an alternate approach to table and data migration that provides a seamless and paced migrationpath.TRANSPARENT GATEWAY (DATA GATEWAY)Oracle Transparent Gateway (branded Data Gateway as of 11g; this paper uses Transparent Gateway to avoid confusion withthe general word “Data”) is an add-on product for Oracle Database that provides access to foreign data stores via DatabaseLinks. Transparent Gateway is similar to Heterogeneous Services (included as part of the base Oracle Database license),however, Transparent Gateway is built for specific foreign targets and as such enables features not available in HeterogeneousServices such as foreign Stored Procedure calls and Heterogeneous Streams Replication.VIEWS EMPLOYING TRANSPARENT GATEWAYOne tool that can be used to fast-track the usefulness of your new Oracle Data Warehouse is to employ Views that linkdirectly to the legacy data store. This approach can be used for key tables that will require some planning and time to fullymigrate, and yet the availability of these tables will greatly influence the adoption of the new Oracle Data Warehouse. Onceyou consider the new table name and column names that meet with your standards, a View can be created similar to thefollowing example:CREATE OR REPLACE VIEW PLAY.VALUE_TABLE_SAMPLE ASSELECT "IDENTIFIER" AS ID_, "VALUE" AS VALUE_, FILE_DATE AS FILE_DATEFROM SampleLegacyTable@MSSQL Figure 4 : Oracle View to Legacy tablePerhaps your naming standards suggest that the prefix VIEW_ should be used for all Views. Keep in mind though that thisView is destined to become a physical table on Oracle once the data population process can be moved and a synchronizationstrategy employed. This paper will assume some sort of ETL process is used for data population, but even transactionaltables can be considered for a staged migration using this approach so long as the locking model of the legacy system isconsidered carefully.PITFALLS OF TRANSPARENT GATEWAY (VIEWS)When developing queries against a View that uses Transparent Gateway such as the one shown above, it is important toremember that these Views are meant as a stop-gap measure. Creating complex queries against these sorts of Views is a riskyventure. For example, consider the following query: 5 Session 387
  6. 6. DatabaseDECLARE tDate DATE := 2008-12-31;BEGIN INSERT INTO PLAY.TEMP_SAMPLE_7445 (ID_, NAME_, PREV_VALUE, CURR_VALUE, VALUE_SUPPLIER, DATE_VALUE_CHANGED) SELECT ID_, , , , SAMPLE, MAX(FILE_DATE) FROM PLAY.VALUE_TABLE_SAMPLE WHERE FILE_DATE <= tDate GROUP BY ID_;END; Figure 5 : Complex use of View can cause internal errorsThis query, because it inserts to a table, selects a constant, uses an aggregate function, filters using a variable, and employs aGroup By clause, throws an ORA-03113: end-of-file on communication channel error (Alert log shows ORA-07445:exception encountered: core dump [intel_fast_memcpy.A()+18] [ACCESS_VIOLATION] [ADDR:0x115354414B][PC:0x52A9DFE] [UNABLE_TO_READ] [])While this particular problem is fixed in Oracle Database patch 10 and patch 7, getting this patch fromOracle took several months. The example is meant to illustrate that queries of increased complexity have a higher likelihoodof failing or hanging. Keeping this in mind, Views over Transparent Gateway can be a powerful tool to bridge dataavailability gaps in the short-term.STORED PROCEDURES EMPLOYING TRANSPARENT GATEWAYIn a similar vain to creating pass-through Views to quickly expose legacy data to the Oracle Data Warehouse, StoredProcedure wrappers can be created to provide an Oracle PL/SQL entry point for legacy stored procedures. This method canbe particularly useful in preventing the creation of new application links directly to stored procedures within the legacy DataWarehouse when it is not possible to immediately migrate the logic contained within the stored procedure.Consider the following Microsoft Transact-SQL stored procedure:CREATE PROCEDURE dbo.GetScheduleForRange @inStartDate DATETIME, @inEndDate DATETIMEASSELECT DATE, DURATION, SESSION_ID, TITLEFROM NorthWind..COLLABSCHEDWHERE DATE BETWEEN @inStartDate AND @inEndDate Figure 6 : Transact SQL procedure 6 Session 387
  7. 7. DatabaseThe following PL/SQL wrapper produces a simple yet effective Oracle entry point for the legacy procedure above:CREATE OR REPLACE PROCEDURE PLAY.RPT_COLLABORATE_SCHEDULE_RANGE ( inStart_Date DATE, inEnd_Date DATE, RC1 IN OUT SYS_REFCURSOR) IStRC1_MS SYS_REFCURSOR;tDate DATE;tDuration NUMBER;tSession_ID NUMBER;tTitle VARCHAR2(256);BEGIN DELETE FROM PLAY.TEMP_COLLABORATE_SCHEDULE; dbo.GetScheduleForRange@MSSQL(inStart_Date, inEnd_Date, tRC1_MS); LOOP FETCH tRC1_MS INTO tDate, tDuration, tSession_ID, tTitle; EXIT WHEN tRC1_MS%NOTFOUND; BEGIN INSERT INTO PLAY.TEMP_COLLABORATE_SCHEDULE (DATE_, DURATION, SESSION_ID, TITLE) VALUES(tDate, tDuration, tSession_ID, tTitle); END; END LOOP; CLOSE tRC1_MS; OPEN RC1 FOR SELECT DATE_, DURATION, SESSION_ID, TITLE FROM PLAY.TEMP_COLLABORATE_SCHEDULE ORDER BY SESSION_ID;END RPT_COLLABORATE_SCHEDULE_RANGE; Figure 7 : PL/SQL wrapper for legacy procedureRegardless of the complexity of the body of the Transact-SQL stored procedure, a simple wrapper similar to the one abovecan be created using only the knowledge of the required parameters, the structure of the result set and a simple 5 stepformula: 1. Declare Variables for all Transact-SQL Result set columns 2. Call Transact-SQL Procedure 3. Fetch Result one row at a time 4. Insert row to Oracle Temporary Table 7 Session 387
  8. 8. Database 5. Open Ref Cursor result setPITFALLS OF TRANSPARENT GATEWAY (STORED PROCEDURES)Oracle Data Gateway for Microsoft SQL Server version for Windows 32-bit contains a rather serious bug withrespect to calling remote stored procedures that return result sets and actually attempting to retrieve the contents of the resultset. Calling the procedure above using an ODBC driver:{CALL PLAY.RPT_COLLABORATE_SCHEDULE_RANGE(2009-05-06 12:00:00, 2009-05-06 17:00:00)}Results in ORA-06504: PL/SQL: Return types of Result Set variables or query do not match. This bug is not fixed until11.1.0.7 Patch 7 which needs to be applied to the Gateway home (assuming the Gateway is installed in a different Oraclehome than the Database).ORACLE STREAMS AS AN ENABLER OF MIGRATIONProxies for Views and Stored Procedures like the ones shown above can be helpful in making your new Oracle DataWarehouse useful in the early stages of a migration effort. How can you then begin to migrate tables and data to Oracle whilestill providing a transition period for applications where data is equally available in the legacy Data Warehouse? In any caseyou will need to start by developing a new load (ETL) process for the Oracle Data Warehouse. Perhaps you could just leavethe old ETL process running in Parallel. Employing this approach, reconciliation would be a constant fear unless you havepurchased an ETL tool that will somehow guarantee both Data Warehouses are loaded or neither is loaded. A more elegantapproach that won’t overload your ETL support people is to employ Oracle Streams Heterogeneous Replication.Oracle Streams combined with Transparent Gateway allows for seamless Heterogeneous Replication back to the legacy DataWarehouse. Using this approach, the Data Warehouse staff need build and support only one ETL process, and DBA’ssupport Oracle Streams like any other aspect of the Database Infrastructure.ORACLE STREAMS – IF WE BUILD IT, WILL THEY COME?Old habits die hard for Developers and Business users. Legacy systems have a way of surviving for a long time. How canyou motivate usage of the new Oracle Data Warehouse?A strong set of metadata documentation describing how the new model replaces the old model will be a key requirement inhelping to motivate a move toward the new Data Warehouse. Easy to read side by side tables showing the new vs. old datastructures will be welcomed by your developers and users. Try to make these available in paper as well as online form just tomake sure you’ve covered everyone’s preference in terms of work habits. Be prepared to do a series of road-shows to displaythe new standards and a sample of the metadata. You will need to commit to keeping this documentation up to date as yougrow your new Data Warehouse and migrate more of the legacy.Occasionally your development group will find that it can no longer support a legacy application because it is written in alanguage or manner that no one completely understands any more. You need to make sure standards are put in place earlyand have your Architecture Review staff enforcing that the newly designed and engineered application must access only thenew Data Warehouse. Try to prioritize your warehouse migration according to the data assets this re-engineered applicationrequires to minimize exceptions and/or many Views and Stored Procedure proxies.Some Data Warehouse access will never be motivated to migrate by anything other than a grass roots effort from the DataWarehouse group. You may find that Business Unit developed applications have this characteristic. You should be planning 8 Session 387
  9. 9. Databasefor a certain amount of Data Warehouse staff time that will be spent with owners of these (often smaller departmental)solutions to help users re-target their data access to the new Data Warehouse. 9 Session 387
  10. 10. DatabaseOnce in a while, a project will be sponsored that requires a significant overhaul of an application; so much so that the effort isessentially a full re-write. Much like the circumstance of the development group refreshing the technology behind anapplication, you want to be sure that the right members of the project working group are aware of the new Data Warehousestandards. You should try to help them understand the benefits to the project in order to create an ally in assuring that theproper Warehouse is targeted.Finally, completely new solutions will be purchased or built. You should aim to be in the same position to have thesedeployments target the new Oracle Data Warehouse as described in some of the situations above.ORACLE STREAMS – BUILDING A HETEROGENEOUS STREAMWhen building a Heterogeneous Streams setup, the traditional separated Capture and Apply model must be used. Much canbe learned about the architecture of Oracle Streams by reading the Oracle Streams Concepts and Administration manual. In a verysmall nutshell, the Capture Process is responsible for Mining the archive logs and finding/queueing all DML that needs to besent to the legacy Data Warehouse target. The Apply Process takes from this queue and actually ships the data downstreamto the legacy target.In general, Streams is a very memory hungry process. Be prepared to allocate 2 to 4 gigabytes of memory to the StreamsPool. Explicitly split your Capture and Apply processes over multiple nodes if you are employing RAC in order to smooththe memory usage across your environment. The value that Streams will provide to your Data Warehouse migration strategyshould hopefully pay for the cost of the memory resources it requires.ORACLE STREAMS – CAPTURE PROCESS AND RULESThe Capture process is created the same way as any Homogeneous capture process would be and is well described in themanual Oracle Streams Concepts and Administration. This paper will therefore not focus on the creation of the Capture processfurther, except to show a script that can be used to create an example Capture process called “SAMPLE_CAPTURE” and aCapture rule to capture the table “PLAY.COLLABORATE_SCHEDULE”:BEGINDBMS_STREAMS_ADM.SET_UP_QUEUE( queue_table => SAMPLE_STREAM_QT, queue_name => SAMPLE_STREAM_Q, queue_user => STRMADMIN );END;/BEGINDBMS_CAPTURE_ADM.CREATE_CAPTURE( queue_name => SAMPLE_STREAM_Q, capture_name => SAMPLE_CAPTURE, capture_user => STRMADMIN, checkpoint_retention_time => 3 );END;/ 10 Session 387
  11. 11. DatabaseFigure 8 : Oracle Streams – Standard Capture 11 Session 387
  12. 12. DatabaseBEGINDBMS_STREAMS_ADM.ADD_TABLE_RULES( table_name => PLAY.COLLABORATE_SCHEDULE, streams_type => CAPTURE, streams_name => SAMPLE_CAPTURE, queue_name => SAMPLE_STREAM_Q, include_dml => true, include_ddl => false, include_tagged_lcr => false, inclusion_rule => true );END;/ Figure 9 : Oracle Streams – Standard Capture RuleORACLE STREAMS – TRANSPARENT GATEWAY CONFIGURATIONBefore you begin building the Streams Apply process, a Transparent Gateway Database Link must first be in place. Therecommended configuration is to create a separate Database Link for your Streams processes even if you have a DatabaseLink available to applications and users to the same remote target. Doing so allows you to use different permissions for theStreams user (eg. The Streams link must be able to write to remote tables while Applications must not write to these sametables or the replication will become out of sync!), and also provides flexibility in configuring or even upgrading and patchingthe gateway for Streams in a different way than the gateway for applications and users.Creating and configuring the Database Link for Streams is therefore like any other Database Link, except we will make itowned by the database user STRMADMIN. This example shows a link named MSSQL_STREAMS_NORTHWIND thatlinks to the SQL Server Northwind database on a server named SQLDEV2:## HS init parameters#HS_FDS_CONNECT_INFO=SQLDEV2//NorthwindHS_FDS_TRACE_LEVEL=OFFHS_COMMIT_POINT_STRENGTH=0HS_FDS_RESULTSET_SUPPORT=TRUEHS_FDS_DEFAULT_OWNER=dbo Figure 10 : Text file “initLDB_STREAMS_NORTHWIND.ora”CREATE DATABASE LINK MSSQL_STREAMS_NORTHWIND CONNECT TO STRMADMIN IDENTIFIED BY ******** USING LDB_STREAMS_NORTHWIND’; Figure 11 : DDL to create Database Link MSSQL_STREAMS_NORTHWIND 12 Session 387
  13. 13. DatabaseORACLE STREAMS – APPLY PROCESS AND RULESThe Streams Apply Process is where the work to send rows to the Heterogeneous target occurs. Each step in the ApplyProcess and Rules creation/configuration is worth looking at in some detail and so this paper will focus more closely on theApply Process configuration than previous steps.When creating a Heterogeneous Apply Process, a Database Link is named. This means that in the design of your StreamsTopology, you will need to include at least one Apply Process for each “Database” on the target server. This is especiallyimportant to consider when targeting Microsoft SQL Server or Sybase, as a Database in those environments is more like aSchema in Oracle. Below is a script to create a sample Heterogeneous Apply process called“SAMPLE_APPLY_NORTHWIND”:BEGINDBMS_APPLY_ADM.CREATE_APPLY( queue_name => SAMPLE_STREAM_Q, apply_name => SAMPLE_APPLY_NORTHWIND, apply_captured => TRUE, apply_database_link => MSSQL_STREAMS_NORTHWIND);END;/ Figure 12 : Oracle Streams – Heterogeneous ApplyIn a Heterogeneous Apply situation, the Apply Table Rule itself does not differ from a typical Streams Apply Table Rule.Below is an example of an Apply Table Rule that includes the same table we captured in the sections above,PLAY.COLLABORATE_SCHEDULE, as a part of the table rules for the Apply ProcessSAMPLE_APPLY_NORTHWIND.BEGINDBMS_STREAMS_ADM.ADD_TABLE_RULES( table_name => PLAY.COLLABORATE_SCHEDULE, streams_type => APPLY, streams_name => SAMPLE_APPLY_NORTHWIND, queue_name => SAMPLE_STREAM_Q, include_dml => true, include_ddl => false);END;/ Figure 13 : Oracle Streams – Standard Apply Rule 13 Session 387
  14. 14. DatabaseORACLE STREAMS – APPLY TRANSFORMS – TABLE RENAMEThe Apply Table Rename transform is one of the most noteworthy steps in the process of setting up Heterogeneous streamsbecause it is absolutely required, unless you are applying to the same schema on the legacy Data Warehouse as the schemaowner of the table in the new Oracle Data Warehouse. It is more likely that you have either redesigned your schemas to bealigned with the current business model, or in the case of a Microsoft SQL Server legacy you have made Oracle Schemas outof the Databases on the SQL Server, and the legacy owner of the tables is “dbo”. You may also have wanted to take theopportunity to create the table in the Oracle Data Warehouse using more accurate or standardized names. Below is anexample of an Apply Table Rename transform that maps the new table PLAY.COLLABORATE_SCHEDULE to the legacydbo.COLLABSCHED table in the Northwind database:BEGINDBMS_STREAMS_ADM.RENAME_TABLE( rule_name => COLLABORATE_SCHEDULE554, from_table_name => PLAY.COLLABORATE_SCHEDULE, to_table_name => "dbo".COLLABSCHED, step_number => 0, operation =>ADD);END;/ Figure 14 : Oracle Streams – Apply Table Rename ruleNotice that the rule name is suffixed in this example with the number 554. This number was chosen by Oracle in the AddTable Rule step. You will need to pull this out of the view DBA_STREAMS_RULES after executing theADD_TABLE_RULE step, or write a more sophisticated script that stores the rule name in a variable using the overloadedADD_TABLE_RULE procedure that allows this to be obtained as an OUT variable.One final note about the Rename Table transform: it is not possible to Apply to a Heterogeneous target table whose name isin Mixed Case. For example, Microsoft SQL Server allows for mixed case table names. You will need to have your DBA’schange the table names to upper case on the target before the Apply process will work. Luckily Microsoft SQL Server iscompletely case insensitive when it comes to the use of the tables, and so while changing the table names to upper case maymake a legacy “Camel Case” table list look rather ugly, nothing should functionally break as a result of this change.ORACLE STREAMS – APPLY TRANSFORMS – COLUMN RENAMEThe Column Rename transform is similar in nature to the Table Rename Transform. Notice in the example below how acolumn is being renamed because the legacy table contains a column named “DATE” which is completely disallowed inOracle as a column name because DATE is a key word (data type). The same restriction applies to Column names as withTable names in a Heterogeneous Apply configuration: All column names on the target must be in upper case. Again, thisshould have no impact on your legacy code as systems that allow mixed case column names such as Microsoft SQL Server aretypically not case sensitive when using the column. 14 Session 387
  15. 15. DatabaseBEGIN DBMS_STREAMS_ADM.RENAME_COLUMN( rule_name => COLLABORATE_SCHEDULE554, table_name => PLAY.COLLABORATE_SCHEDULE, from_column_name => "DATE_", to_column_name => "DATE", value_type => *, step_number => 0, operation => ADD);END;/ Figure 15 : Oracle Streams – Apply Column Rename ruleORACLE STREAMS – EXERCISING THE STREAMAssuming we have tables set up in both the legacy and new Data Warehouse that have only the table name and one columnname difference in terms of structure, the steps above are sufficient to now put the Stream into action. Streams has no abilityto synchronize tables that are out of sync. Before setting up the Stream you must ensure that the table content matchesexactly. Let’s assume for now that you are starting with zero rows and plan to insert all the data after the Stream is set up.The screenshot below illustrates for this example that the legacy target table on Microsoft SQL Server is empty: Figure 16 : Oracle Streams – Empty Microsoft SQL Server target tableBelow is a rudimentary script showing the execution of some seed data being inserted into the Oracle table. You would ofcourse want to use a more sophisticated approach such as SQL*Loader, however, this sample is meant to be simple for thepurposes of understanding and transparency: 15 Session 387
  16. 16. DatabaseSQL> INSERT INTO PLAY.COLLABORATE_SCHEDULE (DATE_,DURATION,SESSION_ID,TITLE) VALUES (2009-05-06 08:30:00,60,359,Oracle Critical Patch Updates: Insight and Understanding);1 row insertedSQL> INSERT INTO PLAY.COLLABORATE_SCHEDULE (DATE_,DURATION,SESSION_ID,TITLE) VALUES (2009-05-06 11:00:00,60,237,Best Practices for Managing Successful BI Implementations);1 row insertedSQL> INSERT INTO PLAY.COLLABORATE_SCHEDULE (DATE_,DURATION,SESSION_ID,TITLE) VALUES (2009-05-06 12:15:00,60,257,Best practices for deploying a Data Warehouse on Oracle Database 11g);1 row insertedSQL> INSERT INTO PLAY.COLLABORATE_SCHEDULE (DATE_,DURATION,SESSION_ID,TITLE) VALUES (2009-05-06 13:30:00,60,744,Business Intelligence Publisher Overview and Planned Features);1 row insertedSQL> INSERT INTO PLAY.COLLABORATE_SCHEDULE (DATE_,DURATION,SESSION_ID,TITLE) VALUES (2009-05-06 15:15:00,60,387,Migrating a Data Warehouse from Microsoft SQL Server to Oracle 11g);1 row insertedSQL> INSERT INTO PLAY.COLLABORATE_SCHEDULE (DATE_,DURATION,SESSION_ID,TITLE) VALUES (2009-05-06 16:30:00,60,245,Data Quality Heartburn? Get 11g Relief);1 row insertedSQL> COMMIT;Commit complete Figure 17 : Oracle Streams – Inserting to the new Oracle Data Warehouse TableAllowing the Capture and Apply processes a few seconds to catch-up, re-executing the query from above on the legacy DataWarehouse shows that the rows have been replicated through Streams to the target. Figure 18 : Oracle Streams – Populated Microsoft SQL Server target Table 16 Session 387
  17. 17. Database ORACLE STREAMS - STREAMS SPEED AND SYNCHRONIZING TABLES IN ADVANCEWhile the capabilities of Oracle Streams to be able to seamlessly replicate data to a heterogeneous legacy target arephenomenal, Streams and especially Heterogeneous Streams over Transparent Gateway won’t be knocking your socks off interms of speed. At best, with today’s hardware, you will see 500-600 rows per second flowing through to the target. In a fullybuilt up Data Warehouse, you’re more likely to see 100-200 rows per second. Hopefully you’ll be able to engineer your ETLprocesses so that this limited speed won’t be an issue due to the incremental nature of the Data Warehouse. But let’s say yourData Warehouse table needs to be seeded with 2 million rows of existing data. The smarter way to start in this case is tosynchronize the tables before setting up the Stream. This approach comes with some extra considerations outlined below.ORACLE STREAMS – EMPTY STRINGS VS. NULL VALUESMicrosoft SQL Server treats empty strings as distinct from NULL values. Oracle on the other hand does not. If yousynchronize your tables outside of Streams, you must ensure there are no empty strings in the Microsoft SQL Server databefore doing so. If you find a column that contains empty strings, there may be some leg work required in advance to makesure there are no consuming systems that will behave differently if they see a NULL instead of an empty string.ORACLE STREAMS – SYNCHRONIZING TABLES CONTAINING FLOATSOne of the simplest ways one can imagine to synchronize the new Oracle Data Warehouse table with the legacy table is to usean Insert/Select statement to select the data from Transparent Gateway and insert the data to the Oracle target. A setoperation via Transparent Gateway will after all work orders of magnitude faster than Streams operating row by row.Unfortunately, if your data contains Float or Real columns in Microsoft SQL Server, this method will not work due to alimitation in Transparent Gateway. This limitation is best illustrated with an example. Below is a sample of a couple offloating point numbers being inserted to a Microsoft SQL Server table. Notice the final two digits of precision: Figure 19 : Oracle Streams – Floats in Microsoft SQL ServerNow have a look at the very same table selected via Oracle Transparent Gateway. Notice how in either case, using the defaultdisplay precision or explicitly forcing Oracle to show us 24 digits of precision, the last two digits of precision are missingwhen compared to the Select done straight on the SQL Server above: 17 Session 387
  18. 18. Database Figure 20 : Oracle Streams – Floats over Transparent GatewayA fact that is unintuitive and yet undeniably clear once you begin working with Heterogeneous Streams: the manner in whichOracle Streams uses Transparent Gateway will require the digits of precision that are missing from the Gateway Selectstatement. If you were to sync up the table shown above to an equivalent Oracle table using an Insert/Select overTransparent Gateway, set up a Capture and Apply process linking the tables, and finally delete from the Oracle side, theStreams Apply Process would fail with a “No Data Found” error when it went to find the SQL Server rows to delete.The most reliable way to synchronize the two sides in preparation for Streams is to extract the rows to a comma separatedvalue file, and then use SQL*Loader to import the data to Oracle. Below is an example of using Microsoft DTS to generatethe CSV file, and beside that proof that the CSV file contains all required digits of precision: Figure 22 : CSV file produced by DTS contains full precisionFigure 21 : Microsoft DTS used to extract seed data to CSV 18 Session 387
  19. 19. Database19 Session 387
  20. 20. DatabaseCONCLUSIONCommitting to a new Data Warehouse technology is a difficult decision for an organization to make. The effort in executingthe migration is costly in terms of time and resources. Remember to respect these facts when making your case tomanagement. Remain confident in your recommendations and plan, but unfold these in a paced fashion that allows you timeto build your message and allows those around you the space to come to terms with the requirements.While migration tools such as Oracle Migration Workbench can help with the migration of certain Data Warehouse assets, thebigger challenge comes in executing a seamless migration over a period of time. Focus on your strategy to enable a newOracle Data Warehouse while maintaining reliable service to the legacy over your parallel period.Employ tools such as Oracle Transparent Gateway or Oracle Heterogeneous Streams to enable your migration strategy, butbe prepared to weather the storm. Because these products are more niche than the core features of the Oracle Database,limitations and product bugs will surface along the way.Finally, old habits will be hard to break for your developers and business users. Be sure to consider the standards, metadata,education, and mentoring that your consumers will require in order to make your new Oracle Data Warehouse deployment anoverwhelming success. 20 Session 387