Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

3,075 views
2,931 views

Published on

Event: Oracle Technology Day 2011
Date: 20.10.2011
Place: Nordic Hotel Forum
Country: ESTONIA

Published in: Technology, Business
1 Comment
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
3,075
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
33
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

Oracle data integrator in swedbank EDW - Rein Adamson ja Mart Tudre

  1. 1. Oracle Data IntegratorETL software in Swedbank EDW2007 – 2011Mart Tudre – Swedbank Baltic DW architectRein Adamson – Project Manager © Swedbank
  2. 2. Agenda• EDW - Enterprise Data Warehouse – EDW, BI definitions – Swedbank Baltic DW - general facts• ETL software evaluation 2007 – ETL Software evaluation and Proof of Concept 2007 – ODI Implementation project – User roles today• ODI implementation in Swedbank Baltic DW – ODI defining features – Usage specifics and custom components 2© Swedbank
  3. 3. Data WareHouse – a definition • A data warehouse is a repository of an organizations electronically stored data, designed to facilitate reporting and analysis. • An expanded definition for data warehousing includes tools for – business intelligence – extracting, transforming and loading data into the repository – to manage and retrieve metadata. Business intelligence - computer-based techniques used in spotting, digging-out, and analyzing business data Source: wikipedia.org ETL – Extract, Transform, Load EDW – Enterprise Data Warehouse (also IT org.unit in Swedbank) 3 © Swedbank
  4. 4. Business Intelligence functions • predictive analytics (statistics, data mining) • online analytical processing (OLAP) • business performance management • benchmarking • text mining • reporting 4© Swedbank
  5. 5. Data Warehouse architecture Analytical Users Replication Enterprise Data Warehouse, Integrated Data Marts Data Transformation Operational Data Source Business Users 5© Swedbank
  6. 6. Data flowsAnalytical services FM RM CM CB Data delivery P A R T Y A S S E T T h i n g s p a r ti e s A G R E E M E N T h a v e a n i n te r e s t i n th a t h a v e v a lu e . A c o n tr a c t o r a n y ty p e P A R T Y o f a g r e e m e n t o f in te r e st b e tw e e n P a r tie s. A n in d iv id u a l, b u sin e ss o r g r o u p o f in d iv id u a ls o f i n t e r e s t to t h e f i n a n c i a l i n s ti t u t i o n . F IN A N C E T h e i n te r n a l a c c o u n ti n g o f th e b u sin e s s. E V E N TCentral data P R O D U C T S o m e th in g o f in te r e st th a t Data store A n y m a r k e ta b l e p r o d u c t h a p p e n e d th a t m a y o r m a y o r se r v ic e in c lu d in g te r m s, n o t in v o lv e c o n ta c t w ith th estore c o n d i ti o n s a n d f e a tu r e s . c u sto m e r . I N T E R N A L O R G A N I Z A T IO N A P a r t y th a t i s a u n it o f b u s in e ss. C H A N N E L T h e v e h ic le b y w h ic h a p a r ty m a y in te r a c t L O C A T IO N w i th th e f i n a n c i a l i n s t i tu t i o n . A p h y sic a l a d d r e s s, C A M P A IG N e le c tr o n ic a d d r e ss A c o m m u n ic a tio n p la n to o r g e o g ra p h ic a l a re a . d e liv e r a m e ssa g e . Data aquisitionSource systems LOAN DEPOSIT CARDS LEASING GL ... 6 © Swedbank
  7. 7. Swedbank Baltic DWSwedbank Baltic Data Warehouse (EDW) is a subject oriented,integrated, time-variant, non-volatile collection of enterprise data. – Subject Oriented: Information is organized by subject areas instead of business line specific source system data structure. Subject areas are Party, Product, Agreement, Channel, Organization, Event etc. – Integrated: Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole under unified governance by using agreed dimesions, such as party Product, Agreement, Channel, Organisation etc. – Time-variant: All data in the data warehouse is identified with a particular time period. DW stores history. – Non-volatile: Data in the data warehouse is usually not over-written or deleted. Once committed, the data is read-only, and retained for future reporting and analysis. – Detailed: The granuality is detailed business events. – Based on reference industry model: Teradata financial services logical datamodel. 7 © Swedbank
  8. 8. Multiple usage of data warehouse• Different business services have different requierements for – data availability frequency and timing (e.g daily 6 am, daily 6 pm, monthly 1 day 8 am) – data quality (some services have near 0 tolerance to errors) – performance and workload 8© Swedbank
  9. 9. Enterprise model (High level) ASSET PARTY FINANCE Items that belong to Items that belong to LOCATION parties and which have parties and which have The internal accounting A geographic or spatial An individual or value. The internal accounting value. of the business area, physical address group of individuals. of the business or electronic address. CAMPAIGN AGREEMENT EVENT A communication plan directed at parties or a A contract or deal between Financial or non-financial market for a purpose. parties that is of interest. event which may involve contact with the customer. CHANNEL PRODUCT INTERNAL ORGANIZATION The vehicle by which a Any marketable or customer interacts with A unit of business within the tradable product the Financial financial institution or insurance or service including institution/insurance company. Is a type of Party. terms and conditions. company. 9 © Swedbank Not all relationships are shown 9
  10. 10. Swedbank Baltic DW StatisticsExternal• 30 source systems (containing 1000 source objects)• 50 business services• 75 employees in Baltic DWInternal• 20 Terabytes of storage (planned for 2012 50 TB)• 650 objects in main data store• 500 ETL processes• 4000 database objects• 40 database schemas in DW• 245 direct db db users, 500 reporting users 10 © Swedbank
  11. 11. How to manage• everyday operations• developement• testing• releasing• migration (both technical and business)• etl workflow optimisationAnswer: Using Enterprise metadata system needed 11© Swedbank
  12. 12. ETL is part of METATADA Enterprise Metadata 12© Swedbank
  13. 13. ETL Software evaluation and POC 2007Rein Adamson – project manager• Request for Proposal to 4 Vendors• 2 Vendors selected for Proof of Concept (POC) – Oracle “ODI” – Informatica “PowerCenter” (ETL market leader)• POC budget 20 kEUR• Evaluation process duration 5-8 months: – 2 m RFP and 2 Vendors selection for POC – 4 m POC preparation – 1 m POC action + results to management decision – 1 m License and Implementation Contract with Winner 13© Swedbank
  14. 14. POC- Proof Of Concept 2007• POC budget 10 kEUR per Vendor included: – 1 day system installation on bank IT infrastructure – 2 days preparation before arrival (5 tasks sended) – 5 days onsite consultant• POC scope in 5 days with consultant: – 1 day: Training to POC team ( 5 persons ) – 2,3,4 day: guidance to team for 5 ETL tasks development – Last day: 2 hrs demo to IT managers 14© Swedbank
  15. 15. POC Loading tasks scenarios• 3 days to complete 5 ETL tasks• 1 task for each POC team member. Experienced DWH specialists: developer, analyst, DBA, Admin, 2 architects• Consultant was a trainer to support our specialistsTASKS CONTENT:• Task 1 – Agreement loading (incl. Historisation)• Task 2 – Trigger filled to history table (incl.Country context)• Task 3 – Rows to Columns and vice versa• Task 4 – Aggregation within Teradata• Task 5 – Bank transactions(events) loading – from 3 sources into 1 target, capacity perfomance test 7 million row 15 © Swedbank
  16. 16. KSF - Key Success Factors evaluated• Reusability and standardization of loadings (high)• Impact analysis on attribute level• Resources for EDW services performance• Release deployment and configuration• Functionality of metadata repository (medium priority)• Improve EDW development process• EDW loading and calculation workflow management• Faster analysis stage of development task• Faster process and error maintenance (low priority) 16 © Swedbank
  17. 17. Reusability and standardization of loadingpatterns.Flexibility of loading templates. Customizable, but robust. Target is toshorten time of development by reusing excisting patterns.• ODI • INFORMATICA• All the objects in ODI are reusable • Templates are fixed source/target because of substitution method templates. Technical options are used. integrated with business logic.• ELT Architecture supports todays • It is possible to create reusable skill sets components but, while doing tasks it was clear that at one point it• Business and technical information easier to start from blank page.... has been separated from data load logic. • 1,8 points out of 3• 2,8 points out of 3 17 © Swedbank
  18. 18. Release deployment and configurationTime and understanding of maintenance and deployment new loadingprocedures. Easier and faster release management. ODI IFA • Topology is transparent and easily • Topology is not clear and understandable, transparent • Monitoring is at necessary detail • Release complexity can grow to level together with debugging, estimations where it is comparable • No additional environments to todays situation, needed, information is moving • Monitoring and debugging is between repositories only, available at high level until steps • Versioning with install/rollback have been completed, no functionality is available. intermediate access, • ... • Country based approach is not • 2,6 points out 3 supported in central repository. • 1,8 points out of 3 18 © Swedbank
  19. 19. POC results summary comment – ODI utilizes the existing infrastructure. There is no (new) proprietary transformation server/database. This tool is utilizing Source and Target database engine and their tools to unload/load data and transform the data. It is transparent. No need for highly new skills and more specialists. – Informatica brings in totally new technology, additional specialists needed, more trainings and consultancy to buy. 19© Swedbank
  20. 20. KSF evaluation points (max 3) ODI IFA 1,0 1,5 2,0 2,5 3,0 1.Reusability and standardisation of loadings 2.Impact analysis on attribute level 3.Resources for EDW services perfomance4.Release deployment and configuration 5.Functionality of metadata repository 6.Improve EDW development process7.EDW loading and calculation workflow management8.Faster analysis stage of development task 9.Faster process and data error maintenance 20© Swedbank
  21. 21. ODI implementation 2007sept - 2008 sept • Oracle ODI partner consultancy used – 1 standard training in 4 days , 10 persons in class – 1 onsite visit in 2 days (consultant from Italy) – 5 days off-site consultancy during 3 months (Poland) – 5 Oracle support cases • Customer resource – 1 experienced ETL developer assigned 100% in 1 year – • Custom solutions design and implementation: – ETL Process registry design and development (2 months duration) – Common Wrapper development (3 months) – Process Registry and Common Wrapper testing, debugging (2 m) – ODI release process procedures implementation (2 m) 21 © Swedbank
  22. 22. 83 active ODI Users today • 59 users in EDW (71%), 22 users in CRM area (27%) • 35 Analyst-Developers; 16 SQA-s. Dev+SQA=61% Sys.admin-DBA App.admin CRM other manager EDW LOANS Implementator Service Manager SQA Developer 0 5 10 15 20 25 30 35 40 22 © Swedbank
  23. 23. Oracle Data Integrator© Swedbank
  24. 24. Oracle Data Integrator• Oracle Data Integrator is a comprehensive• data integration platform that covers all data integration requirements from high-volume, high-performance batch loads, to event-driven, trickle-feed integration processes, to SOA-enabled data services. ODI is Oracle’s Strategic Product for Data Integration • Heterogeneous E-LT Architecture • Optimized Connectivity Architecture • Modular Implementation Architecture • SOA-Native Architecture 24 © Swedbank
  25. 25. ODI Component Architecture 25 © Swedbank
  26. 26. Repository Set-Up Pattern Security Create and archive versions of models, projects and Topology scenarios Versioning Import released and tested versions Master of scenarios for production Repository Models Projects Import released versions of Execution models, projects and scenarios for testing Work Repository Models (Development) Execution Projects Execution Execution Repository (Production) Work Repository (Test & QA) 26 © Swedbank Development – Test – Production Cycle
  27. 27. E+LT approach 27 © Swedbank
  28. 28. ORDER CL_ PARTY CL _BANK_ACCO UNT Acco unt _Nbr : VARCHAR( 35) CL_CO NTRACT Pa rty _Id: INTEG ER ORDER NUMBER In dividua l_Or gan izat ion_ Code : SM ALL INT Acco unt _Nbr _M odifie r: SMAL LINT Acc oun t_Nb r: VARCHAR(35 ) Acc oun t_Nb r_M od ifier: SM ALLINT H T A Y C UN OS _P RT _A CO T H T_ A Y E T OS P RT _R LA ION ORDER DATE L ifecy cle_ Code : SM ALL INT Pr ima ry_ Host _Cus tom er _Nbr : VARCHAR( 20) Acco unt _Cur ren cy_ Code : CHAR( 3) Acco unt _Pro duc t_T ype _Cod e: SMAL LINT Acc oun t_T ype _Cod e: SMAL LINT Pr ima ry_ Host _Id: SM ALLINT Acct _Sta tus _Ty pe_ Code : SM ALL INT Pro duc t_Id : INT EGER H st_ID(FK o ) H st_ ( o ID FK) STATUS F ull_ Nam e: VARCHAR(24 0) Sh ort _Nam e: VARCHAR(7 0) Acco unt _Reg istr atio n_Da te: DATE Acco unt _Op en_ Date : DAT E Acc oun t_Cu rre ncy _Cod e: CHAR(3) Acc oun t_Pr odu ct_ Typ e_Co de: SM ALLINT F irs t_Na me : VARCHAR( 70) Acco unt _M atu rity _Dat e: DATE Acc t_St atu s_T ype _Cod e: SMAL LINT Id tifica _ r (F en tion Nb K) Id tifica _ r (FK en tion Nb ) ORDER ITEM BACKORDERED M idd le_Na me : VARCHAR( 70) L ast _Nam e: VARCHAR(7 0) Acco unt _Clos ing_ Date : DAT E Owne r_Pa rty _Id: INTEG ER Acc oun t_Re gist rat ion_ Date : DAT E Acc oun t_Sig n_Da te: DATE A u b (FK cco nt_N r ) R late Ide tifica _N r (FK e d_ n tion b ) QUANTITY Cu sto me r_Re side ncy _Cod e: SMAL LINT Id ent ificat ion_ Nbr: VARCHAR(2 0) Ma nag er_ Part y_Id : INT EGER Ope n_Pa rty _Id: INTEG ER Acc oun t_O pen _Dat e: DATE Acc oun t_M at urit y_Da te: DATE Acc oun t_Clo sing _Dat e: DATE A u b o cco nt_N r_M difier (FK) R late H Id(FK e d_ ost_ ) CUSTOMER Pa rty _Sta rt_ Date : DAT E Re side ncy _Cou ntr y_G eog _Are a_Id : INT EGER Ope n_Ch ann el_Id : INT EGER Ope n_Us er_ Code : VARCHAR( 16) Acc oun t_Na me : VARCHAR( 100 ) Own er_ Part y_Id : INT EGER Bir th_ Date : DAT E Acco unt _Cha nge _Dtim e: TIM ESTAM P(0 ) S rt_ te ta Da S rt_ te ta Da CUSTOMER NUMBER L ega l_Reg istr atio n_Da te: DATE Cu sto me r_T ype _Cod e: SMAL LINT Acco unt _Cha nge _Lo ad_ Dtim e: TIM ESTAM P(0) Las t_Re newa l_Dat e: DATE Qu ota tion _Id: INTEG ER Por tfolio _Cha nne l_Id: INTEG ER Ad dre ss_ Use_ Code : SM ALL INT Ter m_ Perio d_Co de: SM ALLINT Affiliat ion_ Part y_Id : INT EGER E d_ ate n D CUSTOMER NAME Ad dre ss_ Line : VARCHAR( 140 ) Ter m_ Perio d_Va lue: INTEG ER M ana ger _Par ty_ Id: INTEGER E d_ a n D te ORDER ITEM SHIPPED Cit y_Na me : VARCHAR( 30) Depo sit_ Inte res t_Ra te: DECIM AL(8 ,3) App licat ion_ Ope n_Da te: DATE CUSTOMER CITY Po sta l_Cod e: VARCHAR(20 ) Ph one _Nbr _1: VARCHAR(2 0) Actu al_In ter est _Rat e: DECIMAL (8, 3) Depo sit_ Inte res t_Am t: DECIM AL(1 8,2 ) Op en_ Chan nel_ Id: INTEGER Op en_ Part y_Id : INT EGER CUSTOMER POST QUANTITY Ph one _Nbr _2: VARCHAR(2 0) Ele ctr onic _Add res s: VARCHAR(50 ) Depo sit_ Acco unt _Am t: DECIMAL (18 ,2) Actu al_De pos it_Am t: DECIM AL(1 8,2 ) Op en_ User _Cod e: VARCHAR(16 ) Hint er_ Part y_Id : INT EGER R 4 /37 CUSTOMER ST SHIP DATE M an age r_Pa rty _Id: INTEG ER Auto _Pro long _Ind : SM ALL INT Selle r_Pa rty _ID: INTEGER R 78 /3 R 79 /3 F ax _Nbr : VARCHAR( 20) Cit y_G eog _Are a_Id : INT EGER Auto _Pro long _Per iod_ Code : SM ALL INT Auto _Pro long _Per iod_ Value : SM ALL INT Gr oup _Acc oun t_Ch ild_In d: CHAR(1) Con tra ct_ Stat us_ Typ e_Co de: SM ALLINT H T_ AR Y N IFIC IO H T Y OS P T _IDE T AT N_ IS OR CUSTOMER ADDR St ate _Ge og_ Area _Id: INTEG ER Auto _Pro long _End _Dat e: DATE Cur ren t_Ac cou nt_ Nbr: VARCHAR(3 5) Cur ren t_Ac cou nt_ Nbr_ Mo difier : SM ALL INT H T A Y OS _P RT Se gm ent _Id: INTEG ER Affilia tion _Seg me nt_ Id: INTEGER Prem at ure _Te rm inat ion_ Ind: SM ALLINT Prem at ure _Te rm inat ion_ Rate _Ind : SM ALL INT Pro duc t_Pa ram 1_ Code : INT EGER H st_ ( o ID FK) CUSTOMER PHONE ITEM Affilia tion _Par ty_ Id: INTEGER Inte res t_Ca lc_M et hod _Cod e: SMAL LINT Pro duc t_Pa ram 2_ Code : INT EGER MA TE _P R S R A TY Ho st_ID CUSTOMER FAX Ho me bra nch _Cha nne l_Id: INTEG ER Inte res t_Ac cou nt_ Nbr: VARCHAR(3 5) Pro duc t_Pa ram 3_ Code : INT EGER Acc oun t_Ch ang e_Dt ime : T IMEST AMP( 0) Id tifica _ r (FK en tion Nb ) ITEM NUMBER SIC_ Code : VARCHAR( 10) SIC_ Gro up_ Code : SM ALL INT Inte res t_Ac cou nt_ Nbr_ Mo difier : SM ALL INT Fu nd_ Rate _Pct : DECIM AL( 16, 9) Acc oun t_Ch ang e_L oad _Dtim e: TIM ESTAM P(0 ) Ma r_P rty_ ste a ID S rt_ te ta Da R 72 /3 Id tificatio N en n_ br L ega l_Str uct ure _Cod e: SMAL LINT Affiliatio n_Pa rty _Id: INTEG ER M IS_Pro duc t_Id : INT EGER Int ere st_ Rate _Pct : DECIM AL( 8,3 ) QUANTITY Em plo yee s_Cn t: INTEGER Sy ste m_ Abus e_T ype _Cod e: SMAL LINT Gro up_ Acco unt _Child _ind : CHAR( 1) Cont rac t_St atu s_T ype _Cod e: SMAL LINT Bas e_Ra te_ Pct: DECIM AL(8 ,3) L ang uag e_De mo g_Va lue_ Id: INTEGER Data _Valid atio n_Re sult _Cod e: SMAL LINT Int ere st_ Inde x_Co de: SM ALLINT DESCRIPTION Ed uca tion _Dem og _Valu e_Id : INT EGER Prod uct _id: INTEG ER R/370 M ste P ID(FK a r_ arty_ ) So cial_ Stat us_ Dem og_ Value _Id: INTEG ER M ar ital_ Stat us_ Dem og_ Value _Id: INTEG ER Port folio_ Cahn nel_ Id: INTEGER Mis _Pro duc t_Id : INT EGER E d_ ate n D De pen dan ts_ Cnt: INTEG ER Pa ren t_In ter nal_ Org _Par ty_ Id: INTEGER Port folio_ Chan nel_ Id: INTEGER Depo sit_ Rene wed_ Ind: CHAR(1 ) Pa rty _Cha nge _Dtim e: TIM ESTAM P(0 ) Addit iona l_Int ere st_ Rate : DECIM AL( 8,3 ) Pa rty _Cha nge _Lo ad_ Dtim e: TIM ESTAM P(0) Inte res t_Dis bm _Ty pe_ Code : SM ALL INT Bir th_ Coun try _Ge og_ Area _Id: INTEG ER Depo sit_ Ter min atio n_Ra te: DECIM AL(8 ,3) G end er_ Code : CHAR( 1) Curr enc y_Co nv_Ind : CHAR( 1) Pa rty _Sta tus : SM ALL INT Invest me nt_ Prod uct _id: SM ALLINT 28© Swedbank
  29. 29. ODI Topology usage example• Logical schema is mapped thru Context to Physical Server and Physical SchemaLOGICAL SCHEMAS Logical Schema: CORE_CARD Logical Schema: DW_MAIN CONTEXT: PROD_EE CONTEXT: PROD_LV CONTEXT: PROD_LV CONTEXT: PROD_GR CONTEXT: PROD_EE ODI Server Name: PROD_CORE_EE ODI Server Name: PROD_CORE_LV ODI Server Name: PROD_DW_GR Server Name: TALLINN (LDAP) Server Name: RIGA (LDAP) Server Name: EDW.DOMAIN.EE (IP) Schema: CARD Schema: CARD Schema: MAINPHYSICAL SERVERS - PRODUCTION 29 © Swedbank
  30. 30. Features of ODI topology• Physical server has fixed user name and password• One logical schema can map to exactly one physical schema in one contextTo make multiple users in same database – define more contexts or duplicate the datamodel• Logical schema cannot change technologyConclusion – database schema is needed to be defined as many times as many database users haveSingle shared database connection is preferred to maximize ELT –> compromise on resource management on database side by user names © Swedbank
  31. 31. ODI developer basic steps1. Reverse engineer data models from source and target2. Define column level data mappings, specify join and filter conditions. Every data mapping (odi interface) can have exactly one target and multiple sources3. Select knowledge module (code generator)4. Generate code (odi scenario) and execute scenario 31© Swedbank
  32. 32. ODI scenario generation and executionData Objects Runtime variables Connect & execute commandsInterfaces Package Code Scenario Code DB 1 Generation Execution Connect & execute commandsKnowledge modules Context (Topology) DB 2 ODI Designer ODI Agent• When knowledge module changes – rebuild and deploy all related scenarios• When database objects change – refresh data structure definitions from source database, rebuild and deploy all related scenarios 32 © Swedbank
  33. 33. Custom componentsto manage 500 ETL processes• Process registry – all processes and their dependencies• Common wrapper – special scenario wrapping all others• ODI monitor – Web access to process registry• Release builder – Used for deploying from test to developement 33© Swedbank
  34. 34. Process registry• List of all ETL processes regardless of technology - Create, change, retire process - All necessary information for maintaining the list• Process scheduling information• Dependencies between processes – Process to process dependencies – Dependencies thru “Dependency Group” – Based on process bookmarks© Swedbank
  35. 35. Common Wrapper• Special 1 instance ODI scenario, thru which all other scenarios are executed (pre and post steps)• Implements common functionality needed for all processes - Checks if preliminaries of process have been filled - Checks if process allowed to run at the moment. - Assigns common process control variables and passes its values to executed scenario - Logs execution bookmarks, odi session ids, run result - Alerts monitoring in case of failure© Swedbank
  36. 36. Custom components overview© Swedbank
  37. 37. Dependency group• Defining dependency group - is the data content what process delivers. It corresponds to business concept / subject area + data availability.• Proceses are either: – Suppliers of Dependency group – Consumers of Dependency group• Dependency groups are also used for show the data availability bookmarks for users in ad-hoc reporting environement 37© Swedbank
  38. 38. Dependencies between processes Value added calculation 1 Value added calculation 2 Consuming processes Is Consumer of Is Consumer of Is Consumer of Fin Agrmt Bal Dly Credit Agrmt Dly Dependency Fin Agrmt Dly Groups Is Supplier for Is Supplier for Is Supplier for Bank Account loading Loan Agrmt loading Leasing Agrmt Loading Factoring Agrmt Loading Supplying processes © Swedbank 38
  39. 39. Enterprise metadata context Manual Metadata (Services, business requierments Metadata reports etc)METADATAUSER INTERFACE Enterprise Metadata Repository Transformation CASE tool metadata Presentation metadata RDBMS metadata (Logical data (ETL tools) (Reporting tools) models)TECHNICAL OPERATIONAL METADATA 39© Swedbank
  40. 40. Metadata model – ETL related IMPACT_LAYER DEPENDENCY_GROUP SERVICE Impact_Layer_Name Dependency_Group_Name Service_Component_ShortName Service_Component_ShortName (FK) PACKAGE_SOURCE_LAYER PACKAGE_TARGET_LAYER Package_Name (FK) Package_Name (FK) Impact_Layer_Name (FK) Impact_Layer_Name (FK) PROCESS PROCESS_DEPENDENCY Process_Name Process_Name (FK) Dependency_Group_Name (FK) Service_Component_ShortName (FK) PROCESS_EXECUT ABLE ETL_Server_Name PACKAGE Process_Status_ShortName Process_Executable_Name Package_Name Process_Executable_Name (FK) Package_Name (FK)PROCESS_SCHEDULE_TIMEProcess_Schedule_Type_CodeProcess_Schedule_No PACKAGE_SOURCE_OBJECT PACKAGE_TARGET_OBJECTProcess_Name (FK) PROCESS_PARAM PROCESS_RUN Package_Name (FK) Package_Name (FK) Process_Name (FK) Process_Name (FK) DB_Object_Name (FK)Frequency_Type DB_Object_Name (FK) Param_Name Process_Execution_Dtime Impact_Layer_Name (FK)Frequency_Value Impact_Layer_Name (FK) Param_Value Process_Boomark_Values DB_OBJECT Impact_Layer_Name (FK) Sources of metadata: DB_Object_Name PROCESS REGISTRY Service_Component_ShortName (FK) TRANSFORMATION RDBMS MANUAL CONFIGURATION 40 © Swedbank
  41. 41. Process execution preliminaries 41 © Swedbank
  42. 42. Process execution Daemon• Planned component for automatic ETL workflow management, start process when: – It is time to process new data – Preliminaries are ready – Process run is allowed• Replacement of enterprise job scheduler• Utilizing framework of Process Registry and CommonWrapper© Swedbank
  43. 43. Our experience with ODI (10g)• Performance concerns – Educate developers to use existing patterns – Optimize knowledge modules, while keeping them as generic as possible – Made lightweight quick web application for accessing execution logs• Functionality – Modified almost every KM which is now in use – Created new KMs for common needs (new history integration, SAX xml parsing for loadings, streamed xml output etc.) – Made workarounds for missing features: OLAP function support, sub queries – Utilized ODI code substitution framework to maximum – Made command line utility to start ODI session on remote Agent – Use DTS Agent for scheduling – single high-level workflow management system © Swedbank
  44. 44. Our experience with ODI (10g) , continued• Deployment – We use single ODI project per Area – shared sets of KMs and Variables – To test – install separately changed data models, knowledge modules and odi folders (common releasable unit, based on custom export script) – Huge ODI project import operation required custom solution to do incremental restore for whole project.• ETL Administrator concerns – no way to change the code directly in production (in case of urgent issues) © Swedbank
  45. 45. 2008 started ETL processes migration from MS-DTS to ODI . Current status: Number of ETL processes Number of tasks in ETL processes300 261 257250200 3224150 DTS ODI 381910050 0 ODI DTS 45 © Swedbank
  46. 46. Questions? 46© Swedbank

×