Your SlideShare is downloading. ×
ODI 11g in the Enterprise - BIWA 2013
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

ODI 11g in the Enterprise - BIWA 2013


Published on

Presentation by Mark Rittman, Technical Director, Rittman Mead, on ODI 11g features that support enterprise deployment and usage. Delivered at BIWA Summit 2013, January 2013.

Presentation by Mark Rittman, Technical Director, Rittman Mead, on ODI 11g features that support enterprise deployment and usage. Delivered at BIWA Summit 2013, January 2013.

Published in: Technology

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Deploying ODI 11g in the EnterpriseMark Rittman, Technical Director, Rittman MeadBIWA Summit 2013, San Francisco, January 2013T : +44 (0) 8446 697 995 E : W:
  • 2. About the Speaker• Mark Rittman, Co-Founder of Rittman Mead• Oracle ACE Director, specialising in Oracle BI&DW• 14 Years Experience with Oracle Technology• Regular columnist for Oracle Magazine• Author of two Oracle Press Oracle BI books• Oracle Business Intelligence Developers Guide• Oracle Exalytics Revealed• Writer for Rittman Mead Blog :• Email :• Twitter : @markrittman T : +44 (0) 8446 697 995 E : W:
  • 3. About Rittman Mead• Oracle BI and DW platinum partner• World leading specialist partner for technical excellence, solutions delivery and innovation in Oracle BI• Approximately 50 consultants worldwide• All expert in Oracle BI and DW• Offices in US (Atlanta), Europe, Australia and India• Skills in broad range of supporting Oracle tools: ‣ OBIEE ‣ OBIA ‣ ODIEE ‣ Essbase, Oracle OLAP ‣ GoldenGate ‣ Exadata ‣ Endeca T : +44 (0) 8446 697 995 E : W:
  • 4. Oracle Data Integrator 11g• Oracle’s strategic data integration tool, originally came via the Sunopsis acquisition (2006)• Java architecture, part of the wider Oracle Data Integration Suite, and Oracle Fusion Middleware• Heterogenous database and source/target support• Long-term successor to OWB, most common ETL tool on new projects now• Commonly used alongside OBIEE, Essbase and Oracle RDBMS for BI/DW projects T : +44 (0) 8446 697 995 E : W:
  • 5. Oracle Data Integrator 11g Key Features• Same philosophy as OWB and Oracle RDBMS – DB as the ETL engine• Declarative design - separates logic from implementation ‣ Business rules define what goes where, and using which transformation rules ‣ Technical implementation defines how data is moved• Built for SOA environments ‣ Support for Web Services, EII etc• Supports batch, event-based and real-time integration• Extensible through “Knowledge Modules” ‣ Change Data Capture ‣ Slowly Changing Dimensions ‣ Bulk load• Java client application with server elements (“agents”) T : +44 (0) 8446 697 995 E : W:
  • 6. Part of the Wider Oracle Data Integration Suite• Oracle Data Integrator for large-scale data integration across heterogenous sources and targets• Oracle GoldenGate for heterogeneous data replication and changed data capture• Oracle Enterprise Data Quality for data profiling and cleansing• Oracle Data Services Integrator for SOA message-based data federation T : +44 (0) 8446 697 995 E : W:
  • 7. Part of Oracle Fusion Middleware 11g• Oracle complete set of middleware servers and technologies• Based around Java, SOA, Oracle WebLogic Server and non-Java technologies• Foundation for Oracle’s applications and platforms such as Oracle Fusion Applications T : +44 (0) 8446 697 995 E : W:
  • 8. Deploying ODI within an Enterprise• As ODI becomes more mainstream, and data integration more mission-critical, ODI needed to evolve• Data warehousing and BI projects don’t just access (Oracle) relational sources and targets any more• Data quality requires more thought than just ad-hoc corrections and filtering• ODI needs to participate in modern software development techniques such as “continuous integration”• It’s no longer acceptable for ODI jobs to fail, and be unavailable all day or weekend• The stakes are raised - can ODI deliver? T : +44 (0) 8446 697 995 E : W:
  • 9. Loading More than a Data Warehouse, Accessing More than Oracle RDBMS• Most of us know ODI through its ability to load Oracle data warehouses• Data typically sourced from Oracle databases, files, maybe the odd non-Oracle RBDMS source• Enterprises now work with many and varied data sources and applications, such as ‣ Multidimensional servers such as Oracle Essbase, and associated EPM apps ‣ XML sources, and JMS queues ‣ SOA environments, using messaging and service buses, typically in real-time ‣ More recently - big data sources such as Hadoop clusters, NoSQL databases T : +44 (0) 8446 697 995 E : W:
  • 10. Working with Essbase data, and Hyperion Planning• ODI11g is the strategic, long-term DI tool for Essbase and associated EPM applications• IKMs and LKMs for loading, and extracting from, Essbase databases and EPM metadata stores• Data models for Essbase databases represented as tables, columns, the same as with other data sourcres• Data loads via rules files, Essbase / Planning / HFM APIs• However ... not really Essbase native, learning curve for admins• Good sources of ODI + Essbase/EPM Suite information: ‣ ‣ Cameron Lackpour OOW2012 Presentation “Slay the Bad Data in Essbase with ODI” ‣ Rittman Mead Blog T : +44 (0) 8446 697 995 E : W:
  • 11. Support for SOA Environments, and Messaging• ODI has technology adapters and features for many SOA, queue and messaging-type technologies ‣ JMS Queue, JMS Topic (plain message or XML), SOAP messages via Web Services etc• Main role for ODI in SOA enviroments is bulk-data movement, invoked by web service calls ‣ Regular inter-service messaging for low volume, switching to ODI for high-volume• Web services provided by runtime agents ‣ Start, monitor, stop and restart scenarios ‣ Start, monitor, stop and restart load pans• Public introspection web service ‣ List contexts ‣ List scenarios ‣ Requires deployment in Java EE container• Call from BPEL or any other standard process T : +44 (0) 8446 697 995 E : W:
  • 12. Example: ODI 11g for Bulk Data Handling in an Orders Process 1.Large file arrives, detected by BPEL file 5. ODI transforms payload 2.Execution starts (BPEL / ESB) - and a step for 6. ODI sends payload wherever instructed transforming a large document payload occurs 7. ODI notifies BPEL / ESB that job has completed 3.Pass XML payload, by reference, to ODI 8. Core BPEL / ESB processing completes 4.ODI loads payloadT : +44 (0) 8446 697 995 E : W:
  • 13. Oracle Data Services Integrator - A Data Federation Alternative to ODI in SOA• ODI is essentially a batch-orientated DI tool, though batches can be micro-batches (and event-driven)• ODI moves and transforms data, loading it into a central, integrated location• In some cases though, you may wish to take a different approach ‣ Data federation vs. integration - read and transform data in-place ‣ Data read and integrated on-demand, as a service• Approach could be preferable for many reasons ‣ Security rules don’t allow data to be replicated ‣ Development is dynamic, sources frequently added or changed ‣ Data volumes don’t warrant a full ETL solution ‣ Data format is inherently nested and does not easily map onto relational model T : +44 (0) 8446 697 995 E : W:
  • 14. ODI, ODSI, Golden Gate and OEDQ in a SOA EnvironmentT : +44 (0) 8446 697 995 E : W:
  • 15. Big Data, Hadoop and Unstructured Data Sources• “Big data” is the hot topic in BI, DW and Analytics circles• The ability to harness vast datasets, at a highly-granular level, by harnessing massively-parallel computing• Crunching loosely-structured and modelled datasets using simple algorithms: Map (project) + Reduce (agg)• Largely based around open-source projects, non-relational technologies ‣ Apache Hadoop ‣ MapReduce ‣ Hadoop Distributed File System ‣ Apache Hive, Sqoop, HBase etc• Emerging commercial vendors ‣ Cloudera• ‣ Hortonworks etc Can be used standalone, or linked to an enterprise DW/BI architecture + T : +44 (0) 8446 697 995 E : W:
  • 16. ODI as Part of Oracle’s Big Data Strategy• ODI is the data integration tool for extracting data from Hadoop/MapReduce, and loading into Oracle Big Data Appliance, Oracle Exadata and Oracle Exalytics• Oracle Application Adaptor for Hadoop provides required data adapters ‣ Load data into Hadoop from local filesystem, or HDFS (Hadoop clustered FS) ‣ Read data from Hadoop/MapReduce using Apache Hive (JDBC) and HiveQL, load into Oracle RDBMS using Oracle Loader for Hadoop• Supported by Oracle’s Engineered Systems ‣ Exadata ‣ Exalytics ‣ Big Data Appliance (w/Cloudera Hadoop Distrib) T : +44 (0) 8446 697 995 E : W:
  • 17. How ODI Accesses Hadoop and MapReduce• ODI accesses data in Hadoop clusters through Apache Hive ‣ Metadata and query layer over MapReduce ‣ Provides SQL-like language (HiveQL) and a metadata store (data dictionary) ‣ Provides a means to define “tables”, into which file Hadoop Cluster data is loaded, and then queried via MapReduce ‣ Accessed via Hive JDBC driver MapReduce (separate Hadoop install required on ODI server, for client libs) Hive Server• Additional access through Oracle Direct Connector for HDFS HiveQL and Oracle Loader for Hadoop Oracle RDBMS ODI 11g Direct-path loads using Oracle Loader for Hadoop, transformation logic in MapReduce T : +44 (0) 8446 697 995 E : W:
  • 18. Running a MapReduce / Hive Job in ODI• Data is extracted and loaded using regular interfaces• LKMs and IKMs generate HiveQL queries• Functionally identical to RDBMS access/loading T : +44 (0) 8446 697 995 E : W:
  • 19. Oracle Loader for Hadoop• Oracle technology for accessing Hadoop data, and loading it into an Oracle database• Pushes data transformation, “heavy lifting” to the Hadoop cluster, using MapReduce• Direct-path loads into Oracle Database, partitioned and non-partitioned• Online and offline loads• Key technology for fast load of Hadoop results into Oracle DB T : +44 (0) 8446 697 995 E : W:
  • 20. Profiling Data, and Managing Data Quality Issues• ODI has built-in capabilities for defining data rules, data firewalls ‣ Static controls, Flow controls, constraints etc• But what if you don’t know what issues your data actually has?• What if you need to profile, deduplicate, merge or otherwise manage your data?• This is almost a topic in itself... T : +44 (0) 8446 697 995 E : W:
  • 21. Oracle Enterprise Data Quality• Data profiling, auditing and cleansing based on the industry-leading Datanomic platform• Integration with Oracle Data Integrator for a complex data management solution T : +44 (0) 8446 697 995 E : W:
  • 22. Oracle EDQ Features Relevant to ODI 11g• Ability to profile data from many sources (file, RDBMS, JDNI, XML, MS Office)• Create data quality cases, track and assign to owner• Cleanse, transform, parse and match incoming data via a palette of operators (processors)• Batch or real-time operation• All-Java architecture, thin-client and runs in WebLogic Server• Replaces previous Trillium-based OEM solution (but extra-cost option, as was Trillium solution) T : +44 (0) 8446 697 995 E : W:
  • 23. ODI 11g Integration with Oracle EDQ• Limited integration at present, but Datanomic only just acquired• Can run in same WLS domain, environment• EDQ result schema can be on same DB as ODI staging area• EDQ processes can be executed from ODI package or load plan using EDQ Open Tool Connection details to EDQ Server, and details of job T : +44 (0) 8446 697 995 E : W:
  • 24. Participation in Large-Scale Enterprise Projects, and “DevOps”• As ODI and data integration becomes more integral to enterprises, expectations rise• ODI project elements, and executable code, needs to go into source control• Build systems need to be able to include ODI functionality in their releases• Development Operations (“DevOps”) systems need to be able to spin-up ODI environments automatically• Ideas such as “continuous integration” and “smoke testing” can also apply to ODI projects• ODI topologies need to be flexible enough to deal with DEV/PROD network & responsibility separations T : +44 (0) 8446 697 995 E : W:
  • 25. Typical ODI Repository Topology : DEV, TEST and PROD• Typical enterprise customers deploy all non-PROD environments on their own network, isolated from the main production systems• This stops you having a single master repository for all ODI work repositories• Good practice is to have all non-DEV environments use execution work repositories ‣ Only allows load plans and scenarios to be imported ‣ Can only run existing code, not alter or change code• Challenge is how you deploy code without DEV assistance ‣ Requires command-line tools ‣ Requires scripting ‣ Requires an API? T : +44 (0) 8446 697 995 E : W:
  • 26. Accessing ODI 11g Admin Features from the Command-Line• ODI’s admin functions are available through ODI Tools ‣ Run from the command-line, from an ODI procedure, or other methods ‣ Scriptable using the startcmd.bat|sh utility ‣ Run from the agent home directory, connects to master and work repositories• The key to automating the deployment and administration of ODI projects and environments cd c:oracleproduct11.1.1Oracle_ODI_2oraclediagentbin startcmd.bat OdiImportObject -FILE_NAME=c:Test_Build_Files SCEN_LOAD_PROD_DIM_Version_001.xml -WORK_REP_NAME=PROD_EXECREP  -IMPORT_MODE=INSERT_UPDATE T : +44 (0) 8446 697 995 E : W:
  • 27. Continuous Integration and “Smoke Testing” using ODI• For complex, multi-developer projects, continuous integration is a good practice• Continously taking shipped code and testing it in a “smoke test” environment Security Topology ‣ Identifies changes that “break the build” early Versioning ‣ Use a suite of regression tests that run the code DEV/TEST with optimal coverage, end-to-end ETL runs Master Repository ‣ Gives you confidence that a release shipped into test will actually compile, deploy and pass functional tests Models ‣ Enables more agile development, through having a robust Projects Execution Execution Execution build and regression testing process that welcomes change DEV CI / SMOKE TEST TEST Development Execution Execution Work Repository Work Repository Work Repository Regression Test #1 Regression Test #2 Regression Test #n T : +44 (0) 8446 697 995 E : W:
  • 28. Using Jenkins and OdiImportObject Tool for Continuous Integration• Jenkins is an open-source build automation and continuous integration tool• Supports a range of build tools including Ant, Maven, Subversion, Git etc• Use to detect new ODI export files in a given directory, and then automatically deploy them to the CI / Smoke-Test environment ‣ Or monitor a source-control system for new check-ins• Deploy ODI code through ODI Tools (OdiImportScen, OdiImportObject) Security Topology Jenkins CI Server Versioning with scheduler PROD Master Repository startcmd.bat OdiImportObject  Execution -FILE_NAME = %1.xml ... PROD Execution Work Repository T : +44 (0) 8446 697 995 E : W:
  • 29. Steps to Set up a Continuous Integration Environment using Jenkins• Download Jenkins from• Set up a new build job, optionally integrate with SVN etc• Run ODI tools through “Execute a Batch File” function ‣ Or take it further using Maven, Ant etc• Run the build process manually, to a schedule, or on check-in of new code to the source control system• Report on stability of build, see last failure, reason for fail T : +44 (0) 8446 697 995 E : W:
  • 30. The ODI SDK• For other automation tasks, the ODI SDK can be used to perform all functions available in ODI Studio• Java-based API analogous to OMB+ within Warehouse Builder• Script the creation of repositories & interfaces, updating of models, registering of data sources and topologies etc• Used either within Java applications (compiled), or interpreted using Groovy (editor now shipped with ODI) import oracle.odi.domain.project.OdiProject; import oracle.odi.domain.project.OdiProject; import import   DefaultTransactionDefinition;   DefaultTransactionDefinition; txnDef = new DefaultTransactionDefinition(); txnDef = new DefaultTransactionDefinition(); tm = odiInstance.getTransactionManager() tm = odiInstance.getTransactionManager() txnStatus = tm.getTransaction(txnDef) txnStatus = tm.getTransaction(txnDef) project = new OdiProject("Project For Demo", "PROJECT_DEMO") project = new OdiProject("Project For Demo", "PROJECT_DEMO") odiInstance.getTransactionalEntityManager().persist(project) odiInstance.getTransactionalEntityManager().persist(project) tm.commit(txnStatus) tm.commit(txnStatus) T : +44 (0) 8446 697 995 E : W:
  • 31. Making ODI ETL Processes Resilient and Highly-Available• ODI routines when deployed in the enterprise, need to be resilient, fail gracefully, be restartable• They are often considered “mission critical”• You need to code defensively, and anticipate #fail .. or this. Make your ETL routines like this... Not like this... T : +44 (0) 8446 697 995 E : W:
  • 32. Why do ODI ETL and Data Integration Jobs Fail?• ODI ETL processes typically fail for one of two main reasons ‣ Reason #1 : An error in your code, unexpected data, run out of disk space etc - the process fails ‣ Reason #2 : An agent crashes, ODI repositories goes down etc - the infrastructure fails• Most modern databases (Oracle 11g+ etc) have capabilities to recover from DB process issues• Can we make use of these within ODI packages, KMs etc? T : +44 (0) 8446 697 995 E : W:
  • 33. Enabling ETL Resumption : Resumable Space Allocation and ODI• Oracle Database 9i+ has provided “resumable space allocation” ‣ When enabled, suspends INSERT operations when out of disk space, rather than fail load ‣ Datafiles can then be extended, or new ones added ‣ Can be incorporated into ODI KM to enable more load operations to complete T : +44 (0) 8446 697 995 E : W:
  • 34. Resumable Space Allocation in Action• Insert process becomes suspended, ODI Operator shows step as still running• Suspended operation can be detected using DBA_RESUMABLE, USER_RESUMABLE• Once more disk space added, step will resume, operation can complete select name from dba_resumable; NAME                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  T : +44 (0) 8446 697 995 E : W:                                                                                                                                                                                                                        
  • 35. Making ETL and Data Integration Processes Restartable• Recoverability is another enterprise ETL requirement - graceful failure and ability to restart process• Can be as simple as re-running the job, but some failures may be catastrophic - how to you “unwind”?• Oracle RDBMS has several “flashback” technologies that can help ‣ Flashback database, to a given SCN or restore point ‣ Flashback table, etc• Example : An ETL process performs an UPDATE, then and INSERT - if the INSERT fails, the UPDATE stays present. Can we use FLASHBACK TABLE to restore the table back to original state, so the process can be restarted safely? T : +44 (0) 8446 697 995 E : W:
  • 36. Using Load Plans to Aid Restartability• Alternative to packages for sequencing interfaces and other steps• Helps organize an optimal execution schedule for a batch• Advanced sequencing capabilities ‣ Parallel or Serial, Conditional branching ‣ Exception handling• Complements Scenarios and Packages, does not replace them• Exception handling feature could be very useful in restart / graceful failure scenarios ‣ Run ODI procedure, package, to correct errors ‣ Run commands to roll-back/flashback the database or tables ‣ Let’s use one for our example... T : +44 (0) 8446 697 995 E : W:
  • 37. Defining a Load Plan Exception to Handle Catastrophic ETL Failures : I• Flashback table requires an SCN (System Change Number) to “flashback-to”• Record the current SCN before performing thre integration in a project variable ‣ Requires SELECT privilege on V$_DATABASE T : +44 (0) 8446 697 995 E : W:
  • 38. Defining a Load Plan Exception to Handle Catastrophic ETL Failures : II• Load plan will define an exception, to be raised if the final INSERT operation fails• Exception will call an ODI Procedure that runs the FLASHBACK TABLE command, using the saved SCN T : +44 (0) 8446 697 995 E : W:
  • 39. Defining a Load Plan Exception to Handle Catastrophic ETL Failures : III• Now, when the INSERT step fails due to an error, the UPDATE is rolled-back as well through the FLASHBACK TABLE feature ‣ Table restored to state at original recorded SCN T : +44 (0) 8446 697 995 E : W:
  • 40. Agent and ODI Infrastructure Failure• Enterprises typically deploy ODI using standalone agents, in a parent/child load-balancing configuration• Repository database has regular backups, or ideally uses DataGuard / log-shipping• Scheduled jobs assigned to the parent, master runtime agent• Jobs then delegated to the child agents, that then do the work based on load factor, availability• But what if the parent agent goes down? What about the schedule? T : +44 (0) 8446 697 995 E : W:
  • 41. Using OPMN To Manage, and Restart, Standalone Agents• OPMN (Oracle Process Manager and Notification Server) can be installed to manage standalone agents ‣ Not part of the base install or license, but you probably have it somewhere• Standalone agents then run, stopped, restarted and monitored using OPMN server• Ensures that failed agents are restarted, including the parent agent for load balancing T : +44 (0) 8446 697 995 E : W:
  • 42. Deploying Agents within WebLogic Server - New with ODI 11g• Runtime agents can now be deployed in WebLogic Server managed servers (requires WebLogic Server license)• Benefit from WebLogic clustering, Enterprise Manager (+ODI Console), more resilient JVM• Better for high-availability - protects the scheduler ‣ how? T : +44 (0) 8446 697 995 E : W:
  • 43. How JEE Agents, WebLogic and Coherence Protect Against Agent Failure• Hardware load balancer provides the load-balancing• Agents are all equal - one elects to be the scheduler on cluster start, another takes over if that one crashes• Oracle Coherence cache grid holds details of the schedule, available to all nodes in the cluster• WebLogic Server clustering restarts failed managed servers, and Java processes (JEE runtime agents)• However ... more complex setup, extra license cost, and may not be necessary if external scheduler used instead ‣ Still benefits from running agents in “production” JVM though ‣ And you get Enterprise Manager, ODI Console etc T : +44 (0) 8446 697 995 E : W:
  • 44. Further Reading - “ODI11g in the Enterprise” series on Rittman Mead Blog• Five-part series on the Rittman Mead Blog: “ODI 11g in the Enterprise” ‣ “Part 1: Beyond Data Warehouse Table Loading” ‣ “Part 2 : Data Integration using Essbase, Messaging, and Big Data Sources and Targets” ‣ “Part 3: Data Quality and Data Profiling using Oracle EDQ” ‣ “Part 4: Build Automation and Devops using the ODI SDK, Groovy and ODI Tools” ‣ “Part 5: ETL Resilience and High-Availability”• odi11g-in-the-enterprise-part-1-beyond-data- warehouse-table-loading/ T : +44 (0) 8446 697 995 E : W:
  • 45. Thank You for Attending!• Thank you for attending this presentation, and more information can be found at• Contact us at or• Look out for our book, “Oracle Business Intelligence Developers Guide” out now!• Follow-us on Twitter (@rittmanmead) or Facebook ( T : +44 (0) 8446 697 995 E : W:
  • 46. Deploying OBIEE 11g in the EnterpriseMark Rittman, Technical Director, Rittman MeadUKOUG Conference & Exhibition, Birmingham December 2012T : +44 (0) 8446 697 995 E : W: