Data Integration Solutions

837 views

Published on

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
837
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Before I present how we help you achieve pervasive continuous access to trusted data with high performance and low TCO, I’d like to remind you on our data integration product portfolio. You will see our complete capabilities for enterprise data integration and how their architectures are designed for maximum performance and lower TCO. Oracle Data Integration products support the diverse set of environments in today’s IT, delivering bulk and real time data movement, transformation, data quality, and service integration. They are certified for all the leading technology offerings to enable very fast time to value.We have proven solutions where customers report 80% lower TCO, 5x higher application performance, and 70% reduction in development costs,(FYI 80% TCO comes from Sabre Holdings. They used GG to offload “airtravel reservations from mainframe (NonStop) to a set of Oracle databases)
  • Platform consists of:Big Data Appliance to source unstructured/semi structured dataExadata to combine the data customer has now structured, alongside traditional schema-based data, running in-DB analytics on itExalytics for in-memory extreme analyticsAll connected by InfiniBand
  • In the context of the different challenges of Big Data, we classify it into these 4 important categories. Acquire, Organize, Analyze, and finally Decide. For each solution area of the industry challenge, Oracle has a best-in-class solution. One that can be componentized to address a single aspect of the problem, or built together as a whole. In addition, these solutions leverages optimized hardware and software which is especially critical to address the need of those 3 Vs that we talked about.For the remainder of this webcast, we’re going to focus on Organize component which features ‘integration’ as it’s core element. Those products we’re going to be unveiling in a little more detail are Oracle Data Integrator and Big Data Connectors. But before that, let’s first hear from Helen Sun who will walk us through the Architecture Principles and Best Practices around Big Data.
  • ODI is at the heart of our integration strategy, and has the necessary capabilities for integrating both structured, and now semi-structured, unstructured data. A little bit about ODI in general first:Optimized E-LT Architecture, which allows for us to run transformations flexibly on the source or target, without the need for running transformation in an intermediary server. This improves performance and efficiency which is critical for supporting the volumes of loads in Big DataEasy to use interface for Declarative design for Lower TCO, this can be applied to generating mapReduce Code which we’ll dive into in a minute. Extensible Knowledge Modules for enriching functionality (i.e. big data transformation). These knowledge modules allow the ODI product to be extended, just like you can extend your mobile phone with new ‘apps’ ODI has the same pluggable logic to allow for customizing logic in loading, transformation, data services, and more.
  • Speaker: Graham Hainbach
  • Oracle Data Integrator (ODI) Application Adapter for Hadoop provides native Hadoop integration within ODI. Specific ODI Knowledge Modules optimized for Hive and Oracle Loader for Hadoop are included within ODI Application Adapter for Hadoop. The knowledge modules can be used to build Hadoop metadata within ODI, load data into Hadoop, transform data within Hadoop, and load data easily and directly into any Hadoop environmentHadoop implementations, whether it’s HDFS or NoSQL require complex Java MapReduce code to be written and executed on the Hadoop cluster. Using ODI and the ODI Application Adapter for Hadoop developers use a graphical user interface to create these programs. Utilizing the ODI Application Adapter for Hadoop, ODI generates optimized HiveQL which in turn generates native MapReduce programs that are executed on the Hadoop cluster.
  • The capability of Oracle Data Integrator for Hadoop is bundled with Big Data Connectors along with Oracle Loader for Hadoop and other tools for interfacing and accessing Hadoop environments, these can be heterogeneous environments as long as they support Hadoop standards.In this case, ODI loads the data directly into Oracle Database utilizing the Oracle Loader for Hadoop support within the Application Adapter for Hadoop.What makes the loading optimized: Oracle Loader for Hadoop sorts, partitions, and converts data into Oracle Database formats in Hadoop, then loads the converted data into the database. By preprocessing the data to be loaded as a Hadoop job on a Hadoop cluster, Oracle Loader for Hadoop dramatically reduces the CPU and IO utilization on the database commonly seen when ingesting data from Hadoop. An added benefit of presorting data is faster index creation on the data once in the database.
  • Takeaway – ODI is part of Oracle’s complete data integration strategy, but specifically for Big Data, ODI is the key element you want to start with. The unique advantages of the offering include:Simplifies creation of Hadoop and MapReduce code to boost productivityIntegrates big data heterogeneously via industry standards: Hadoop, MapReduce, Hive, NoSQL, HDFSUnifies integration tooling across unstructured/semi-structured and structured dataOptimizes loading of big data to Oracle Exadata using Oracle Loader for HadoopIntegrated and optimized for running on Oracle Big Data Appliance via Big Data Connectors
  • For 'automated processes' ODI calls an EDQ job.
  • Here is the picture that combines the two big pieces of the Analytics puzzle. You can use the same technology for both your structured, transactional data with robust real-time data integration capabilities and your Big Data which includes unstructured and semi-structured data. Oracle Data Integrator has capabilities to support all your enterprise data and enable analytics that deliver the complete picture.
  • I’d like to briefly introduce you Oracle Data Integration products and core capabilities before we discuss how Oracle enables companies transform their information management infrastructure using a strong set of DI prodcuts.There are 3 key products within Oracle’s DI portfolio:Oracle GoldenGate provides real-time, log-based change data capture, and delivery between heterogeneous systems. Oracle GoldenGate moves committed transactions with transaction integrity and minimal overhead on the infrastructure. Its wide variety of use cases includes real-time business intelligence, query offloading, zero-downtime upgrades/migrations, disaster recovery, and active-active databases for data distribution and high availabilityOracle Data Integrator offers high performance bulk data movement and transformation via its unique Extract Load Transform architecture. It has Fastest transformation in the industry on Exadata platform, because there is no other platform like Exadata on the market today. We will discuss that shortly and in more detail in our first breakout session. ODI is perfect for analytical systems but also as the standard for data integration across OLTP systems.Oracle Data Quality: While other vendors struggle to accommodate the different technology and use-case requirements of widely different domains such as customer and product data with a single DQ engine, Oracle has recognized that these different domains require a different approach. Oracle Product Data Quality uses deep semantic-based technology to extract and standardize key information from highly disparate product data, while Oracle Data Quality for DI handles the specific requirements of customer data.Overall Oracle Data Integration products provide data in any latency, from daily batches to subseconds. They have resilient architecture for mission-critical systems and provide fast-simple deployment for fast time to value.
  • Here’s something interesting I thought I would share with you. You can’t talk about big data without talking about the internet. These statistics around what can be accomplished in 60 seconds amaze me, and I’m sure you as well. Almost 700,000 status updates on Facebook? And half of those 168 million emails are completely junk mail. Look at the amount of data, volumes, But how can we make this data relevant to our business? What do we do with all this data? Well, we can analyze it for patterns? Understand customers sentiment? Build better advertising? Optimize our business operations. And companies are doing just that now. While not all ‘internet data’ = big data, I still thought it’s a good warmup for today’s discussion! So before another 60 seconds pass again, let’s continue!
  • Data Integration Solutions

    1. 1. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.1
    2. 2. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.2 Data Integration Solutions Magdalena Bartnik DIS Sales Manager CEE&MENA In association with
    3. 3. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration3 Oracle Data Integration Strategy
    4. 4. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration4 Change is Inevitable and Unstoppable Data Volumes and Retention Times Grow Continually DataVolume Data Lifetime Data volumes continue to increase exponentially Data lifetimes are getting longer Infrastructure costs go up Data processing tasks take longer Overall data complexity increases Storage requirements and costs increase “Active” vs. “Archive” data must be defined Regulatory requirements = immediate data access
    5. 5. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration5 Change is Inevitable and Unstoppable Batch Processing Windows Are Shrinking Insufficient Time Batch processing takes too long Global 24 x7 operations limit downtime Processing volumes exceed batch windows Even the best hardware will struggle eventually Amount of Data Hours Minutes Seconds MB GB TB 8x5 24x7
    6. 6. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration6 Data Integration Trends Data Integration Tools Today Need to Do More  Big Data is the new normal  Real-time data jumps to the forefront  Application data now lives in the Cloud and in a multitude of disparate environments  Conventional ETL tools are struggling to meet performance demands
    7. 7. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration7 New Requirements for Data Integration Integrated and High Productivity Tooling High Performance, High Availability Cloud Ready Any Data Latency, Real-time Any Data, Any Source
    8. 8. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration8 Modernization MDM SOABig Data Oracle Data Integration  Future Ready  Extreme Performance  Fast Time to Value Complete Offering for Enterprise Data Integration Oracle Data Integrator Oracle GoldenGate Oracle Enterprise Data Quality Active Data Guard Oracle Customers Report:  80% lower TCO  Five times higher performance  70% reduction in development costs SynchronizationCustom BI Big Data Database Apps Cloud
    9. 9. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration9 Oracle Data Integrator Bulk Data Processing and Data Transformation  High-performance, low cost of ownership E-LT architecture  Lightweight deployment  Flexible, easy to enrich functionality Oracle Data Integrator OLTP Applications Legacy Unstructured High Performance E-LT Declarative Design CEP Data Services Extensible Knowledge Modules
    10. 10. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration10 Oracle GoldenGate Real-time Data Integration  High-performance, low- impact real-time data integration  Timely data for improved business insight  Continuous availability for 24/7 operations Oracle GoldenGate OLTP Applications Legacy Unstructured Log-based Change Data Capture and Delivery CEP Integration, Real-time Events Heterogeneous Sources and Targets Reliability and Transaction Integrity Bidirectional Replication
    11. 11. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration11 Oracle Enterprise Data Quality Data Quality for Customer and Product Data Oracle Enterprise Data Quality OLTP Applications Legacy Unstructured  Improves data accuracy, usability and „fitness for purpose‟  Unified interface for ease- of-use, lower TCO  Depth of capability in multiple data domains reduces project risk Governance, Case Management Match, Merge, Enrich Parse, Standardize, Cleanse Profile, Explore, Audit
    12. 12. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration12 Big Data and Data Integration
    13. 13. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.13 Big Data Drives Big Benefits Any Data, Any Source TransactionsYour Data: Decisions based on your data Big Data: Decisions based on all data relevant to you Machine-Generated Data Social Data Documents Oracle SAP
    14. 14. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.14 Oracle Big Data Platform Exadata Exalytics ACQUIRE ORGANIZE DECIDEANALYZE Big Data Appliance
    15. 15. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration15 Oracle Integrated Solution Stack for Big Data ACQUIRE Oracle NoSQL Database HDFS Oracle Database ORGANIZE Hadoop (MapReduce) Oracle Big Data Connectors Oracle Data Integrator DECIDE Analytic Applications ANALYZE In-Database Analytics Data Warehouse
    16. 16. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration16 Oracle Data Integrator  Optimized E-LT Architecture  Declarative design for Lower TCO  Extensible Knowledge Modules for enriching functionality (i.e. big data transformation) Bulk Data Processing and Big Data Transformation Any Data Warehouse Any Planning System CEP, Data ServicesOracle Data Integrator
    17. 17. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration17 Pluggable Knowledge Modules Architecture Hadoop Siebel SAP eBusiness Suite Oracle Datapump Oracle DBLink DB2 Exp/Imp JMS Queues External Tables J.D Edwards Oracle SQL*Loader TPump/ Multiload HDFS Load Oracle Merge Siebel EIM Schema Oracle Web Services Hive Append Sample out-of-the-box Knowledge Modules Key Architecture Benefits  Ease of maintenance  Flexibility, tailored to existing best practices  Reduces cost of ownership Reverse Engineer Metadata Journalize (CDC) Load from Source to Staging Check Constraints Integrate, T ransform Data Service Knowledge Modules Pluggable Architecture for Improved Flexibility
    18. 18. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration18 ODI for Big Data  Supports Hadoop standards  Reverse Engineer Hadoop metadata  Check, Validate and Ensure Data Integrity with Hadoop  Load Data into HDFS/Hive  Generate Hive QL and execute in Hadoop  Leverage existing Hadoop transformations Heterogeneous Integration to Hadoop Environments Access Transform Loads Oracle Data Integrator  Supports Hadoop standards  Reverse Engineer Hadoop metadata  Check, Validate and Ensure Data Integrity with Hadoop  Load Data into HDFS/Hive  Generate HiveQL and execute in Hadoop  Leverage existing Hadoop transformations
    19. 19. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration19 Selectable Hive Transformations Metadata Focused Approach Big Data Transformation Services Oracle Data Integrator – Hive Control Knowledge Module Easy to use & Guided Hive Function Support
    20. 20. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration20 Oracle Big Data Application Adapters Reverse Engineering RKM Hive – Extract Hive metadata into ODI. These table and view definitions can then be used in ODI Interfaces (mappings) Loading into Hadoop IKM File to Hive (LOAD DATA) – Uses Hive to one or many Load Flat Files into Hadoop. Supports Hive Partitioning. Files can be local or inside HDFS Transformations within Hadoop IKM Hive Control Append – SQL like transformations can be made on the data. Utilizes the Hive function library. IKM Hive Transform – Performs transformations via a user defined shell script Check Constraints CKM Hive – Use this to define constraints when loading data into Hive. Loading into Oracle IKM File/Hive to Oracle (OLH) – Uses OLH to load Data from Hadoop into Oracle. This KM will either use the Delimeted File InputFormat, the Hive InputFormat or a user defined InputFormat(Java class) to stream the data. Different Output Modes can be chosen (JDBC, Direct Path, Datapump, Direct HDFS)
    21. 21. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.21 Oracle Loader for Hadoop Generates MapReduce Code Supports Several Load Options Supports Partitioned and Non-partitioned tables Online and Offline loads
    22. 22. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration22 ODI for Big Data to Oracle Optimized Integration to Oracle Exadata Oracle Database, Oracle Exadata Transforms Via MapReduce Loads Activates Oracle Loader for Hadoop Oracle Data Integrator Oracle Big Data Connectors Hadoop Cluster
    23. 23. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration23 Oracle Data Integrator for Big Data Putting Together the Unique Advantages Simplifies creation of Hadoop and MapReduce code to boost productivity Integrates big data heterogeneously via industry standards: Hadoop, MapReduce, Hive, NoSQL, HDFS Unifies integration tooling across unstructured/semi-structured and structured data Optimizes loading of big data to Oracle Exadata using Oracle Big Data Connectors Engineered for running on and integrating with Oracle Big Data Appliance via Big Data Connectors
    24. 24. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.24 Oracle Data Integration for Business Analytics
    25. 25. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration25 Next Generation Architecture LoadExtract Conventional ETL Architecture Extract Load Transform Data Loading through E-LT The key to improved performance and reduced costs • E-LT provides flexible architecture for optimized performance • Benefits: • Leverage Set-based transformations • Improved performance for loading, no network hop • Takes advantage of existing hardware 25 TransformTransform EL-T
    26. 26. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration26 1.8 TB /hr 1 Exadata X2-8 7.5 TB /hr • Run ODI Directly on Exadata • Complex Data Transformations • Linear ETL Scalability • Fully Leverages DBFS/Infiniband, Smart Storage, and Advanced Compression 4:1 advantage* * TPC-H data sets with transformations ** Production hardware savings (not including Dev + Test environments, management costs or software savings) ODI Other ETL 7tb/hr 1.8 tb/hr HP Superdome 64 + XP24000 w/ Flash $5m ETL H/W $0 ETL H/W $5m 3yr savings** E E T L TL ODI Outperforms “Other ETL” for Less 4x Performance on Exadata at a Fraction of the Ownership Cost
    27. 27. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.27 ODI and Exadata – Real World Example: A Disruptive Improvement in Performance  A complex branch of the customer‟s tax allocation process runs 5 hrs 11 mins during quarter close  Exadata and ODI (E-LT) combined is able to execute the process 42X faster (7mins 20 secs)0 50 100 150 200 250 300 350 ODI on Exadata Conventional ETL Jobexecutiontimeinminutes 5hrs 11mins 7min 20sec 42XIMPROVEMENT
    28. 28. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration28 Design & Drill Data Flow Bulk and Real-Time Data Processing Oracle Data Integrator Enterprise Edition Information Assets OBI EE Suite and ODI-EE Data Lineage and Data Quality provides trusted data Data Warehouse • Support OLAP/R-OLAP sources and targets • Report-to-source lineage • Integrated Data Quality • Common management: administration, monitoring, scheduling, auditing, exceptions • Works with Oracle GoldenGate for improved real-time data access 28 Oracle Business Intelligence EE Plus
    29. 29. 29 Copyright © 2014, Oracle and/or its affiliates. All rights reserved. EDQ With Oracle Data Integrator: Use Cases Sources Target(s) E.g. Data Warehouse such as Exadata Oracle Data Integrator Data Profiling Analyze and understand data to build ODI mappings Automated Processes De-duplication, complex cleansing and parsing invoked in ODI workflow Measure Ongoing Data Quality Assess quality of data in target system. How well is ETL working? Enterprise Data Quality
    30. 30. 30 Copyright © 2014, Oracle and/or its affiliates. All rights reserved. EDQ and ODI: Comprehensive Data Quality Process Sources Oracle Enterprise Data Quality Parsing Standardization Cleansing Matching Merging Targets Oracle Data Integrator E-LT/ETL Process - Continuous Quality Monitoring - Quality Alerts 4 Create new Data Quality Rules 2 - Add Data Quality to E-LT/ETL Flow 3 Profile Data 1
    31. 31. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration31 Connecting Big Data for Analytics Oracle Database, Oracle Exadata Transforms Via MapReduce Loads Activates Oracle Loader for Hadoop Oracle Data Integrator Oracle Big Data Connectors Hadoop Cluster Oracle GoldenGate Transaction/ Application data (Heterogeneous sources) Big Data Real-Time Transactions Transformations
    32. 32. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration32 Certified for 100s of Sources Certified for 100s of Targets • Any data latency • Highest performance ETL/ELT, CDC • Fast, simple development • Resilient architecture • Cleansing & matching for any data Oracle Data Integration Next Generation Data Integration for the Enterprise Oracle GoldenGate High Performance, Real Time Data Integration Oracle Data Integrator Simpler, Faster, Lower TCO Bulk Data Movement & Transformation Oracle Enterprise Data Quality Trusted Data for the Enterprise
    33. 33. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. #OracleDataIntegration33
    34. 34. Copyright © 2014, Oracle and/or its affiliates. All rights reserved.34

    ×