Oracle Database 12c for Big Data and Data Warehousing

383
-1

Published on

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
383
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Heat Map is a new Oracle Database feature that collects usage information at the block and segment levels. By using Heat Map in conjunction with Automatic Data Optimization - see the Automatic Data Optimization section below - Oracle Database 12c can automate compression and storage policies based on the usage of the data, reducing storage costs, improving performance and optimizing storage. Automatic Data Optimization (ADO) enables organizations to create policies that implement compression and storage tiering automatically. ADO policies define conditions and corresponding actions to be applied to specific objectsWith the data collected by Heat Map, Oracle Database can automatically compress each partition of a table independently based on Heat Map data, implementing compression tiering.Heat Map and ADO make it easy to use existing innovations in Oracle Database compression technologies, which can help reduce the cost of managing large amounts of data, while also improving application and database performance. The Advanced Compression Option includes a comprehensive set of compression features designed to reduce costs and improve performance by enabling compression for structured data, unstructured data, database backups and for Data Guard Redo network traffic.
  • Plan decision deferred until runtimeFinal decision is based on statistics collected during execution Alternate sub-plans are pre-computed, and stored in the cursorStatistic collectors are inserted at key points in the planEach sub-plan has a valid range for stats collectedIf stats prove to be out of range sub-plans can be swapped Requires buffering near the swap point to avoid returning rows to userOnly join methods and distribution method can change
  • Let’s take a look data integration for Oracle Exadata. Oracle’s DI offering is designed and engineered to work with Exadata. Only Oracle supplies best-of-breed data integration tools that install on Exadata, are certified and benchmarked with Exadata, and are engineered to take advantage of the powerful innovations that Exadata brings to bear.Oracle’s data integration solutions support feeding data from legacy systems to Exadata, provide non-invasive capture capabilities and support also traditional use cases such as data warehousing, data distribution. Oracle Data Integrator offers most cost effective and high performance data loading for Exadata.For the full data integration solution, Oracle Data Integrator and Oracle GoldenGate can be combined with Oracle Enterprise Data Quality and Oracle Active Data Guard to ensure that the Exadata solution is fully future-proof, optimized for the best performance, and simplified for the most cost-efficient operations.SolutionOracle Data Integrator and Oracle GoldenGate to capture data and immediately move information where it is needed Enterprise Data Quality to cleanse and standardize dataReal-time and bulk data movement, data synchronization, data quality, and data servicesBig Data transformation, real-time capture for Fast DataBenefitsNo resource / performance impact to source systemsLive data, 100% change detection, available for better informed decision makingGet double-duty from database investment by using it for transformationsMaximizes availability of source systems and DW due to smaller batch windows.
  • Faster performance in data warehousing can have very tangible benefits in verticals like Healthcare.
  • Recognizing patterns in a sequence of rows has been a capability that was widely desired, but not possible with SQL until now. There were many workarounds, but these were difficult to write, hard to understand, and inefficient to execute. Beginning in Oracle Database 12c, you can use the MATCH_RECOGNIZE clause to achieve this capability in native SQL that executes efficiently. This chapter discusses how to do this, and includes the following sections:Oracle worked with BM on the standard.SQL Pattern Matching is in the final stages of being accepted into the ANSI standard and only a matter of formalism.The importance is that we wanted to provide not only an ANSI-compliant language construct, but also the internal optimized processing capabilities to ensure performance and scalability. Check the execution plan for SQL Pattern Matching, and you will see a new row source ..
  • The Oracle Spatial and Graph option for Oracle Database 12c includes advanced features for spatial data and analysis; physical, network, and social graph applications; and a foundation to help location-enable business applications.Most widely used enterprise spatial database in the world; thousands of customers including most of the world’s mapping agencies, state, provincial, local, county, municipal governments worldwide, and pretty much every brand name telco, energy company, utility, and increasingly companies in insurance, banking, retail, and transportation organizations.Very rich feature set – unequaled by other database providers; manages all kinds of spatial content with associated analysisOracle Spatial and Graph, an option for Oracle Database 12c Enterprise Edition, includes advanced features for geospatial, location-based and graph data management and analysis. Formerly known as Oracle Spatial option, Oracle Spatial and Graph underlines its existing graph capabilities, which comprise the most robust, mature database graph technologies available in the industry. Oracle Spatial and Graph, an option for Oracle Database 12c Enterprise Edition, includes advanced features for geospatial, location-based and graph data management and analysis. Formerly known as Oracle Spatial option, Oracle Spatial and Graph underlines its existing graph capabilities, which comprise the most robust, mature database graph technologies available in the industry. Oracle Spatial and Graph provides two graph data models: Network Data Model graph (NDM), and RDF Semantic Graph. NDM is a property graph model used to model and analyze physical and logical networks used in industries such as transportation, logistics, and utilities. RDF Semantic Graph supports the World Wide Web Consortium (W3C) Resource Description Framework (RDF) standards. It provides RDF data management, querying and inferencing that are commonly used in a variety of applications ranging from semantic data integration to social network analysis and linked open data applications. Oracle Spatial and Graph RDF support has become the industry’s leading open, scalable, and secure RDF database
  • Oracle brings the power and value of location analysis to yourbusiness applications, with advanced spatial data management features to supportgeospatial applications in domains ranging from land management and utilities to lifesciences. Only Oracle provides world-class performance, scalability, security, andmanageability to your spatial data assets, while reducing costs, with support from everyleading geospatial vendorOracle includes native support for a variety of data types and analytical functionsExample: GeoRaster data type that natively manages georeferenced raster imagery (e.g., satellite imagery, gridded data) in Oracle Database 11gOracle Spatial provides features for you to perform location analysis on your customer,employee, competitor, supplier data, and view it with partner or Oracle mapping tools.With Oracle Spatial’s native geocoding engine, routing engine, and eLocation QuickStart APIs, application developers can quickly and easily deploy mapping, geocoding,and routing services right "out of the box", from data stored in Oracle Spatial
  • Oracle pluggable Databases is a new architecture for consolidating databases on cloud architectures. Taking advantage of the flexible resource sharing and cost savings that Cloud computing offers can be a challenge for many IT organizations. Designed for the Cloud, Oracle Multitenant delivers a new architecture that simplifies a key step on the journey to the Cloud: database consolidation. In this new architecture, a multitenant container database can hold many pluggable databases. An administrator deals with the multitenant container database, but application code connects to one pluggable database, just like it does with previous releases of Oracle Database. Now customers can easily consolidate multiple databases onto private Clouds without changing their applications, and still control the prioritization of resources between consolidated databases. Oracle Multitenant is also suited to SaaS vendors looking for the power of Oracle database in a secure and isolated multitenant model. Oracle Multitenant helps customers reduce IT costs by simplifying consolidation, provisioning, upgrades, and more. It is supported by a new architecture that allows a multitenant container database to hold many pluggable databases. And it fully complements other options, including Oracle Real Application Clusters and Oracle Active Data Guard. An existing database can be simply adopted, with no change, as a pluggable database; and no changes are needed in the other tiers of the application. The benefits of Oracle Multitenant are brought by implementing a pure deployment choice. The following list calls out the most compelling examples.
  • Multitenant is particularly useful in analytical environments where different data science/analytics teams can make rapid clones of production data for their own use. Cloned sandboxes can be have destructive changes made to it and then simply thrown away when analysis is complete.
  • Before: Production data had to be subsetted first and sensitive data then masked separatelyNow: Production data is subsetted and sensitive data masked in one step using On-the-Fly MaskingHow: As subsetted data is read from Production, Data Masking masks the sensitive data before it gets written to Data Pump fileE-Business Suite Masking TemplateMetadata driven data masking XMLColumns, Relationships, and Masking rules for PII and Sensitive attributes for E-Business Suite productsInstructions for wiping credentials after cloning (Support Note 419475.1)950 Columns / 1900 rules65% HCM - Payroll, Employment Details, Personal InfoAlso TCA, ATG, Financials, Projects…
  • Points to communicate: Emphasize security and compliance benefit Redacts specific columns in the database Works for tables, views, and materialized viewsUse Cases:Existing ApplicationsScreens, reports dashboards, panels …Decision Support SystemsData warehouse and BIExported spreadsheetsTarget data: Sensitive or regulated dataData with structureStored in columnsData needs to be redacted in almost all cases
  • Points to communicate: Emphasize security and compliance benefit Redacts specific columns in the database Works for tables, views, and materialized viewsUse Cases:Existing ApplicationsScreens, reports dashboards, panels …Decision Support SystemsData warehouse and BIExported spreadsheetsTarget data: Sensitive or regulated dataData with structureStored in columnsData needs to be redacted in almost all casesBefore: Production data had to be subsetted first and sensitive data then masked separatelyNow: Production data is subsetted and sensitive data masked in one step using On-the-Fly MaskingHow: As subsetted data is read from Production, Data Masking masks the sensitive data before it gets written to Data Pump fileE-Business Suite Masking TemplateMetadata driven data masking XMLColumns, Relationships, and Masking rules for PII and Sensitive attributes for E-Business Suite productsInstructions for wiping credentials after cloning (Support Note 419475.1)950 Columns / 1900 rules65% HCM - Payroll, Employment Details, Personal InfoAlso TCA, ATG, Financials, Projects…
  • Oracle Database 12c for Big Data and Data Warehousing

    1. 1. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 121 Abstract Information architectures are undergoing tremendous transformations with new data sources becoming more readily available than ever. Using Oracle Database 12c, big data and data warehousing teams can deliver even better analytics to end users, while also dramatically improving performance, availability, security, and storage efficiency. New features for improved query optimization, simplified delivery of data labs and in-database SQL analytics for big data at scale make Oracle Database 12c an even better platform for big data and data warehousing applications. Join this session to learn about all of the new big data and data warehousing compatibilities of Oracle Database 12c.
    2. 2. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 122
    3. 3. Oracle Database 12c for Big Data and Data Warehousing Krzysztof Marciniak In association with
    4. 4. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 124 Agenda  Delivering Higher Performance and Scalability  Faster Analysis With Embedded In-Database Analytics  Embedded In-Database Security  Summary
    5. 5. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 125 Industry Trends in Data Warehousing Data has more volume, velocity, and variety requiring higher performance and scale to process Insights & Patterns needed more quickly requiring in-database approaches to analytics Information is more critical to the business requiring in-database security and access control
    6. 6. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 126 DELIVERING HIGHER PERFORMANCE AND SCALABILITY
    7. 7. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 127 Oracle Advanced Compression Distribute Partitions Across Multiple Compression Tiers Benefits: • Free up storage space and execute queries faster • No changes to existing applications Active Data 3x OLTP Compression Read Only Data 10-15x DW Compression Archive Data 15-50x Archive Compression
    8. 8. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 128 NEW IN 12c:  Heat Map & Automatic Data Optimization features offer Automatic, Policy-Based Compression & Tiering  Significant savings in CAPEX and OPEX  Performance is ensured by keeping “hot” data in the highest performing storage tier Storage Optimization and ILM Active Frequent Access Occasional Access Dormant ORDERS Oracle Advanced Compression Oracle Database 12c Automatic Data Optimization & Heat Map
    9. 9. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 129 Optimizer evolution In the beginning there were rules Optimizer evolved to be cost based CBO CBO Optimizer proactively adapts to become self-learning Rule are not enough Databases became more feature rich Reactive tuning with the use of advisors and auto jobs As environment changes Potential for plan changes Databases become more real-time, ad-hoc environments Reactive tuning not enough
    10. 10. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1210 Adaptive Execution Plans Good SQL execution without intervention HJ Table scan T2 Table scan T1 NL Index Scan T2 Threshold exceeded, plan switches Table scan T1 HJ Table scan T2  Plan decision deferred until runtime  Final decision is based on statistics collected during execution  If statistics prove to be out of range, sub-plans can be swapped  Bad effects of skew eliminated & queries significantly accelerated Query Performance Acceleration
    11. 11. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1211  Non-invasive real-time data capture from heterogeneous sources  No mid-tier. Set-based transformations use database engine  Mini-batch loading throughout the day for minimized batch window  Live data for improved decision making  Most Cost-Effective and High-Performance Oracle Exadata Data Loading High Performance Data Integration Top Performance, Integrated & “Red Optimized” DIM FACT DIM DIMDIM DEPT Batch Feeds Oracle GoldenGate Oracle Data Integrator DEPTEMP EMP Non-Invasive Real Time Transaction Feeds
    12. 12. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1212 “Personalized medicine is really understanding the individual treatment for every single patient…to develop our analytics platform and to really digitize and bring all that information into an interactive experience for our 50,000 employees, we studied the market… With Oracle, their high performance computing capabilities…enabled us to move the needle faster, better, sooner.” Lisa Khorey Vice President, Enterprise Systems and Data Management University of Pittsburgh Medical Center, USA Faster Performance Enables Powerful Insights
    13. 13. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1213 FASTER ANALYSIS WITH EMBEDDED IN-DATABASE ANALYTICS
    14. 14. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1214 Pattern Matching in Sequences of Rows  Historically requires complex SQL or external code to execute  Example: Find a W- shape pattern in a ticker stream: • Output the beginning and ending date of the pattern • Calculate average price each the W-shape • Find only patterns that lasted less than a week EVENT TIME LOCATION A 1 SFO A 1 SFO A 2 ATL A 2 LAX B 2 SFO C 2 LAX C 3 LAS A 3 SFO B 3 NYC C 4 NYC >1min A 2 ATL A 2 LAX B 2 SFO C 2 LAX “Find one or more event A followed by one B followed by one or more C in a 1 minute interval”
    15. 15. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1215 SQL Pattern Matching For Fast Analysis Example: Double bottom pattern (W-shape) in stock ticker stream Provides native SQL language construct Align with well-known regular expression declaration (PERL) Apply expressions across rows Dramatically simplifies and accelerates pattern matching analysis SELECT * FROM Ticker MATCH_RECOGNIZE ( PARTITION BY symbol ORDER BY tstamp MEASURES STRT.tstamp AS start_tstamp, LAST(DOWN.tstamp) AS bottom_tstamp, LAST(UP.tstamp) AS end_tstamp ONE ROW PER MATCH AFTER MATCH SKIP TO LAST UP PATTERN (STRT DOWN+ UP+) DEFINE DOWN AS DOWN.price < PREV(DOWN.price), UP AS UP.price > PREV(UP.price) ) MR ORDER BY MR.symbol, MR.start_tstamp;
    16. 16. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1216 Oracle Advanced Analytics Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics Embedded Analytics Engine Combination of data mining and open source R algorithms for scalable, in-database execution Accessible via SQL, PL/SQL, R and database API’s Range of GUI and IDE options for any analytics end users Fastest and simplest platform for delivering analytics capabilities to end users Clustering Models Classification Models Regression Models Association Models Market Basket Analysis Feature Selection & Reduction Anomaly Detection
    17. 17. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1217 Oracle Spatial and Graph High performance, simplified geospatial analysis all through SQL  Native Geometry Data Types  Self Balancing R-tree Indexing  Full Query and Analysis Select, join, buffer, within distance, nearest neighbor, intersection, union, convex hull, centroid, ... Element 0 Element 1 (Hole) P 1 P 2 P 3 P 4 P 5 P 6 P 7P 8 H 2 H 3 H 4H 1 ROADS RNAME ID TYPE LANES GEOMETRY M40 M25 140 141 HWY HWY 6 4 SELECT a.owner_name, a.acquisition_status FROM properties a, projects b WHERE sdo_within_distance (a.property_geom, b.project_geom, ‘distance = .1 unit = mile’) = ‘TRUE’ and b.project_id=189498;
    18. 18. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1218 Oracle Spatial and Graph New Data Types and Analysis  Spatial Data Types and Models – 2D and 3D Geometries – Raster Imagery and Gridded Data – Land Management Persistent Topology – Geographic and Whole Earth (Geodetic) – Network Data Model Graph  Spatial Analysis – Spatial Search (Containment & Proximity) – Geocoding (Address conversion) – Routing (Turn by Turn directions) – Indexing of 2D & 3D – Full Coordinate System Support
    19. 19. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1219 Oracle OLAP Built-in Access to Analytic Calculations  Multidimensional analytic engine that analyzes summary data  Offers improved query performance and fast, incremental updates  Embedded in Oracle Database instance and storage  Example Analytical Questions  How do sales in the Western region this quarter compare with sales a year ago?  What will sales next quarter be?  What factors can we alter to improve the sales forecast?
    20. 20. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1220 DW DW DW Background Processes Memory Utilized Container Database New Multitenant Architecture Create a pluggable database in a multitenant container database  Improve quality of service  Provide isolation and multitenancy  Keep applications unchanged  Greater resource utilization  Provision and move databases rapidly  Manage and backup many databases as one Benefits Memory and Processes Allocated to Container Database and Shared
    21. 21. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1221 EDW Sandbox Sandboxes for Data Scientists Pluggable Databases provide virtual, resource managed sandboxes EDW Production Data Warehouse EDW Virtual Sandbox EDW Virtual Sandbox Analytical Sandboxes Top Priority Low Priority
    22. 22. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1222 “So the in-database analytics takes advantage of a couple of things: not moving the data, and using all the horsepower in the engineered system…we have reduced our key deliverable runtimes from, say, 49 or 50 hours down to an hour or so…we've had other cases where we've been able to do analysis that we couldn't do in the past -- so it's allowing us to bid on new business, and that obviously is important to our growth.” Chris Wones Director, Data Solutions Dunnhumby USA In-database analytics accelerates the business
    23. 23. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1223 EMBEDDED IN-DATABASE SECURITY
    24. 24. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1224 Data Masking Securely Provisioning Data Warehouses  Mask At-Source  Minimize sensitive data exposure01001011001010100100100100100100100100100100100010 01010100100100100111001001001001001001000010010010 11100100101010010010101010011010100101010010 Subsetted & Masked Data Pump File 12.1 Prod Test At-Source Masking
    25. 25. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1225 Soc. Sec. # 115-69-3428 DOB 11/06/71 NAME SARA JONES Policy enforced redaction of sensitive data Data Redaction Dynamically Masking for Data Warehouses Data Analyst ETL / Data Quality Processes
    26. 26. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1226 Oracle Advanced Security Oracle Database 12c Data Redaction and Data Masking Opportunities:  Call Centers  Decision Support Systems  Systems with PII, PHI, PCI Data NEW Features:  Data Redaction and Data Masking:  Mask Data Dynamically  Discover Sensitive Data  Securely Provision Test Systems Dynamic Masking for Data Warehouses Soc. Sec. # 115-69-3428 DOB 11/06/71 PIN 5623 Policy enforced redaction of sensitive data Call Center Operator Payroll Processing Data Analyst
    27. 27. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1227 “We looked at trying to write application changes to try to take care of [security]; we found that to be very costly. We went with Oracle…because we wanted to make sure that not only did we get the performance and reliability part, but also a fast and easy implementation.” “[Oracle] gives us the way to control access, audit that access, and it really gives us that comprehensive solution that protects our data but still gives users the ability to access that data but not see the sensitive data.” Kyle Nelson Director, Information Technology National Marrow Donor Program In-database security provides defense in-depth
    28. 28. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1228 Oracle is the Industry Leader in Data Warehousing
    29. 29. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1229 Additional Resources Oracle Product Information: http://www.oracle.com/us/products/database/datawarehousing/ Oracle Technology Network: http://www.oracle.com/technetwork/database/bi-datawarehousing/index.html Blogs: https://blogs.oracle.com/bigdataconnectors/ https://blogs.oracle.com/datawarehousing/ Social: https://twitter.com/OracleDatabase https://twitter.com/OracleBigData http://www.linkedin.com/groups?gid=2129659 https://www.facebook.com/OracleBigData
    30. 30. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1230
    31. 31. Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1231

    ×