Summer Tech Day 2014 -Michał Grochowski

424 views
367 views

Published on

Architektura ekosystemu Big
Data w Oracle

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
424
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Summer Tech Day 2014 -Michał Grochowski

  1. 1. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Architektura ekosystemu Big Data w Oracle Omówienie & Przykład Michał Grochowski Principal Sales Consultant Oracle Oracle Confidential – Internal/Restricted/Highly Restricted
  2. 2. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Agenda Architektura Big Data & Komponenty Przykładowy scenariusz Podsumowanie 1 2 3 Oracle Confidential – Internal/Restricted/Highly Restricted 3
  3. 3. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Architektura Big Data & Komponenty Oracle Confidential – Internal/Restricted/Highly Restricted 4
  4. 4. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Information Management Architecture Security and Metadata Source Data Enterprise Data Warehouse Information Access Data Integration
  5. 5. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Traditional Data Architecture Security and Metadata Source Data Layer External COTS/ERP Processes Enterprise Data Warehouse Data Integration Staging Data Layer Performance Layer Embedded Data Marts Data Quality Strongly Typed Data Foundation Layer Enterprise Data with full history Weakly Typed Data
  6. 6. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Add New Data Sources Security and Metadata Source Data Layer External COTS/ERP Processes Streaming Sensors Social/Text Enterprise Data Warehouse Data Integration Staging Data Layer Performance Layer Embedded Data Marts Data Quality Strongly Typed Data Weakly Typed Data Foundation Layer Enterprise Data with full history Architecture Insights Goals: Transform low density data to extract value Balance schema design with time to value Prevent Data Loss Provide Historic Data Retention Value: Enrich data, enable correlation How: Maximum parallelism, Map-Reduce, Targeted languages, Statistics Key Decisions: Software Tools, Hardware, Storage, Network Repeatable Process - Productivity Standards and Support
  7. 7. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Add Knowledge Discovery Security and Metadata Source Data Layer External COTS/ERP Processes Streaming Sensors Social/Text Enterprise Data Warehouse Data Integration Staging Data Layer Performance Layer Knowledge Discovery Layer Embedded Data Marts Data Quality Strongly Typed Data Weakly Typed Data Foundation Layer Enterprise Data with full history Rapid Dev SandboxData Mining Sandbox Architecture Insights • Goals: Enable business centric, rapid exploration Enable deep analytic capabilities • Value: Assess value  find patterns • How: Sampling, correlation, statistics, modeling • Key Decisions:  Minimize data movement  Support individualized data discovery  requires enterprise scale, security
  8. 8. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Knowledge Pools and Correlation Security and Metadata Source Data Layer External COTS/ERP Processes Streaming Sensors Social/Text Enterprise Data Warehouse Data Integration Staging Data Layer Performance Layer Knowledge Discovery Layer Data Mining Sandbox Rapid Dev Sandbox Embedded Data Marts Data Quality Strongly Typed Data Weakly Typed Data Foundation Layer Enterprise Data with full history Architecture Insights • Goal: Analytical Performance • Value: Rapid modeling iterations • How: Move analytics to the data • Key Decisions:  Minimize data movement  Use optimizations: hardware & SQL  Consider “R” as portable statistical platform
  9. 9. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Add Information Access Security and Metadata Source Data Layer Enterprise Data Warehouse Data Integration Staging Data Layer Performance Layer Knowledge Discovery Layer Information Access BIAbstraction&QueryFederation Alerts, Dashboards, Reporting Advanced Analysis & Data Science Services Performance Management Information Discovery Foundation Layer Architecture Insights • Goal: Secure, timely data sharing (No silos!) • Value: Widespread access • How: Standardized services, federation • Key Decisions:  Leverage existing skills, standards, and tools  Build services interfaces for easy sharing and operational integration  Plan architecture from access to ingest
  10. 10. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Translated into Oracle Product Architecture Security and Metadata Source Data Layer External COTS/ERP Processes Streaming Sensors Social/Text Enterprise Data Warehouse Data Integration Staging Data Layer Performance Layer Knowledge Discovery Layer Embedded Data Marts Data Quality Strongly Typed Data Weakly Typed Data Information Access BIAbstraction&QueryFederation Alerts, Dashboards, Reporting Advanced Analysis & Data Science Services Performance Management Information Discovery Foundation Layer Enterprise Data with full history Rapid Dev SandboxData Mining Sandbox Oracle Big Data Appliance Oracle Exalytics Oracle Exadata Oracle Big Data Connectors
  11. 11. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Acquire Organize & Discover Analyze Visualize & Decide Oracle’s Big Data Platform Stream
  12. 12. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Przykładowy scenariusz Oracle Confidential – Internal/Restricted/Highly Restricted 14
  13. 13. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | An Example Multi-Channel Retail Shopping Conversion • Increase sales through all channels • Reduce abandoned on-line carts • Reduce out-of-stock promotional items • Improve satisfaction & word-of-mouth Business Goals • Linking structured and unstructured • Capturing where items are placed • Customer sentiment on social media Challenges
  14. 14. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Goal: Make an Informed Recommendation Sensors Real-Time Recommendations Sentiment & Influence Store Placement Customer Analytics Product Analytics Customer History Inventory Position Structured Sources Unstructured Sources Analytics Cart Search Promotions Customer Behavior Product Impact +
  15. 15. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Using all Data Sources to Understand Customer Website Logs & Data NoSQL DB Sensors Data Warehouse Shopping Cart Site
  16. 16. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Discovering Valuable Data Knowledge Discovery Engine High Volume Distributed File System Website Logs & Data NoSQL DB Sensors Data Warehouse Unstructured Structured Semi-structured
  17. 17. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Valuable Data Found – Now Store it Securely Knowledge Discovery Engine High Volume Distributed File System Website Logs & Data NoSQL DB Sensors Data Warehouse Persistent Data Store for All Data of Value MapReduce code separates valued data, then sent to via specialized adapters to Data Warehouse Discoveries
  18. 18. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Deploy Widely Available Reports & Analytics Knowledge Discovery Engine High Volume Distributed File System Website Logs & Data NoSQL DB Sensors Data Warehouse BI Tools and Dashboards Persistent Data Store for All Data of Value + In-DB Analytics MapReduce code separates valued data, then sent to via specialized adapters to Data Warehouse Enterprise-class for reporting & analysis
  19. 19. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Feed the Recommendation Engine Knowledge Discovery Engine High Volume Distributed File System Website Logs & Data NoSQL DB Sensors Data Warehouse BI Tools and Dashboards Real-Time Analytics and Recommendations Persistent Data Store for All Data of Value + In-DB Analytics MapReduce code separates valued data, then sent to via specialized adapters to Data Warehouse Update Website Recommendations
  20. 20. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Make Well-Tuned Real-Time Recommendations Knowledge Discovery Engine High Volume Distributed File System Website Logs & Data NoSQL DB Sensors Data Warehouse BI Tools and Dashboards Real-Time Analytics and Recommendations Persistent Data Store for All Data of Value + In-DB Analytics MapReduce code separates valued data, then sent to via specialized adapters to Data Warehouse RecommendLocation & User Profile
  21. 21. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Only Oracle Offers this Entire Solution Endeca Information Discovery on Exalytics Cloudera HDFS on Big Data Appliance Reliable, Available, Secure Source of Truth Fast, Intuitive Data Discovery Website Logs & Data Oracle NoSQL DB Real-time Recommendations Analyst Friendly Reporting Query and Analysis Tools Unstructured Data Analysis Advanced Analytics Sensors Oracle Database DW on Exadata Oracle BI Foundation Suite on Exalytics Oracle ERP & CRM Solutions on Exadata Oracle Real-Time Decisions Structured Data Analysis Unstructured Data Analysis
  22. 22. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Exadata Oracle Big Data Appliance MoviePlex Architecture Application Log Log of all activity on site Capture activity nec. for MoviePlex site Streamed into HDFS using Flume Load Recommendations Customer Profile (e.g. recommended movies) Oracle NoSQL DB HDFS Map Reduce ORCH - CF Recs. Map Reduce Hive - Activities Map Reduce Pig - Sessionize Clustering/Market Basket Oracle Advanced Analytics Oracle Exalytics Endeca Information Discovery Oracle Business Intelligence EE “Mood” Recommendat ions Load Session & Activity Data Oracle Big Data Connectors
  23. 23. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Review HDFS Data from Oracle Database • Goal – Need to leverage analytic capabilities, query performance and multi-user support of Oracle Database – Access data in HDFS from Oracle DB – Combine data in HDFS w/data in Oracle tables • Challenge – How do you effectively combine data across these platforms • Products Featured – Oracle Connector for HDFS • Value – Combine latest data in HDFS w/data in database – Move data into DB from HDFS at 12 TB/hour In SQL Developer…
  24. 24. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Combine HDFS Data with Data in Oracle Database • Goal – Need to leverage analytic capabilities, query performance and multi-user support of Oracle Database – Access data in HDFS from Oracle DB – Combine data in HDFS w/data in Oracle tables • Challenge – How do you effectively combine data across these platforms • Products Featured – Oracle Connector for HDFS • Value – Combine latest data in HDFS w/data in database – Move data into DB from HDFS at 12 TB/hour
  25. 25. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Identify Opportunities w/Advanced Analytics • Goal – Maximize uplift from marketing campaign – Use “market basket” analysis to identify the right movies to offer • Challenge – How do you implement advanced analytics - leveraging existing skill sets? – How do you perform analytics at scale? • Products Featured – Oracle Advanced Analytics • Value – Utilize industry standard analytic language R – Leverage scalability of Oracle Database * Note: arules graphics are not available in VM due to licensing. You can install separately but not distribute.
  26. 26. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Identify Opportunities w/Advanced Analytics lhs rhs support confidence lift 1 {Candyman} => {The Time Machine} 0.2256607 0.8560411 2.842981 Support: 22% of people bought both Candyman & The Time Machine Confidence: 85% of people who bought Candyman bought Time Machine Lift: it is 2.8x more impactful than offering The Time Machine with a random movie
  27. 27. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Identify Opportunities w/Advanced Analytics plot(sort(assoc,by="support" )[1:100], method="grouped",control=lis t(k=30))
  28. 28. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Monitor the Business w/BI Dashboards
  29. 29. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Monitor the Business w/BI Dashboards Best Sellers Are people buying recommended movies? What is our close rate? Browse vs Buy Better understand our customer demographics Comedy’s really sell - will look at this later - regardless of recommendations
  30. 30. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Identify Customers & Movies for a Campaign
  31. 31. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Podsumowanie Oracle Confidential – Internal/Restricted/Highly Restricted 33
  32. 32. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Challenge: Exploiting Synergies Big Data Adds to Your Information Architecture ANALYZE DECIDE ACQUIRE ORGANIZE
  33. 33. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Big Data Architecture Capabilities Transactions Management Security,Governance Advanced Analytics Interactive Discovery DBMS (OLTP) Master & Reference Structured Warehouse Text Analytics and Search Reporting & Dashboards Real-Time Machine Generated Social Media Text, Image Video, Audio NoSQL UnstructuredSemi- structured Alerting & Recommendations In-Database Analytics EPM, BI, Social Applications Message-Based ETL/ELT ChangeDC ODS Streaming (EP Engine) Acquire Organize Analyze Decide Hadoop (MapReduce) Specialized Hardware HDFS Data In Memory Analytics RDBMS Cluster Big Data Cluster High Speed Network Files
  34. 34. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Exadata Oracle Exalytics Oracle Big Data Platform Stream Acquire Organize Analyze and Visualize Oracle Big Data Appliance Oracle Big Data Connectors Optimized for Analytics & In-Memory Workloads “System of Record” Optimized for DW/OLTP Optimized for Hadoop, R, and NoSQL Processing Oracle Enterprise Performance Management Oracle Business Intelligence Applications Oracle Business Intelligence Tools Oracle Endeca Information Discovery Cloudera & Hadoop Oracle NoSQL Open Source R Applications Oracle Big Data Connectors Oracle Data Integrator In-DatabaseAnalytics Data Warehouse Oracle Advanced Analytics Oracle Database
  35. 35. Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Przydatne linki architektura referencyjna Big Data architektura referencyjna Big Data (dodatkowe informacje) informacje o Endeca Information Discovery Maszyna wirtualna z Endeca Information Discovery Maszyna wirtualna z Oracle Business Intelligence Maszyna wirtualna z Oracle Big Data portal Oracle Business Intelligence

×