Your SlideShare is downloading. ×
Big Data Appliance: Innowacje w centrach przetwarzania danych - Maciej Tomkiewicz, Oracle
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Big Data Appliance: Innowacje w centrach przetwarzania danych - Maciej Tomkiewicz, Oracle


Published on

Social Media oraz Big Data przez pryzmat Oracle Applications, Oracle Fusion Middleware i Oracle Engineered Systems, Warszawa - 30.03.2012 r.

Social Media oraz Big Data przez pryzmat Oracle Applications, Oracle Fusion Middleware i Oracle Engineered Systems, Warszawa - 30.03.2012 r.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Big Data Appliance - Innowacje w centrachprzetwarzania danychMaciej
  • 2. The following is intended to outline our general product direction.It is intended for information purposes only, and may not beincorporated into any contract. It is not a commitment to deliverany material, code, or functionality, and should not be relied uponin making purchasing decisions. The development, release, andtiming of any features or functionality described for Oracle’sproducts remains at the sole discretion of Oracle.
  • 3. Agenda• The Business of Big Data• ORACLE HW strategy• Inside the Big Data Appliance • Overview • Applications• Summary• Q&A
  • 4. <Insert Picture Here>The Business of Big Data
  • 5. Big Data: Acting on New Data “I found it” Stores Looking back “PAST” Catalog/ Call 60% Web Center Potential increase in Retail Decisions retailers’ operating margins possible with Big Data Social Search Looking ahead Networks McKinsey Global Institute: “FUTURE” Big DataThe next frontier for innovation, competition and productivity (May 2011) “I think” “I want”
  • 6. Tapping into Diverse Data Sets Video and Images Big Data:Decisions based Documentson all your data Social Data Machine-Generated Data Information Architectures Today: Transactions Decisions based on database data
  • 7. Case: On-line Ads and Content Real-time: Determine Low Lookup user best ad to placeLatency profile on page for this user Add user NoSQL Expert if not present DB Input into System HDFS Predictions Actual on browsing ads Web served logs High scale Batch BI and Billing NoSQL DB data reductions Analytics Profiles
  • 8. Case: On-line Adds and Content • Dynamic and rapidly changing schemaNoSQL DB • Scalable single record lookup • Low cost, high scale storage HDFS • Write once, read many times • High scale batch processing Hadoop • Highly customizable infrastructure • Deep analytics and BI value add RDBMS • Reporting for large user community
  • 9. Big Data: Infrastructure Requirements Acquire Organize Analyze• Low, predictable Latency• High Transaction Volume • Deep Analytics• Flexible Data Structures • Agile Development • High Throughput • Massive Scalability • In-Place Preparation • Real Time Results • All Data Sources/Structures
  • 10. Oracle Big Data Platform Big Data Exadata Exalytics ApplianceACQUIRE ORGANIZE ANALYZE DECIDE
  • 11. Why Build A Hadoop Appliance? Time to Build Optimizations Maintenance
  • 13. Engineered TogetherTested Together Certified Together Deployed Together Upgraded Together Managed Together Supported Together 13
  • 15. <Insert Picture Here>Inside the Big Data ApplianceApplications
  • 16. Oracle NoSQL DB A distributed, scalable key-value database• Simple Data Model • Key-value pair with major+sub-key paradigm • Read/insert/update/delete operations Application Application• Scalability NoSQLDB Driver NoSQLDB Driver • Dynamic data partitioning and distribution • Optimized data access via intelligent driver• High availability • One or more replicas • Disaster recovery through location of replicas • Resilient to partition master failures • No single point of failure• Transparent load balancing Storage Nodes Storage Nodes • Reads from master or replicas Data Center A Data Center B • Driver is network topology & latency aware• Elastic (Planned for Release 2) • Online addition/removal of Storage Nodes • Automatic data redistribution
  • 17. Big Data ApplianceSystem Layout Master NodeNoSQL DB Replicas
  • 18. Oracle NoSQL DB Differentiation• Commercial Grade Software and Support • General-purpose • Reliable – Based on proven Berkeley DB JE HA • Easy to install and configure• Scalable throughput, bounded latency• Simple Programming and Operational Model • Simple Major + Sub key and Value data structure • ACID transactions • Configurable consistency & durability• Easy Management • Web-based console, API accessible • Manages and Monitors: Topology; Load; Performance; Events; Alerts• Completes Oracle large scale data storage offerings
  • 19. Oracle Loader for Hadoop Partition and transform into Oracle ready format QueryInput Load . . . . TableInput Oracle Loader for Hadoop
  • 20. Streaming Access to HDFS HDFSDatafile_part_1 Oracle Database External Query HDFS TableDatafile_part_2 View FUSE HDFS OrDatafile_part_m Table Function HDFSDatafile_part_n Reduce Map HDFS Datafile_part_x
  • 21. Oracle Data Integrator Easily integrate data from any source Expanded functionality: => Construct Hadoop jobs to transform and load data into Oracle => Leverage Oracle Loader for Hadoop and/or Hive
  • 22. Oracle Big Data ApplianceSoftware Overview
  • 23. Oracle Big Data Appliance Pre-Installed, Optimized • Oracle Linux 5.6 • Java Hotspot VM • Cloudera CDH • Cloudera Manager • Open Source R Distribution • Oracle NoSQL Database CE • Oracle Big Data Connectors ** Optional. Available and licensed separately.
  • 24. Why Cloudera CDH?• Fast evolution in critical features – Built by the Hadoop experts in the community – Practical instead of esoteric• Proven at very large scale – In production at all the large consumers of Hadoop – Extremely stable in those environments• Managed and Tested by Cloudera – Managed Open Source components – Contains a rich management GUI tool
  • 25. Cloudera CDH Apache Hadoop Apache Sqoop Apache Hive Apache Mahout Apache Pig Apache Whirr Apache HBase Apache Oozie Apache Zookeeper Fuse-DFS Apache Flume HueLatest details at:
  • 26. Hadoop Software Layout (Masters) • Node 1: – M: Name Node, Balancer & HBase Master – S: HDFS Data Node, NoSQL Database Storage Node* • Node 2: – M: Secondary Name Node, Cloudera Manager, Zookeeper, MySQL Slave – S: HDFS Data Node, NoSQL Database Storage Node* • Node 3: 3 – M: JobTracker, MySQL Master, ODI Agent, Hive Server 2 1 – S: HDFS Data Node, NoSQL Database Storage Node** Optional
  • 27. Oracle Big Data Connectors*Software Overview
  • 28. Oracle Data IntegratorSimplifying MapReduce Oracle Data Integrator Automatically generates MapReduce code Oracle Loader for Manages the process Hadoop Loads into Data Warehouse
  • 29. Oracle Loader for HadoopUse The Cluster ORACLE LOADER FOR HADOOP MAP REDUCE MAP Last stage in MapReduce MAP SHUFFLE /SORT REDUCE workflow Partitioned and non- MAP REDUCE partitioned tables MAP REDUCE SHUFFLE MAP /SORT REDUCE Online and offline loads
  • 30. Oracle Direct Connector for HDFSDirect Access from Oracle Database HDFS Oracle Database SQL Query SQL access to HDFS External Table External table view Data query or import DCH DCH HDFS Infini Band DCH Client
  • 31. Oracle R Hadoop Connector Native R Access to HadoopClient Host Oracle Big Data Oracle Exadata Appliance R Engine R Engine ORE ORHC ORHC Native R MapReduce Hadoop MapReduce R Engine Native R HDFS access Cluster Nodes ORE Software HDFS
  • 32. Oracle Big Data Appliance• Optimized and Complete – All you need to store and integrate big data• Integrated with Oracle Exadata – Analyze all your data• Easy to Deploy – Risk Free, Quick Installation and Setup• Single Vendor Support – Full Oracle support for all hardware and software