Presentation big dataappliance-overview_oow_v3

<Insert Picture Here>
Oracle Big Data Appliance
Jacco Draaijer
Jean-Pierre Dijcks

The following is intended to outline our general product
direction. It is intended for information purposes only, and
may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality,
and should not be relied upon in making purchasing
decisions.
The development, release, and timing of any features or
functionality described for Oracle’s products remain at the
sole discretion of Oracle.

Agenda
• The Business of Big Data
• Big Data Technology
• Inside the Big Data Appliance
• Overview
• Applications
• Summary
• Q&A

The Business of Big Data

Big Data: Acting on New Data
“I think” “I want”
Retail
Decisions
Stores
Web
Search
Social
Networks
Catalog/
Call
Center
“I found it”
Looking back
“PAST”
Looking ahead
“FUTURE”
60%
Potential increase in
retailers’ operating margins
possible with Big Data
McKinsey Global Institute:
Big DataThe next frontier for innovation, competition and productivity (May 2011)

Tapping into Diverse Data Sets
Transactions
Information
Architectures
Today:
Decisions based
on database data
Big Data:
Decisions based
on all your data
Video and Images
Machine-Generated Data
Social Data
Documents

Case: On-line Ads and Content
NoSQL
DB
Expert
System
Real-time: Determine
best ad to place
on page for this user
Input into
Lookup user
profile
Add user
if not present
Web
logs
HDFS
Profiles
NoSQL DB
High scale
data reductions BI and
Analytics
Billing
Predictions
on browsing
Actual
ads
served
Low
Latency
Batch

Case: On-line Adds and Content
NoSQL DB
HDFS
Hadoop
RDBMS
• Dynamic and rapidly changing schema
• Scalable single record lookup
• Low cost, high scale storage
• Write once, read many times
• High scale batch processing
• Highly customizable infrastructure
• Deep analytics and BI value add
• Reporting for large user community

Big Data Technology

• Deep Analytics
• Agile Development
• Massive Scalability
• Real Time Results
• High Throughput
• In-Place Preparation
• All Data Sources/Structures
• Low, predictable Latency
• High Transaction Volume
• Flexible Data Structures
Big Data: Infrastructure Requirements
Acquire Organize Analyze

Divided Solution Spectrum
Acquire AnalyzeOrganize
MapReduce
Solutions
DBMS
(DW)
DBMS
(OLTP)
Advanced
Analytics
Distributed
File Systems
Transaction
(Key-Value)
Stores
ETL
NoSQL
Flexible
Specialized
Developer
Centric
SQL
Trusted
Secure
Administered
Schema-less
Unstructured
Data
Variety
Schema
Information
Density

12 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Insert Information Protection Policy Classification from Slide 8
Oracle Integrated Software Solution Stack
Oracle
Database
(DW)
Oracle
Database
(OLTP)
In-DB
Analytics
“R”
Mining
Text
Graph
Spatial
Oracle
BI EE
Oracle NoSQL
DB
HDFS Hadoop
Oracle
Data Integrator
Oracle Loader
for Hadoop
Data
Variety
Information
Density
Unstructured
Schema

reserved.
Why build a Hadoop Appliance?
• Time to Build?
• Required Expertise?
• Cost and Difficulty Maintaining?

reserved.
Oracle Engineered Solutions
Oracle
Database
(DW)
Oracle
Database
(OLTP)
In-DB
Analytics
“R”
Mining
Text
Graph
Spatial
Oracle
BI EE
Oracle NoSQL
DB
HDFS Hadoop
Oracle
Data Integrator
Oracle Loader
for Hadoop
Data
Variety
Information
Density
Unstructured
Schema
Big Data Appliance
• Hadoop
• NoSQL Database
• Oracle Loader for hadoop
• Oracle Data Integrator
Oracle Exadata
• OLTP & DW
• Data Mining & Oracle R
• Semantics
• Spatial
Exalytics
• Speed of
Thought
Analytics

reserved.
• Oracle Exadata – best engineered
solution for:
• Data Warehousing and In-database
Analytics
• Relational Database Consolidation
• Oracle Exalytics – best engineered
solution for:
• Business Intelligence
• “Speed of Thought” Data
Visualization & Discovery
The Engineered Systems Story Evolves…
Engineered Systems for All of Your Big Data Needs

•18 Sun X4270 M2 Servers
– 48 GB memory per node = 864 GB memory
– 12 Intel cores per node = 216 cores
– 24 TB storage per node = 432 TB storage
•40 Gb p/sec InfiniBand
•10 Gb p/sec Ethernet
Oracle Engineered SystemsOracle Big Data Appliance Hardware

Big Data Appliance
Cluster of industry standard servers for Hadoop and NoSQL Database
• Focus on Scalability and Availability at low cost
Compute and Storage
• 18 High-performance low-cost
servers acting as Hadoop
nodes
• 24 TB Capacity per node
• 2 6-core CPUs per node
• Hadoop triple replication
• NoSQL Database triple
replication
10GigE Network
• 8 10GigE ports
• Datacenter connectivity
InfiniBand Network
• Redundant 40Gb/s switches
• IB connectivity to Exadata

Big Data Appliance Building Block
• High-performance storage server built from
industry standard components
• 12 disks - 2TB 7200 RPM
High Capacity SAS
• 2 Six-Core Intel Xeon Processors (L5640)
• Dual ported 40 Gb/sec InfiniBand
• Optimized software layout:
• Hadoop HDFS
• HBase and Hive
• NoSQL Database and Replicas
• Hardware by Sun
• Software by Oracle

Scale Out to Infinity
Scale out by connecting racks
to each other using Infiniband
•60 Nodes
•864 Cores
•1.7 PB Storage

•Oracle Linux 5.6
•Java Hotspot VM
•Apache Hadoop Distribution v0.20.x
•R Distribution
•Oracle NoSQL Database Enterprise
Edition
•Oracle Data Integrator Application
Adapter for Hadoop
•Oracle Loader for Hadoop
Oracle Big Data Appliance Software

Why Open-Source Apache Hadoop?
• Fast evolution in critical features
• Built by the Hadoop experts in the community
• Practical instead of esoteric
• Focus on what is needed for large clusters
• Proven at very large scale
• In production at all the large consumers of Hadoop
• Extremely stable in those environments
• Well-understood by practitioners

Software Layout
• Node 1:
• M: Name Node, Balancer & HBase Master
• S: HDFS Data Node, NoSQL DB Storage Node
• Node 2:
• M: Secondary Name Node, Management,
Zookeeper, MySQL Slave
• Node 3:
• M: JobTracker, MySQL Master, ODI Agent,
Hive Server
• Node 4 – 18:
• S: HDFS Data Nodes, Task Tracker, HBase
Region Server, NoSQL DB Storage Nodes
• Your MapReduce runs here!

Big Data Appliance
Usage Model
Oracle
Big Data Appliance
Oracle
Exadata
InfiniBand
Acquire Organize Analyze & VisualizeStream
Oracle
Exalytics
InfiniBand

Big Data Appliance
Big Data for the Enterprise
• Optimized and Complete
• Everything you need to store and integrate
your lower information density data
• Integrated with Oracle Exadata
• Analyze all your data
• Easy to Deploy
• Risk Free, Quick Installation and Setup
• Single Vendor Support
• Full Oracle support for the entire system and
software set

Inside the Big Data Appliance
Applications

Oracle NoSQL DB
A distributed, scalable key-value database
• Simple Data Model
• Key-value pair with major+sub-key paradigm
• Read/insert/update/delete operations
• Scalability
• Dynamic data partitioning and distribution
• Optimized data access via intelligent driver
• High availability
• One or more replicas
• Disaster recovery through location of replicas
• Resilient to partition master failures
• No single point of failure
• Transparent load balancing
• Reads from master or replicas
• Driver is network topology & latency aware
• Elastic (Planned for Release 2)
• Online addition/removal of Storage Nodes
• Automatic data redistribution
Storage Nodes
Data Center A
Storage Nodes
Data Center B
NoSQLDB Driver
Application
NoSQLDB Driver
Application

NoSQL DB
Big Data Appliance
System Layout
Master Node
Replicas
Note: For illustration purposes only!

Oracle NoSQL DB Differentiation
• Commercial Grade Software and Support
• General-purpose
• Reliable – Based on proven Berkeley DB JE HA
• Easy to install and configure
• Scalable throughput, bounded latency
• Simple Programming and Operational Model
• Simple Major + Sub key and Value data structure
• ACID transactions
• Configurable consistency & durability
• Easy Management
• Web-based console, API accessible
• Manages and Monitors: Topology; Load; Performance; Events; Alerts
• Completes Oracle large scale data storage offerings

Input
Input
Query
Table
Oracle Loader for Hadoop
Load
....
Partition and transform
into Oracle ready format
....
Oracle Loader for Hadoop

Streaming Access to HDFS
HDFS
HDFS
HDFS
HDFS
HDFS
Datafile_part_1
Datafile_part_2
Datafile_part_m
Datafile_part_n
Datafile_part_x
Oracle Database
FUSE
External
Table
View
Or
Table
Function
Reduce
Map
Query

Oracle Data Integrator
Easily integrate data from any source
Expanded functionality:
=> Construct Hadoop jobs to transform and load
data into Oracle
=> Leverage Oracle Loader for Hadoop and/or
Hive

Big Data
• Big data can improve your
top line today!
• Big data can make you
much more agile
• Provides an edge over your
competitors
Opportunity
Threat
• Big data is here – now
• Your competitors will not
miss out on the opportunity
• Act now! Start building a
big data platform for your
organization

Big Data Appliance and Exadata
NoSQL DB

HDFS

Hadoop

RDBMS 

Big Data Appliance
• Optimized and Complete
• Everything you need to store and integrate your lower
information density data
• Integrated with Oracle Exadata
• Analyze all your data
• Easy to Deploy
• Risk Free, Quick Installation and Setup
• Single Vendor Support
• Full Oracle support for the entire system and software
set

Presentation big dataappliance-overview_oow_v3

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Presentation big dataappliance-overview_oow_v3

Similar to Presentation big dataappliance-overview_oow_v3 (20)

More from xKinAnx

More from xKinAnx (20)

Recently uploaded

Recently uploaded (20)

Presentation big dataappliance-overview_oow_v3