Oracle's BigData solutions

Big Data – The New Information
Luis Campos
Big Data Solutions Director, Oracle EMEA
@luigicampos
1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Big Data – The New Information
Challenges and Technology Advances

AGENDA

- Where’s the new information?
- Where could it be?
- If in the right place, what could be achieved?
- New Technologies and the role of Oracle Corp.
- Challenges of the main industries.
- The Role of Switzerland in the Big Data space


Where’s the New Information?

New Sources New Analytics New Integrations New
from Any Data on All Data of Data Orchestrations
Any Computing
model


What does “New Data” really means?

Absorb
Any Data,
Any Source All
Dimensions
of Data
=
360º


What does “All Data” really means?

Stop Throwing
Any Data, Data Away
Any Source
=
Know More
About What’s
Going
On
in your Business


What does “Any Data” really means?

Any Data,
Any Source Tap Any Data
=
New Revenue
Streams


Next Step: Reduced It

Any Data, Full Range of
All Data
Any Source Analytics
Filtered
Cleaned


Oracle Big Data Reference Architecture
Source Data Layer Information Access

Enterprise Data Warehouse
Performance
Processes Management
Staging Data Layer

BI Abstraction & Query Federation
COTS/ERP Strongly Typed Foundation Layer Alerts,
Data Dashboards,
Performance Layer Reporting
External Enterprise Data
Data with full history
Quality Embedded
Services
Data Marts
Social/Text
Weakly Typed
Data Information
Discovery
Sensors

Knowledge Discovery Layer Advanced
Streaming Analysis &
Data Mining Sandbox Rapid Dev Sandbox Data Science

Security and Metadata Data Integration


Translated into Oracle Product Architecture
Source Data Layer Information Access
Enterprise Data Warehouse
Performance
Processes Management
Staging Data Layer

BI Abstraction & Query Federation
COTS/ERP Strongly Typed Foundation Layer
Data Oracle Oracle BI
Dashboards,
Alerts,

External
Database
Enterprise Data Performance Layer Reporting
Foundation
Data -Advanced Analytics & OLAP
with full history
Quality Embedded
- Spatial and Graph
Data Marts Endeca
Services

Social/Text - Industry Models
Oracle
Weakly Typed Information
Information
Data
Sensors Big Data Discovery
Discovery

Appliance Endeca Information
Knowledge Discovery Layer Advanced
Discovery Analysis &
Streaming
Data Mining Sandbox Rapid Dev Sandbox Data Science

Security and Metadata Data Integration Oracle Big Data Connectors

Big Data Appliance
Hadoop Ecosystem for the Enterprises

Oracle
Big Data
Appliance 18 Nodes
648TB, 288 CPUs
12 Nodes (U)
Cloudera Dist. Hadoop
6 Nodes
Oracle NoSQL 216TB, 96 CPUs
BD Connectors


Oracle’s Big Data Connectors
Unlock the power of Hadoop integration

Hadoop Oracle Database

Oracle Big Data
Connectors


(1) Oracle Data Integrator Application
Adapters for Hadoop
Transforms
Via MapReduce(HIVE)
Benefits
 Consistent tooling across BI/DW,
SOA, Integration and Big Data
Oracle Data
Integrator Activates
 Reduce complexities :
graphical tooling
Oracle Loads
Loader for
Hadoop
 Improves productivity

Oracle Database
Improving Productivity and
Efficiency for Big Data


(2) Oracle SQL Connector for Hadoop
Accessing HDFS Data from Oracle Database
Access or load into the
Features database in parallel using
external table mechanism
HDFS Oracle Database
Access and analyze data in SQL Query
place on HDFS

Query and join data on ODCH External
HDFS with database ODCH Table
ODCH
resident data
HDFS
Load into the database Client
using SQL if required

Automatic load balancing to
maximize performance


(3) Oracle R Connector for Hadoop
R Analytics leveraging Hadoop and HDFS
Oracle R Client
Linearly Scale a Robust Set
of R Algorithms

MAP MAP MAP MAP Hadoop Leverage MapReduce for R
Calculations
REDUCE REDUCE

Compute Intensive
HDFS Parallelism for Simulations


What is ?
• Brings R’s statistical functionality to the Oracle Database
• Eliminates R’s memory constraints
• Allows R to run on very large data sets
• Oracle R is architected for enterprise production infrastructure
• Automatically exploits database parallelism without requiring
parallel R programming
• Oracle R leverages the latest R algorithms and packages
• R is an embedded component of the DBMS server
• Part of Oracle Advanced Analytics (+ODM)

Oracle R Architecture
R workspace console

Function push-down Oracle statistics engine
OBIEE, Web
– data transformation & Services
statistics

Development Production Consumption
• Leverages SQL for data prep, analysis and enhanced statistics engine
• R engine runs on database nodes for production enablement of R models
• Leverages Exadata—Oracle R workloads run in-database and can be bound to
database nodes for workload isolation
• Enriches OBIEE dashboards with Oracle R statistics and analytics

Oracle Data Mining (ODM) Data mining can answer questions
that cannot be addressed through
simple query and reporting techniques.

• Data Mining: Insight from discovering relationships
• Knowledge about what happened in the past
• Characterization, segmentation, comparisons, discrimination
• Descriptive models of patterns
• Predictive Analytics: Making better decisions and
forecasts
• Knowledge about what is happening right now and in the future
• Classification and prediction of patterns
• Rule-and-model driven

Data Mining – Some Definitions
Supervised Learning
Problem Classification Sample Problem
Classification Predict customer response to an affinity
card program

Regression Predict customer’s age

Attribute Importance Find the most significant predictors, data
preparation
A1 A2 A3 A4 A5 A6 A7

Data Mining – Some Definitions
Unsupervised Learning
Problem Classification Sample Problem
Anomaly Identify customer purchasing behavior that is
Detection significantly different from the norm
Association Find the items that tend to be purchased
Rules together and specify their relationship –
market basket analysis
Clustering Segment demographic data into clusters and
rank the probability that an individual will
belong to a given cluster
Feature Group the attributes into general
Extraction characteristics of the customers
F1 F2 F3 F4

Endeca Information Discovery
Sandbox and Production mode

Endeca
Information
Discovery
Studio
Endeca MDEX Server
Intergration Suite


Information Discovery on Big Data To Be
Released
Complements Hadoop Soon

Oracle Endeca
Information Discovery

 Data Variety -
structured and
unstructured data
 Massive scalability  No model required  In-memory analytics
 Batch execution  Interactive
 Business users
Filters
Deep Large-Scale Fast Self-service
Processing Persists Information Discovery


What is the world doing today
Large Spanish Clothes Manufacturer

• Automation

• Sensory Event Processing

• Quality Assurance


What is the world doing today
Second Largest Bank in United States of America

• Analysis of data xLoB:
Loans, Insurance, on-line banking, card products
• Market assessment
• Risk Analysis
• Revenue lift for new & existing products


Telco Industry
Deep, Big and Fast
Deep
• SNA*, Find Influencers, RA**
Big
• Network Optimization,
• CDR Analysis
Fast
• Sentiment Analysis
• Location Based Services
• Click stream Analysis

© 2012 Oracle Corporation – Proprietary and Confidential
* Social Network Analysis (Rate plan optimization) ** Revenue Assurance

Retail Industry
Marketing, Merchandising and Supply Chain
Marketing
• In-store behaviour analysis
• Sentiment Analysis + Microsegmentation
Merchandising
• Assortment optimization
Supply Chain
• Distribution and logistics optimization
• Informing supplier negotiations


Life Sciences / Pharmaceutical

Life Sciences
• DNA Sequencing, Diseases Correlation

Pharmaceutical
• Clinical Trial – meds simulation


The Role of Switzerland in the Big Data space

 Scientific Research
 Data Science
 Financial Industry
 Telco
 Public Sector


Danke / Merci / Grazie


Oracle's BigData solutions

More Related Content

What's hot

Viewers also liked

Similar to Oracle's BigData solutions

More from Swiss Big Data User Group

Recently uploaded

Oracle's BigData solutions

Editor's Notes