SlideShare a Scribd company logo
1 of 46
Download to read offline
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Using Oracle Big Data SQL to add Hadoop + NoSQL

to your Oracle Data Warehouse
Mark Rittman, CTO, Rittman Mead
SQL Celebration Day, Netherlands June 2016
info@rittmanmead.com www.rittmanmead.com @rittmanmead 2
•Many customers and organisations are now running initiatives around “big data”

•Some are IT-led and are looking for cost-savings around data warehouse storage + ETL

•Others are “skunkworks” projects in the marketing department that are now scaling-up

•Projects now emerging from pilot exercises

•And design patterns starting to emerge
Many Organisations are Running Big Data Initiatives
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Gives us an ability to store more data, at more detail, for longer

•Provides a cost-effective way to analyse vast amounts of data

•Hadoop & NoSQL technologies can give us “schema-on-read” capabilities

•There’s vast amounts of innovation in this area we can harness

•And it’s very complementary to Oracle BI & DW
Why is Hadoop of Interest to Us?
info@rittmanmead.com www.rittmanmead.com @rittmanmead 4
•Mark Rittman, Co-Founder of Rittman Mead

‣Oracle ACE Director, specialising in Oracle BI&DW

‣14 Years Experience with Oracle Technology

‣Regular columnist for Oracle Magazine

•Author of two Oracle Press Oracle BI books

‣Oracle Business Intelligence Developers Guide

‣Oracle Exalytics Revealed

‣Writer for Rittman Mead Blog :

http://www.rittmanmead.com/blog

•Email : mark.rittman@rittmanmead.com

•Twitter : @markrittman
About the Speaker
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Flexible Cheap Storage for Logs, Feeds + Social Data
$50k
Hadoop
Node
Voice + Chat
Transcripts
Call Center LogsChat Logs iBeacon Logs Website LogsCRM Data Transactions Social FeedsDemographics
Raw Data
Customer 360 Apps
Predictive 

Models
SQL-on-Hadoop
Business analytics
Real-time Feeds,

batch and API
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Incorporate Hadoop Data Reservoirs into DW Design
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Oracle Big Data Product Architecture
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Oracle Big Data Appliance - Engineered System for running Hadoop alongside Exadata

•Oracle Big Data Connectors - Utility from Oracle for feeding Hadoop data into Oracle

•Oracle Data Integrator EE Big Data Option - Add Spark, Pig data transforms to Oracle ODI

•Oracle BI Enterprise Edition - can connect to Hive, Impala for federated queries

•Oracle Big Data Discovery - data wrangling + visualization tool for Hadoop data reservoirs

•Oracle Big Data SQL - extend Oracle SQL 

language + processing to Hadoop
Oracle Software Initiatives around Big Data
Hang on though…
This is Hadoop
Where everything is non-relational
isn’t SQL on Hadoop

somewhat missing the point?
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Where Can SQL Processing Be Useful with Hadoop?
•Hadoop is not a cheap substitute for enterprise DW platforms - don’t use it like this

•But adding SQL processing and abstraction can help in many scenarios:

• Query access to data stored in Hadoop as an archive

• Aggregating, sorting, filtering and transforming data

• Set-based transformation capabilities for other frameworks (e.g. Spark)

• Ad-hoc analysis and data discovery in-real time

• Providing tabular abstractions over complex datatypes
SQL!
Though 

SQL

isn’t actually

relational
According
to Chris Date

SQL is just

mappings
Tedd Codd

used 

Predicate

Calculus
and there’s

never been

a mainstream

relational

DBMS
but it is the

standard

language for

RDBMSs
and it’s great

for set-based

transforms

& queries
so

Yes SQL!
info@rittmanmead.com www.rittmanmead.com @rittmanmead 15
•Original developed at Facebook, now foundational within the Hadoop project

•Allows users to query Hadoop data using SQL-like language

•Tabular metadata layer that overlays files, can interpret semi-structured data (e.g. JSON)

•Generates MapReduce code to return required data

•Extensible through SerDes and Storage Handlers

•JDBC and ODBC drivers for most platforms/tools

•Perfect for set-based access + batch ETL work
Apache Hive : SQL Metadata + Engine over Hadoop
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Hive uses a RBDMS metastore to hold

table and column definitions in schemas

•Hive tables then map onto HDFS-stored files

‣Managed tables

‣External tables

•Oracle-like query optimizer, compiler,

executor

•JDBC and OBDC drivers,

plus CLI etc
16
How Does Hive Translate SQL into MapReduce?
Hive Thrift
Server
JDBC / ODBC
Parser Planner
Execution Engine
Metastore
MapReduc
e
HDFS
HueCLI
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Hive uses a RBDMS metastore to hold

table and column definitions in schemas

•Hive tables then map onto HDFS-stored files

‣Managed tables

‣External tables

•Oracle-like query optimizer, compiler,

executor

•JDBC and OBDC drivers,

plus CLI etc
17
How Does Hive Translate SQL into MapReduce?
hive> select count(*) from src_customer;


Total MapReduce jobs = 1

Launching Job 1 out of 1

Number of reduce tasks determined at compile time: 1

In order to change the average load for a reducer (in bytes):

set hive.exec.reducers.bytes.per.reducer=

In order to limit the maximum number of reducers:

set hive.exec.reducers.max=

In order to set a constant number of reducers:

set mapred.reduce.tasks=

Starting Job = job_201303171815_0003, Tracking URL = 

http://localhost.localdomain:50030/jobdetails.jsp…

Kill Command = /usr/lib/hadoop-0.20/bin/

hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 

-kill job_201303171815_0003



2013-04-17 04:06:59,867 Stage-1 map = 0%, reduce = 0%

2013-04-17 04:07:03,926 Stage-1 map = 100%, reduce = 0%

2013-04-17 04:07:14,040 Stage-1 map = 100%, reduce = 33%

2013-04-17 04:07:15,049 Stage-1 map = 100%, reduce = 100%

Ended Job = job_201303171815_0003

OK

25

Time taken: 22.21 seconds
HiveQL

Query
MapReduce

Job submitted
Results 

returned
But Hive is too slow 

for ad-hoc queries
info@rittmanmead.com www.rittmanmead.com @rittmanmead 19
•Cloudera’s answer to Hive query response time issues

•MPP SQL query engine running on Hadoop, bypasses MapReduce for
direct data access

•Mostly in-memory, but spills to disk if required

•Uses Hive metastore to access Hive table metadata

•Similar SQL dialect to Hive - not as rich though and no support for Hive
SerDes, storage handlers etc
Cloudera Impala - Fast, MPP-style Access to Hadoop Data
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Apache Drill is another SQL-on-Hadoop project that focus on schema-free data discovery

•Inspired by Google Dremel, innovation is querying raw data with schema optional

•Automatically infers and detects schema from semi-structured datasets and NoSQL DBs

•Join across different silos of data e.g. JSON records, Hive tables and HBase database

•Aimed at different use-cases than Hive - 

low-latency queries, discovery 

(think Endeca vs OBIEE)
Apache Drill - SQL for Schema-Free Data Discovery
info@rittmanmead.com www.rittmanmead.com @rittmanmead 21
•A replacement for Hive, but uses Hive concepts and

data dictionary (metastore)

•MPP (Massively Parallel Processing) query engine

that runs within Hadoop

‣Uses same file formats, security,

resource management as Hadoop

•Processes queries in-memory

•Accesses standard HDFS file data

•Option to use Apache AVRO, RCFile,

LZO or Parquet (column-store)

•Designed for interactive, real-time

SQL-like access to Hadoop
How Impala Works
Impala
Hadoop
HDFS etc
BI Server
Presentation Svr
Cloudera Impala

ODBC Driver
Impala
Hadoop
HDFS etc
Impala
Hadoop
HDFS etc
Impala
Hadoop
HDFS etc
Impala
Hadoop
HDFS etc
Multi-Node

Hadoop Cluster
but sometimes,
but sometimes,
you need the real thing
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Originally part of Oracle Big Data 4.0 (BDA-only)

‣Also required Oracle Database 12c, Oracle Exadata Database Machine

•Extends Oracle Data Dictionary to cover Hive

•Extends Oracle SQL and SmartScan to Hadoop

•Extends Oracle Security Model over Hadoop

‣Fine-grained access control

‣Data redaction, data masking

‣Uses fast c-based readers where possible

(vs. Hive MapReduce generation)

‣Map Hadoop parallelism to Oracle PQ

‣Big Data SQL engine works on top of YARN

‣Like Spark, Tez, MR2
Oracle Big Data SQL
Exadata

Storage Servers
Hadoop

Cluster
Exadata Database

Server
Oracle Big

Data SQL
SQL Queries
SmartScan SmartScan
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•As with other next-gen SQL access layers, uses common Hive metastore table metadata

•leverages Hadoop standard APIs for HDFS file access, metadata integration etc
Leverages Hive Metastore and Hadoop file access APIs
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Brings query-offloading features of Exadata

to Oracle Big Data Appliance

•Query across both Oracle and Hadoop sources

•Intelligent query optimisation applies SmartScan

close to ALL data

•Use same SQL dialect across both sources

•Apply same security rules, policies, 

user access rights across both sources
Extending SmartScan, and Oracle SQL, Across All Data
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Read data from HDFS Data Node

‣Direct-path reads

‣C-based readers when possible

‣Use native Hadoop classes otherwise

•Translate bytes to Oracle

•Apply SmartScan to Oracle bytes

‣Apply filters

‣Project columns

‣Parse JSON/XML

‣Score models
How Big Data SQL Accesses Hadoop (HDFS) Data
Disks%
Data$Node$
Big$Data$SQL$Server$
External$Table$Services$
Smart$Scan$
RecordReader%
SerDe%
10110010%10110010%10110010%
1%
2%
3%
1
2
3
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•“Query Franchising – dispatch of query processing to self-similar compute agents on
disparate systems without loss of operational fidelity”
•Contrast with OBIEE which provides a query federation capability over Hadoop

•Sends sub-queries to each data source

•Relies on each data source’s native query engine, and resource management

•Query franchising using Big Data SQL ensures consistent resource management 

•And contrast with SQL translation tools (i.e. Oracle SQL to Impala)

•Either limits Oracle SQL to the subset that Hive, Impala supports

•Or translation engine has to transform each Oracle feature into Hive, Impala SQL
Query Franchising vs. SQL Translation / Federation
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Oracle Database 12c 12.1.0.2.0 with Big Data SQL option can view Hive table metadata

‣Linked by Exadata configuration steps to one or more BDA clusters

•DBA_HIVE_TABLES and USER_HIVE_TABLES exposes Hive metadata

•Oracle SQL*Developer 4.0.3, with Cloudera Hive drivers, can connect to Hive metastore
View Hive Table Metadata in the Oracle Data Dictionary
SQL> col database_name for a30
SQL> col table_name for a30
SQL> select database_name, table_name
2 from dba_hive_tables;
DATABASE_NAME TABLE_NAME
------------------------------ ------------------------------
default access_per_post
default access_per_post_categories
default access_per_post_full
default apachelog
default categories
default countries
default cust
default hive_raw_apache_access_log
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Big Data SQL accesses Hive tables through external table mechanism

‣ORACLE_HIVE external table type imports Hive metastore metadata

‣ORACLE_HDFS requires metadata to be specified

•Access parameters cluster and tablename specify Hive table source and BDA cluster
Hive Access through Oracle External Tables + Hive Driver
CREATE TABLE access_per_post_categories(
hostname varchar2(100),
request_date varchar2(100),
post_id varchar2(10),
title varchar2(200),
author varchar2(100),
category varchar2(100),
ip_integer number)
organization external
(type oracle_hive
default directory default_dir
access parameters(com.oracle.bigdata.tablename=default.access_per_post_categories));
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Run normal Oracle SQL from the Oracle Database server

•Big Data SQL query franchising then uses agents on Hadoop nodes to query and return
data independent of YARN scheduling; Oracle Database combines and returns full results
Running Oracle SQL on Hadoop Data Nodes
SELECT w.sess_id,w.cust_id,c.name
FROM web_logs w, customers c
WHERE w.source_country = ‘Brazil’
AND c.customer_id = w.cust_id
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•OBIEE can access Hadoop data via Hive, but it’s slow 

•(Impala only has subset of Oracle SQL capabilities)

•Big Data SQL presents all data to OBIEE as Oracle data, with full advanced analytic
capabilities across both platforms
Example : Combining Hadoop + Oracle Data for BI
Hive Weblog Activity table
Oracle Dimension lookup tables
Combined output

in report form
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Not all functions can be offloaded to Hadoop tier

•Even for non-offloadable operations Big Data SQL will perform column pruning and
datatype conversion (which saves a lot of resources)

•Other operations (non-offloadable) will be done on the database side

•Requires Oracle Database 12.1.0.2 + patchset, and per-disk licensing for Big Data SQL

•You need and Oracle Big Data Appliance, and Oracle Exadata, to use Big Data SQL
Restrictions when using Oracle Big Data SQL
SELECT NAME FROM v$sqlfn_metadata WHERE offloadable ='YES'
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•From Big Data SQL 3.0, commodity hardware can be used instead of BDA and Exadata

•Oracle Database 12.1.0.2 on x86_64 with Jan/Apr Proactive Bundle Patches

•Cloudera CDH 5.5 or Hortonworks HDP 2.3 on RHEL/OEL6

•See MOS Doc ID 2119369.1 - note cannot mix Engineered/Non-Engineered platforms
Running Big Data SQL on Commodity Hardware
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•No functional differences when running Big Data SQL on commodity hardware

•External table capability lives with the database, and the performance functionality with
the BDS cell software.

•All BDS features (SmartScan, offloading, storage indexes etc still available)

•But hardware can be a factor now, as we’re pushing processing down and data up the wire

•1GB ethernet can be too slow, 10Gb is a minimum (i.e. no InfiniBand)

•If you run on an undersized system you may see bottlenecks on the DB side. 
Big Data SQL on Commodity Hardware Considerations
But Hadoop is more than 

simple HDFS files
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Subsequent releases of Big Data SQL have extended its Hadoop capabilties

•Support for Hive storage handlers (HBase, MongoDB etc)

•Hive partition elimination

•Better, more efficient access to Hadoop data

•Storage Indexes

•Predicate Push-Down for Parquet, ORC, HBase, Oracle NoSQL

•Bloom Filters

•Coming with Oracle Database 12.2

•Big Data-aware optimizer

•Dense Bloom Filters

•Oracle managed Big Data partitions
Going beyond Fast Unified Query Access to HDFS Data
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Hive Storage handlers give Hive the ability
to access data from non-HDFS sources

•MongoDB

•HBase

•Oracle NoSQL database

•Run HiveQL queries against NoSQL DBs

•From BDS1.1, Hive storage handlers can
be used with Big Data SQL

•Only MongoDB, HBase and NoSQL
currently “supported”

•Others should work but not tested
Big Data SQL and Hive Storage Handlers
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Create Hive table over HBase database as normal

•Typically done to add INSERT and DELETE capabilities to Hive, for DW dimension ETL

•Create Oracle external table as normal, using ORACLE_HIVE driver
Use of Hive Storage Handlers Transparent to BDS
CREATE EXTERNAL TABLE tablename colname coltype[, colname coltype,...]
ROW FORMAT
SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
'serialization.format'='1',
'hbase.columns.mapping'=':key,value:key,value:
CREATE TABLE tablename(colname colType[, colname colType...])
ORGANIZATION EXTERNAL
(TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR
ACCESS PARAMETERS
(access parameters)
)
REJECT LIMIT UNLIMITED;
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•From Big Data SQL 2.0, Storage Indexes
are automatically created in Big Data SQL
agents

•Check index before reading blocks – Skip
unnecessary I/Os

•An average of 65% faster than BDS 1.x

•Up to 100x faster for highly selective
queries

•Columns in SQL are mapped to fields in
the HDFS file via External Table Definitions

•Min / max value is recorded for each
HDFS Block in a storage index
Big Data SQL Storage Indexes
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Hadoop supports predicate push-down through several mechanisms (filetypes, Hive
partition pruning etc)

•Original BDS 1.0 supported Hive predicate push-down as part of SmartScan

•BDS 3.0 extends this by pushing SARGable (Search ARGument ABLE) predicates

•Into Parquet and ORCFile to reduce I/O when 

reading files from disk

•Into HBAse and Oracle NoSQL database 

to drive subscans of data from remote DB

•Oracle Database 12.2 will add more optimisations

•Columnar-caching 

•Big Data-Aware Query Optimizer, 

•Managed Hadoop partitions

•Dense Bloom Filters
Extending Predicate Push-Down Beyond Hive
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Typically a one-way street - queries run in Hadoop but results delivered through Oracle

•What if you want to load data into Hadoop, update data, do Hadoop>Hadoop transforms?

•Still requires formal Hive metadata, whereas direction is towards Drill & schema-free queries

•What if you have other RDBMSs as well as Oracle RDBMS? 

•Trend is towards moving all high-end analytic workloads into Hadoop - BDS is Oracle-only

•Requires Oracle 12c database, no 11g support

•And cost … BDS is $3k/Hadoop disk drive

•Can cost more than an Oracle BDA

•High-end, high-cost Oracle-centric solution

•of course!
… So What’s the Catch?
info@rittmanmead.com www.rittmanmead.com @rittmanmead
•Rich, comprehensive SQL access to all enterprise data

•Extend Oracle security, advanced analytic features and metadata across Hadoop & NoSQL
Oracle Big Data SQL Vision : Unified Query
http://www.rittmanmead.com
info@rittmanmead.com www.rittmanmead.com @rittmanmead
Using Oracle Big Data SQL to add Hadoop + NoSQL

to your Oracle Data Warehouse
Mark Rittman, CTO, Rittman Mead
SQL Celebration Day, Netherlands June 2016

More Related Content

What's hot

OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudMark Rittman
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...Mark Rittman
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...Mark Rittman
 
The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsMark Rittman
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Mark Rittman
 
How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017Rittman Analytics
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...Rittman Analytics
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyMark Rittman
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...Mark Rittman
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudMark Rittman
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016StampedeCon
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...StampedeCon
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryDataWorks Summit/Hadoop Summit
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016StampedeCon
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiridatastack
 
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...Kolja Manuel Rödel
 
Hadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseHadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseDataWorks Summit
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWSGary Stafford
 
SQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalSQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalMichael Rainey
 

What's hot (20)

OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
 
The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data Platforms
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...
 
How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017How a Tweet Went Viral - BIWA Summit 2017
How a Tweet Went Viral - BIWA Summit 2017
 
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
 
Navigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data DiscoveryNavigating the World of User Data Management and Data Discovery
Navigating the World of User Data Management and Data Discovery
 
Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016Turn Data Into Actionable Insights - StampedeCon 2016
Turn Data Into Actionable Insights - StampedeCon 2016
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
 
Hadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data WarehouseHadoop and Enterprise Data Warehouse
Hadoop and Enterprise Data Warehouse
 
Building a Data Lake on AWS
Building a Data Lake on AWSBuilding a Data Lake on AWS
Building a Data Lake on AWS
 
SQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalSQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle Professional
 

Viewers also liked

Rolling stones power point o
Rolling stones power point oRolling stones power point o
Rolling stones power point ochewitt5
 
Where are the slides?
Where are the slides?Where are the slides?
Where are the slides?Robin Moffatt
 
Spark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit
 
Spark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit EU talk by Sudeep Das and Aish FaentonSpark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit EU talk by Sudeep Das and Aish FaentonSpark Summit
 
Connecting Hadoop and Oracle
Connecting Hadoop and OracleConnecting Hadoop and Oracle
Connecting Hadoop and OracleTanel Poder
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Jay Patel
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Spark Summit
 
Aws seminar-tokyo dan-jp-final-publish
Aws seminar-tokyo dan-jp-final-publishAws seminar-tokyo dan-jp-final-publish
Aws seminar-tokyo dan-jp-final-publishawsadovantageseminar
 
Leticia alonso inmaculada concepcion
Leticia alonso inmaculada concepcionLeticia alonso inmaculada concepcion
Leticia alonso inmaculada concepcionABNCFIE VALLADOLID
 
Employee of the month
Employee of the monthEmployee of the month
Employee of the monthAdam Chard
 
PwC 2016 Top Issues - The Aging Workforce
PwC 2016 Top Issues - The Aging WorkforcePwC 2016 Top Issues - The Aging Workforce
PwC 2016 Top Issues - The Aging WorkforceTodd DeStefano
 
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_RadiosurgeryBarrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_RadiosurgeryJeffrey A. Fiedler
 

Viewers also liked (18)

Rolling stones power point o
Rolling stones power point oRolling stones power point o
Rolling stones power point o
 
Spark vs. PL/SQL
Spark vs. PL/SQLSpark vs. PL/SQL
Spark vs. PL/SQL
 
Where are the slides?
Where are the slides?Where are the slides?
Where are the slides?
 
Spark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon Whitear
 
Spark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit EU talk by Sudeep Das and Aish FaentonSpark Summit EU talk by Sudeep Das and Aish Faenton
Spark Summit EU talk by Sudeep Das and Aish Faenton
 
Connecting Hadoop and Oracle
Connecting Hadoop and OracleConnecting Hadoop and Oracle
Connecting Hadoop and Oracle
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012
 
Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...Realtime Analytical Query Processing and Predictive Model Building on High Di...
Realtime Analytical Query Processing and Predictive Model Building on High Di...
 
Aws seminar-tokyo dan-jp-final-publish
Aws seminar-tokyo dan-jp-final-publishAws seminar-tokyo dan-jp-final-publish
Aws seminar-tokyo dan-jp-final-publish
 
Leticia alonso inmaculada concepcion
Leticia alonso inmaculada concepcionLeticia alonso inmaculada concepcion
Leticia alonso inmaculada concepcion
 
Employee of the month
Employee of the monthEmployee of the month
Employee of the month
 
Cumpleaños especial (pasado)
Cumpleaños especial (pasado)Cumpleaños especial (pasado)
Cumpleaños especial (pasado)
 
Sistem pencernaan katak
Sistem pencernaan katakSistem pencernaan katak
Sistem pencernaan katak
 
PwC 2016 Top Issues - The Aging Workforce
PwC 2016 Top Issues - The Aging WorkforcePwC 2016 Top Issues - The Aging Workforce
PwC 2016 Top Issues - The Aging Workforce
 
Sistem gerak jadi
Sistem gerak jadiSistem gerak jadi
Sistem gerak jadi
 
Erik j. robinson
Erik j. robinsonErik j. robinson
Erik j. robinson
 
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_RadiosurgeryBarrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
Barrow_Quarterly_1997_Physical_Aspects_of_Stx_Radiosurgery
 
Goal worksheet
Goal worksheetGoal worksheet
Goal worksheet
 

Similar to Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Warehouse

IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...Mark Rittman
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoopbddmoscow
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013Mark Rittman
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...Mark Rittman
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopBig Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopCaserta
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelinprajods
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkJames Chen
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformHortonworks
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data AnalyticsAttunity
 
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012Andrew Brust
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big DataAndrew Brust
 
Practical introduction to hadoop
Practical introduction to hadoopPractical introduction to hadoop
Practical introduction to hadoopinside-BigData.com
 

Similar to Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Warehouse (20)

IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
 
Twitter with hadoop for oow
Twitter with hadoop for oowTwitter with hadoop for oow
Twitter with hadoop for oow
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
Apache drill
Apache drillApache drill
Apache drill
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopBig Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelin
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
 
Accelerating Big Data Analytics
Accelerating Big Data AnalyticsAccelerating Big Data Analytics
Accelerating Big Data Analytics
 
Meetup Oracle Database BCN: 2.1 Data Management Trends
Meetup Oracle Database BCN: 2.1 Data Management TrendsMeetup Oracle Database BCN: 2.1 Data Management Trends
Meetup Oracle Database BCN: 2.1 Data Management Trends
 
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
Microsoft's Big Play for Big Data- Visual Studio Live! NY 2012
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
Practical introduction to hadoop
Practical introduction to hadoopPractical introduction to hadoop
Practical introduction to hadoop
 

More from Mark Rittman

Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Mark Rittman
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...Mark Rittman
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Mark Rittman
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Mark Rittman
 
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...Mark Rittman
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015Mark Rittman
 
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODIBIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODIMark Rittman
 
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI ProjectsOGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI ProjectsMark Rittman
 
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12cUKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12cMark Rittman
 
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Mark Rittman
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gMark Rittman
 
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12cPart 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12cMark Rittman
 

More from Mark Rittman (12)

Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
 
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
 
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODIBIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODI
 
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI ProjectsOGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
OGH 2015 - Hadoop (Oracle BDA) and Oracle Technologies on BI Projects
 
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12cUKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12c
 
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
 
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12cPart 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12c
 

Recently uploaded

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 

Recently uploaded (20)

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 

Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Warehouse

  • 1. info@rittmanmead.com www.rittmanmead.com @rittmanmead Using Oracle Big Data SQL to add Hadoop + NoSQL
 to your Oracle Data Warehouse Mark Rittman, CTO, Rittman Mead SQL Celebration Day, Netherlands June 2016
  • 2. info@rittmanmead.com www.rittmanmead.com @rittmanmead 2 •Many customers and organisations are now running initiatives around “big data” •Some are IT-led and are looking for cost-savings around data warehouse storage + ETL •Others are “skunkworks” projects in the marketing department that are now scaling-up •Projects now emerging from pilot exercises •And design patterns starting to emerge Many Organisations are Running Big Data Initiatives
  • 3. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Gives us an ability to store more data, at more detail, for longer •Provides a cost-effective way to analyse vast amounts of data •Hadoop & NoSQL technologies can give us “schema-on-read” capabilities •There’s vast amounts of innovation in this area we can harness •And it’s very complementary to Oracle BI & DW Why is Hadoop of Interest to Us?
  • 4. info@rittmanmead.com www.rittmanmead.com @rittmanmead 4 •Mark Rittman, Co-Founder of Rittman Mead ‣Oracle ACE Director, specialising in Oracle BI&DW ‣14 Years Experience with Oracle Technology ‣Regular columnist for Oracle Magazine •Author of two Oracle Press Oracle BI books ‣Oracle Business Intelligence Developers Guide ‣Oracle Exalytics Revealed ‣Writer for Rittman Mead Blog :
 http://www.rittmanmead.com/blog •Email : mark.rittman@rittmanmead.com •Twitter : @markrittman About the Speaker
  • 5. info@rittmanmead.com www.rittmanmead.com @rittmanmead Flexible Cheap Storage for Logs, Feeds + Social Data $50k Hadoop Node Voice + Chat Transcripts Call Center LogsChat Logs iBeacon Logs Website LogsCRM Data Transactions Social FeedsDemographics Raw Data Customer 360 Apps Predictive 
 Models SQL-on-Hadoop Business analytics Real-time Feeds,
 batch and API
  • 8. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Oracle Big Data Appliance - Engineered System for running Hadoop alongside Exadata •Oracle Big Data Connectors - Utility from Oracle for feeding Hadoop data into Oracle •Oracle Data Integrator EE Big Data Option - Add Spark, Pig data transforms to Oracle ODI •Oracle BI Enterprise Edition - can connect to Hive, Impala for federated queries •Oracle Big Data Discovery - data wrangling + visualization tool for Hadoop data reservoirs •Oracle Big Data SQL - extend Oracle SQL 
 language + processing to Hadoop Oracle Software Initiatives around Big Data
  • 11. Where everything is non-relational
  • 12. isn’t SQL on Hadoop
 somewhat missing the point?
  • 13.
  • 14. info@rittmanmead.com www.rittmanmead.com @rittmanmead Where Can SQL Processing Be Useful with Hadoop? •Hadoop is not a cheap substitute for enterprise DW platforms - don’t use it like this •But adding SQL processing and abstraction can help in many scenarios: • Query access to data stored in Hadoop as an archive • Aggregating, sorting, filtering and transforming data • Set-based transformation capabilities for other frameworks (e.g. Spark) • Ad-hoc analysis and data discovery in-real time • Providing tabular abstractions over complex datatypes SQL! Though 
 SQL
 isn’t actually
 relational According to Chris Date
 SQL is just
 mappings Tedd Codd
 used 
 Predicate
 Calculus and there’s
 never been
 a mainstream
 relational
 DBMS but it is the
 standard
 language for
 RDBMSs and it’s great
 for set-based
 transforms
 & queries so
 Yes SQL!
  • 15. info@rittmanmead.com www.rittmanmead.com @rittmanmead 15 •Original developed at Facebook, now foundational within the Hadoop project •Allows users to query Hadoop data using SQL-like language •Tabular metadata layer that overlays files, can interpret semi-structured data (e.g. JSON) •Generates MapReduce code to return required data •Extensible through SerDes and Storage Handlers •JDBC and ODBC drivers for most platforms/tools •Perfect for set-based access + batch ETL work Apache Hive : SQL Metadata + Engine over Hadoop
  • 16. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Hive uses a RBDMS metastore to hold
 table and column definitions in schemas •Hive tables then map onto HDFS-stored files ‣Managed tables ‣External tables •Oracle-like query optimizer, compiler,
 executor •JDBC and OBDC drivers,
 plus CLI etc 16 How Does Hive Translate SQL into MapReduce? Hive Thrift Server JDBC / ODBC Parser Planner Execution Engine Metastore MapReduc e HDFS HueCLI
  • 17. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Hive uses a RBDMS metastore to hold
 table and column definitions in schemas •Hive tables then map onto HDFS-stored files ‣Managed tables ‣External tables •Oracle-like query optimizer, compiler,
 executor •JDBC and OBDC drivers,
 plus CLI etc 17 How Does Hive Translate SQL into MapReduce? hive> select count(*) from src_customer; 
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
 set hive.exec.reducers.bytes.per.reducer=
 In order to limit the maximum number of reducers:
 set hive.exec.reducers.max=
 In order to set a constant number of reducers:
 set mapred.reduce.tasks=
 Starting Job = job_201303171815_0003, Tracking URL = 
 http://localhost.localdomain:50030/jobdetails.jsp…
 Kill Command = /usr/lib/hadoop-0.20/bin/
 hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 
 -kill job_201303171815_0003
 
 2013-04-17 04:06:59,867 Stage-1 map = 0%, reduce = 0%
 2013-04-17 04:07:03,926 Stage-1 map = 100%, reduce = 0%
 2013-04-17 04:07:14,040 Stage-1 map = 100%, reduce = 33%
 2013-04-17 04:07:15,049 Stage-1 map = 100%, reduce = 100%
 Ended Job = job_201303171815_0003
 OK
 25
 Time taken: 22.21 seconds HiveQL
 Query MapReduce
 Job submitted Results 
 returned
  • 18. But Hive is too slow 
 for ad-hoc queries
  • 19. info@rittmanmead.com www.rittmanmead.com @rittmanmead 19 •Cloudera’s answer to Hive query response time issues •MPP SQL query engine running on Hadoop, bypasses MapReduce for direct data access •Mostly in-memory, but spills to disk if required •Uses Hive metastore to access Hive table metadata •Similar SQL dialect to Hive - not as rich though and no support for Hive SerDes, storage handlers etc Cloudera Impala - Fast, MPP-style Access to Hadoop Data
  • 20. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Apache Drill is another SQL-on-Hadoop project that focus on schema-free data discovery •Inspired by Google Dremel, innovation is querying raw data with schema optional •Automatically infers and detects schema from semi-structured datasets and NoSQL DBs •Join across different silos of data e.g. JSON records, Hive tables and HBase database •Aimed at different use-cases than Hive - 
 low-latency queries, discovery 
 (think Endeca vs OBIEE) Apache Drill - SQL for Schema-Free Data Discovery
  • 21. info@rittmanmead.com www.rittmanmead.com @rittmanmead 21 •A replacement for Hive, but uses Hive concepts and
 data dictionary (metastore) •MPP (Massively Parallel Processing) query engine
 that runs within Hadoop ‣Uses same file formats, security,
 resource management as Hadoop •Processes queries in-memory •Accesses standard HDFS file data •Option to use Apache AVRO, RCFile,
 LZO or Parquet (column-store) •Designed for interactive, real-time
 SQL-like access to Hadoop How Impala Works Impala Hadoop HDFS etc BI Server Presentation Svr Cloudera Impala
 ODBC Driver Impala Hadoop HDFS etc Impala Hadoop HDFS etc Impala Hadoop HDFS etc Impala Hadoop HDFS etc Multi-Node
 Hadoop Cluster
  • 23. but sometimes, you need the real thing
  • 24. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Originally part of Oracle Big Data 4.0 (BDA-only) ‣Also required Oracle Database 12c, Oracle Exadata Database Machine •Extends Oracle Data Dictionary to cover Hive •Extends Oracle SQL and SmartScan to Hadoop •Extends Oracle Security Model over Hadoop ‣Fine-grained access control ‣Data redaction, data masking ‣Uses fast c-based readers where possible
 (vs. Hive MapReduce generation) ‣Map Hadoop parallelism to Oracle PQ ‣Big Data SQL engine works on top of YARN ‣Like Spark, Tez, MR2 Oracle Big Data SQL Exadata
 Storage Servers Hadoop
 Cluster Exadata Database
 Server Oracle Big
 Data SQL SQL Queries SmartScan SmartScan
  • 25. info@rittmanmead.com www.rittmanmead.com @rittmanmead •As with other next-gen SQL access layers, uses common Hive metastore table metadata •leverages Hadoop standard APIs for HDFS file access, metadata integration etc Leverages Hive Metastore and Hadoop file access APIs
  • 26. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Brings query-offloading features of Exadata
 to Oracle Big Data Appliance •Query across both Oracle and Hadoop sources •Intelligent query optimisation applies SmartScan
 close to ALL data •Use same SQL dialect across both sources •Apply same security rules, policies, 
 user access rights across both sources Extending SmartScan, and Oracle SQL, Across All Data
  • 27. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Read data from HDFS Data Node ‣Direct-path reads ‣C-based readers when possible ‣Use native Hadoop classes otherwise
 •Translate bytes to Oracle
 •Apply SmartScan to Oracle bytes ‣Apply filters ‣Project columns ‣Parse JSON/XML ‣Score models How Big Data SQL Accesses Hadoop (HDFS) Data Disks% Data$Node$ Big$Data$SQL$Server$ External$Table$Services$ Smart$Scan$ RecordReader% SerDe% 10110010%10110010%10110010% 1% 2% 3% 1 2 3
  • 28. info@rittmanmead.com www.rittmanmead.com @rittmanmead •“Query Franchising – dispatch of query processing to self-similar compute agents on disparate systems without loss of operational fidelity” •Contrast with OBIEE which provides a query federation capability over Hadoop •Sends sub-queries to each data source •Relies on each data source’s native query engine, and resource management •Query franchising using Big Data SQL ensures consistent resource management •And contrast with SQL translation tools (i.e. Oracle SQL to Impala) •Either limits Oracle SQL to the subset that Hive, Impala supports •Or translation engine has to transform each Oracle feature into Hive, Impala SQL Query Franchising vs. SQL Translation / Federation
  • 29. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Oracle Database 12c 12.1.0.2.0 with Big Data SQL option can view Hive table metadata ‣Linked by Exadata configuration steps to one or more BDA clusters •DBA_HIVE_TABLES and USER_HIVE_TABLES exposes Hive metadata •Oracle SQL*Developer 4.0.3, with Cloudera Hive drivers, can connect to Hive metastore View Hive Table Metadata in the Oracle Data Dictionary SQL> col database_name for a30 SQL> col table_name for a30 SQL> select database_name, table_name 2 from dba_hive_tables; DATABASE_NAME TABLE_NAME ------------------------------ ------------------------------ default access_per_post default access_per_post_categories default access_per_post_full default apachelog default categories default countries default cust default hive_raw_apache_access_log
  • 30. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Big Data SQL accesses Hive tables through external table mechanism ‣ORACLE_HIVE external table type imports Hive metastore metadata ‣ORACLE_HDFS requires metadata to be specified •Access parameters cluster and tablename specify Hive table source and BDA cluster Hive Access through Oracle External Tables + Hive Driver CREATE TABLE access_per_post_categories( hostname varchar2(100), request_date varchar2(100), post_id varchar2(10), title varchar2(200), author varchar2(100), category varchar2(100), ip_integer number) organization external (type oracle_hive default directory default_dir access parameters(com.oracle.bigdata.tablename=default.access_per_post_categories));
  • 31. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Run normal Oracle SQL from the Oracle Database server •Big Data SQL query franchising then uses agents on Hadoop nodes to query and return data independent of YARN scheduling; Oracle Database combines and returns full results Running Oracle SQL on Hadoop Data Nodes SELECT w.sess_id,w.cust_id,c.name FROM web_logs w, customers c WHERE w.source_country = ‘Brazil’ AND c.customer_id = w.cust_id
  • 32. info@rittmanmead.com www.rittmanmead.com @rittmanmead •OBIEE can access Hadoop data via Hive, but it’s slow •(Impala only has subset of Oracle SQL capabilities) •Big Data SQL presents all data to OBIEE as Oracle data, with full advanced analytic capabilities across both platforms Example : Combining Hadoop + Oracle Data for BI Hive Weblog Activity table Oracle Dimension lookup tables Combined output
 in report form
  • 33. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Not all functions can be offloaded to Hadoop tier •Even for non-offloadable operations Big Data SQL will perform column pruning and datatype conversion (which saves a lot of resources) •Other operations (non-offloadable) will be done on the database side •Requires Oracle Database 12.1.0.2 + patchset, and per-disk licensing for Big Data SQL •You need and Oracle Big Data Appliance, and Oracle Exadata, to use Big Data SQL Restrictions when using Oracle Big Data SQL SELECT NAME FROM v$sqlfn_metadata WHERE offloadable ='YES'
  • 34. info@rittmanmead.com www.rittmanmead.com @rittmanmead •From Big Data SQL 3.0, commodity hardware can be used instead of BDA and Exadata •Oracle Database 12.1.0.2 on x86_64 with Jan/Apr Proactive Bundle Patches •Cloudera CDH 5.5 or Hortonworks HDP 2.3 on RHEL/OEL6 •See MOS Doc ID 2119369.1 - note cannot mix Engineered/Non-Engineered platforms Running Big Data SQL on Commodity Hardware
  • 35. info@rittmanmead.com www.rittmanmead.com @rittmanmead •No functional differences when running Big Data SQL on commodity hardware •External table capability lives with the database, and the performance functionality with the BDS cell software. •All BDS features (SmartScan, offloading, storage indexes etc still available) •But hardware can be a factor now, as we’re pushing processing down and data up the wire •1GB ethernet can be too slow, 10Gb is a minimum (i.e. no InfiniBand) •If you run on an undersized system you may see bottlenecks on the DB side.  Big Data SQL on Commodity Hardware Considerations
  • 36. But Hadoop is more than 
 simple HDFS files
  • 37.
  • 38. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Subsequent releases of Big Data SQL have extended its Hadoop capabilties •Support for Hive storage handlers (HBase, MongoDB etc) •Hive partition elimination •Better, more efficient access to Hadoop data •Storage Indexes •Predicate Push-Down for Parquet, ORC, HBase, Oracle NoSQL •Bloom Filters •Coming with Oracle Database 12.2 •Big Data-aware optimizer •Dense Bloom Filters •Oracle managed Big Data partitions Going beyond Fast Unified Query Access to HDFS Data
  • 39. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Hive Storage handlers give Hive the ability to access data from non-HDFS sources •MongoDB •HBase •Oracle NoSQL database •Run HiveQL queries against NoSQL DBs •From BDS1.1, Hive storage handlers can be used with Big Data SQL •Only MongoDB, HBase and NoSQL currently “supported” •Others should work but not tested Big Data SQL and Hive Storage Handlers
  • 40. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Create Hive table over HBase database as normal •Typically done to add INSERT and DELETE capabilities to Hive, for DW dimension ETL •Create Oracle external table as normal, using ORACLE_HIVE driver Use of Hive Storage Handlers Transparent to BDS CREATE EXTERNAL TABLE tablename colname coltype[, colname coltype,...] ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( 'serialization.format'='1', 'hbase.columns.mapping'=':key,value:key,value: CREATE TABLE tablename(colname colType[, colname colType...]) ORGANIZATION EXTERNAL (TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ACCESS PARAMETERS (access parameters) ) REJECT LIMIT UNLIMITED;
  • 41. info@rittmanmead.com www.rittmanmead.com @rittmanmead •From Big Data SQL 2.0, Storage Indexes are automatically created in Big Data SQL agents •Check index before reading blocks – Skip unnecessary I/Os •An average of 65% faster than BDS 1.x •Up to 100x faster for highly selective queries •Columns in SQL are mapped to fields in the HDFS file via External Table Definitions •Min / max value is recorded for each HDFS Block in a storage index Big Data SQL Storage Indexes
  • 42. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Hadoop supports predicate push-down through several mechanisms (filetypes, Hive partition pruning etc) •Original BDS 1.0 supported Hive predicate push-down as part of SmartScan •BDS 3.0 extends this by pushing SARGable (Search ARGument ABLE) predicates •Into Parquet and ORCFile to reduce I/O when 
 reading files from disk •Into HBAse and Oracle NoSQL database 
 to drive subscans of data from remote DB •Oracle Database 12.2 will add more optimisations •Columnar-caching •Big Data-Aware Query Optimizer, •Managed Hadoop partitions •Dense Bloom Filters Extending Predicate Push-Down Beyond Hive
  • 43. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Typically a one-way street - queries run in Hadoop but results delivered through Oracle •What if you want to load data into Hadoop, update data, do Hadoop>Hadoop transforms? •Still requires formal Hive metadata, whereas direction is towards Drill & schema-free queries •What if you have other RDBMSs as well as Oracle RDBMS? •Trend is towards moving all high-end analytic workloads into Hadoop - BDS is Oracle-only •Requires Oracle 12c database, no 11g support •And cost … BDS is $3k/Hadoop disk drive •Can cost more than an Oracle BDA •High-end, high-cost Oracle-centric solution •of course! … So What’s the Catch?
  • 44. info@rittmanmead.com www.rittmanmead.com @rittmanmead •Rich, comprehensive SQL access to all enterprise data •Extend Oracle security, advanced analytic features and metadata across Hadoop & NoSQL Oracle Big Data SQL Vision : Unified Query
  • 46. info@rittmanmead.com www.rittmanmead.com @rittmanmead Using Oracle Big Data SQL to add Hadoop + NoSQL
 to your Oracle Data Warehouse Mark Rittman, CTO, Rittman Mead SQL Celebration Day, Netherlands June 2016