SlideShare a Scribd company logo
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Turning Relational Database Tables into
Hadoop Datasources
Oracle Confidential – Internal/Restricted/Highly Restricted
Kuassi Mensah
Director of Product Management
Java & Hadoop Products for the DB
@kmensah – db360.blogspot.com
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.
Oracle Confidential – Internal/Restricted/Highly Restricted 3
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Speaker Bio
• Director of Product Management at Oracle
(i) Java integration with the Oracle database (JDBC, UCP, Java in the database)
(ii) Oracle Datasource for Hadoop (OD4H), upcoming OD for Spark, OD for Flink and so on
(iii) JavaScript/Nashorn integration with the Oracle database (DB access, JS stored proc, fluent JS )
• MS CS from the Programming Institute of University of Paris
• Frequent speaker
JavaOne, Oracle Open World, Data Summit, Node Summit, Oracle User groups (UKOUG, DOAG,OUGN,
BGOUG, OUGF, GUOB, ArOUG, ORAMEX, Sangam,OTNYathra, China, Thailand, etc),
• Author: Oracle Database Programming using Java and Web Services
• @kmensah, http://db360.blogspot.com/, https://www.linkedin.com/in/kmensah
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Big Data Analytics Requirements
Opportunity: Hive Storage Handler
Storage Handler Implementation for Oracle
1
2
3
Oracle Confidential – Internal/Restricted/Highly Restricted 5
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Analytics
Oracle Confidential – Internal/Restricted/Highly Restricted 6
• Goal: furnish actionable information to help business decisions making.
• Example
“Which of our products got a rating of four stars or higher, on social
media in the last quarter?
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Big Data Analytics and Requirements
Oracle Confidential – Internal/Restricted/Highly Restricted 7
• Goal: furnish actionable information to help business decisions making.
• Example
“Which of our products got a rating of four stars or higher, on social
media in the last quarter?
Master Data
Big Data
(Weblogs, Facts, Scans, Events, IoT)
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
• ETL Copy
– Preplanned/scheduled
• What to copy and when?
• Always behind
– Copy is protected using Hadoop file-
level security
Apache Sqoop, Oracle CopyToBDA
• Direct Access from Hadoop
– Ad-hoc queries, always current
– Hive SQL, Spark SQL, Impala*,
other SQL engines
– Hadoop APIs
– Database security
Oracle Datasource for Hadoop (OD4H)
Accessing Master Data in RDBMS
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Direct Access From Hadoop
Dummy Example
SELECT HiveTab.First_Name, HiveTab.Last_Name, OraTab.bonus
FROM HiveTab join OraTab on (HiveTab.Emp_ID=OraTab.Emp_ID)
WHERE salary > 70000 and bonus > 7000;
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Big Data Analytics Requirements
Opportunity: Hive Storage Handler
Hive Storage Handler Implementation for Oracle
1
2
3
Oracle Confidential – Internal/Restricted/Highly Restricted 10
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Data
HCatalog
InputFormat
OutputFormat
SerDe
Hadoop 2.0 Architecture – Storage Handler
YARN
HDFS NoSQL
Redundant Storage
Batch
(MapReduce)
Hive SQL Spark
(In-Memory)
Big Data
SQL
External
Table RDBMS
table(s)
Storage
Handler
Mahout
(ML libs)
Compute
Resources
+
Scheduler
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Storage Handler Interface
https://cwiki.apache.org/confluence/display/Hive/StorageHandlers
package org.apache.hadoop.hive.ql.metadata;
import java.util.Map;
import org.apache.hadoop.conf.Configurable;
import org.apache.hadoop.hive.metastore.HiveMetaHook;
import org.apache.hadoop.hive.ql.plan.TableDesc;
import org.apache.hadoop.hive.serde2.SerDe;
import org.apache.hadoop.mapred.InputFormat;
import org.apache.hadoop.mapred.OutputFormat;
public interface HiveStorageHandler extends Configurable {
public Class<? extends InputFormat> getInputFormatClass();
public Class<? extends OutputFormat> getOutputFormatClass();
public Class<? extends SerDe> getSerDeClass();
public HiveMetaHook getMetaHook();
public void configureTableJobProperties(
TableDesc tableDesc,
Map<String, String> jobProperties);
}
Oracle Confidential – Internal/Restricted/Highly Restricted 12
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
RDBMS Table as Hive External Table
DDL
CREATE EXTERNAL TABLE Hadoop_employees (
EMPLOYEE_ID INT, FIRST_NAME STRING, LAST_NAME
STRING,SALARY DOUBLE, ...)
STORED BY ‘ RDBMS specific Storage handler class‘
TBLPROPERTIES
( ...
'mapreduce.jdbc.input.table.name' ='EMPLOYEES‘,
...
);
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Program Agenda
Big Data Analytics Requirements
Opportunity: Hive Storage Handler
Hive Storage Handler Implementation for Oracle
1
2
3
Oracle Confidential – Internal/Restricted/Highly Restricted 14
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Oracle Datasource for Hadoop (OD4H)
Hive
OracleTable
Impala
*
Spark
SQL
Mahout
Other
YARN
HCatalog
StorageHandler
InputFormat
OutputFormat
SerDe
Direct, parallel, fast secure and consistent access to master data
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Parallel Access to Oracle Table: Splitter Patterns
• SINGLE_SPLITTER
• ROW_SPLITTER
number of rows set in oracle.hcat.osh.rowsPerSplit
• BLOCK_SPLITTER
max # of splits directed by oracle.hcat.osh.maxStorageBasedSplits
• PARTITION_SPLITTER
• CUSTOM_SPLITTER
a user-defined SELECT statement that emits ROWIDs corresponding to start and end of
each split in oracle.hcat.osh.chunkSQL
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Split Pattern for Partitioned Oracle Table
CREATE EXTERNAL TABLE Hadoop_employees (
EMPLOYEE_ID INT, FIRST_NAME STRING, LAST_NAME STRING,
SALARY DOUBLE, HIRE_DATE TIMESTAMP,
JOB_ID STRING)
STORED BY
'oracle.hcat.osh.storagehandler.OracleStorageHandler ‘
TBLPROPERTIES (
'mapreduce.jdbc.url' =
'jdbc:oracle:thin:@localhost:1521:orcl',
'mapreduce.jdbc.username' = ‘foobar',
'mapreduce.jdbc.password' = ‘ dontdothis',
'mapreduce.jdbc.input.table.name' = 'EMPLOYEES',
'oracle.hcat.osh.splitterKind' = ‘PARTITION_SPLITTER'
);
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
How Oracle Datasource for Hadoop Works
Oracle Confidential
Hive
Query
Hadoop Cluster
Execution
Plan (partial) Oracle
Datasrce
4
Hadoop
1. From TBLPROPERTY in HCatalog, get a
secure connection to DB
2. Generate database Splits with SCN, based on
the query and split pattern
3. For each split, rewrites the sub-query into
Oracle SQL
4. Each split is processed by a Hadoop task
5. Matching rows returned to Hadoop Query
coordinator
Oracle
table
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
HCatalog
Map Reduce
Putting Everything Together
Oracle
Table
granule
granule
granule
granule
(2) Hive
Query
Oracle
Storage Handler
MapTask
MapTask
MapTask
Job Tracker
split
split
split
split
(1) Hive DDL
Rewritten
Sub-Queries
JDBC
Connections
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OD4H Features
• Performance and Scalability
• Resource Management and Consistency
• Security
• High Availability
• Data Types
• OutputFormat: Write back
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OD4H - Performance and Scalability
Fully exploit Hadoop clusters and Oracle database servers
• Splitter Patterns
• Optimized JDBC Driver
• Connection Caching
• Integration with Database Resident Connection Pool (DRCP)
• Projection & Predicate Pushdown
• Partition Pruning
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OD4H - Resource Management
• MaxSplit
• DRCP
maxconnections
• Hadoop
mapred.tasktracker.map.tasks.maximum in conf/mapred-site.xml
• Spark
spark.dynamicAllocation.enabled
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OD4H - Security
• Simple and Strong Authentication
– Username/password
– Wallet
– Kerberos
• Encryption and Integrity
• JVM System Properties
• Hive/Hadoop/Spark environment variables
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
OD4H – OutputFormat
OD4H allows writing back to Oracle table, the result of a Hive query
INSERT into EmployeeBonusReport
SELECTEmployeeDataSimple.First_Name,EmployeeDataSimple.Last_Name,
EmployeeBonus.bonus FROM EmployeeDataSimple
JOIN EmployeeBonus on
(EmployeeDataSimple.Emp_ID=EmployeeBonus.Emp_ID)
WHERE salary > 70000 and bonus > 7000
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Summary
• Support for Hive SQL, Spark-SQL, Impala*
• Support for MapReduce, Pig, etc
• Secure and reliable authentication:
Kerberos authentication, SSL, Oracle
Wallet
• Efficient translation of HQL to Oracle SQL
• Scalability: splits based on DB meta-data
• Column Projection Pushdown
• Predicate Pushdown
• Partition Pruning
• Connection caching
• Consistent Read (SCN)
• Writing back to Oracle
• Free for Oracle Big Data Appliance (BDA)
• Included in Oracle Big Data Cloud Service
& Big Data Cloud Service Compute Edition
• Other Hadoop Cluster: priced as an Oracle
Big Data Connector
Oracle Confidential
Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |
Resources
Oracle Confidential – Internal/Restricted/Highly Restricted 26
• Oracle Datasource for Hadoop (OD4H)
http://bit.ly/2j1kSIT (landing page, white paper, etc)
Download @ http://bit.ly/2v36Wnf
• Oracle Big Data Connectors
https://www.oracle.com/database/big-data-connectors/index.html
• Big Data Cloud Service
https://cloud.oracle.com/en_US/big-data
• Big Data Cloud Service - Compute Edition
https://cloud.oracle.com/en_US/big-data-compute-edition
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah

More Related Content

What's hot

Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
Michael Rainey
 
Oracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ OverviewOracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ Overview
Kris Rice
 
A Cloud Journey - Move to the Oracle Cloud
A Cloud Journey - Move to the Oracle CloudA Cloud Journey - Move to the Oracle Cloud
A Cloud Journey - Move to the Oracle Cloud
Markus Michalewicz
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overview
Dave Segleau
 
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Alex Gorbachev
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Markus Michalewicz
 
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
Oracle MAA (Maximum Availability Architecture) 18c - An OverviewOracle MAA (Maximum Availability Architecture) 18c - An Overview
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
Markus Michalewicz
 
Avoid the Oracle SE2 Trap with EnterpriseDB & Palisade Compliance
Avoid the Oracle SE2 Trap with EnterpriseDB & Palisade ComplianceAvoid the Oracle SE2 Trap with EnterpriseDB & Palisade Compliance
Avoid the Oracle SE2 Trap with EnterpriseDB & Palisade Compliance
EDB
 
Oracle Sharding 18c - Technical Overview
Oracle Sharding 18c - Technical OverviewOracle Sharding 18c - Technical Overview
Oracle Sharding 18c - Technical Overview
Markus Michalewicz
 
Meetup Oracle Database MAD_BCN: 1.1 Servicios de Oracle Database en la nube
 Meetup Oracle Database MAD_BCN: 1.1 Servicios de Oracle Database en la nube Meetup Oracle Database MAD_BCN: 1.1 Servicios de Oracle Database en la nube
Meetup Oracle Database MAD_BCN: 1.1 Servicios de Oracle Database en la nube
avanttic Consultoría Tecnológica
 
Why Use an Oracle Database?
Why Use an Oracle Database?Why Use an Oracle Database?
Why Use an Oracle Database?
Markus Michalewicz
 
A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014
Anuj Sahni
 
Make Your Application “Oracle RAC Ready” & Test For It
Make Your Application “Oracle RAC Ready” & Test For ItMake Your Application “Oracle RAC Ready” & Test For It
Make Your Application “Oracle RAC Ready” & Test For It
Markus Michalewicz
 
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Dave Segleau
 
Pimping SQL Developer and Data Modeler
Pimping SQL Developer and Data ModelerPimping SQL Developer and Data Modeler
Pimping SQL Developer and Data Modeler
Kris Rice
 
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
HA, Scalability, DR & MAA in Oracle Database 21c - OverviewHA, Scalability, DR & MAA in Oracle Database 21c - Overview
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
Markus Michalewicz
 
Oracle database 12c_and_DevOps
Oracle database 12c_and_DevOpsOracle database 12c_and_DevOps
Oracle database 12c_and_DevOps
Maria Colgan
 
Oracle MAA Best Practices - Applications Considerations
Oracle MAA Best Practices - Applications ConsiderationsOracle MAA Best Practices - Applications Considerations
Oracle MAA Best Practices - Applications Considerations
Markus Michalewicz
 
2020 – A Decade of Change
2020 – A Decade of Change2020 – A Decade of Change
2020 – A Decade of Change
Markus Michalewicz
 
Oracle RAC 19c - the Basis for the Autonomous Database
Oracle RAC 19c - the Basis for the Autonomous DatabaseOracle RAC 19c - the Basis for the Autonomous Database
Oracle RAC 19c - the Basis for the Autonomous Database
Markus Michalewicz
 

What's hot (20)

Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
 
Oracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ OverviewOracle REST Data Services Best Practices/ Overview
Oracle REST Data Services Best Practices/ Overview
 
A Cloud Journey - Move to the Oracle Cloud
A Cloud Journey - Move to the Oracle CloudA Cloud Journey - Move to the Oracle Cloud
A Cloud Journey - Move to the Oracle Cloud
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overview
 
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
Under The Hood of Pluggable Databases by Alex Gorbachev, Pythian, Oracle OpeW...
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
 
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
Oracle MAA (Maximum Availability Architecture) 18c - An OverviewOracle MAA (Maximum Availability Architecture) 18c - An Overview
Oracle MAA (Maximum Availability Architecture) 18c - An Overview
 
Avoid the Oracle SE2 Trap with EnterpriseDB & Palisade Compliance
Avoid the Oracle SE2 Trap with EnterpriseDB & Palisade ComplianceAvoid the Oracle SE2 Trap with EnterpriseDB & Palisade Compliance
Avoid the Oracle SE2 Trap with EnterpriseDB & Palisade Compliance
 
Oracle Sharding 18c - Technical Overview
Oracle Sharding 18c - Technical OverviewOracle Sharding 18c - Technical Overview
Oracle Sharding 18c - Technical Overview
 
Meetup Oracle Database MAD_BCN: 1.1 Servicios de Oracle Database en la nube
 Meetup Oracle Database MAD_BCN: 1.1 Servicios de Oracle Database en la nube Meetup Oracle Database MAD_BCN: 1.1 Servicios de Oracle Database en la nube
Meetup Oracle Database MAD_BCN: 1.1 Servicios de Oracle Database en la nube
 
Why Use an Oracle Database?
Why Use an Oracle Database?Why Use an Oracle Database?
Why Use an Oracle Database?
 
A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014
 
Make Your Application “Oracle RAC Ready” & Test For It
Make Your Application “Oracle RAC Ready” & Test For ItMake Your Application “Oracle RAC Ready” & Test For It
Make Your Application “Oracle RAC Ready” & Test For It
 
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
 
Pimping SQL Developer and Data Modeler
Pimping SQL Developer and Data ModelerPimping SQL Developer and Data Modeler
Pimping SQL Developer and Data Modeler
 
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
HA, Scalability, DR & MAA in Oracle Database 21c - OverviewHA, Scalability, DR & MAA in Oracle Database 21c - Overview
HA, Scalability, DR & MAA in Oracle Database 21c - Overview
 
Oracle database 12c_and_DevOps
Oracle database 12c_and_DevOpsOracle database 12c_and_DevOps
Oracle database 12c_and_DevOps
 
Oracle MAA Best Practices - Applications Considerations
Oracle MAA Best Practices - Applications ConsiderationsOracle MAA Best Practices - Applications Considerations
Oracle MAA Best Practices - Applications Considerations
 
2020 – A Decade of Change
2020 – A Decade of Change2020 – A Decade of Change
2020 – A Decade of Change
 
Oracle RAC 19c - the Basis for the Autonomous Database
Oracle RAC 19c - the Basis for the Autonomous DatabaseOracle RAC 19c - the Basis for the Autonomous Database
Oracle RAC 19c - the Basis for the Autonomous Database
 

Similar to Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah

Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by Example
Harald Erb
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)
Jeffrey T. Pollock
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
Jeffrey T. Pollock
 
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Jeffrey T. Pollock
 
Oracle big data discovery 994294
Oracle big data discovery   994294Oracle big data discovery   994294
Oracle big data discovery 994294
Edgar Alejandro Villegas
 
New data dictionary an internal server api that matters
New data dictionary an internal server api that mattersNew data dictionary an internal server api that matters
New data dictionary an internal server api that matters
Alexander Nozdrin
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
Jeffrey T. Pollock
 
OOW-TBE-12c-CON7307-Sharable
OOW-TBE-12c-CON7307-SharableOOW-TBE-12c-CON7307-Sharable
OOW-TBE-12c-CON7307-Sharable
Obaidur (OB) Rashid
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
Matt Lord
 
Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster
Fran Navarro
 
Oracle database in cloud, dr in cloud and overview of oracle database 18c
Oracle database in cloud, dr in cloud and overview of oracle database 18cOracle database in cloud, dr in cloud and overview of oracle database 18c
Oracle database in cloud, dr in cloud and overview of oracle database 18c
AiougVizagChapter
 
Oracle SQL Developer for SQL Server?
Oracle SQL Developer for SQL Server?Oracle SQL Developer for SQL Server?
Oracle SQL Developer for SQL Server?
Jeff Smith
 
Oracle Database Cloud Service
Oracle Database Cloud ServiceOracle Database Cloud Service
Oracle Database Cloud Service
Jean-Philippe PINTE
 
CON6492 - Oracle Database Public Cloud Services v1 1
CON6492 - Oracle Database Public Cloud Services v1 1CON6492 - Oracle Database Public Cloud Services v1 1
CON6492 - Oracle Database Public Cloud Services v1 1
David van Schalkwyk
 
Session 203 iouc summit database
Session 203 iouc summit databaseSession 203 iouc summit database
Session 203 iouc summit database
OUGTH Oracle User Group in Thailand
 
Oracle Warehouse Builder to Oracle Data Integrator 12c Migration Utility
Oracle Warehouse Builder to Oracle Data Integrator 12c Migration UtilityOracle Warehouse Builder to Oracle Data Integrator 12c Migration Utility
Oracle Warehouse Builder to Oracle Data Integrator 12c Migration Utility
Noel Sidebotham
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
jdijcks
 
Oracle Database Exadata Cloud Service Conference
Oracle Database Exadata Cloud Service ConferenceOracle Database Exadata Cloud Service Conference
Oracle Database Exadata Cloud Service Conference
Okcan Yasin Saygılı
 
Oracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningOracle GoldenGate Performance Tuning
Oracle GoldenGate Performance Tuning
Bobby Curtis
 
Extending Hortonworks with Oracle's Big Data Platform
Extending Hortonworks with Oracle's Big Data PlatformExtending Hortonworks with Oracle's Big Data Platform
Extending Hortonworks with Oracle's Big Data Platform
DataWorks Summit/Hadoop Summit
 

Similar to Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah (20)

Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by Example
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
 
Oracle big data discovery 994294
Oracle big data discovery   994294Oracle big data discovery   994294
Oracle big data discovery 994294
 
New data dictionary an internal server api that matters
New data dictionary an internal server api that mattersNew data dictionary an internal server api that matters
New data dictionary an internal server api that matters
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
 
OOW-TBE-12c-CON7307-Sharable
OOW-TBE-12c-CON7307-SharableOOW-TBE-12c-CON7307-Sharable
OOW-TBE-12c-CON7307-Sharable
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
 
Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster Simplify IT: Oracle SuperCluster
Simplify IT: Oracle SuperCluster
 
Oracle database in cloud, dr in cloud and overview of oracle database 18c
Oracle database in cloud, dr in cloud and overview of oracle database 18cOracle database in cloud, dr in cloud and overview of oracle database 18c
Oracle database in cloud, dr in cloud and overview of oracle database 18c
 
Oracle SQL Developer for SQL Server?
Oracle SQL Developer for SQL Server?Oracle SQL Developer for SQL Server?
Oracle SQL Developer for SQL Server?
 
Oracle Database Cloud Service
Oracle Database Cloud ServiceOracle Database Cloud Service
Oracle Database Cloud Service
 
CON6492 - Oracle Database Public Cloud Services v1 1
CON6492 - Oracle Database Public Cloud Services v1 1CON6492 - Oracle Database Public Cloud Services v1 1
CON6492 - Oracle Database Public Cloud Services v1 1
 
Session 203 iouc summit database
Session 203 iouc summit databaseSession 203 iouc summit database
Session 203 iouc summit database
 
Oracle Warehouse Builder to Oracle Data Integrator 12c Migration Utility
Oracle Warehouse Builder to Oracle Data Integrator 12c Migration UtilityOracle Warehouse Builder to Oracle Data Integrator 12c Migration Utility
Oracle Warehouse Builder to Oracle Data Integrator 12c Migration Utility
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
 
Oracle Database Exadata Cloud Service Conference
Oracle Database Exadata Cloud Service ConferenceOracle Database Exadata Cloud Service Conference
Oracle Database Exadata Cloud Service Conference
 
Oracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningOracle GoldenGate Performance Tuning
Oracle GoldenGate Performance Tuning
 
Extending Hortonworks with Oracle's Big Data Platform
Extending Hortonworks with Oracle's Big Data PlatformExtending Hortonworks with Oracle's Big Data Platform
Extending Hortonworks with Oracle's Big Data Platform
 

More from Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
Data Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
Data Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
Data Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Recently uploaded

OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 

Recently uploaded (20)

OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 

Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah

  • 1.
  • 2. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Turning Relational Database Tables into Hadoop Datasources Oracle Confidential – Internal/Restricted/Highly Restricted Kuassi Mensah Director of Product Management Java & Hadoop Products for the DB @kmensah – db360.blogspot.com
  • 3. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Oracle Confidential – Internal/Restricted/Highly Restricted 3
  • 4. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Speaker Bio • Director of Product Management at Oracle (i) Java integration with the Oracle database (JDBC, UCP, Java in the database) (ii) Oracle Datasource for Hadoop (OD4H), upcoming OD for Spark, OD for Flink and so on (iii) JavaScript/Nashorn integration with the Oracle database (DB access, JS stored proc, fluent JS ) • MS CS from the Programming Institute of University of Paris • Frequent speaker JavaOne, Oracle Open World, Data Summit, Node Summit, Oracle User groups (UKOUG, DOAG,OUGN, BGOUG, OUGF, GUOB, ArOUG, ORAMEX, Sangam,OTNYathra, China, Thailand, etc), • Author: Oracle Database Programming using Java and Web Services • @kmensah, http://db360.blogspot.com/, https://www.linkedin.com/in/kmensah
  • 5. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Program Agenda Big Data Analytics Requirements Opportunity: Hive Storage Handler Storage Handler Implementation for Oracle 1 2 3 Oracle Confidential – Internal/Restricted/Highly Restricted 5
  • 6. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Big Data Analytics Oracle Confidential – Internal/Restricted/Highly Restricted 6 • Goal: furnish actionable information to help business decisions making. • Example “Which of our products got a rating of four stars or higher, on social media in the last quarter?
  • 7. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Big Data Analytics and Requirements Oracle Confidential – Internal/Restricted/Highly Restricted 7 • Goal: furnish actionable information to help business decisions making. • Example “Which of our products got a rating of four stars or higher, on social media in the last quarter? Master Data Big Data (Weblogs, Facts, Scans, Events, IoT)
  • 8. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | • ETL Copy – Preplanned/scheduled • What to copy and when? • Always behind – Copy is protected using Hadoop file- level security Apache Sqoop, Oracle CopyToBDA • Direct Access from Hadoop – Ad-hoc queries, always current – Hive SQL, Spark SQL, Impala*, other SQL engines – Hadoop APIs – Database security Oracle Datasource for Hadoop (OD4H) Accessing Master Data in RDBMS
  • 9. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Direct Access From Hadoop Dummy Example SELECT HiveTab.First_Name, HiveTab.Last_Name, OraTab.bonus FROM HiveTab join OraTab on (HiveTab.Emp_ID=OraTab.Emp_ID) WHERE salary > 70000 and bonus > 7000;
  • 10. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Program Agenda Big Data Analytics Requirements Opportunity: Hive Storage Handler Hive Storage Handler Implementation for Oracle 1 2 3 Oracle Confidential – Internal/Restricted/Highly Restricted 10
  • 11. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Data HCatalog InputFormat OutputFormat SerDe Hadoop 2.0 Architecture – Storage Handler YARN HDFS NoSQL Redundant Storage Batch (MapReduce) Hive SQL Spark (In-Memory) Big Data SQL External Table RDBMS table(s) Storage Handler Mahout (ML libs) Compute Resources + Scheduler
  • 12. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Storage Handler Interface https://cwiki.apache.org/confluence/display/Hive/StorageHandlers package org.apache.hadoop.hive.ql.metadata; import java.util.Map; import org.apache.hadoop.conf.Configurable; import org.apache.hadoop.hive.metastore.HiveMetaHook; import org.apache.hadoop.hive.ql.plan.TableDesc; import org.apache.hadoop.hive.serde2.SerDe; import org.apache.hadoop.mapred.InputFormat; import org.apache.hadoop.mapred.OutputFormat; public interface HiveStorageHandler extends Configurable { public Class<? extends InputFormat> getInputFormatClass(); public Class<? extends OutputFormat> getOutputFormatClass(); public Class<? extends SerDe> getSerDeClass(); public HiveMetaHook getMetaHook(); public void configureTableJobProperties( TableDesc tableDesc, Map<String, String> jobProperties); } Oracle Confidential – Internal/Restricted/Highly Restricted 12
  • 13. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | RDBMS Table as Hive External Table DDL CREATE EXTERNAL TABLE Hadoop_employees ( EMPLOYEE_ID INT, FIRST_NAME STRING, LAST_NAME STRING,SALARY DOUBLE, ...) STORED BY ‘ RDBMS specific Storage handler class‘ TBLPROPERTIES ( ... 'mapreduce.jdbc.input.table.name' ='EMPLOYEES‘, ... );
  • 14. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Program Agenda Big Data Analytics Requirements Opportunity: Hive Storage Handler Hive Storage Handler Implementation for Oracle 1 2 3 Oracle Confidential – Internal/Restricted/Highly Restricted 14
  • 15. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Oracle Datasource for Hadoop (OD4H) Hive OracleTable Impala * Spark SQL Mahout Other YARN HCatalog StorageHandler InputFormat OutputFormat SerDe Direct, parallel, fast secure and consistent access to master data
  • 16. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Parallel Access to Oracle Table: Splitter Patterns • SINGLE_SPLITTER • ROW_SPLITTER number of rows set in oracle.hcat.osh.rowsPerSplit • BLOCK_SPLITTER max # of splits directed by oracle.hcat.osh.maxStorageBasedSplits • PARTITION_SPLITTER • CUSTOM_SPLITTER a user-defined SELECT statement that emits ROWIDs corresponding to start and end of each split in oracle.hcat.osh.chunkSQL
  • 17. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Split Pattern for Partitioned Oracle Table CREATE EXTERNAL TABLE Hadoop_employees ( EMPLOYEE_ID INT, FIRST_NAME STRING, LAST_NAME STRING, SALARY DOUBLE, HIRE_DATE TIMESTAMP, JOB_ID STRING) STORED BY 'oracle.hcat.osh.storagehandler.OracleStorageHandler ‘ TBLPROPERTIES ( 'mapreduce.jdbc.url' = 'jdbc:oracle:thin:@localhost:1521:orcl', 'mapreduce.jdbc.username' = ‘foobar', 'mapreduce.jdbc.password' = ‘ dontdothis', 'mapreduce.jdbc.input.table.name' = 'EMPLOYEES', 'oracle.hcat.osh.splitterKind' = ‘PARTITION_SPLITTER' );
  • 18. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | How Oracle Datasource for Hadoop Works Oracle Confidential Hive Query Hadoop Cluster Execution Plan (partial) Oracle Datasrce 4 Hadoop 1. From TBLPROPERTY in HCatalog, get a secure connection to DB 2. Generate database Splits with SCN, based on the query and split pattern 3. For each split, rewrites the sub-query into Oracle SQL 4. Each split is processed by a Hadoop task 5. Matching rows returned to Hadoop Query coordinator Oracle table
  • 19. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | HCatalog Map Reduce Putting Everything Together Oracle Table granule granule granule granule (2) Hive Query Oracle Storage Handler MapTask MapTask MapTask Job Tracker split split split split (1) Hive DDL Rewritten Sub-Queries JDBC Connections
  • 20. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | OD4H Features • Performance and Scalability • Resource Management and Consistency • Security • High Availability • Data Types • OutputFormat: Write back
  • 21. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | OD4H - Performance and Scalability Fully exploit Hadoop clusters and Oracle database servers • Splitter Patterns • Optimized JDBC Driver • Connection Caching • Integration with Database Resident Connection Pool (DRCP) • Projection & Predicate Pushdown • Partition Pruning
  • 22. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | OD4H - Resource Management • MaxSplit • DRCP maxconnections • Hadoop mapred.tasktracker.map.tasks.maximum in conf/mapred-site.xml • Spark spark.dynamicAllocation.enabled
  • 23. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | OD4H - Security • Simple and Strong Authentication – Username/password – Wallet – Kerberos • Encryption and Integrity • JVM System Properties • Hive/Hadoop/Spark environment variables
  • 24. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | OD4H – OutputFormat OD4H allows writing back to Oracle table, the result of a Hive query INSERT into EmployeeBonusReport SELECTEmployeeDataSimple.First_Name,EmployeeDataSimple.Last_Name, EmployeeBonus.bonus FROM EmployeeDataSimple JOIN EmployeeBonus on (EmployeeDataSimple.Emp_ID=EmployeeBonus.Emp_ID) WHERE salary > 70000 and bonus > 7000
  • 25. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Summary • Support for Hive SQL, Spark-SQL, Impala* • Support for MapReduce, Pig, etc • Secure and reliable authentication: Kerberos authentication, SSL, Oracle Wallet • Efficient translation of HQL to Oracle SQL • Scalability: splits based on DB meta-data • Column Projection Pushdown • Predicate Pushdown • Partition Pruning • Connection caching • Consistent Read (SCN) • Writing back to Oracle • Free for Oracle Big Data Appliance (BDA) • Included in Oracle Big Data Cloud Service & Big Data Cloud Service Compute Edition • Other Hadoop Cluster: priced as an Oracle Big Data Connector Oracle Confidential
  • 26. Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | Resources Oracle Confidential – Internal/Restricted/Highly Restricted 26 • Oracle Datasource for Hadoop (OD4H) http://bit.ly/2j1kSIT (landing page, white paper, etc) Download @ http://bit.ly/2v36Wnf • Oracle Big Data Connectors https://www.oracle.com/database/big-data-connectors/index.html • Big Data Cloud Service https://cloud.oracle.com/en_US/big-data • Big Data Cloud Service - Compute Edition https://cloud.oracle.com/en_US/big-data-compute-edition