SlideShare a Scribd company logo
1 of 25
<Insert Picture Here>




Oracle In-Database Hadoop:
When MapReduce Meets RDBMS
Kuassi Mensah | db360.blogspot.com @kmensah
Director Product Management | kuassi.mensah@oracle.com
The following is intended to outline our general product direction. It
              is intended for information purposes only, and may not be
              incorporated into any contract. It is not a commitment to deliver any
              material, code, or functionality, and should not be relied upon in
              making purchasing decisions. The development, release, and
              timing of any features or functionality described for Oracle s
              products remains at the sole discretion of Oracle.




Hadoop Summit 2012, June 13-14, San Jose, California, USA                                                               Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah


         2   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Insert Information Protection Policy Classification from Slide 8
Oracle In-Database Hadoop
             Agenda

            •  In-Database MapReduce
                  •  Why
                  •  Previous Initiatives and Limitations
            •  Oracle In-Database Hadoop
            •  Integration with Oracle’s Big Data solution
            •  Summary




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
In-Database MapReduce




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
MapReduce Paradigm
             You All Know This Stuff!


                                                                                                  Map:
                                                                                               <K1,V1>	
  →	
  	
  
                                                                                              {<K2,V2>,…}	
  


                                                                                                  Shuffle:
                                                                                         {<K2,V2>,	
  …}	
  →	
  
                                                                                       {<K2,{V2,…,V2}>,…}	
  


                                                                                                  Reduce:
                                                                                           <K2,{V2,…,V2}>	
  
                                                                                            →	
  {<K3,V3>,…}	
  




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
In-Database MapReduce
             Why?

            •  Avoid shipping data residing in RDBMS to a
               separate infrastructure.
                  •  Many initiatives
            •  Address top two issues preventing broader
               adoption of Hadoop in the enterprise
                  •  Lack of development and/or administration skills
                  •  Lack of enterprise-class security




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
In-Database MapReduce
             Previous Efforts and Limitations


            •  SQL-MapReduce,HadoopDB (Hadapt), etc.
            •  PL/SQL User-defined pipelined table functions
               and aggregation objects
            •  Limitations
                  •  Lack of compatibility with Hadoop
                  •  Loose integration with Hadoop
                  •  Dependency on Hadoop infrastructure




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop is
                                                            a prototype (not a feature of
                                                            Oracle products), built on
                                                            current Oracle products.




Hadoop Summit 2012, June 13-14, San Jose, California, USA    Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop
             Goals

            •  Avoid shipping data residing in Oracle database to
               Hadoop clusters.
            •  Preserve Hadoop programming model
            •  Reduce dependency on Hadoop infrastructure
            •  Get enterprise developers up to speed with minimal
               training
            •  Get enterprise administrators (DBAs) up to speed
               with minimal training
            •  Reduce deployment time
            •  Bring enterprise class security to MapReduce
            •  Seamless integration with Oracle’s Big Data
               solution

Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop
             Compatibility & Minimal Dependency on Hadoop Infra
                 Node 1                              Node 2                       Node 3
                                                                                                            Pipelelined Table
                                                                                                             Function w Java
                                                                                                             impl.
           Mapping Process                     Mapping Process              Mapping Process




                                                                                                         PARTITION by
                                                                                                          CLUSTER BY Clause




                  Node 1                               Node 2                      Node 3

                                                                                                           Pipelined Table
                                                                                                            Function w Java impl.
            Reducing Process                     Reducing Process             Reducing Process




Hadoop Summit 2012, June 13-14, San Jose, California, USA           Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop
             Preserve Hadoop Programming Model
            •  Source-compatibility
            •  Job configuration
            •  Invocation thru Java interface: job.run()
            •  Direct table access: TableReader and TableWriter




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop
             SQL and MapReduce Integration

            •  Mix SQL and MapReduce processing for flexibility and
               efficiency.
            •  MapReduce steps as pipelined table functions.


                  INSERT	
  INTO	
  OutTable	
  
                  SELECT	
  *	
  FROM	
  TABLE	
  
                  	
  (Word_Count_Reduce(:ConfKey,	
  
                  	
   	
  CURSOR(SELECT	
  *	
  FROM	
  TABLE	
  
                  	
  (Word_Count_Map(:ConfKey,	
  
                  	
   	
  CURSOR(SELECT	
  *	
  FROM	
  InTable))))))	
  



Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop
               SQL and Java interfaces




	
  	
  SELECT	
  *	
  FROM	
  TABLE	
                                 	
  	
  public	
  class	
  WordCount	
  {	
  
	
  	
  	
  	
  (Reduce_VARCHAR2_NUMBER(:ConfKey,	
   	
  	
  public	
  static	
  void	
  main()	
  throws	
  Exception	
  {	
  
                                                                       	
  	
  	
  	
  /*	
  Setup	
  the	
  parameters	
  and	
  run	
  the	
  job	
  */	
  
	
  	
  	
  	
  	
  	
  CURSOR(SELECT	
  *	
  FROM	
  TABLE	
  
                                                                       	
  	
  	
  	
  ……	
  
	
  	
  	
  	
  (Map_VARCHAR2_NUMBER(:ConfKey,	
  
                                                                       	
  	
  	
  	
  job.init();	
  
	
  	
  	
  	
  	
  	
  CURSOR(SELECT	
  *	
  from	
  InTable))))))	
  	
  	
  	
  	
  job.run();	
  
                                                                              	
  	
  }	
  


  Hadoop Summit 2012, June 13-14, San Jose, California, USA               Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop
               Leverage Enterprise Skills
            •  Get database developers up to speed, with minimal
               training, on developing MapReduce jobs by reusing
               Hadoop Mappers and Reducers
            •  Get DBAs up to speed on deploying and managing
               MapReduce jobs with minimal training




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle Database Security
             Bringing Enterprise Class Security to MapReduce


            •  Auditing and Monitoring
                  •  Database Activity Auditing
                  •  Database Firewall Monitoring
                  •  Centralized Audit Data Warehouse
            •  Encryption and Masking
                  •  Transparent Data Encryption
                  •  Network Encryption/Strong Auth
                  •  Data Masking for Non-Production
            •  Privileged User Access Control and Contextual
               Authorization
                  •  Separation of Duties for DBAs
                  •  Protection Realms & Rules
                  •  Label Based Access Control



Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Seamless integration with Oracle’s Big
                       Data Solution




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle’s Big Data solution
                                                            Endeca Information Discovery



                                       Oracle
                                      Big Data
                                                                           Oracle
                                      Appliance
                                                                          Exadata
                                                                                                            Oracle
                                                                                                           Exalytics




                                                            InfiniBand                    InfiniBand




                                                                                                                            Oracle
                                                                                                                           Real-Time
                                                                                                                           Decisions



                              Acquire        Organize & Discover                 Analyze                 Decide




Hadoop Summit 2012, June 13-14, San Jose, California, USA                Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle Direct Connector for HDFS

                                                                                         Direct Access from
                        HDFS                          Oracle Database                      Oracle Database
                                                                          SQL Query

                                                                                                 SQL access to HDFS
                                                                    External
                                                                     Table                       External table view

                                                                                                 Data query or import

                                                              DCH   HDFS
                                          Infini
                                                   Band
                                                             DCH
                                                            DCH
                                                                    Client




Hadoop Summit 2012, June 13-14, San Jose, California, USA            Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Analytics

                                                                                 Oracle Advanced
                                                                                         Analytics
                                                                                          Statistical
                                                                                          Data Mining
                                                                                          Text
                                                                                          Graph
                                                                                          Spatial
                                                                                          Semantic

                               2 miles




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
What Have We Done?




Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Oracle In-Database Hadoop
             Summary
                                                            A prototype:
                                                            •  Apply MapReduce processing to data in
                                                               Oracle RDBMS without the need of a
                                                               separate infrastructure.
                                                            •  Compatibility with Hadoop while
                                                               minimizing dependency on the Apache
                                                               Hadoop infrastructure.
                                                            •  Reduce training and deployment time.
                                                            •  Integration with Oracle SQL, allowing
                                                               mixing MapReduce steps with
                                                               sophisticated SQL queries.
                                                            •  Bring Enterprise Class Security to
                                                               Hadoop MapReduce
                                                            •  Seamless integration with Oracle’s Big
                                                               Data solution

Hadoop Summit 2012, June 13-14, San Jose, California, USA    Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Demo




Hadoop Summit 2012, June 13-14, San Jose, California, USA    Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Hadoop Summit 2012, June 13-14, San Jose, California, USA   Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
Thank You!


             Page 25

More Related Content

What's hot

Bharath Hadoop Resume
Bharath Hadoop ResumeBharath Hadoop Resume
Bharath Hadoop ResumeBharath Kumar
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarCloudera, Inc.
 
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics ApplicationsHortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applicationsrussell_jurney
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Daniel Abadi
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudLeons Petražickis
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeNicolas Morales
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaSwiss Big Data User Group
 
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdHadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdIBM Analytics
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Jonathan Seidman
 
Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Muthu Natarajan
 
Data Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataData Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataCloudera, Inc.
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Flexible In-Situ Indexing for Hadoop via Elephant Twin
Flexible In-Situ Indexing for Hadoop via Elephant TwinFlexible In-Situ Indexing for Hadoop via Elephant Twin
Flexible In-Situ Indexing for Hadoop via Elephant TwinDmitriy Ryaboy
 
Big Data and Hadoop Guide
Big Data and Hadoop GuideBig Data and Hadoop Guide
Big Data and Hadoop GuideSimplilearn
 
Big data processing with apache spark part1
Big data processing with apache spark   part1Big data processing with apache spark   part1
Big data processing with apache spark part1Abbas Maazallahi
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Jonathan Seidman
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopJoey Jablonski
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaselarsgeorge
 
Big Data: SQL on Hadoop from IBM
Big Data:  SQL on Hadoop from IBM Big Data:  SQL on Hadoop from IBM
Big Data: SQL on Hadoop from IBM Cynthia Saracco
 

What's hot (20)

Bharath Hadoop Resume
Bharath Hadoop ResumeBharath Hadoop Resume
Bharath Hadoop Resume
 
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop WebinarWhy Every NoSQL Deployment Should Be Paired with Hadoop Webinar
Why Every NoSQL Deployment Should Be Paired with Hadoop Webinar
 
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics ApplicationsHortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applications
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
 
Big SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor LandscapeBig SQL Competitive Summary - Vendor Landscape
Big SQL Competitive Summary - Vendor Landscape
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the HerdHadoop-DS: Which SQL-on-Hadoop Rules the Herd
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
 
Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.
 
Intro to Hadoop
Intro to HadoopIntro to Hadoop
Intro to Hadoop
 
Data Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataData Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big Data
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Flexible In-Situ Indexing for Hadoop via Elephant Twin
Flexible In-Situ Indexing for Hadoop via Elephant TwinFlexible In-Situ Indexing for Hadoop via Elephant Twin
Flexible In-Situ Indexing for Hadoop via Elephant Twin
 
Big Data and Hadoop Guide
Big Data and Hadoop GuideBig Data and Hadoop Guide
Big Data and Hadoop Guide
 
Big data processing with apache spark part1
Big data processing with apache spark   part1Big data processing with apache spark   part1
Big data processing with apache spark part1
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Realtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBaseRealtime Analytics with Hadoop and HBase
Realtime Analytics with Hadoop and HBase
 
Big Data: SQL on Hadoop from IBM
Big Data:  SQL on Hadoop from IBM Big Data:  SQL on Hadoop from IBM
Big Data: SQL on Hadoop from IBM
 

Viewers also liked

Oracle In-database-archiving ~Oracleでの論理削除~
Oracle In-database-archiving ~Oracleでの論理削除~Oracle In-database-archiving ~Oracleでの論理削除~
Oracle In-database-archiving ~Oracleでの論理削除~Daiki Mogmet Ito
 
Best Practices with ODI : Flexibility
Best Practices with ODI : FlexibilityBest Practices with ODI : Flexibility
Best Practices with ODI : FlexibilityGurcan Orhan
 
From oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsFrom oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsGuy Harrison
 
Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...
Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...
Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...Neo4j
 
In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性Satoshi Nagayasu
 

Viewers also liked (7)

Table functions - Planboard Symposium 2013
Table functions - Planboard Symposium 2013Table functions - Planboard Symposium 2013
Table functions - Planboard Symposium 2013
 
Oracle In-database-archiving ~Oracleでの論理削除~
Oracle In-database-archiving ~Oracleでの論理削除~Oracle In-database-archiving ~Oracleでの論理削除~
Oracle In-database-archiving ~Oracleでの論理削除~
 
Best Practices with ODI : Flexibility
Best Practices with ODI : FlexibilityBest Practices with ODI : Flexibility
Best Practices with ODI : Flexibility
 
Kafka for DBAs
Kafka for DBAsKafka for DBAs
Kafka for DBAs
 
From oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other toolsFrom oracle to hadoop with Sqoop and other tools
From oracle to hadoop with Sqoop and other tools
 
Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...
Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...
Hadoop and Graph Databases (Neo4j): Winning Combination for Bioanalytics - Jo...
 
In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性In-Database Analyticsの必要性と可能性
In-Database Analyticsの必要性と可能性
 

Similar to Oracle in Database Hadoop

Improving MySQL performance with Hadoop
Improving MySQL performance with HadoopImproving MySQL performance with Hadoop
Improving MySQL performance with HadoopSagar Jauhari
 
Analyzing_Data_with_Spark_and_Cassandra
Analyzing_Data_with_Spark_and_CassandraAnalyzing_Data_with_Spark_and_Cassandra
Analyzing_Data_with_Spark_and_CassandraRich Beaudoin
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
Analytics on Hadoop
Analytics on HadoopAnalytics on Hadoop
Analytics on HadoopEMC
 
Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2Big Data Joe™ Rossi
 
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldWhy Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldDean Wampler
 
It takes two to tango! : Is SQL-on-Hadoop the next big step?
It takes two to tango! : Is SQL-on-Hadoop the next big step?It takes two to tango! : Is SQL-on-Hadoop the next big step?
It takes two to tango! : Is SQL-on-Hadoop the next big step?Srihari Srinivasan
 
Hadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in ActionHadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in ActionAndrew Brust
 
Kerry osborne hadoop meets exadata
Kerry osborne hadoop meets exadataKerry osborne hadoop meets exadata
Kerry osborne hadoop meets exadataEnkitec
 
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...DataStax Academy
 
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry OsborneHadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry OsborneEnkitec
 
Hadoop vs spark
Hadoop vs sparkHadoop vs spark
Hadoop vs sparkamarkayam
 
Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)Steve Min
 
JackHare- a framework for SQL to NoSQL translation using MapReduce
JackHare- a framework for SQL to NoSQL translation using MapReduceJackHare- a framework for SQL to NoSQL translation using MapReduce
JackHare- a framework for SQL to NoSQL translation using MapReduce康志強 大人
 
Hadoop by kamran khan
Hadoop by kamran khanHadoop by kamran khan
Hadoop by kamran khanKamranKhan587
 

Similar to Oracle in Database Hadoop (20)

Improving MySQL performance with Hadoop
Improving MySQL performance with HadoopImproving MySQL performance with Hadoop
Improving MySQL performance with Hadoop
 
Analyzing_Data_with_Spark_and_Cassandra
Analyzing_Data_with_Spark_and_CassandraAnalyzing_Data_with_Spark_and_Cassandra
Analyzing_Data_with_Spark_and_Cassandra
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
Analytics on Hadoop
Analytics on HadoopAnalytics on Hadoop
Analytics on Hadoop
 
Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2
 
Why Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data WorldWhy Scala Is Taking Over the Big Data World
Why Scala Is Taking Over the Big Data World
 
It takes two to tango! : Is SQL-on-Hadoop the next big step?
It takes two to tango! : Is SQL-on-Hadoop the next big step?It takes two to tango! : Is SQL-on-Hadoop the next big step?
It takes two to tango! : Is SQL-on-Hadoop the next big step?
 
Hadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in ActionHadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in Action
 
Kerry osborne hadoop meets exadata
Kerry osborne hadoop meets exadataKerry osborne hadoop meets exadata
Kerry osborne hadoop meets exadata
 
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...
Cassandra Day Denver 2014: Feelin' the Flow: Analyzing Data with Spark and Ca...
 
Hadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry OsborneHadoop Meets Exadata- Kerry Osborne
Hadoop Meets Exadata- Kerry Osborne
 
big dat ppt
big dat pptbig dat ppt
big dat ppt
 
Why Spark over Hadoop?
Why Spark over Hadoop?Why Spark over Hadoop?
Why Spark over Hadoop?
 
Hadoop vs spark
Hadoop vs sparkHadoop vs spark
Hadoop vs spark
 
Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)Apache Spark Overview part1 (20161107)
Apache Spark Overview part1 (20161107)
 
JackHare- a framework for SQL to NoSQL translation using MapReduce
JackHare- a framework for SQL to NoSQL translation using MapReduceJackHare- a framework for SQL to NoSQL translation using MapReduce
JackHare- a framework for SQL to NoSQL translation using MapReduce
 
Hadoop programming
Hadoop programmingHadoop programming
Hadoop programming
 
Hadoop by kamran khan
Hadoop by kamran khanHadoop by kamran khan
Hadoop by kamran khan
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Oracle in Database Hadoop

  • 1. <Insert Picture Here> Oracle In-Database Hadoop: When MapReduce Meets RDBMS Kuassi Mensah | db360.blogspot.com @kmensah Director Product Management | kuassi.mensah@oracle.com
  • 2. The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah 2 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8
  • 3. Oracle In-Database Hadoop Agenda •  In-Database MapReduce •  Why •  Previous Initiatives and Limitations •  Oracle In-Database Hadoop •  Integration with Oracle’s Big Data solution •  Summary Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 4. In-Database MapReduce Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 5. MapReduce Paradigm You All Know This Stuff! Map: <K1,V1>  →     {<K2,V2>,…}   Shuffle: {<K2,V2>,  …}  →   {<K2,{V2,…,V2}>,…}   Reduce: <K2,{V2,…,V2}>   →  {<K3,V3>,…}   Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 6. In-Database MapReduce Why? •  Avoid shipping data residing in RDBMS to a separate infrastructure. •  Many initiatives •  Address top two issues preventing broader adoption of Hadoop in the enterprise •  Lack of development and/or administration skills •  Lack of enterprise-class security Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 7. In-Database MapReduce Previous Efforts and Limitations •  SQL-MapReduce,HadoopDB (Hadapt), etc. •  PL/SQL User-defined pipelined table functions and aggregation objects •  Limitations •  Lack of compatibility with Hadoop •  Loose integration with Hadoop •  Dependency on Hadoop infrastructure Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 8. Oracle In-Database Hadoop Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 9. Oracle In-Database Hadoop is a prototype (not a feature of Oracle products), built on current Oracle products. Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 10. Oracle In-Database Hadoop Goals •  Avoid shipping data residing in Oracle database to Hadoop clusters. •  Preserve Hadoop programming model •  Reduce dependency on Hadoop infrastructure •  Get enterprise developers up to speed with minimal training •  Get enterprise administrators (DBAs) up to speed with minimal training •  Reduce deployment time •  Bring enterprise class security to MapReduce •  Seamless integration with Oracle’s Big Data solution Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 11. Oracle In-Database Hadoop Compatibility & Minimal Dependency on Hadoop Infra Node 1 Node 2 Node 3 Pipelelined Table Function w Java impl. Mapping Process Mapping Process Mapping Process PARTITION by CLUSTER BY Clause Node 1 Node 2 Node 3 Pipelined Table Function w Java impl. Reducing Process Reducing Process Reducing Process Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 12. Oracle In-Database Hadoop Preserve Hadoop Programming Model •  Source-compatibility •  Job configuration •  Invocation thru Java interface: job.run() •  Direct table access: TableReader and TableWriter Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 13. Oracle In-Database Hadoop SQL and MapReduce Integration •  Mix SQL and MapReduce processing for flexibility and efficiency. •  MapReduce steps as pipelined table functions. INSERT  INTO  OutTable   SELECT  *  FROM  TABLE    (Word_Count_Reduce(:ConfKey,      CURSOR(SELECT  *  FROM  TABLE    (Word_Count_Map(:ConfKey,      CURSOR(SELECT  *  FROM  InTable))))))   Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 14. Oracle In-Database Hadoop SQL and Java interfaces    SELECT  *  FROM  TABLE      public  class  WordCount  {          (Reduce_VARCHAR2_NUMBER(:ConfKey,      public  static  void  main()  throws  Exception  {          /*  Setup  the  parameters  and  run  the  job  */              CURSOR(SELECT  *  FROM  TABLE          ……          (Map_VARCHAR2_NUMBER(:ConfKey,          job.init();              CURSOR(SELECT  *  from  InTable))))))          job.run();      }   Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 15. Oracle In-Database Hadoop Leverage Enterprise Skills •  Get database developers up to speed, with minimal training, on developing MapReduce jobs by reusing Hadoop Mappers and Reducers •  Get DBAs up to speed on deploying and managing MapReduce jobs with minimal training Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 16. Oracle Database Security Bringing Enterprise Class Security to MapReduce •  Auditing and Monitoring •  Database Activity Auditing •  Database Firewall Monitoring •  Centralized Audit Data Warehouse •  Encryption and Masking •  Transparent Data Encryption •  Network Encryption/Strong Auth •  Data Masking for Non-Production •  Privileged User Access Control and Contextual Authorization •  Separation of Duties for DBAs •  Protection Realms & Rules •  Label Based Access Control Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 17. Seamless integration with Oracle’s Big Data Solution Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 18. Oracle’s Big Data solution Endeca Information Discovery Oracle Big Data Oracle Appliance Exadata Oracle Exalytics InfiniBand InfiniBand Oracle Real-Time Decisions Acquire Organize & Discover Analyze Decide Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 19. Oracle Direct Connector for HDFS Direct Access from HDFS Oracle Database Oracle Database SQL Query SQL access to HDFS External Table External table view Data query or import DCH HDFS Infini Band DCH DCH Client Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 20. Oracle In-Database Analytics Oracle Advanced Analytics Statistical Data Mining Text Graph Spatial Semantic 2 miles Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 21. What Have We Done? Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 22. Oracle In-Database Hadoop Summary A prototype: •  Apply MapReduce processing to data in Oracle RDBMS without the need of a separate infrastructure. •  Compatibility with Hadoop while minimizing dependency on the Apache Hadoop infrastructure. •  Reduce training and deployment time. •  Integration with Oracle SQL, allowing mixing MapReduce steps with sophisticated SQL queries. •  Bring Enterprise Class Security to Hadoop MapReduce •  Seamless integration with Oracle’s Big Data solution Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 23. Demo Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 24. Hadoop Summit 2012, June 13-14, San Jose, California, USA Oracle In-Database Hadoop: When MapReduce Meets RDBMS. Kuassi Mensah
  • 25. Thank You! Page 25