SlideShare a Scribd company logo
1 of 24
Presentation  slide  for  Sqoop  User  Meetup  (Strata  +  Hadoop  World  NYC  2013)  

Complex  stories  about
Sqooping  PostgreSQL  data
10/28/2013
NTT  DATA  Corporation
Masatake  Iwasaki

Copyright © 2013 NTT DATA Corporation
Introduction

Copyright © 2013 NTT DATA Corporation

2
About  Me
Masatake  Iwasaki:
Software  Engineer  @  NTT  DATA:

NTT(Nippon  Telegraph  and  Telephone  Corporation):  Telecommunication
NTT  DATA:  Systems  Integrator

Developed:
Ludia:  Fulltext  search  index  for  PostgreSQL  using  Senna
Authored:
“A  Complete  Primer  for  Hadoop”  
(no  official  English  title)
Patches  for  Sqoop:
SQOOP-‐‑‒390:  PostgreSQL  connector  for  direct  export  with  pg_̲bulkload
SQOOP-‐‑‒999:  Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM  
SQOOP-‐‑‒1155:  Sqoop  2  documentation  for  connector  development  

Copyright © 2013 NTT DATA Corporation

3
Why  PostgreSQL?

Enterprisy
from  earlier  version
comparing  to  MySQL
Active  community  in  Japan
NTT  DATA  commits  itself  to  development

Copyright © 2013 NTT DATA Corporation

4
Sqooping  PostgreSQL  data

Copyright © 2013 NTT DATA Corporation

5
Working  around  PostgreSQL
Direct  connector  for  PostgreSQL  loader:
    SQOOP-‐‑‒390:  PostgreSQL  connector  for  direct  export  with  pg_̲bulkload
Yet  another  direct  connector  for  PostgreSQL  JDBC:
    SQOOP-‐‑‒999:  Support  bulk  load  from  HDFS  to  PostgreSQL
                                            using  COPY  ...  FROM
Supporting  complex  data  types:
    SQOOP-‐‑‒1149:  Support  Custom  Postgres  Types
  

Copyright © 2013 NTT DATA Corporation

6
Direct  connector  for  PostgreSQL  loader

SQOOP-‐‑‒390:  
    PostgreSQL  connector  for  direct  export  with  pg_̲bulkload

pg_̲bulkload:
Data  loader  for  PostgreSQL
Server  side  plug-‐‑‒in  library  and  client  side  command
Providing  filtering  and  transformation  of  data
http://pgbulkload.projects.pgfoundry.org/

Copyright © 2013 NTT DATA Corporation

7
SQOOP-‐‑‒390:  
    PostgreSQL  connector  for  direct  export  with  pg_̲bulkload	
HDFS

File  Split

File  Split

File  Split
CREATE  TABLE
    tmp3(LIKE  dest  INCLUDING  CONSTRAINTS)

Mapper

Mapper

Mapper

pg_̲bulkload

pg_̲bulkoad

pg_̲bulkload

tmp1

PostgreSQL

tmp2

Reducer

Destination  Table

Copyright © 2013 NTT DATA Corporation

tmp3

external  process
staging  table  per  mapper  is  must
due  to  table  level  locks
BEGIN
INSERT  INTO  dest  (  SELECT  *  FROM  tmp1  )
DROP  TABLE  tmp1
INSERT  INTO  dest  (  SELECT  *  FROM  tmp2  )
DROP  TABLE  tmp2
INSERT  INTO  dest  (  SELECT  *  FROM  tmp3  )
DROP  TABLE  tmp3
COMMIT
8
Direct  connector  for  PostgreSQL  loader

Pros:
Fast

by  short-‐‑‒circuitting  server  functionality

Flexible

filtering  error  records

Cons:
Not  so  fast

Bottleneck  is  not  in  client  side  but  in  DB  side
Built-‐‑‒in  COPY  functionality  is  fast  enough

Not  General

pg_̲bulkload  supports  only  export

Requiring  setup  on  all  slave  nodes  and  client  node
Possible  to  Require  recovery  on  failure
Copyright © 2013 NTT DATA Corporation

9
Yet  another  direct  connector  for  PostgreSQL  JDBC
PostgreSQL  provides  custom  SQL  command  for  data  import/export
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
TO { 'filename' | STDOUT }
[ [ WITH ] ( option [, ...] ) ]
where option can be one of:
FORMAT format_name
OIDS [ boolean ]
DELIMITER 'delimiter_character'
NULL 'null_string'
HEADER [ boolean ]
QUOTE 'quote_character'
ESCAPE 'escape_character'
FORCE_QUOTE { ( column_name [, ...] ) | * }
FORCE_NOT_NULL ( column_name [, ...] )
ENCODING 'encoding_name‘

AND  JDBC  API
org.postgresql.copy.*
Copyright © 2013 NTT DATA Corporation

10
SQOOP-‐‑‒999:
    Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM	

File  Split

File  Split

File  Split

Mapper

Mapper

Mapper

PostgreSQL
JDBC

HDFS

PostgreSQL
JDBC

PostgreSQL
JDBC

Using  custom  SQL  command  
via  JDBC  API
only  available  in  PostgreSQL
COPY  FROM  STDIN  WITH  ...

staging  tagle
PostgreSQL
Destination  Table

Copyright © 2013 NTT DATA Corporation

11
SQOOP-‐‑‒999:
    Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM  
import org.postgresql.copy.CopyManager;
import org.postgresql.copy.CopyIn;
...

Requiring  PostgreSQL  
specific  interface.

protected void setup(Context context)
...
dbConf = new DBConfiguration(conf);
CopyManager cm = null;
...
public void map(LongWritable key, Writable value, Context context)
...
if (value instanceof Text) {
line.append(System.getProperty("line.separator"));
}
try {
byte[]data = line.toString().getBytes("UTF-8"); Just  feeding  lines  of  text
copyin.writeToCopy(data, 0, data.length);
Copyright © 2013 NTT DATA Corporation

12
Yet  another  direct  connector  for  PostgreSQL  JDBC

Pros:
Fast  enough
Ease  of  use

JDBC  driver  jar  is  distributed  automatically  by  MR  framework

Cons:
Dependency  on  not  general  JDBC

possible  licensing  issue  (PostgreSQL  is  OK,  itʼ’s  BSD  Lisence)
build  time  requirement  (PostgreSQL  JDBC  is  available  in  Maven  repo.)
<dependency org="org.postgresql" name="postgresql"
rev="${postgresql.version}" conf="common->default" />

Error  record  causes  rollback  of  whole  transaction
Still  difficult  to  implement  custom  connector  for  IMPORT
because  of  code  generation  part

Copyright © 2013 NTT DATA Corporation

13
Supporting  complex  data  types
PostgreSQL  supports  lot  of  complex  data  types  
Geometric  Types
Points
Line  Segments
Boxes
Paths
Polygons
Circles

Network  Address  Types
inet
cidr
macaddr

XML  Type
JSON  Type

Supporting  complex  data  types:
    SQOOP-‐‑‒1149:  Support  Custom  Postgres  Types
Copyright © 2013 NTT DATA Corporation

not  me
14
Constraints  on  JDBC  data  types  in  Sqoop  framework
protected Map<String, Integer> getColumnTypesForRawQuery(String stmt) {
...
results = execute(stmt);
...
ResultSetMetaData metadata = results.getMetaData();
for (int i = 1; i < cols + 1; i++) {
int typeId = metadata.getColumnType(i);

returns  java.sql.Types.OTHER  for  types  
{ not  mappable  to  basic    Java  data  types
=>  Losing  type  imformation    

public String toJavaType(int sqlType)
// Mappings taken from:
// http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/mapping.html
if (sqlType == Types.INTEGER) {
return "Integer";
} else if (sqlType == Types.VARCHAR) {
return "String";
....
reaches  here
} else {
// TODO(aaron): Support DISTINCT, ARRAY, STRUCT, REF, JAVA_OBJECT.
// Return null indicating database-specific manager should return a
// java data type if it can find one for any nonstandard type.
return null;
Copyright © 2013 NTT DATA Corporation

15
Sqoop  1  Summary

Pros:
Simple  Standalone  MapReduce  Driver
Easy  to  understand  for  MR  application  developpers

                                                                            except  for  ORM  (SqoopRecord)  code  generation  part.

Variety  of  connectors
Lot  of  information
Cons:
Complex  command  line  and  inconsistent  options
meaning  of  options  is  according  to  connectors

Not  enough  modular
Dependency  on  JDBC  data  model
Security
Copyright © 2013 NTT DATA Corporation

16
Sqooping  PostgreSQL  Data  2

Copyright © 2013 NTT DATA Corporation

17
Sqoop  2
Everything  are  rewritten
Working  on  server  side
More  modular
Not  compatible  with  Sqoop  1  at  all
(Almost)  Only  generic  connector
Black  box  comparing  to  Sqoop  1
Needs  more  documentation

SQOOP-‐‑‒1155:  Sqoop  2  documentation  for  connector  development
Internal of Sqoop2 MapReduce Job
++++++++++++++++++++++++++++++++
...
- OutputFormat invokes Loader's load method (via SqoopOutputFor
.. todo: sequence diagram like figure.
Copyright © 2013 NTT DATA Corporation

18
Sqoop2:  Initialization  phase  of  IMPORT  job
	
 
Implement  this
	
 	
 	
 	
 	
 	
 	
 ,----------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,-----------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 |SqoopInputFormat|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |Partitioner|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 `-------+--------'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `-----+-----'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 getSplits	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 ----------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 getPartitions	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |------------------------>|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 ,---------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |------->	
 |Partition|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 `----+----'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |<-	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,----------.	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |-------------------------------------------------->|SqoopSplit|	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `----+-----'	
 

Copyright © 2013 NTT DATA Corporation

19
Sqoop2:  Map  phase  of  IMPORT  job
	
 
	
 	
 	
 	
 	
 	
 	
 ,-----------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 |SqoopMapper|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 `-----+-----'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 run	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 --------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,-------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |---------------------------------->|MapDataWriter|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `------+------'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
Implement  this
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,---------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |-------------->	
 |Extractor|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `----+----'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 extract	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |-------------------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
Conversion  to  Sqoop  
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
internal  data  format
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 read	
 from	
 DB	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 <-------------------------------|	
 	
 	
 	
 	
 	
 write*	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |------------------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 ,----.	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |---------->|Data|	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `-+--'	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 context.write	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |-------------------------->	
 
	
 
Copyright © 2013 NTT DATA Corporation

20
Sqoop2:  Reduce  phase  of  EXPORT  job

	
 	
 ,-------.	
 	
 ,---------------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 |Reducer|	
 	
 |SqoopNullOutputFormat|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 `---+---'	
 	
 `----------+----------'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 ,-----------------------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
Implement  this
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |->	
 |SqoopOutputFormatLoadExecutor|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 `--------------+--------------'	
 	
 	
 	
 	
 	
 	
 	
 ,----.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |--------------------->	
 |Data|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 `-+--'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 ,-----------------.	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |->	
 |SqoopRecordWriter|	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 getRecordWriter	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 `--------+--------'	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
----------------------->|	
 getRecordWriter	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |----------------->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 ,--------------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |----------------------------->	
 |ConsumerThread|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 `------+-------'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |<-	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 ,------.	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
<-	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -	
 -|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |--->|Loader|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 `--+---'	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 load	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 run	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |------>|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
----->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 write	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |------------------------------------------------>|	
 setContent	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 read*	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |----------->|	
 getContent	
 |<------|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |<-----------|	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 -	
 -	
 ->|	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 |	
 write	
 into	
 DB	
 	
 
	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 	
 |	
 	
 	
 	
 	
 	
 	
 |-------------->	
 

Copyright © 2013 NTT DATA Corporation

21
Summary

Copyright © 2013 NTT DATA Corporation

22
My  interest  for  popularizing  Sqoop  2

Complex  data  type  support  in  Sqoop  2
Bridge  to  use  Sqoop  1  connectors  on  Sqoop  2
Bridge  to  use  Sqoop  2  connectors  from  Sqoop  1  CLI

Copyright © 2013 NTT DATA Corporation

23
Copyright © 2011 NTT DATA Corporation

Copyright © 2013 NTT DATA Corporation

More Related Content

What's hot

SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)Kohei KaiGai
 
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀EXEM
 
A git workflow for Drupal Core development
A git workflow for Drupal Core developmentA git workflow for Drupal Core development
A git workflow for Drupal Core developmentCameron Tod
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)Naoto MATSUMOTO
 
Advanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkAdvanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkRiyaj Shamsudeen
 
A kind and gentle introducton to rac
A kind and gentle introducton to racA kind and gentle introducton to rac
A kind and gentle introducton to racRiyaj Shamsudeen
 
Athenticated smaba server config with open vpn
Athenticated smaba server  config with open vpnAthenticated smaba server  config with open vpn
Athenticated smaba server config with open vpnChanaka Lasantha
 
Mvcc in postgreSQL 권건우
Mvcc in postgreSQL 권건우Mvcc in postgreSQL 권건우
Mvcc in postgreSQL 권건우PgDay.Seoul
 
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel ExecutionDoug Burns
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015Kohei KaiGai
 
Cisco vs. huawei CLI Commands
Cisco vs. huawei CLI CommandsCisco vs. huawei CLI Commands
Cisco vs. huawei CLI CommandsBootcamp SCL
 
Ims06 change you can believe in
Ims06 change you can believe inIms06 change you can believe in
Ims06 change you can believe inRobert Hain
 
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to AnalyticsReplicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to AnalyticsLinas Virbalas
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersConnor McDonald
 
Amazon aurora 1
Amazon aurora 1Amazon aurora 1
Amazon aurora 1EXEM
 
OakTable World Sep14 clonedb
OakTable World Sep14 clonedb OakTable World Sep14 clonedb
OakTable World Sep14 clonedb Connor McDonald
 

What's hot (20)

SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)SQL+GPU+SSD=∞ (English)
SQL+GPU+SSD=∞ (English)
 
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
제 8회 엑셈 수요 세미나 자료 연구컨텐츠팀
 
A git workflow for Drupal Core development
A git workflow for Drupal Core developmentA git workflow for Drupal Core development
A git workflow for Drupal Core development
 
Apend. networking linux
Apend. networking linuxApend. networking linux
Apend. networking linux
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)
 
Advanced RAC troubleshooting: Network
Advanced RAC troubleshooting: NetworkAdvanced RAC troubleshooting: Network
Advanced RAC troubleshooting: Network
 
A kind and gentle introducton to rac
A kind and gentle introducton to racA kind and gentle introducton to rac
A kind and gentle introducton to rac
 
Athenticated smaba server config with open vpn
Athenticated smaba server  config with open vpnAthenticated smaba server  config with open vpn
Athenticated smaba server config with open vpn
 
Mvcc in postgreSQL 권건우
Mvcc in postgreSQL 권건우Mvcc in postgreSQL 권건우
Mvcc in postgreSQL 권건우
 
Introduction to Parallel Execution
Introduction to Parallel ExecutionIntroduction to Parallel Execution
Introduction to Parallel Execution
 
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
 
Cisco vs. huawei CLI Commands
Cisco vs. huawei CLI CommandsCisco vs. huawei CLI Commands
Cisco vs. huawei CLI Commands
 
Ims06 change you can believe in
Ims06 change you can believe inIms06 change you can believe in
Ims06 change you can believe in
 
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to AnalyticsReplicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
Replicate Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
 
OpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developersOpenWorld Sep14 12c for_developers
OpenWorld Sep14 12c for_developers
 
Ccnp3 lab 3_3_en
Ccnp3 lab 3_3_enCcnp3 lab 3_3_en
Ccnp3 lab 3_3_en
 
Amazon aurora 1
Amazon aurora 1Amazon aurora 1
Amazon aurora 1
 
Memcache as udp traffic reflector
Memcache as udp traffic reflectorMemcache as udp traffic reflector
Memcache as udp traffic reflector
 
Ccnp3 lab 3_4_en
Ccnp3 lab 3_4_enCcnp3 lab 3_4_en
Ccnp3 lab 3_4_en
 
OakTable World Sep14 clonedb
OakTable World Sep14 clonedb OakTable World Sep14 clonedb
OakTable World Sep14 clonedb
 

Viewers also liked

Intellectual property in social media and online resources. Dmytro Gadomsky
Intellectual property in social media and online resources. Dmytro GadomskyIntellectual property in social media and online resources. Dmytro Gadomsky
Intellectual property in social media and online resources. Dmytro GadomskyConstantine Zerov
 
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 201014 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010Sierra Francisco Justo
 
Fine motor
Fine motorFine motor
Fine motorjeh20717
 
Fine Motor 2
Fine Motor 2Fine Motor 2
Fine Motor 2jeh20717
 
Israel dop curvas horizontales eua&amp;europa r
Israel dop curvas horizontales eua&amp;europa rIsrael dop curvas horizontales eua&amp;europa r
Israel dop curvas horizontales eua&amp;europa rSierra Francisco Justo
 
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)IT Arena
 
D drops Vitamin D za stare i mlade
D drops Vitamin D za stare i mladeD drops Vitamin D za stare i mlade
D drops Vitamin D za stare i mladeZoran Stojcevski
 
What is it to be a senior engineer?
What is it to be a senior engineer?What is it to be a senior engineer?
What is it to be a senior engineer?Vladimir Miguro
 
До Дня Незалежності України
До Дня Незалежності УкраїниДо Дня Незалежності України
До Дня Незалежності УкраїниMargoshenka
 
Захист прав інтелектуальної власності. Носік Ю.В.
Захист прав інтелектуальної власності. Носік Ю.В.Захист прав інтелектуальної власності. Носік Ю.В.
Захист прав інтелектуальної власності. Носік Ю.В.Constantine Zerov
 
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)IT Arena
 
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google AnalitycsСтроим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google AnalitycsMaxim Uvarov
 
Reworker предпочтовая подготовка
Reworker предпочтовая подготовкаReworker предпочтовая подготовка
Reworker предпочтовая подготовкаNikita Florinskiy
 
PwC Sports Outlook 2016
PwC Sports Outlook 2016PwC Sports Outlook 2016
PwC Sports Outlook 2016Jonesy033
 
Failure Analysis of Gas Turbine Blade
Failure Analysis of Gas Turbine BladeFailure Analysis of Gas Turbine Blade
Failure Analysis of Gas Turbine BladeOluwaseyi Adeniyan
 
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.GIDEONE Moura Santos Ferreira
 
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...Constantine Zerov
 

Viewers also liked (20)

Retail crm
Retail crmRetail crm
Retail crm
 
Intellectual property in social media and online resources. Dmytro Gadomsky
Intellectual property in social media and online resources. Dmytro GadomskyIntellectual property in social media and online resources. Dmytro Gadomsky
Intellectual property in social media and online resources. Dmytro Gadomsky
 
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 201014 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
14 albrieu&amp;baruzzi 2016 comparación normasdnv1967 2010
 
VidaEconomica_21NOV
VidaEconomica_21NOVVidaEconomica_21NOV
VidaEconomica_21NOV
 
Fine motor
Fine motorFine motor
Fine motor
 
Fine Motor 2
Fine Motor 2Fine Motor 2
Fine Motor 2
 
Israel dop curvas horizontales eua&amp;europa r
Israel dop curvas horizontales eua&amp;europa rIsrael dop curvas horizontales eua&amp;europa r
Israel dop curvas horizontales eua&amp;europa r
 
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
Cryptocurrencies for Everyone (Dmytro Pershyn Technology Stream)
 
D drops Vitamin D za stare i mlade
D drops Vitamin D za stare i mladeD drops Vitamin D za stare i mlade
D drops Vitamin D za stare i mlade
 
What is it to be a senior engineer?
What is it to be a senior engineer?What is it to be a senior engineer?
What is it to be a senior engineer?
 
До Дня Незалежності України
До Дня Незалежності УкраїниДо Дня Незалежності України
До Дня Незалежності України
 
Захист прав інтелектуальної власності. Носік Ю.В.
Захист прав інтелектуальної власності. Носік Ю.В.Захист прав інтелектуальної власності. Носік Ю.В.
Захист прав інтелектуальної власності. Носік Ю.В.
 
E05912226
E05912226E05912226
E05912226
 
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
How to Create Digital Disruption (Alejandro Danylyszyn Business Stream)
 
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google AnalitycsСтроим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
Строим собственную BI в MS Excel на данных из Яндекс.Метрики и Google Analitycs
 
Reworker предпочтовая подготовка
Reworker предпочтовая подготовкаReworker предпочтовая подготовка
Reworker предпочтовая подготовка
 
PwC Sports Outlook 2016
PwC Sports Outlook 2016PwC Sports Outlook 2016
PwC Sports Outlook 2016
 
Failure Analysis of Gas Turbine Blade
Failure Analysis of Gas Turbine BladeFailure Analysis of Gas Turbine Blade
Failure Analysis of Gas Turbine Blade
 
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
EBD CPAD Lições bíblicas 1° trimestre 2016 lição 3 Esperando a volta de Jesus.
 
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
Сучасні форми фінансування комерціалізації інтелектуальної власності. Володим...
 

Similar to Sqoop 2 Initialization Phase of IMPORT Job

PostgreSQL 13 New Features
PostgreSQL 13 New FeaturesPostgreSQL 13 New Features
PostgreSQL 13 New FeaturesJosé Lin
 
Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Stephen Gordon
 
Couch to OpenStack: Glance - July, 23, 2013
Couch to OpenStack: Glance - July, 23, 2013Couch to OpenStack: Glance - July, 23, 2013
Couch to OpenStack: Glance - July, 23, 2013Trevor Roberts Jr.
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_PlaceKohei KaiGai
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda enKohei KaiGai
 
TrinityCore server install guide
TrinityCore server install guideTrinityCore server install guide
TrinityCore server install guideSeungmin Shin
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012Roland Bouman
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012Roland Bouman
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraSveta Smirnova
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and ArchitectureSidney Chen
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스PgDay.Seoul
 
NetDevOps 202: Life After Configuration
NetDevOps 202: Life After ConfigurationNetDevOps 202: Life After Configuration
NetDevOps 202: Life After ConfigurationCumulus Networks
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multiKohei KaiGai
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PROIDEA
 
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...Continuent
 

Similar to Sqoop 2 Initialization Phase of IMPORT Job (20)

PostgreSQL 13 New Features
PostgreSQL 13 New FeaturesPostgreSQL 13 New Features
PostgreSQL 13 New Features
 
Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015Compute 101 - OpenStack Summit Vancouver 2015
Compute 101 - OpenStack Summit Vancouver 2015
 
MySQLinsanity
MySQLinsanityMySQLinsanity
MySQLinsanity
 
Couch to OpenStack: Glance - July, 23, 2013
Couch to OpenStack: Glance - July, 23, 2013Couch to OpenStack: Glance - July, 23, 2013
Couch to OpenStack: Glance - July, 23, 2013
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
 
pgconfasia2016 plcuda en
pgconfasia2016 plcuda enpgconfasia2016 plcuda en
pgconfasia2016 plcuda en
 
TrinityCore server install guide
TrinityCore server install guideTrinityCore server install guide
TrinityCore server install guide
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
 
How to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with GaleraHow to Avoid Pitfalls in Schema Upgrade with Galera
How to Avoid Pitfalls in Schema Upgrade with Galera
 
Pro PostgreSQL
Pro PostgreSQLPro PostgreSQL
Pro PostgreSQL
 
Oracle Basics and Architecture
Oracle Basics and ArchitectureOracle Basics and Architecture
Oracle Basics and Architecture
 
Curso de MySQL 5.7
Curso de MySQL 5.7Curso de MySQL 5.7
Curso de MySQL 5.7
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
 
NetDevOps 202: Life After Configuration
NetDevOps 202: Life After ConfigurationNetDevOps 202: Life After Configuration
NetDevOps 202: Life After Configuration
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
 
Stress your DUT
Stress your DUTStress your DUT
Stress your DUT
 
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
PLNOG20 - Paweł Małachowski - Stress your DUT–wykorzystanie narzędzi open sou...
 
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
Training Slides: Advanced 304: Upgrading From Native MySQL Replication To Tun...
 

More from NTT DATA OSS Professional Services

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力NTT DATA OSS Professional Services
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~NTT DATA OSS Professional Services
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントNTT DATA OSS Professional Services
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~NTT DATA OSS Professional Services
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~NTT DATA OSS Professional Services
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのことNTT DATA OSS Professional Services
 

More from NTT DATA OSS Professional Services (20)

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力
 
Spark SQL - The internal -
Spark SQL - The internal -Spark SQL - The internal -
Spark SQL - The internal -
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
 
Hadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返りHadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返り
 
HDFS Router-based federation
HDFS Router-based federationHDFS Router-based federation
HDFS Router-based federation
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
 
Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状
 
Distributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystemDistributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystem
 
Structured Streaming - The Internal -
Structured Streaming - The Internal -Structured Streaming - The Internal -
Structured Streaming - The Internal -
 
Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?
 
Apache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development statusApache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development status
 
HDFS basics from API perspective
HDFS basics from API perspectiveHDFS basics from API perspective
HDFS basics from API perspective
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
 
20170303 java9 hadoop
20170303 java9 hadoop20170303 java9 hadoop
20170303 java9 hadoop
 
ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)
 
Application of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jpApplication of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jp
 
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructureApplication of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
 
Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

Sqoop 2 Initialization Phase of IMPORT Job

  • 1. Presentation  slide  for  Sqoop  User  Meetup  (Strata  +  Hadoop  World  NYC  2013)   Complex  stories  about Sqooping  PostgreSQL  data 10/28/2013 NTT  DATA  Corporation Masatake  Iwasaki Copyright © 2013 NTT DATA Corporation
  • 2. Introduction Copyright © 2013 NTT DATA Corporation 2
  • 3. About  Me Masatake  Iwasaki: Software  Engineer  @  NTT  DATA: NTT(Nippon  Telegraph  and  Telephone  Corporation):  Telecommunication NTT  DATA:  Systems  Integrator Developed: Ludia:  Fulltext  search  index  for  PostgreSQL  using  Senna Authored: “A  Complete  Primer  for  Hadoop”   (no  official  English  title) Patches  for  Sqoop: SQOOP-‐‑‒390:  PostgreSQL  connector  for  direct  export  with  pg_̲bulkload SQOOP-‐‑‒999:  Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM   SQOOP-‐‑‒1155:  Sqoop  2  documentation  for  connector  development   Copyright © 2013 NTT DATA Corporation 3
  • 4. Why  PostgreSQL? Enterprisy from  earlier  version comparing  to  MySQL Active  community  in  Japan NTT  DATA  commits  itself  to  development Copyright © 2013 NTT DATA Corporation 4
  • 5. Sqooping  PostgreSQL  data Copyright © 2013 NTT DATA Corporation 5
  • 6. Working  around  PostgreSQL Direct  connector  for  PostgreSQL  loader:    SQOOP-‐‑‒390:  PostgreSQL  connector  for  direct  export  with  pg_̲bulkload Yet  another  direct  connector  for  PostgreSQL  JDBC:    SQOOP-‐‑‒999:  Support  bulk  load  from  HDFS  to  PostgreSQL                                            using  COPY  ...  FROM Supporting  complex  data  types:    SQOOP-‐‑‒1149:  Support  Custom  Postgres  Types   Copyright © 2013 NTT DATA Corporation 6
  • 7. Direct  connector  for  PostgreSQL  loader SQOOP-‐‑‒390:      PostgreSQL  connector  for  direct  export  with  pg_̲bulkload pg_̲bulkload: Data  loader  for  PostgreSQL Server  side  plug-‐‑‒in  library  and  client  side  command Providing  filtering  and  transformation  of  data http://pgbulkload.projects.pgfoundry.org/ Copyright © 2013 NTT DATA Corporation 7
  • 8. SQOOP-‐‑‒390:      PostgreSQL  connector  for  direct  export  with  pg_̲bulkload HDFS File  Split File  Split File  Split CREATE  TABLE    tmp3(LIKE  dest  INCLUDING  CONSTRAINTS) Mapper Mapper Mapper pg_̲bulkload pg_̲bulkoad pg_̲bulkload tmp1 PostgreSQL tmp2 Reducer Destination  Table Copyright © 2013 NTT DATA Corporation tmp3 external  process staging  table  per  mapper  is  must due  to  table  level  locks BEGIN INSERT  INTO  dest  (  SELECT  *  FROM  tmp1  ) DROP  TABLE  tmp1 INSERT  INTO  dest  (  SELECT  *  FROM  tmp2  ) DROP  TABLE  tmp2 INSERT  INTO  dest  (  SELECT  *  FROM  tmp3  ) DROP  TABLE  tmp3 COMMIT 8
  • 9. Direct  connector  for  PostgreSQL  loader Pros: Fast by  short-‐‑‒circuitting  server  functionality Flexible filtering  error  records Cons: Not  so  fast Bottleneck  is  not  in  client  side  but  in  DB  side Built-‐‑‒in  COPY  functionality  is  fast  enough Not  General pg_̲bulkload  supports  only  export Requiring  setup  on  all  slave  nodes  and  client  node Possible  to  Require  recovery  on  failure Copyright © 2013 NTT DATA Corporation 9
  • 10. Yet  another  direct  connector  for  PostgreSQL  JDBC PostgreSQL  provides  custom  SQL  command  for  data  import/export COPY table_name [ ( column_name [, ...] ) ] FROM { 'filename' | STDIN } [ [ WITH ] ( option [, ...] ) ] COPY { table_name [ ( column_name [, ...] ) ] | ( query ) } TO { 'filename' | STDOUT } [ [ WITH ] ( option [, ...] ) ] where option can be one of: FORMAT format_name OIDS [ boolean ] DELIMITER 'delimiter_character' NULL 'null_string' HEADER [ boolean ] QUOTE 'quote_character' ESCAPE 'escape_character' FORCE_QUOTE { ( column_name [, ...] ) | * } FORCE_NOT_NULL ( column_name [, ...] ) ENCODING 'encoding_name‘ AND  JDBC  API org.postgresql.copy.* Copyright © 2013 NTT DATA Corporation 10
  • 11. SQOOP-‐‑‒999:    Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM File  Split File  Split File  Split Mapper Mapper Mapper PostgreSQL JDBC HDFS PostgreSQL JDBC PostgreSQL JDBC Using  custom  SQL  command   via  JDBC  API only  available  in  PostgreSQL COPY  FROM  STDIN  WITH  ... staging  tagle PostgreSQL Destination  Table Copyright © 2013 NTT DATA Corporation 11
  • 12. SQOOP-‐‑‒999:    Support  bulk  load  from  HDFS  to  PostgreSQL  using  COPY  ...  FROM   import org.postgresql.copy.CopyManager; import org.postgresql.copy.CopyIn; ... Requiring  PostgreSQL   specific  interface. protected void setup(Context context) ... dbConf = new DBConfiguration(conf); CopyManager cm = null; ... public void map(LongWritable key, Writable value, Context context) ... if (value instanceof Text) { line.append(System.getProperty("line.separator")); } try { byte[]data = line.toString().getBytes("UTF-8"); Just  feeding  lines  of  text copyin.writeToCopy(data, 0, data.length); Copyright © 2013 NTT DATA Corporation 12
  • 13. Yet  another  direct  connector  for  PostgreSQL  JDBC Pros: Fast  enough Ease  of  use JDBC  driver  jar  is  distributed  automatically  by  MR  framework Cons: Dependency  on  not  general  JDBC possible  licensing  issue  (PostgreSQL  is  OK,  itʼ’s  BSD  Lisence) build  time  requirement  (PostgreSQL  JDBC  is  available  in  Maven  repo.) <dependency org="org.postgresql" name="postgresql" rev="${postgresql.version}" conf="common->default" /> Error  record  causes  rollback  of  whole  transaction Still  difficult  to  implement  custom  connector  for  IMPORT because  of  code  generation  part Copyright © 2013 NTT DATA Corporation 13
  • 14. Supporting  complex  data  types PostgreSQL  supports  lot  of  complex  data  types   Geometric  Types Points Line  Segments Boxes Paths Polygons Circles Network  Address  Types inet cidr macaddr XML  Type JSON  Type Supporting  complex  data  types:    SQOOP-‐‑‒1149:  Support  Custom  Postgres  Types Copyright © 2013 NTT DATA Corporation not  me 14
  • 15. Constraints  on  JDBC  data  types  in  Sqoop  framework protected Map<String, Integer> getColumnTypesForRawQuery(String stmt) { ... results = execute(stmt); ... ResultSetMetaData metadata = results.getMetaData(); for (int i = 1; i < cols + 1; i++) { int typeId = metadata.getColumnType(i); returns  java.sql.Types.OTHER  for  types   { not  mappable  to  basic    Java  data  types =>  Losing  type  imformation     public String toJavaType(int sqlType) // Mappings taken from: // http://java.sun.com/j2se/1.3/docs/guide/jdbc/getstart/mapping.html if (sqlType == Types.INTEGER) { return "Integer"; } else if (sqlType == Types.VARCHAR) { return "String"; .... reaches  here } else { // TODO(aaron): Support DISTINCT, ARRAY, STRUCT, REF, JAVA_OBJECT. // Return null indicating database-specific manager should return a // java data type if it can find one for any nonstandard type. return null; Copyright © 2013 NTT DATA Corporation 15
  • 16. Sqoop  1  Summary Pros: Simple  Standalone  MapReduce  Driver Easy  to  understand  for  MR  application  developpers                                                                            except  for  ORM  (SqoopRecord)  code  generation  part. Variety  of  connectors Lot  of  information Cons: Complex  command  line  and  inconsistent  options meaning  of  options  is  according  to  connectors Not  enough  modular Dependency  on  JDBC  data  model Security Copyright © 2013 NTT DATA Corporation 16
  • 17. Sqooping  PostgreSQL  Data  2 Copyright © 2013 NTT DATA Corporation 17
  • 18. Sqoop  2 Everything  are  rewritten Working  on  server  side More  modular Not  compatible  with  Sqoop  1  at  all (Almost)  Only  generic  connector Black  box  comparing  to  Sqoop  1 Needs  more  documentation SQOOP-‐‑‒1155:  Sqoop  2  documentation  for  connector  development Internal of Sqoop2 MapReduce Job ++++++++++++++++++++++++++++++++ ... - OutputFormat invokes Loader's load method (via SqoopOutputFor .. todo: sequence diagram like figure. Copyright © 2013 NTT DATA Corporation 18
  • 19. Sqoop2:  Initialization  phase  of  IMPORT  job Implement  this ,----------------. ,-----------. |SqoopInputFormat| |Partitioner| `-------+--------' `-----+-----' getSplits | | ----------->| | | getPartitions | |------------------------>| | | ,---------. | |-------> |Partition| | | `----+----' |<- - - - - - - - - - - - | | | | | ,----------. |-------------------------------------------------->|SqoopSplit| | | | `----+-----' Copyright © 2013 NTT DATA Corporation 19
  • 20. Sqoop2:  Map  phase  of  IMPORT  job ,-----------. |SqoopMapper| `-----+-----' run | --------->| ,-------------. |---------------------------------->|MapDataWriter| | `------+------' Implement  this | ,---------. | |--------------> |Extractor| | | `----+----' | | extract | | |-------------------->| | Conversion  to  Sqoop   | | | internal  data  format read from DB | | <-------------------------------| write* | | |------------------->| | | | ,----. | | |---------->|Data| | | | `-+--' | | | | | | context.write | | |--------------------------> Copyright © 2013 NTT DATA Corporation 20
  • 21. Sqoop2:  Reduce  phase  of  EXPORT  job ,-------. ,---------------------. |Reducer| |SqoopNullOutputFormat| `---+---' `----------+----------' | | ,-----------------------------. Implement  this | |-> |SqoopOutputFormatLoadExecutor| | | `--------------+--------------' ,----. | | |---------------------> |Data| | | | `-+--' | | | ,-----------------. | | | |-> |SqoopRecordWriter| | getRecordWriter | | `--------+--------' | ----------------------->| getRecordWriter | | | | |----------------->| | | ,--------------. | | |-----------------------------> |ConsumerThread| | | | | | `------+-------' | |<- - - - - - - - -| | | | ,------. <- - - - - - - - - - - -| | | | |--->|Loader| | | | | | | `--+---' | | | | | | | | | | | | | load | run | | | | | |------>| ----->| | write | | | | | |------------------------------------------------>| setContent | | read* | | | | |----------->| getContent |<------| | | | | |<-----------| | | | | | | | - - ->| | | | | | | | write into DB | | | | | | |--------------> Copyright © 2013 NTT DATA Corporation 21
  • 22. Summary Copyright © 2013 NTT DATA Corporation 22
  • 23. My  interest  for  popularizing  Sqoop  2 Complex  data  type  support  in  Sqoop  2 Bridge  to  use  Sqoop  1  connectors  on  Sqoop  2 Bridge  to  use  Sqoop  2  connectors  from  Sqoop  1  CLI Copyright © 2013 NTT DATA Corporation 23
  • 24. Copyright © 2011 NTT DATA Corporation Copyright © 2013 NTT DATA Corporation