An introduction to Apache Sqoop

3,290 views

Published on

An introduction to Apache Sqoop, what is it ?
How does it assist in large volume data transfer
between Hadoop and external sources ?

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,290
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
247
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

An introduction to Apache Sqoop

  1. 1. Apache Sqoop ● What is it ? ● How does it work ? ● Interfaces ● Example ● Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  2. 2. Scoop – What is it ? ● A command line interface – ( plus web in scoop2 ) ● For data import / export to Hadoop ● Uses Map jobs from Map Reduce ● Supports incremental loads ● Written in Java ● Licensed by Apache ● Uses plugins for new types of data source www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  3. 3. Scoop – How does it work ? ● Data sliced into partitions ● Mappers transfer data ● Data types determined via meta data ● Many data transfer formats supported – i.e. CSV, Avro ● Can import into – Hive ( use --hive-import flag ) – Hbase ( use –hbase* flags ) www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  4. 4. Scoop – Interfaces ● Get data from – Relational databases – Data warehouses – NoSQL databases ● Load to Hive and Hbase ● Integrates with Oozie – for scheduling www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  5. 5. Scoop – Example An example scoop command to – load data from mySql into Hive bin/sqoop-import --connect jdbc:mysql://<mysql host>:<msql port>/db3 -username <username> -password <password> --table <tableName> --hive-table <Hive tableName> --create-hive-table --hive-import --hive-home <hive path> www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  6. 6. Scoop – Architecture Scoop has moved from ● Scoop1 to Scoop 2 ● Changed from client to server install ● Now has web and command line access ● Server now accesses Hive & Hbase ● Oozie uses REST API www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  7. 7. Scoop – Architecture - Scoop1 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  8. 8. Scoop – Architecture - Scoop2 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  9. 9. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
  10. 10. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems

×