Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

An introduction to Apache Sqoop

4,382 views

Published on

An introduction to Apache Sqoop, what is it ?
How does it assist in large volume data transfer
between Hadoop and external sources ?

An introduction to Apache Sqoop

  1. 1. Apache Sqoop ● What is it ? ● How does it work ? ● Interfaces ● Example ● Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  2. 2. Scoop – What is it ? ● A command line interface – ( plus web in scoop2 ) ● For data import / export to Hadoop ● Uses Map jobs from Map Reduce ● Supports incremental loads ● Written in Java ● Licensed by Apache ● Uses plugins for new types of data source www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  3. 3. Scoop – How does it work ? ● Data sliced into partitions ● Mappers transfer data ● Data types determined via meta data ● Many data transfer formats supported – i.e. CSV, Avro ● Can import into – Hive ( use --hive-import flag ) – Hbase ( use –hbase* flags ) www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  4. 4. Scoop – Interfaces ● Get data from – Relational databases – Data warehouses – NoSQL databases ● Load to Hive and Hbase ● Integrates with Oozie – for scheduling www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  5. 5. Scoop – Example An example scoop command to – load data from mySql into Hive bin/sqoop-import --connect jdbc:mysql://<mysql host>:<msql port>/db3 -username <username> -password <password> --table <tableName> --hive-table <Hive tableName> --create-hive-table --hive-import --hive-home <hive path> www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  6. 6. Scoop – Architecture Scoop has moved from ● Scoop1 to Scoop 2 ● Changed from client to server install ● Now has web and command line access ● Server now accesses Hive & Hbase ● Oozie uses REST API www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  7. 7. Scoop – Architecture - Scoop1 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  8. 8. Scoop – Architecture - Scoop2 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  9. 9. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
  10. 10. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems

×