• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
An introduction to Apache Sqoop
 

An introduction to Apache Sqoop

on

  • 797 views

An introduction to Apache Sqoop, what is it ?

An introduction to Apache Sqoop, what is it ?
How does it assist in large volume data transfer
between Hadoop and external sources ?

Statistics

Views

Total Views
797
Views on SlideShare
797
Embed Views
0

Actions

Likes
0
Downloads
54
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    An introduction to Apache Sqoop An introduction to Apache Sqoop Presentation Transcript

    • Apache Sqoop ● What is it ? ● How does it work ? ● Interfaces ● Example ● Architecture www.semtech-solutions.co.nz info@semtech-solutions.co.nz
    • Scoop – What is it ? ● A command line interface – ( plus web in scoop2 ) ● For data import / export to Hadoop ● Uses Map jobs from Map Reduce ● Supports incremental loads ● Written in Java ● Licensed by Apache ● Uses plugins for new types of data source www.semtech-solutions.co.nz info@semtech-solutions.co.nz
    • Scoop – How does it work ? ● Data sliced into partitions ● Mappers transfer data ● Data types determined via meta data ● Many data transfer formats supported – i.e. CSV, Avro ● Can import into – Hive ( use --hive-import flag ) – Hbase ( use –hbase* flags ) www.semtech-solutions.co.nz info@semtech-solutions.co.nz
    • Scoop – Interfaces ● Get data from – Relational databases – Data warehouses – NoSQL databases ● Load to Hive and Hbase ● Integrates with Oozie – for scheduling www.semtech-solutions.co.nz info@semtech-solutions.co.nz
    • Scoop – Example An example scoop command to – load data from mySql into Hive bin/sqoop-import --connect jdbc:mysql://<mysql host>:<msql port>/db3 -username <username> -password <password> --table <tableName> --hive-table <Hive tableName> --create-hive-table --hive-import --hive-home <hive path> www.semtech-solutions.co.nz info@semtech-solutions.co.nz
    • Scoop – Architecture Scoop has moved from ● Scoop1 to Scoop 2 ● Changed from client to server install ● Now has web and command line access ● Server now accesses Hive & Hbase ● Oozie uses REST API www.semtech-solutions.co.nz info@semtech-solutions.co.nz
    • Scoop – Architecture - Scoop1 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
    • Scoop – Architecture - Scoop2 www.semtech-solutions.co.nz info@semtech-solutions.co.nz
    • Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
    • Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems