Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big data: Loading your data with flume and sqoop

7,386 views

Published on

Studying Hortonworks stack, I created this 10 minutes presentation. http://hortonworks.com

Published in: Technology

Big data: Loading your data with flume and sqoop

  1. 1. Loading data in Hadoop 2 with SQOOP and Flume Christophe Marchal | Software Architect
  2. 2. Problem to solve
  3. 3. Hortonworks stack
  4. 4. Batch Loading vs Stream Loading
  5. 5. SQOOP HCatalog
  6. 6. SQOOP 1: Import
  7. 7. SQOOP 1: Export
  8. 8. SCOOP 2
  9. 9. Flume Source Source Source Source Web Web Server Web Server Server Agent Agent Agent Agent Sink Sink Sink Sink Channel Channel Channel Channel HDFS
  10. 10. Multi agent flow
  11. 11. Consolidation flow
  12. 12. Flume vs SQOOP ● distributed ● Data imports ● reliable (transaction) ● Parallelizes data ● available (backup routes) ● collecting data ● aggregating data transfer ● Copies data quickly
  13. 13. Flume example
  14. 14. Flume example
  15. 15. Flume example
  16. 16. SQOOP: import HDFS
  17. 17. SQOOP: import HDFS
  18. 18. SQOOP: import HDFS
  19. 19. SQOOP: import Hive
  20. 20. SQOOP: import Hive
  21. 21. SQOOP: import Hive
  22. 22. Thanks Christophe Marchal | Software Architect @toff63

×