Migrating from Another
  Distribution
  7/6/2012

© 2012 MapR Technologies   Migration 1
Migration
   Agenda
   • Migration Roadmap
   • Planning the Migration
   • MapR Deployment
   • Component Migration
   • Application Migration
   • Data Migration
   • Node Migration



© 2012 MapR Technologies      Migration 2
Migration
   Objectives
   At the end of this module you will be able to:
   • List the important steps in migrating to MapR from another Hadoop
     distribution
   • Plan for a migration to MapR
   • Describe the process for migrating components to MapR
   • Identify the steps for migrating applications to MapR
   • Explain how to migrate data to MapR




© 2012 MapR Technologies            Migration 3
Migration Roadmap




© 2012 MapR Technologies        Migration 4
Migration Roadmap

     Planning
     Initial MapR Deployment
     Component Migration
     Application Migration
     Data Migration
     Node Migration




© 2012 MapR Technologies        Migration 5
Planning the Migration




© 2012 MapR Technologies    Migration 6
Planning the Migration

     Requirements and goals
     Potential issues
     Data migration
      –   Move datasets individually or all at once?
     Downtime
      –   How much can you tolerate, if any?
     Customization
     Storage




© 2012 MapR Technologies               Migration 7
MapR Deployment




© 2012 MapR Technologies        Migration 8
MapR Deployment

     Install, configure and test MapR cluster
     Add ecosystem components and applications you need
     Test MapR functionality
      –   Your custom patches and one-off code might no longer be needed




© 2012 MapR Technologies            Migration 9
Component Migration




© 2012 MapR Technologies   Migration 10
Component Migration

     Hive Migration
      –   Import table schemas
      –   Import Hive metastore
      –   Using Hive with volumes
     HBase Migration
      –   Use the /hbase volume
      –   Turn off compression for the /hbase volume
     Migrating customized components
      –   Reapply your patches to MapR versions
      –   Do not hardcode paths – use fs.default.name




© 2012 MapR Technologies            Migration 11
Application Migration




© 2012 MapR Technologies     Migration 12
Application Migration

     Make sure Java classpath includes path to maprfs.jar
     Make sure java.library.path includes libMapRClient.so
     Use fs.default.name
      –   Point to maprfs:///
      –   Uses cluster specified on the first line of mapr-clusters.conf
     The distcp command does not copy file permissions
     Remove explicit memory settings from your application
     Test application with a small amount of data




© 2012 MapR Technologies               Migration 13
Data Migration




© 2012 MapR Technologies      Migration 14
Data Migration

     Distributed copy
      –   hadoop distcp command
      –   Sets up MapReduce job to copy files and directories
     Push data
      –   Use MapR File Client to push data from the other cluster




© 2012 MapR Technologies             Migration 15
Node Migration




© 2012 MapR Technologies       Migration 16
Node Migration

     Reclaim nodes
      –   Decommission nodes from the other cluster
      –   Make sure the nodes meet the MapR installation requirements
      –   Add the nodes to the MapR cluster




© 2012 MapR Technologies            Migration 17
Questions




© 2012 MapR Technologies   Migration 18

58a migration

  • 1.
    Migrating from Another Distribution 7/6/2012 © 2012 MapR Technologies Migration 1
  • 2.
    Migration Agenda • Migration Roadmap • Planning the Migration • MapR Deployment • Component Migration • Application Migration • Data Migration • Node Migration © 2012 MapR Technologies Migration 2
  • 3.
    Migration Objectives At the end of this module you will be able to: • List the important steps in migrating to MapR from another Hadoop distribution • Plan for a migration to MapR • Describe the process for migrating components to MapR • Identify the steps for migrating applications to MapR • Explain how to migrate data to MapR © 2012 MapR Technologies Migration 3
  • 4.
    Migration Roadmap © 2012MapR Technologies Migration 4
  • 5.
    Migration Roadmap  Planning  Initial MapR Deployment  Component Migration  Application Migration  Data Migration  Node Migration © 2012 MapR Technologies Migration 5
  • 6.
    Planning the Migration ©2012 MapR Technologies Migration 6
  • 7.
    Planning the Migration  Requirements and goals  Potential issues  Data migration – Move datasets individually or all at once?  Downtime – How much can you tolerate, if any?  Customization  Storage © 2012 MapR Technologies Migration 7
  • 8.
    MapR Deployment © 2012MapR Technologies Migration 8
  • 9.
    MapR Deployment  Install, configure and test MapR cluster  Add ecosystem components and applications you need  Test MapR functionality – Your custom patches and one-off code might no longer be needed © 2012 MapR Technologies Migration 9
  • 10.
    Component Migration © 2012MapR Technologies Migration 10
  • 11.
    Component Migration  Hive Migration – Import table schemas – Import Hive metastore – Using Hive with volumes  HBase Migration – Use the /hbase volume – Turn off compression for the /hbase volume  Migrating customized components – Reapply your patches to MapR versions – Do not hardcode paths – use fs.default.name © 2012 MapR Technologies Migration 11
  • 12.
    Application Migration © 2012MapR Technologies Migration 12
  • 13.
    Application Migration  Make sure Java classpath includes path to maprfs.jar  Make sure java.library.path includes libMapRClient.so  Use fs.default.name – Point to maprfs:/// – Uses cluster specified on the first line of mapr-clusters.conf  The distcp command does not copy file permissions  Remove explicit memory settings from your application  Test application with a small amount of data © 2012 MapR Technologies Migration 13
  • 14.
    Data Migration © 2012MapR Technologies Migration 14
  • 15.
    Data Migration  Distributed copy – hadoop distcp command – Sets up MapReduce job to copy files and directories  Push data – Use MapR File Client to push data from the other cluster © 2012 MapR Technologies Migration 15
  • 16.
    Node Migration © 2012MapR Technologies Migration 16
  • 17.
    Node Migration  Reclaim nodes – Decommission nodes from the other cluster – Make sure the nodes meet the MapR installation requirements – Add the nodes to the MapR cluster © 2012 MapR Technologies Migration 17
  • 18.
    Questions © 2012 MapRTechnologies Migration 18