SlideShare a Scribd company logo
1 of 30
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
HDFS
As we know, HDFS is designed to store very large
amounts of data – terabytes and petabytes.
Reliability is part of the core design of HDFS.
HDFS provides horizontal scalability.
Default block size of HDFS is 64MB.
HDFS does not store files in the traditional filesystem
and requires different commands to access the
metadata.
In order to work with HDFS you need to use hadoop fs
command.
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
What HDFS is not good at?
Due to the architecture choices, HDFS is good at
certain workloads.
It also has the following limitations due to these
choices:
• Random seek IO performance is bad. HDFS is optimized
for streaming read performance – leading to long
sequential read performance optimizations.
• It is optimized for write-once, read-many workloads.
• Not ideal for large number of small files.
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Configuring HDFS
core-site.xml and hdfs-site.xml contains default values for
every parameter of Hadoop.
These files can be found in /etc/hadoop/conf directory
Configuration settings are a set of key-value pairs
Some of the main properties are
Key Value Example
fs.default.name Protocol://servername:port hdfs://localhost:8020
dfs.replication No of replication 1
dfs.namenode.name.dir Pathname var/lib/hadoophdfs/cache/${u
ser.name}/dfs/name
dfs.datanode.data.dir Pathname var/lib/hadoophdfs/cache/${u
ser.name}/dfs/data
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
Check your home directory
• $ hadoop fs –ls
• Above command will run if home directory is created
else it will give output as:
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
Creating your home directory
• $ sudo –u hdfs hadoop fs –mkdir /user/root
• $ sudo -u hdfs hadoop fs –chown root /user/root
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
Create a directory as your workspace
• $ hadoop fs –mkdir workspace
• $ hadoop fs -ls
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
Different ways to create a file in HDFS
• Create a zero byte file using touchz
• Copying a file from local fs
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
Display the contents of a file
• $ hadoop fs –cat workspace/sample.txt
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
Copying multiple files from local fs
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
chmod, chown, chgrp
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
copyFromLocal , copyToLocal
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
du
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Basic Hadoop FS Commands
mv, cp, rm
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Need any help!
$ hadoop fs –help
$ hadoop fs –help <command_name>
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Use it as any linux command
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Archive
Creates a hadoop archive
Usage: hadoop archive -archiveName NAME <src>*
<dest>
Archive name always has a .har extension
Archives are immutable
Create Archive:
• $ hadoop archive -archiveName myarchive.har -p
/user/root/workspace /user/root/archive
List Files in Archive:
• $ hadoop fs –ls -R har:///user/root/archive/myarchive.har
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Distcp
Copy file or directories recursively.
Used for large inter/intra-cluster copying
Uses Map/Reduce to effect its distribution, error
handling and recovery, and reporting
Options:
• -i
• -log <logdir>
• -overwrite
• -update
• -m <num_maps>
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Distcp
Usage: hadoop distcp <srcurl> <desturl>
• $ hadoop fs -mkdir workspace_cp
• $ hadoop distcp workspace workspace_cp
• $ hadoop fs –mkdir workspace/test
• $ hadoop distcp -overwrite /user/root/workspace
/user/root/workspace_cp
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
fsck
Runs a HDFS filesystem checking utility.
Usage:
• hadoop fsck <path> [-move | -delete | -openforwrite]
[-files [-blocks [-locations | -racks]]]
Options:
Option Description
<path> Start checking from this path
-move Move corrupted file to /lost_found
-delete Delete corrupted files
-files Prints files being checked
-blocks Prints out block report
-locations Prints location for each block
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
fsck
$ hadoop fsck -blocks -locations
/user/root/workspace
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Quota
Name Quota:
• A hard limit on the number of file and directory names in
the tree
• Fails file and directory creations
• A newly created directory has no associated quota
• To set quota
 $ hadoop dfsadmin –setQuota <N> <dir>…<dir>
• To remove quota
 $ hadoop dfsadmin -clrQuota <dir>...<dir>
• To see the quota of a directory
 hadoop fs –count –q <dir>…<dir>
 First 4 columns of the output are name quota value, available name
quota, space quota value and available space quota
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Quota
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Quota
Space Quota:
• A hard limit on number of bytes used by files in the tree
• Block allocations fail if the quota would not allow a full
block to be written
• A newly created directory has no associated quota
• The space quota takes replication into consideration
• To set quota
 $ hadoop dfsadmin –setSpaceQuota <N> <dir>…<dir>
 N is by default in bytes but can be used as 10m, 10g, 10t,etc
• To remove quota
 $ hadoop dfsadmin -clrSpaceQuota <dir>...<dir>
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Quota
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Using HDFS Programmatically
Creating configuration object
• To be able to read from or write to HDFS, you need to
create a Configuration object and pass configuration
parameter to it using hadoop configuration files.
 Configuration conf = new Configuration();
 conf.addResource(new Path("/opt/hadoop-0.20.0/conf/core-
site.xml"));
 conf.addResource(new Path("/opt/hadoop-0.20.0/conf/hdfs-
site.xml"));
• If you do not assign the configurations to conf object (using
hadoop xml file) your HDFS operation will be performed on
the local file system and not on the HDFS.
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Using HDFS Programatically
Adding Files in HDFS
 FileSystem fileSystem = FileSystem.get(conf);
 Path path = new Path("/path/to/file.ext");
 if (fileSystem.exists(path)) {
System.out.println("File " + dest + " already exists");
return;
}
 FSDataOutputStream out = fileSystem.create(path);
 InputStream in = new BufferedInputStream(new FileInputStream(
new File(source)));
 byte[] b = new byte[1024];
 int numBytes = 0;
 while ((numBytes = in.read(b)) > 0) {
out.write(b, 0, numBytes);
}
 in.close();
 out.close();
 fileSystem.close();
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Using HDFS Programatically
Reading file from HDFS
 FileSystem fileSystem = FileSystem.get(conf);
 Path path = new Path("/path/to/file.ext");
 if (!fileSystem.exists(path)) {
System.out.println("File does not exists");
return;
}
 FSDataInputStream in = fileSystem.open(path);
 String filename = file.substring(file.lastIndexOf('/') + 1,
file.length());
 OutputStream out = new BufferedOutputStream(new FileOutputStream(
new File(filename)));
 byte[] b = new byte[1024];
 int numBytes = 0;
 while ((numBytes = in.read(b)) > 0) {
out.write(b, 0, numBytes);
}
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Using HDFS Programatically
Creating Directory in HDFS
 FileSystem fileSystem = FileSystem.get(conf);
 Path path = new Path(dir);
 if (fileSystem.exists(path)) {
System.out.println("Dir " + dir + " already not exists");
return;
}
 fileSystem.mkdirs(path);
 fileSystem.close();
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
Using HDFS Programatically
Deleting file from HDFS
 FileSystem fileSystem = FileSystem.get(conf);
 Path path = new Path("/path/to/file.ext");
 if (!fileSystem.exists(path)) {
System.out.println("File does not exists");
return;
}
 fileSystem.delete(new Path(file), true);
 fileSystem.close();
Clogeny Technologies http://www.clogeny.com
(US) 408-556-9645
(India) +91 20 661 43 482
HDFS Web Interface
HDFS provides a web interface which can be
accessed on 50070 port
http://<virtual machine ip>:50070/dfshealth.jsp
Provides ability to browse the filesystem
Provides ability to view the NameNode logs
Provides overall cluster summery

More Related Content

What's hot

Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieChicago Hadoop Users Group
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Mandakini Kumari
 
Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501Jinho Kim
 
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer toolsMay 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer toolsYahoo Developer Network
 
July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessJuly 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessYahoo Developer Network
 
Habits of Effective Sqoop Users
Habits of Effective Sqoop UsersHabits of Effective Sqoop Users
Habits of Effective Sqoop UsersKathleen Ting
 
Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache HiveAvkash Chauhan
 
Oozie HUG May12
Oozie HUG May12Oozie HUG May12
Oozie HUG May12mislam77
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Shalin Shekhar Mangar
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterLearn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterEdureka!
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2benjaminwootton
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadooplucenerevolution
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariAlejandro Fernandez
 

What's hot (19)

Hadoop Hive
Hadoop HiveHadoop Hive
Hadoop Hive
 
Everything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about OozieEverything you wanted to know, but were afraid to ask about Oozie
Everything you wanted to know, but were afraid to ask about Oozie
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04
 
Hdfs java api
Hdfs java apiHdfs java api
Hdfs java api
 
Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501Tajo Seoul Meetup-201501
Tajo Seoul Meetup-201501
 
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer toolsMay 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
 
July 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification ProcessJuly 2012 HUG: Overview of Oozie Qualification Process
July 2012 HUG: Overview of Oozie Qualification Process
 
Habits of Effective Sqoop Users
Habits of Effective Sqoop UsersHabits of Effective Sqoop Users
Habits of Effective Sqoop Users
 
Advanced Sqoop
Advanced Sqoop Advanced Sqoop
Advanced Sqoop
 
Introduction to Apache Hive
Introduction to Apache HiveIntroduction to Apache Hive
Introduction to Apache Hive
 
Oozie HUG May12
Oozie HUG May12Oozie HUG May12
Oozie HUG May12
 
Hive commands
Hive commandsHive commands
Hive commands
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterLearn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node Cluster
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
 
Perl Programming - 04 Programming Database
Perl Programming - 04 Programming DatabasePerl Programming - 04 Programming Database
Perl Programming - 04 Programming Database
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Oozie at Yahoo
Oozie at YahooOozie at Yahoo
Oozie at Yahoo
 

Similar to HDFS Commands Guide for Managing Files and Directories

Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117exsuns
 
Big data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with InstallationBig data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with Installationmellempudilavanya999
 
Running hadoop on ubuntu linux
Running hadoop on ubuntu linuxRunning hadoop on ubuntu linux
Running hadoop on ubuntu linuxTRCK
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)Amir Sedighi
 
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosBig Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosHeiko Loewe
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptxAakashBerlia1
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoopFrank Y
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFSApache Apex
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentationAmrut Patil
 
SQLRally Amsterdam 2013 - Hadoop
SQLRally Amsterdam 2013 - HadoopSQLRally Amsterdam 2013 - Hadoop
SQLRally Amsterdam 2013 - HadoopJan Pieter Posthuma
 
Top 10 Hadoop Shell Commands
Top 10 Hadoop Shell Commands Top 10 Hadoop Shell Commands
Top 10 Hadoop Shell Commands SimoniShah6
 
Drupal Deployment Troubles and Problems
Drupal Deployment Troubles and ProblemsDrupal Deployment Troubles and Problems
Drupal Deployment Troubles and ProblemsAndrii Lundiak
 
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, DatatypesHDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, DatatypesThe HDF-EOS Tools and Information Center
 

Similar to HDFS Commands Guide for Managing Files and Directories (20)

Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117
 
Big data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with InstallationBig data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with Installation
 
Hadoop Pig
Hadoop PigHadoop Pig
Hadoop Pig
 
Running hadoop on ubuntu linux
Running hadoop on ubuntu linuxRunning hadoop on ubuntu linux
Running hadoop on ubuntu linux
 
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
 
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and MesosBig Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and Mesos
 
HDFS_Command_Reference
HDFS_Command_ReferenceHDFS_Command_Reference
HDFS_Command_Reference
 
Hadoop File System.pptx
Hadoop File System.pptxHadoop File System.pptx
Hadoop File System.pptx
 
BIGDATA ANALYTICS LAB MANUAL final.pdf
BIGDATA  ANALYTICS LAB MANUAL final.pdfBIGDATA  ANALYTICS LAB MANUAL final.pdf
BIGDATA ANALYTICS LAB MANUAL final.pdf
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoop
 
Hadoop Interacting with HDFS
Hadoop Interacting with HDFSHadoop Interacting with HDFS
Hadoop Interacting with HDFS
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
 
AHUG Presentation: Fun with Hadoop File Systems
AHUG Presentation: Fun with Hadoop File SystemsAHUG Presentation: Fun with Hadoop File Systems
AHUG Presentation: Fun with Hadoop File Systems
 
SQLRally Amsterdam 2013 - Hadoop
SQLRally Amsterdam 2013 - HadoopSQLRally Amsterdam 2013 - Hadoop
SQLRally Amsterdam 2013 - Hadoop
 
Unit-3.pptx
Unit-3.pptxUnit-3.pptx
Unit-3.pptx
 
Top 10 Hadoop Shell Commands
Top 10 Hadoop Shell Commands Top 10 Hadoop Shell Commands
Top 10 Hadoop Shell Commands
 
Dc kyiv2010 jun_08
Dc kyiv2010 jun_08Dc kyiv2010 jun_08
Dc kyiv2010 jun_08
 
Drupal Deployment Troubles and Problems
Drupal Deployment Troubles and ProblemsDrupal Deployment Troubles and Problems
Drupal Deployment Troubles and Problems
 
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, DatatypesHDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
HDF5 Advanced Topics - Object's Properties, Storage Methods, Filters, Datatypes
 

Recently uploaded

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 

Recently uploaded (20)

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 

HDFS Commands Guide for Managing Files and Directories

  • 1. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 HDFS As we know, HDFS is designed to store very large amounts of data – terabytes and petabytes. Reliability is part of the core design of HDFS. HDFS provides horizontal scalability. Default block size of HDFS is 64MB. HDFS does not store files in the traditional filesystem and requires different commands to access the metadata. In order to work with HDFS you need to use hadoop fs command.
  • 2. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 What HDFS is not good at? Due to the architecture choices, HDFS is good at certain workloads. It also has the following limitations due to these choices: • Random seek IO performance is bad. HDFS is optimized for streaming read performance – leading to long sequential read performance optimizations. • It is optimized for write-once, read-many workloads. • Not ideal for large number of small files.
  • 3. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Configuring HDFS core-site.xml and hdfs-site.xml contains default values for every parameter of Hadoop. These files can be found in /etc/hadoop/conf directory Configuration settings are a set of key-value pairs Some of the main properties are Key Value Example fs.default.name Protocol://servername:port hdfs://localhost:8020 dfs.replication No of replication 1 dfs.namenode.name.dir Pathname var/lib/hadoophdfs/cache/${u ser.name}/dfs/name dfs.datanode.data.dir Pathname var/lib/hadoophdfs/cache/${u ser.name}/dfs/data
  • 4. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands Check your home directory • $ hadoop fs –ls • Above command will run if home directory is created else it will give output as:
  • 5. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands Creating your home directory • $ sudo –u hdfs hadoop fs –mkdir /user/root • $ sudo -u hdfs hadoop fs –chown root /user/root
  • 6. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands Create a directory as your workspace • $ hadoop fs –mkdir workspace • $ hadoop fs -ls
  • 7. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands Different ways to create a file in HDFS • Create a zero byte file using touchz • Copying a file from local fs
  • 8. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands Display the contents of a file • $ hadoop fs –cat workspace/sample.txt
  • 9. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands Copying multiple files from local fs
  • 10. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands chmod, chown, chgrp
  • 11. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands copyFromLocal , copyToLocal
  • 12. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands du
  • 13. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Basic Hadoop FS Commands mv, cp, rm
  • 14. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Need any help! $ hadoop fs –help $ hadoop fs –help <command_name>
  • 15. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Use it as any linux command
  • 16. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Archive Creates a hadoop archive Usage: hadoop archive -archiveName NAME <src>* <dest> Archive name always has a .har extension Archives are immutable Create Archive: • $ hadoop archive -archiveName myarchive.har -p /user/root/workspace /user/root/archive List Files in Archive: • $ hadoop fs –ls -R har:///user/root/archive/myarchive.har
  • 17. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Distcp Copy file or directories recursively. Used for large inter/intra-cluster copying Uses Map/Reduce to effect its distribution, error handling and recovery, and reporting Options: • -i • -log <logdir> • -overwrite • -update • -m <num_maps>
  • 18. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Distcp Usage: hadoop distcp <srcurl> <desturl> • $ hadoop fs -mkdir workspace_cp • $ hadoop distcp workspace workspace_cp • $ hadoop fs –mkdir workspace/test • $ hadoop distcp -overwrite /user/root/workspace /user/root/workspace_cp
  • 19. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 fsck Runs a HDFS filesystem checking utility. Usage: • hadoop fsck <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]] Options: Option Description <path> Start checking from this path -move Move corrupted file to /lost_found -delete Delete corrupted files -files Prints files being checked -blocks Prints out block report -locations Prints location for each block
  • 20. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 fsck $ hadoop fsck -blocks -locations /user/root/workspace
  • 21. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Quota Name Quota: • A hard limit on the number of file and directory names in the tree • Fails file and directory creations • A newly created directory has no associated quota • To set quota  $ hadoop dfsadmin –setQuota <N> <dir>…<dir> • To remove quota  $ hadoop dfsadmin -clrQuota <dir>...<dir> • To see the quota of a directory  hadoop fs –count –q <dir>…<dir>  First 4 columns of the output are name quota value, available name quota, space quota value and available space quota
  • 22. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Quota
  • 23. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Quota Space Quota: • A hard limit on number of bytes used by files in the tree • Block allocations fail if the quota would not allow a full block to be written • A newly created directory has no associated quota • The space quota takes replication into consideration • To set quota  $ hadoop dfsadmin –setSpaceQuota <N> <dir>…<dir>  N is by default in bytes but can be used as 10m, 10g, 10t,etc • To remove quota  $ hadoop dfsadmin -clrSpaceQuota <dir>...<dir>
  • 24. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Quota
  • 25. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Using HDFS Programmatically Creating configuration object • To be able to read from or write to HDFS, you need to create a Configuration object and pass configuration parameter to it using hadoop configuration files.  Configuration conf = new Configuration();  conf.addResource(new Path("/opt/hadoop-0.20.0/conf/core- site.xml"));  conf.addResource(new Path("/opt/hadoop-0.20.0/conf/hdfs- site.xml")); • If you do not assign the configurations to conf object (using hadoop xml file) your HDFS operation will be performed on the local file system and not on the HDFS.
  • 26. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Using HDFS Programatically Adding Files in HDFS  FileSystem fileSystem = FileSystem.get(conf);  Path path = new Path("/path/to/file.ext");  if (fileSystem.exists(path)) { System.out.println("File " + dest + " already exists"); return; }  FSDataOutputStream out = fileSystem.create(path);  InputStream in = new BufferedInputStream(new FileInputStream( new File(source)));  byte[] b = new byte[1024];  int numBytes = 0;  while ((numBytes = in.read(b)) > 0) { out.write(b, 0, numBytes); }  in.close();  out.close();  fileSystem.close();
  • 27. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Using HDFS Programatically Reading file from HDFS  FileSystem fileSystem = FileSystem.get(conf);  Path path = new Path("/path/to/file.ext");  if (!fileSystem.exists(path)) { System.out.println("File does not exists"); return; }  FSDataInputStream in = fileSystem.open(path);  String filename = file.substring(file.lastIndexOf('/') + 1, file.length());  OutputStream out = new BufferedOutputStream(new FileOutputStream( new File(filename)));  byte[] b = new byte[1024];  int numBytes = 0;  while ((numBytes = in.read(b)) > 0) { out.write(b, 0, numBytes); }
  • 28. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Using HDFS Programatically Creating Directory in HDFS  FileSystem fileSystem = FileSystem.get(conf);  Path path = new Path(dir);  if (fileSystem.exists(path)) { System.out.println("Dir " + dir + " already not exists"); return; }  fileSystem.mkdirs(path);  fileSystem.close();
  • 29. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 Using HDFS Programatically Deleting file from HDFS  FileSystem fileSystem = FileSystem.get(conf);  Path path = new Path("/path/to/file.ext");  if (!fileSystem.exists(path)) { System.out.println("File does not exists"); return; }  fileSystem.delete(new Path(file), true);  fileSystem.close();
  • 30. Clogeny Technologies http://www.clogeny.com (US) 408-556-9645 (India) +91 20 661 43 482 HDFS Web Interface HDFS provides a web interface which can be accessed on 50070 port http://<virtual machine ip>:50070/dfshealth.jsp Provides ability to browse the filesystem Provides ability to view the NameNode logs Provides overall cluster summery