The Nutch distribution is overkill if you already have a Hadoop Cluster. Its also not how you really integrate with Hadoop these days, but there is some history to consider. Nutch Wiki has Distributed Setup.
Why orchestrate your crawl?
Create the seed file and copy it into a “urls” directory. Then copy the directory up to the HDFS
Edit the conf/crawl-urlfilter.txt regex to constrain the crawl (Usually via domain)
Copy the conf/nutch-site,conf/nutch-default.xml, conf/nutch-conf.xml & conf/crawl-urlfilter.txt to the Hadoop conf directory.
Restart Hadoop so the new files are picked up in the classpath