The growing popularity of Hadoop has led to an increasing number of clusters worldwide. Priming these clusters with data from existing client repositories is difficult due to a number of issues including data size, network constraints, security & lack of domain knowledge. In this talk, we present a number of techniques & best practices for uploading large amounts of data to remote Hadoop clusters.