This document discusses issues with analyzing sub-datasets in a distributed manner using Hadoop, such as imbalanced computational loads and inefficient data scanning. It proposes a new approach called Data-Net that uses metadata about sub-dataset distributions stored in an Elastic-Map structure to optimize storage placement and queries. Experimental results on a 128-node cluster show that Data-Net provides better load balancing and performance for various sub-dataset analysis applications compared to the default Hadoop implementation.