Big data repositories are seeing an increase in smaller, niche datasets as more researchers contribute data. This "long tail of data" poses challenges for discovery, access, and attribution. The authors propose a centralized data repository that would make any dataset discoverable and accessible regardless of size or topic by automating metadata generation and attribution to help researchers find and share relevant data.