This document discusses challenges and opportunities in identifying small molecules from big data sources. There are large compound databases but still many potential molecules not represented. Mass spectral libraries also have gaps and do not represent all predicted or generated molecules. Initiatives aim to improve exchange of information between resources to help close these gaps. Collaborative efforts worldwide can greatly help cross-annotation of known and unknown molecules.