This document discusses using natural language processing techniques to analyze over 32,000 open datasets from NASA. It describes preprocessing text data, generating word clouds, calculating term frequency-inverse document frequency, performing topic modeling with Latent Dirichlet Allocation and K-means clustering. The topic modeling helps evaluate how accurately descriptions and keywords capture dataset topics. Connections between NASA datasets and other US government collections are also examined.