While in an ideal world all data would be deposited by the producing scientist directly into a database, in the real-world most chemical data is instead presented in a form designed for human rather than machine consumption. Text mining has the potential to extract this data back into a computer understandable form. As all United States patents are available free of charge they make the perfect corpus for extracting a large number of experimental properties of compounds, and chemical reactions. We report on our text-mining activities to extract millions of textual NMR spectra, hundreds of thousands of physicochemical properties (with their associated compounds) and over a million chemical reactions. All extracted results are to be deposited into online databases allowing the community to benefit from the results of this work. Using Mestrelab Research’s MNova product we have converted the textual NMR spectra to graphical spectra, and validated each spectrum against its associated chemical structure so as to detect cases where the NMR spectrum could not be produced by the associated structure. In the case of melting points the resultant dataset, of over a quarter of a million melting compound/temperature relationships, is the largest public dataset the authors are aware of. We have used this dataset to produce a predictive model with results comparable to those of manually curated datasets. Our experiences with modelling this data has demonstrated that we are working at the edge of current algorithmic and computing capabilities for predictive model building, with the resultant matrix containing over 200 billion descriptors. The melting point model and the data it was derived from are available freely from http://www.ochem.eu.