The document discusses modeling chemical datasets with a focus on regression-based methods. It aims to examine how the dynamic range and experimental error of datasets impact model performance, and how to determine if a model can be applied to new data. Several solubility datasets are analyzed using descriptors and statistical/machine learning models. The challenges of classification and regression models are presented, along with various methods to evaluate model performance and the steps to build a predictive model. Experimental error, model applicability domain, and the impact of dataset properties on correlations are also discussed.