This document discusses large-scale integration of biological data and text. It describes combining heterogeneous data from many databases on topics like protein interactions, disease associations, tissue expression, and subcellular localization. It also discusses using text mining of over 10 km of text to extract information on entities, relationships, and annotations in order to supplement incomplete experimental data and facilitate updates and predictions.