Text Mining for Chemistry and Building a Public Platform for Document Markup
The identification of chemical names in documents has provided platforms to enable structure-based searching of patents and mark-up chemistry publications. A natural extension is the ability to make chemistry articles, blog pages, wiki pages and other documents searchable by the extracted chemical structures. The ChemSpider database is built on a database of over 21 million unique chemical entities from close to 200 data sources and provides a rich resource of information for chemists. We will report on our efforts to integrate chemical name extraction with the ChemSpider platform to enable structure searching of Open Access chemistry articles, and online chemistry materials. We will unveil our online document markup platform for chemists to make both their open- and closed-access publications searchable by the language of chemistry – the structure.