Speaker: Thierry Hamon, Associate Professor in Computer Science at Université Paris, Member of the LIMSI-CNRS research lab.
Summary: Among the large amounts of unstructured data generated across the world and available nowadays, textual data represent an important source of information. This fact is particularly true in the biomedical domain, where a constant increasing demand to access the textual content is observed: the situation is relevant for accessing and processing Electronic Health Records, online discussion forums, and scientific literature. Indeed, dealing with biomedical texts requires us to take into account a great variety of texts, languages and Users.
For several years now, a lot of NLP research has focused on mining and retrieving information (i.e., medical entities and domain-specific relations), which are relevant for biologists, physicians, terminologists, epidemiologists, and patients. We will propose an overview of the NLP methods used for tackling several such research problems through text mining applications. First, we will present the resources and rule-based approaches we designed for extracting drug-related information from clinical texts, and for acquiring domain-specific semantic relations from digital libraries. Then we will present the cross-lingual approach we are developing for building multilingual terminologies from a patient-centered Ukrainian corpus.