Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Name Entity Recognition problems in biomedical literature


Published on

Named Entity Recognition is one of the vast techniques in Natural Language Processing. NER techniques can be applied on biomedical data but there are some problems which are mentioned in the presentation.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Name Entity Recognition problems in biomedical literature

  1. 1. Aisha Kalsoom
  2. 2. Tools and techniques to help researchers cope with the information overload are therefore needed. NER tools can be applied to find all kind of entities, such as gene or protein names, diseases and drugs, mutations or properties of protein structures. Medline database contained approx. 15 million scientific abstracts with a growth rate of about 400,000 articles per year. Identification of proteins or genes is important to find out protein interaction networks.
  3. 3. Concepts, meaning and representation Names in text represent real-life concepts in our mind Concept denoted by a gene name is usually not clearly defined No community-wide agreement to name particular gene Supermarket Sonic Hedgehog gene in human p53 2WRU
  4. 4. • Clone during mapping phase in Human GENOME Project had up to 15 different names • FLT4 has four names: PCL; FLT41; LMPH1A;VEGFR3 Many genes and proteins have more than one name • Cbp/p300- interactive transactivator • CCAAT/enhancer binding protein, C/EBP alpha Inconsistent use of variations of names • BioCreative Corpus of expert tagged gene names consist of 53% of all names consist of more than one token • HumanT-cell leukaemia lymphotropic virus type 1Tax protein Multi-word names Acronyms are homonyms • SEC stands for • surface epithelial cell • size exclusion chromatography • Selenocystein
  5. 5.  Lesar, U. and Hakenberg, J. (2005), ‘What makes a gene name? Named entity recognition in the biomedical literature’, Briefings in Bioinformatics,Vol. 6(4), pp. 357-369.  =search 