Nathan from Imperial College London, gave a presentation at London Biogeeks on Thursday 24 Feb, between 6 - 6.30pm at King’s College London, Rm 1.20, Franklin Wilkins Building, Waterloo Campus, Stamford Street, London, SE1 9NH, see: biogeeks.wordpress.com/2011/02/16/ february-tech-meet-24th-kcl/
His presentation was about identifying genes and proteins in text: a short review of available tools and resources
The ever-increasing publication rate now means that manually extracting information from biological papers is now intractable. This situation has led to a sustained interest in the application of text mining (TM) methods to the biological literature. The first stage in any text-mining pipeline is to recognise named entities in text (a process called Named Entity Recognition or NER). I will discuss the basic concepts behind these methods and provide a basic evaluation of some of the freely available software (standalone and web services).