Charla en el CBM
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
778
On Slideshare
778
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Pasado, presente y futuro de la búsqueda de literatura científica Ramón Alonso-­Allende
  • 2. y futuro Pasado, presente de la búsqueda de literatura científica Ramón Alonso-­Allende
  • 3. ëÉ~êÅÜ êÉ~Ç Future Science Cicle Search = Today Integration + Meaning + Social ïêáíÉ ÉñéÉêáãÉåí Relevance Value system 2000’s 1990’s + Complete + Easy -­ Time
  • 4. Sistemas de información 1995 2000 2005 2010
  • 5. Searches in PubMed 1.000.000 Searches (1000s) 750.000 500.000 250.000 0 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
  • 6. Retos
  • 7. Retos ‣ Manejar cantidades ingentes de información. ‣ Ambigüedad del lenguaje. ‣ Tiempo. ‣ Mantenerse al día. jordinho_dp
  • 8. Mucha información heterogenea 80.000.000 60.000.000 40.000.000 20.000.000 0 92 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 GB PDB Medline SwissProt
  • 9. 43% Genes humanos tienen nombres ambiguos
  • 10. Algunos datos Número de términos 4.000 ‣ 5.892 términos pueden ser genes o enfermedades 3.000 ‣ 3.963 nombres hacen 2.000 referencia a 2 genes 1.000 diferentes 0 ‣ Un término hace referencia 2 3 4 5 6 7 8 9 Número de conceptos a 114 genes Disease Genes Drugs
  • 11. Algunos Ejemplos sps AAt1 stiff-­man syndrome annuloaortic ectasia   (Diseases or Syndromes) (Diseases or Syndromes) polystyrene sulfonate   alanine aminotransferase   (Pharmacological substances) (Genes and Proteins) systolic blood pressure   (Biological functions) spermine synthase (Genes and Proteins)
  • 12. Language ambiguity póåoåóãë eoãoåóãë ^Åêoåóã RÉÇìÅÉ=ïoêÇ= aáÑÑÉêÉåí=ïoêÇ=Ñoê=íÜÉ= p~ãÉ=å~ãÉ=Ñoê=ÇáÑÑÉêÉåí= êÉéêÉëÉåíáåÖ=~= ë~ãÉ=ÄáoãÉÇáÅ~ä=Éåíáíó ÄáoãÉÇáÅ~ä=ÉåíáíáÉë ÄáoãÉÇáÅ~ä=Éåíáíó póãÄoä=m^m=áë=~å=~äá~ë= • få=eìã~å=íÜÉêÉ=~êÉ=~í= ÑoêW äÉ~ëí=RKQNU=ÖÉåÉë=ïáíÜ= •m^m=Em~åÅêÉ~íáíáëJ p`q=ëí~åÇë=ÑoêW ëóåoåóãë=EPUB=oÑ=íÜÉ= ~ëëoÅá~íÉÇ=éêoíÉáåF • píÉã=`Éää=qê~åëéä~åí íoí~ä=ÖÉåoãÉF • jRmpPM=EjáíoÅÜoÇ= • pÉÅêÉíáå • aêìÖë=Ü~îÉ=~= êáÄoëoã~ä=éêoí=PMëF • p~äãoå=Å~äÅáíoåáå ÅoããÉêÅá~ä=å~ãÉ=~åÇ=~= • m^mli^=Emoäó^= ÅÜÉãáÅ~ä=å~ãÉ éoäáãÉê~ëÉ=~äéÜ~F
  • 13. Inmanejable ‣ More than 25 MM documents considering scientific articles, grants, biomedical patents… relevant sources of information for biomedical researchers. ‣ 2,000 new scientific papers published everyday ‣ 5 years to read the new scientific material produced every 24 hours. ‣ Scan 130 journals and read 27 articles per day to follow a single disease, like breast cancer.
  • 14. Mantenerse al día ‣Alertas en buscadores ‣emailling eTOCs ‣Feeds RSS
  • 15. Search tasks & Lab work by discipline 80% 70% 60% % time 50% 40% 30% 20% 10% 0% All Biochemestry Mol. & Cell Biol. Genetics Biotechnology Bioinfromatics Medicine Other Searchin literature Searching data form DB Working in the lab Roos, A., Kumpulainen, S., Järvelin, K and Hedlund, T. (2008). "The information environment of researchers in molecular medicine" Information Research, 13(3) paper 353. [Available at http://InformationR.net/ir/13-­3/paper353.html]
  • 16. Cómo afrontamos retos
  • 17. Afrontamos los retos: ‣ Integrando información para el usuarios. ‣ Analizando el texto (text mining). ‣ Funcionalidad útil. ‣ Tecnología + Interfaz sencillo = - Tiempo
  • 18. Integración de datos Sequence DBs Pathway DBs Other DBs UniProt KEGG Affymetrix GenBank EC GO RefSeq Reactome PDB PIR MIM EMBL Domain DBs CCDS Entrez Protein Pfam HPRD UniSTS PROSITE HGNC SMART Gene DBs ProDom GDB InterPro Ensembl Entrez Gene UniGene H-­InvDB MGC HGNC
  • 19. Text mining Gene: GH1 Gene: GG1 Growth Hormone 1 Gamma Glutamyl Hydrolase GeneID: 2688 GeneID: 8836 Synonym: GHN Synonym: conjugasa Synonym: GH Synonym: GH adenoma (0.300) antifolate (2.850) adipocyte (0.418) carboxypeptidase (12.618) adipose (0.324) folate (0.674) age-related (0.442) gamma-glu-x (15.452) genotropin (19.368) antifolylpoly-gamma-glutamate (12.054)
  • 20. Datos indexados Medline Open access Proyectos I+D abstracts Texto completo abstracts NU=j NQRKMMM NIR=j [=OMM=j=êÉä~ÅáoåÉë [=Qj=ÅoåÅÉéíoë
  • 21. Comparison: Use-­Case: Looking for the gene SCT PubMed: SCT is Solid-­ Cystic tumor Google Scholar: SCT is name of author novo|seek: SCT is meaning you are looking for: -Secretine -Stem Cell transplantation
  • 22. novo|seek vs. Google Scholar dooÖäÉ=pÅÜoä~êW=åo=ï~ó=ío=ÑoÅìë=íÜÉ=ëÉ~êÅÜ=ÄÉóoåÇ=êÉ~ÇáåÖW= íáãÉJÅoåëìãáåÖ
  • 23. Techonology ‣ Search more efficiently. ‣ Extract more information. ‣ Put into relation different sources of information ‣ Gain time Semantic Search Discovery Concept relations Knowledge Extraction by L cornide
  • 24. Semantic Search ‣ Conceptual search e.g. Search of breast cancer Detection of breast carcinoma cells in effusions is associated with rapidly fatal outcome Women who do not receive regular mammograms are more likely than others to have breast cancer diagnosed at an advanced stage […] thereby providing higher cytotoxicity against the 4T1 mouse mammary carcinoma cell line All of this keywords are referred to the same biomedical concept, a search by breast cancer will retrieve this three documents ‣ Use of context and semantic information to identify the relevant information e.g. Search of CAT, that could be referred to the enzyme Catalase or to the animal, “cat”. [..] activity of antioxidant enzymes (GSH-­Px, SOD, CAT) and content of malondialdehyde (MDA) were determined […] 26 free-­living lynx, 53 domestic cats, 28 dogs, 33 red foxes (Vulpes vulpes) […] The same keyword is referred to different biomedical concepts. Using the context, we can identify that only the first sentence talks about an enzyme by L cornide
  • 25. Concept Relations e.g. Search for Alzheimer’s Disease The apolipoprotein E gene (APOE) polymorphism genotyping has an allegedly important predictive value for coronary heart disorders and Alzheimer's disease. Apolipoprotein E (apoE), a ligand for the low-­density lipoprotein receptor family, has been implicated in modulating glial inflammatory responses and the risk of neurodegeneration associated with Alzheimer's disease. Although many genes have been suggested to be associated with AD, with the exception of APOE, most polymorphic variants of potential risk exhibit a very weak association with AD The protein apolipoprotein E and Alzheimer disease are related with a relevance of 36% by L cornide
  • 26. Knowledge Extraction ‣ Based on the detected relations between concepts, we can extract automatically knowledge from text e.g. Obtain the knowledge about Breast cancer, extracted from literature […] BRCA1 or BRCA2 […] Information was recorded on prophylactic mastectomy, prophylactic oophorectomy, use of tamoxifen [..] had a bilateral prophylactic oophorectomy. […] breast cancer, 248 (18.0%) had had a prophylactic bilateral mastectomy. Among those who did not have a prophylactic mastectomy, only 76 women (5.5%) took tamoxifen and 40 women (2.9%) took raloxifene for breast cancer prevention. […]. Genes BRCA1 and BRCA2 are related with breast cancer. Tamoxifen and Raloxifene are drugs used in its treatment, and mastectomy and oophorectomy are usual procedures to treat it. by L cornide
  • 27. Make new Discoveries ‣ Discover hidden relations between concepts that have not been described before in the scientific literature e.g. Obtain the knowledge about Breast cancer, extracted from literature […] meal fatty acids appear to be an important determinant of vascular reactivity, with fish oils significantly improving postprandial endothelium-­independent vasodilation Numerous studies have documented longer bleeding times and decreased platelet aggregation in subjects ingesting omega-­3 fatty acids vasomotor pain, in particular the fact of reactional vasodilation during Raynaud's syndrome, inflammation in the region surrounding zones of ischemic necrosis, and infection of ulcers Objective judgement on effects of medicine in patients with Raynaud's phenomenon-­-­measurement of cutaneous blood flow using laser Doppler flowmeter and platelet aggregation activity By finding evidence of a relation between fish oils and vasodilatation and platelet aggregation, and evidence in the link between these two functions and Raynaud’s syndrome, we can uncover a new discovery that was not described previously in the literature, the possible treatment of Raynaud’s Syndrome with fish oil. by L cornide
  • 28. El Futuro ‣ Información estructurada. ‣ Identificador de usuario. ‣ El artículo del futuro. ‣ Búsqueda social.
  • 29. http://beta.cell.com/erickson/
  • 30. Collective Social Search Collaborative Q&A Friend-­Filtered http://www.readwriteweb.com/archives/3_flavors_of_social_search_what_to_expect.php
  • 31. Beta testers Colaboración en el desarrollo de uno de los principales buscadores biomédicos en el mercado. Acceso a los últimas actualizaciones de nuestro buscador. Regalo seguro. www.novoseek.com/betatesters.html
  • 32. Contacto Ramón Alonso-­Allende Marketing & Business Development allende@bioalma.com Phone: +34 91 141 71 50