Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Building a Spanish MMTx by
using Automatic Translation and
Biomedical Ontologies
Francisco Carrero 1,2 ; José Carlos Corti...
Outline

   The MIRCAT project
   The challenge
   English MetaMap, a big effort
   Approaching a Spanish MetaMap
   Exper...
The MIRCAT Project
The Interface




                     Francisco Carrero Garcia
The MIRCAT Project
System’s Architecture




                        Francisco Carrero Garcia
The Challenge
Our Goal




                            English docs




           Medical record


                      ...
The Challenge
The problem




     We can extract UMLS concepts from English texts using
     MetaMap...
     ...but there...
English MetaMap
A big Effort




                  ∼3 years!!

                        Francisco Carrero Garcia
Approaching Spanish MetaMap
Two Main Approaches Considered




                                 Francisco Carrero Garcia
Approaching Spanish MetaMap
Our Approach: Translation and Reuse




                    Optional



                      ...
Experimental Design
Text Collections


      MedLine Plus medical News
          http://www.nlm.nih.gov/medlineplus/newsby...
Experiments
Experimental Design

     MetaMap extracts concepts, allowing multiple representations
         A => Using com...
Experiments
Filtering


      Data representations containing a lot of features do not usually
      perform very well in ...
Experiments Results
Number of concepts for each representation




                                             Francisco ...
Experiments Results
Average Similarities




                       Francisco Carrero Garcia
Experiments Results
Last Experiments (not in IDEAL paper)




                                        Francisco Carrero Ga...
Discussion of the Results
Translation

      The worst results (similarity) are achieved with the most
      complex (near...
Discussion of the Results
Classification


      All results are comparable to classification on original English
      text...
Conclussions and Future Work

   The “easy way” to construct a Spanish MetaMap is promising
   Google Translation seems a ...
Ending...




   Thank you very much for your attention




                                            Francisco Carrero ...
Any Question?




                Francisco Carrero Garcia
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
Di 624+A B3 Manual 2.03 Cn D Link友讯 Di 624+A 无线宽带路由器中文安装手册
Next
Upcoming SlideShare
Di 624+A B3 Manual 2.03 Cn D Link友讯 Di 624+A 无线宽带路由器中文安装手册
Next
Download to read offline and view in fullscreen.

Share

Presentación en IDEAL 2008

Download to read offline

Presentación de nuestro artículo de traducción+MetaMap en IDEAL 2008

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Presentación en IDEAL 2008

  1. 1. Building a Spanish MMTx by using Automatic Translation and Biomedical Ontologies Francisco Carrero 1,2 ; José Carlos Cortizo 1,2 ; José Mª Gómez 3 1 Wipley, Social Gaming Platform http://www.wipley.com 2 Universidad Europea de Madrid http://www.esp.uem.es/gsi 3 Optenet http://www.esp.uem.es/gsi
  2. 2. Outline The MIRCAT project The challenge English MetaMap, a big effort Approaching a Spanish MetaMap Experiments Discussion of the Results and Future Work Francisco Carrero Garcia
  3. 3. The MIRCAT Project The Interface Francisco Carrero Garcia
  4. 4. The MIRCAT Project System’s Architecture Francisco Carrero Garcia
  5. 5. The Challenge Our Goal English docs Medical record Spanish docs Francisco Carrero Garcia
  6. 6. The Challenge The problem We can extract UMLS concepts from English texts using MetaMap... ...but there is no Spanish version of MetaMap Is it difficult to construct a tool like MetaMap? Francisco Carrero Garcia
  7. 7. English MetaMap A big Effort ∼3 years!! Francisco Carrero Garcia
  8. 8. Approaching Spanish MetaMap Two Main Approaches Considered Francisco Carrero Garcia
  9. 9. Approaching Spanish MetaMap Our Approach: Translation and Reuse Optional Francisco Carrero Garcia
  10. 10. Experimental Design Text Collections MedLine Plus medical News http://www.nlm.nih.gov/medlineplus/newsbydate.html Excellent online resource 2000 news, some in English, some in Spanish 600 available in both languages Francisco Carrero Garcia
  11. 11. Experiments Experimental Design MetaMap extracts concepts, allowing multiple representations A => Using compound concepts B => simple concepts 1 => resolves ambiguity by adding all the concepts 2 => ignores ambiguities by choosing the first possibility 4 representations: A1, A2, B1, B2 Francisco Carrero Garcia
  12. 12. Experiments Filtering Data representations containing a lot of features do not usually perform very well in text tasks Many classifiers degrade in prediction accuracy when faced with many irrelevant features or redundant/correlated ones (“curse of dimensionality”) We apply Zipf’s Law to filter the attributes Francisco Carrero Garcia
  13. 13. Experiments Results Number of concepts for each representation Francisco Carrero Garcia
  14. 14. Experiments Results Average Similarities Francisco Carrero Garcia
  15. 15. Experiments Results Last Experiments (not in IDEAL paper) Francisco Carrero Garcia
  16. 16. Discussion of the Results Translation The worst results (similarity) are achieved with the most complex (near to humans) representation: A1 B1 is less complex and produces the best results => Our model seems to be more suitable as a plain bag-of- concepts representation Similar to bag-of-words representation, widely used in text processing tasks Francisco Carrero Garcia
  17. 17. Discussion of the Results Classification All results are comparable to classification on original English texts In some cases, are even better Best results using A2+Zipf, +7.8% in AUC UNMKD representations never achieves worse classifications than English Francisco Carrero Garcia
  18. 18. Conclussions and Future Work The “easy way” to construct a Spanish MetaMap is promising Google Translation seems a good tool to adapt English resources to any other languages (like Spanish) We should try other translation tools We are working on applying this approach to other text tasks (like Information Retrieval and Filtering) Francisco Carrero Garcia
  19. 19. Ending... Thank you very much for your attention Francisco Carrero Garcia
  20. 20. Any Question? Francisco Carrero Garcia

Presentación de nuestro artículo de traducción+MetaMap en IDEAL 2008

Views

Total views

1,811

On Slideshare

0

From embeds

0

Number of embeds

517

Actions

Downloads

2

Shares

0

Comments

0

Likes

0

×