Automatic extraction of genes and diseases

  • 415 views
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
415
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Jamsandekar Mugdha Prabhu Priyanka Sugandh Neha Extraction of Relations from Unstructured Text
  • 2. Introduction
    • Goal - to automatically extract groups of entities of a specific relation from unstructured text.
    • HuGE – Human Gene Expression, a medical database which contains about 20,000 abstracts
    • (gene, disease) relations
  • 3. Approach
    • (gene, disease) seed values
    • MMTX- Medical Ontology
    • Pattern <order, prefix, gene/disease, middle, disease/gene, suffix>
    • Weights for prefix, middle and suffix, Stop word
    • Threshold, Match
  • 4. Patterns
    • <true| polymorphisms of|LTA gene| are associated with risk of|MI| in Japanese>
    • <true| allele of|angiotensin converting enzyme (ACE) gene| is associated with|hypoxemia| in sars p>
    • <false| developing ovarian|cancer| borderline tumours in presence of|BRCA1| mutations>
    • <true| apolipoprotein|epsilon4 allele| with progression in|AD| pkd may not>
    • <true| mutations in|BRCA1| brca2 genes explain at least 10 % of breast|cancer| cases diagnosed>
    • <true| mutation of|BRCA1| contributes little occurrence of breast|cancer| in taiwanese>
    • <true| between|factor V Leiden gene variant| carotid|atherosclerosis| in cross - sectional>
  • 5. 30 Relations Obtained – 4% of abstracts
  • 6. Statistics
    • 4% of abstracts  30 pair results
    • All 20000 abstracts ≈ 750 pair results
    • 30 results  1 error pair
    • Error ≈ 1/30 = 3.33%
  • 7. Advantages
    • Generic domain independent method for extracting relations
    • Gives the patterns where the entities occur, these can be used in pattern analysis
    • Better results in iterative steps
    • More efficient as compared to traditional ontology based approach
  • 8. Extraction of Relations from Unstructured Text
    • Thank You
    • Any Questions ??
    • Jamsandekar Mugdha
    • Prabhu Priyanka
    • Sugandh Neha