SlideShare a Scribd company logo
Automatic Semantic Annotation of the
          Cyttron Database
                              David Graus
                               @dvdgrs
                             http://graus.nu
                           Media Technology
Part I

What?
What » How » Why




What?




   Automatic Semantic Annotation of the Cyttron Database
What » How » Why




What?




        Semantic Annotation
What » How » Why




Example

“ Company XYZ announced profits in Q3, planning to
         build a $120M plant in Bulgaria.”
What » How » Why




It is like tagging

“ Company XYZ announced profits in Q3, planning to
         build a $120M plant in Bulgaria.”
What » How » Why




It is not tagging

“ Company XYZ announced profits in Q3, planning to
         build a $120M plant in Bulgaria.”

Tags:
   Company XYZ
   Plant
   Bulgaria
What » How » Why




It is not tagging

“ Company XYZ announced profits in Q3, planning to
         build a $120M plant in Bulgaria.”

Meaning:
   What is Company XYZ?
   What is a Plant?
   What is Bulgaria?
What » How » Why




It is not tagging

“ Company XYZ announced profits in Q3, planning to
         build a $120M plant in Bulgaria.”

Meaning:
   What is Company XYZ?
   What is a Plant?
   What is Bulgaria?
   How do they relate?
What » How » Why




It adds context!




                             source: ontottext.com
What » How » Why




What?




   Automatic Semantic Annotation of the Cyttron Database
What » How » Why




What?




        Cyttron Database
What » How » Why




Cyttron Database

                   "The volume of the brain evaluated in this
                   study. The color scale represents the
                   number of 4-mm voxels with data in at
                   least 7 subjects along a 3-cm deep line
                   into the brain. A three-dimensional
                   rendering of a brain is shown in regions
                   where insufficient data were obtained. The
                   most superior regions of the frontal and
                   parietal lobes and the most inferior
                   regions of the temporal lobes were not
                   evaluated. Imaging artifacts may also
                   compromise the significance of results in
                   the most inferior portions of the frontal
                   lobe."
What » How » Why




NCI Thesaurus
What » How » Why




NCI Thesaurus
Concept:        http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Brain
What » How » Why




NCI Thesaurus

Label:          Brain
What » How » Why




NCI Thesaurus


Definition:     An organ composed of grey and white matter
                containing billions of neurons that is the center for
                intelligence and reasoning. It is protected by the
                bony cranium.
What » How » Why




NCI Thesaurus




Context:

           Brain               is a      Central Nervous System Part
           Brain               is a      Organ
           Brain               part of   Central Nervous System
           Basal Ganglia       part of   Brain
           Base of the Brain   part of   Brain
           Brain Nucleus       part of   Brain
What » How » Why




NCI Thesaurus
Part II

How?
What » How » Why

Approach I
What » How » Why

Keyword Extraction
What » How » Why

Keyword Extraction   x6
What » How » Why

Semantic Annotation
What » How » Why




Approach II
What » How » Why




Topic Classification
What » How » Why




Evaluation




             I    I    I    I         I      I

             I    I    I    I         I      I

             I    I    I    I         I      I

             I    I    I    I         I      I

             II   II   II   II       II
What » How » Why




Evaluation




             1   I    I    I    I         I      I

             2   I    I    I    I         I      I

             3   I    I    I    I         I      I

                 I    I    I    I         I      I

                 II   II   II   II       II
What » How » Why




Evaluation


             ?



             1   I    I    I    I         I      I

             2   I    I    I    I         I      I

             3   I    I    I    I         I      I

                 I    I    I    I         I      I

                 II   II   II   II       II
What » How » Why




Evaluation I

Confusion Matrix
What » How » Why




Evaluation II

Semantic Similarity
What » How » Why




Evaluation II

Semantic Similarity
    Human                        Sagittal Plane
    Brain                        Magnetic Resonance Imaging
    Magnetic Resonance Imaging   Cingulate Gyrus
    Cingulate Gyrus              Corpus Callosum
                                 Lateral Ventricle
                                 Thalamus
                                 Mamillary Body
                                 Cerebral Fornix
                                 White Matter
What » How » Why




Evaluation II

Semantic Similarity
    Human                        Sagittal Plane
    Brain                        Magnetic Resonance Imaging
    Magnetic Resonance Imaging   Cingulate Gyrus
    Cingulate Gyrus              Corpus Callosum
                                 Lateral Ventricle
                                 Thalamus
                                 Mamillary Body
                                 Cerebral Fornix
                                 White Matter
What » How » Why




Visualization   DEMO
Part III

Why?
What » How » Why




Evaluation
What » How » Why




Results

1. No ‘agreement’ between experts

2. Annotation method I best approach

3. Both Annotation II & Random had no direct matches
What » How » Why




What is it good for? / Future Work

1. Domain independent method

2. Clustering topic identification

3. Subgraph similarity
Fin

Questions?

More Related Content

More from David Graus

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientists
David Graus
 
Bias in Recommendations
Bias in RecommendationsBias in Recommendations
Bias in Recommendations
David Graus
 
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
David Graus
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for Impact
David Graus
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender Systems
David Graus
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
David Graus
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
David Graus
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData Amsterdam
David Graus
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgeven
David Graus
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.info
David Graus
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & Valkuilen
David Graus
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task Reminders
David Graus
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
David Graus
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
David Graus
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email Traffic
David Graus
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
David Graus
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-Discovery
David Graus
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualization
David Graus
 

More from David Graus (19)

Pragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientistsPragmatic ethical and fair AI for data scientists
Pragmatic ethical and fair AI for data scientists
 
Bias in Recommendations
Bias in RecommendationsBias in Recommendations
Bias in Recommendations
 
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
RecSys in the Media Industry: Relevance, Recency, Popularity, and Diversity.
 
CAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for ImpactCAT/AI: Computer Assisted Translation 
Assessment for Impact
CAT/AI: Computer Assisted Translation 
Assessment for Impact
 
Opening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender SystemsOpening the Black Box of User Profiles in Content-based Recommender Systems
Opening the Black Box of User Profiles in Content-based Recommender Systems
 
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacyZoeken, vinden, en aanbevelen: personalisatie vs. privacy
Zoeken, vinden, en aanbevelen: personalisatie vs. privacy
 
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital TracesLayman's Talk: Entities of Interest --- Discovery in Digital Traces
Layman's Talk: Entities of Interest --- Discovery in Digital Traces
 
Financial News Mining @ PyData Amsterdam
Financial News Mining @ PyData AmsterdamFinancial News Mining @ PyData Amsterdam
Financial News Mining @ PyData Amsterdam
 
De Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgevenDe Macht van Data --- Hoe algoritmen ons leven vormgeven
De Macht van Data --- Hoe algoritmen ons leven vormgeven
 
Financial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.infoFinancial News Mining @ FD Mediagroep/Company.info
Financial News Mining @ FD Mediagroep/Company.info
 
Big Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & ValkuilenBig Data & Machine Learning - Mogelijkheden & Valkuilen
Big Data & Machine Learning - Mogelijkheden & Valkuilen
 
Analyzing and Predicting Task Reminders
Analyzing and Predicting Task RemindersAnalyzing and Predicting Task Reminders
Analyzing and Predicting Task Reminders
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Dynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity RankingDynamic Collective Entity Representations for Entity Ranking
Dynamic Collective Entity Representations for Entity Ranking
 
Understanding Email Traffic
Understanding Email TrafficUnderstanding Email Traffic
Understanding Email Traffic
 
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27thDavid Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
David Graus - Entity Linking (at SEA), Search Engines Amsterdam, Fri June 27th
 
yourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic eventsyourHistory - entity linking for a personalized timeline of historic events
yourHistory - entity linking for a personalized timeline of historic events
 
Semantic Search in E-Discovery
Semantic Search in E-DiscoverySemantic Search in E-Discovery
Semantic Search in E-Discovery
 
Semantic annotation, clustering and visualization
Semantic annotation, clustering and visualizationSemantic annotation, clustering and visualization
Semantic annotation, clustering and visualization
 

Recently uploaded

A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
Mydbops
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 

Recently uploaded (20)

A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 

Semantic Annotation of the Cyttron Database

  • 1. Automatic Semantic Annotation of the Cyttron Database David Graus @dvdgrs http://graus.nu Media Technology
  • 3. What » How » Why What? Automatic Semantic Annotation of the Cyttron Database
  • 4. What » How » Why What? Semantic Annotation
  • 5. What » How » Why Example “ Company XYZ announced profits in Q3, planning to build a $120M plant in Bulgaria.”
  • 6. What » How » Why It is like tagging “ Company XYZ announced profits in Q3, planning to build a $120M plant in Bulgaria.”
  • 7. What » How » Why It is not tagging “ Company XYZ announced profits in Q3, planning to build a $120M plant in Bulgaria.” Tags: Company XYZ Plant Bulgaria
  • 8. What » How » Why It is not tagging “ Company XYZ announced profits in Q3, planning to build a $120M plant in Bulgaria.” Meaning: What is Company XYZ? What is a Plant? What is Bulgaria?
  • 9. What » How » Why It is not tagging “ Company XYZ announced profits in Q3, planning to build a $120M plant in Bulgaria.” Meaning: What is Company XYZ? What is a Plant? What is Bulgaria? How do they relate?
  • 10. What » How » Why It adds context! source: ontottext.com
  • 11. What » How » Why What? Automatic Semantic Annotation of the Cyttron Database
  • 12. What » How » Why What? Cyttron Database
  • 13. What » How » Why Cyttron Database "The volume of the brain evaluated in this study. The color scale represents the number of 4-mm voxels with data in at least 7 subjects along a 3-cm deep line into the brain. A three-dimensional rendering of a brain is shown in regions where insufficient data were obtained. The most superior regions of the frontal and parietal lobes and the most inferior regions of the temporal lobes were not evaluated. Imaging artifacts may also compromise the significance of results in the most inferior portions of the frontal lobe."
  • 14. What » How » Why NCI Thesaurus
  • 15. What » How » Why NCI Thesaurus Concept: http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Brain
  • 16. What » How » Why NCI Thesaurus Label: Brain
  • 17. What » How » Why NCI Thesaurus Definition: An organ composed of grey and white matter containing billions of neurons that is the center for intelligence and reasoning. It is protected by the bony cranium.
  • 18. What » How » Why NCI Thesaurus Context: Brain is a Central Nervous System Part Brain is a Organ Brain part of Central Nervous System Basal Ganglia part of Brain Base of the Brain part of Brain Brain Nucleus part of Brain
  • 19. What » How » Why NCI Thesaurus
  • 21. What » How » Why Approach I
  • 22. What » How » Why Keyword Extraction
  • 23. What » How » Why Keyword Extraction x6
  • 24. What » How » Why Semantic Annotation
  • 25. What » How » Why Approach II
  • 26. What » How » Why Topic Classification
  • 27. What » How » Why Evaluation I I I I I I I I I I I I I I I I I I I I I I I I II II II II II
  • 28. What » How » Why Evaluation 1 I I I I I I 2 I I I I I I 3 I I I I I I I I I I I I II II II II II
  • 29. What » How » Why Evaluation ? 1 I I I I I I 2 I I I I I I 3 I I I I I I I I I I I I II II II II II
  • 30. What » How » Why Evaluation I Confusion Matrix
  • 31. What » How » Why Evaluation II Semantic Similarity
  • 32. What » How » Why Evaluation II Semantic Similarity Human Sagittal Plane Brain Magnetic Resonance Imaging Magnetic Resonance Imaging Cingulate Gyrus Cingulate Gyrus Corpus Callosum Lateral Ventricle Thalamus Mamillary Body Cerebral Fornix White Matter
  • 33. What » How » Why Evaluation II Semantic Similarity Human Sagittal Plane Brain Magnetic Resonance Imaging Magnetic Resonance Imaging Cingulate Gyrus Cingulate Gyrus Corpus Callosum Lateral Ventricle Thalamus Mamillary Body Cerebral Fornix White Matter
  • 34. What » How » Why Visualization DEMO
  • 36. What » How » Why Evaluation
  • 37. What » How » Why Results 1. No ‘agreement’ between experts 2. Annotation method I best approach 3. Both Annotation II & Random had no direct matches
  • 38. What » How » Why What is it good for? / Future Work 1. Domain independent method 2. Clustering topic identification 3. Subgraph similarity