Your SlideShare is downloading. ×
Lecture semantic augmentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Lecture semantic augmentation

897
views

Published on

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
897
On Slideshare
0
From Embeds
0
Number of Embeds
21
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • It is just not tagging
  • Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps.
  • Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps.
  • Transcript

    • 1. COMP3725Knowledge Enriched Information Systems Lecture 13: Semantic Augmentation Dhavalkumar Thakker (Dhaval) School of Computing, University of Leeds 1
    • 2. Outline• Semantic Augmentation – What – Why – How• Existing systems & services for Semantic Augmentation• Challenges 2
    • 3. Semantic Augmentation• From: (…) Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps.• To: (…) Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps. http://dbpedia.org/Ontology/New_York_City http://dbpedia.org/Ontology/Apple_Corps 3
    • 4. Semantic Augmentation• Semantic augmentation is a process of attaching semantics to a selected part of a text to assist automatic interpretation of the meaning conveyed by the text.• Also called semantic annotation, semantic tagging 4
    • 5. It provides additional information about an existing piece of data. 5
    • 6. Why Semantic Augmentation?• Links to complementary information – “More about this”• Show related or similar informatiom• Reasoning and inferencing offered by semantics• Semantic annotation is the glue that ties ontologies into document spaces – remember existing web is document web• Manual metadata production cost is too high 6
    • 7. GATE for Semantic Augmentation• GATE (General Architecture for Text Engineering) – see gate.ac.uk• GATE Developer is a development environment that provides a rich set of graphical interactive tools for the creation, measurement and maintenance of software components for processing human language.• See: http://gate.ac.uk/family/developer.html 7
    • 8. Overview of Gate Developer• GATE Developer• Resources Pane – applications: groups of processes to run on a document or corpus – language resources: corpus, ontologies, schemas – processing resources: tools that operate on unstructured text – datastores: saved documents and resources• Display Pane: whatever you’re currently working with.• See next slide
    • 9. GATE : InterfaceResourcesPane Display Pane 9
    • 10. Processing Resources: ANNIE• A family of Processing Resources for language analysis included with GATE• Stands for A Nearly-New Information Extraction system.• Using finite state techniques to implement various tasks: tokenization, semantic tagging, verb phrase chunking, and so on.
    • 11. ANNIE IE Modules http://gate.ac.uk/sale/tao/splitch6.html#chap:annie
    • 12. Some ANNIE Components• Tokenizer – word, number, symbol, punctuation, and spaceToken.• Sentence Splitter – Segments text into sentences• Part of Speech Tagger – produces a part-of-speech tag as an annotation on each word or symbol – Nouns, verbs etc.• Gate Morphological Analyser – detecting morphemes in a piece of text (e.g. car, caring)• OntoGazetteer – Semantic Tagging component – uses ontology
    • 13. Demo:• From: (…) Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps.• To: (…) Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps. http://dbpedia.org/Ontology/New_York_City http://dbpedia.org/Ontology/Apple_Corps 13 13
    • 14. Step : Download & Start the GATE application• Download GATE from: http://gate.ac.uk/download/• Note: the demonstration is using GATE 6.0 14
    • 15. Step: From Language Resources Select• GATE document-> Make sure that String content is selected in the last field, see screenshot below. Name the file “Test” 15
    • 16. Paste following text…in the file• Upon their return, Lennon and McCartney went to New York to announce the formation of Apple Corps. 16
    • 17. Step: From Processing resources select following resources• ANNIE English Tokeniser• ANNIE Sentence Splitter• ANNIE POS Tagger• GATE Morphological Analyser• Note: For all the above, leave the “Name” field Empty 17
    • 18. Step: From Processing resources select following resources 18
    • 19. Step: From Language Resources Select• OWLIM Ontology – Specify the location of the ontology you would like to use for semantic augmentation – For example, we are using dbpedia ontology 19
    • 20. OWLIM Ontology window 20
    • 21. From Processing Resources Select• Select Onto Root Gazetteer• & specify parameters as follows: 21
    • 22. Final steps: Create Corpus• Go to Language resources and click on GATE Corpus, and add “Test” document created earlier 22
    • 23. Final steps: Create Corpus Pipeline• From application• And add processing resources in order shown below and press “run this application” 23
    • 24. Results: Go to file, Click on Annotation Set, Annotation List, LookupSemantic Augmentation 24
    • 25. Other features• JAPE – a Java Annotation Patterns Engine, provides regular-expression based pattern/action rules over annotations. – Grammar to detect entities, validate detected entities, pre & post processing – Example: “at the Carnegie Stadium”, “at the Emirates Stadium”, “at the O2 Arena” – See Tutorial: http://gate.ac.uk/sale/thakker-jape- tutorial/index.html
    • 26. Some Links• Home page is http://gate.ac.uk/• Some good short tutorial videos for getting started: http://gate.ac.uk/demos/developer-videos/ . These are only a few minutes each, so they’re fast• User Guide: http://gate.ac.uk/sale/tao/index.html . This is apparently for version 7.1, which is a development build, but again it seems to be fine.• Lots of documentation : http://gate.ac.uk/documentation.html• The wiki: http://gate.ac.uk/wiki/• JAPE grammar by Dhaval Thakker et al http://gate.ac.uk/sale/thakker-jape- tutorial/index.html
    • 27. Challenge: Term Ambiguity• ...this apple on the palm of my hand...• ...Apple tried to acquire Palm Inc....• ...eating an apple sitted by a palm tree...• What do “apple” and “palm” mean in each case?• Objective is to recognize entities and disambiguate their meaning. DBpedia Spotlight: Shedding Light on the Web of Documents. Pablo Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. In: In the Proceedings of the 7th International Conference on Semantic Systems I-Semantics (2011) . 27
    • 28. Challenges• Disambiguation• Unknown entities• Ontology learning• Scale and speed• Co-referencing
    • 29. Existing Services for SemanticAugmentation
    • 30. Existing Services for SemanticAugmentation
    • 31. DBpedia Spotlight• DBpedia is a collection of entity descriptions extracted from Wikipedia & shared as linked data• DBpedia Spotlight uses data from DBpedia and text from associated Wikipedia pages• Learns how to recognize that a DBpedia resource was mentioned• Given plain text as input, generates annotated text http://dbpedia-spotlight.github.com/demo/ 31
    • 32. DBpedia Spotlight 32
    • 33. DBpedia Spotlight 33
    • 34. References• DBpedia Spotlight: Shedding Light on the Web of Documents. Pablo Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. In: In the Proceedings of the 7th International Conference on Semantic Systems I-Semantics (2011) .• Introduction to GATE, Dr. Paula Matuszek• Various resources from gate.ac.uk 34