Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1
The NIF format (hands on)
Annotating Strings and Documents using the
NLP Interchange Format
2
Practical session outcomes
• Participants will learn to use NIF API to
annotate strings and documents using
the followin...
3
NIF Example
4
Snowball Stemmer Wrapper
• Stemming algorithm is a process
for removing suffixes from words.
–CONNECT
• CONNECTED
• CONN...
5
Snowball Stemmer Wrapper
java -jar snowball.jar -f text -i 'I am
connected.'
• -f is used to define the format
• -i is u...
6
Snowball Stemmer Wrapper
7
Snowball Stemmer Wrapper
8
Snowball Stemmer Wrapper
NIF Standard Annotations
NIF Offset
9
Snowball Stemmer Wrapper
NIF Standard Annotations
Snowball Stemmer
NIF Offset
10
Annotating Strings: Step-by-step
• 1. Open the USB stick folder
• 2. Decompress the “session-nif.zip” folder
• 3. Open ...
11
Available Wrappers
• To annotate documents, use the local wrappers (USB Stick)
java -jar opennlp.jar -f text -i 'This i...
12
Reading and Writing Files
• Write results in a file:
“--outfile myAnnotatedFile.ttl“
• Read a document as input
“--inty...
13
POS tagger for multiple languages
• The -modelFolder parameter set the folder
that contains the POS tagging OpenNLP
tra...
14
Example 2: Query a Corpus
15
Querying with Twinkle
Open the “/twinkle” folder and run
the command:
java -jar twinkle.jar
16
Querying a Corpus
17
Querying a Corpus
18
Querying a Corpus
19
Querying a Corpus
20
Querying a Corpus
21
Querying a Corpus
22
Querying a Corpus
23
Querying a Corpus
24
Querying a Corpus
25
Querying a Corpus
26
Querying a Corpus
27
Querying a Corpus
28
Querying a Corpus
29
Exercise 3: Querying your own NIF
annotated corpus
30
Querying your own NIF annotated
corpus
1. Annotate your string using one of the
wrappers
2. Save your annotated sentenc...
31
• Query your annotated corpus:
– nif:Context
– nif:Sentence
– nif:anchorOf
– nif:oliaCategory
– nif:oliaLink
… or pract...
32
33
Thank you!
http://site.nlp2rdf.org/
Upcoming SlideShare
Loading in …5
×

Nif practical

1,231 views

Published on

LIDER Datathon NIF Practical Session

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Nif practical

  1. 1. 1 The NIF format (hands on) Annotating Strings and Documents using the NLP Interchange Format
  2. 2. 2 Practical session outcomes • Participants will learn to use NIF API to annotate strings and documents using the following wrappers: –OpenNLP –Stanford Core NLP –Snowball Stemmer –DBpedia Spotlight • Query your corpus using SPARQL
  3. 3. 3 NIF Example
  4. 4. 4 Snowball Stemmer Wrapper • Stemming algorithm is a process for removing suffixes from words. –CONNECT • CONNECTED • CONNECTION • CONNECTING • CONNECTIONS
  5. 5. 5 Snowball Stemmer Wrapper java -jar snowball.jar -f text -i 'I am connected.' • -f is used to define the format • -i is used to define the input
  6. 6. 6 Snowball Stemmer Wrapper
  7. 7. 7 Snowball Stemmer Wrapper
  8. 8. 8 Snowball Stemmer Wrapper NIF Standard Annotations NIF Offset
  9. 9. 9 Snowball Stemmer Wrapper NIF Standard Annotations Snowball Stemmer NIF Offset
  10. 10. 10 Annotating Strings: Step-by-step • 1. Open the USB stick folder • 2. Decompress the “session-nif.zip” folder • 3. Open the “NIF_DATATHON” folder and decompress “NIF_tutorial_hands_on_jars.zip” • Open the prompt command, and use the commands from the next slide in the “jar” folder.
  11. 11. 11 Available Wrappers • To annotate documents, use the local wrappers (USB Stick) java -jar opennlp.jar -f text -i 'This is a test.' -modelFolder ../model/ java -jar stanford.jar -f text -i 'This is a test.' java -jar snowball.jar -f text -i 'This is my favorite test.' java -jar spotlight.jar -f text -i 'Welcome to Germany.' -confidence 0.2 • To annotate small strings, you can try the on-line services: http://spotlight.nlp2rdf.aksw.org/spotlight? f=text&i=Welcome+to+Germany.&t=direct&confidence=0.3&prefix=http://yourDomain.org/ • http://snowball.nlp2rdf.aksw.org/snowball? f=text&i=This+is+my+favorite+test.&t=direct&prefix=http://yourDomain.org/ • http://stanford.nlp2rdf.aksw.org/stanfordcorenlpn? f=text&i=This+is+a+test.&t=direct&prefix=http://yourDomain.org/ • http://opennlp.nlp2rdf.aksw.org/opennlp? f=text&i=This+is+a+test.&t=direct&modelFolder=model&prefix=http://yourDomain.org
  12. 12. 12 Reading and Writing Files • Write results in a file: “--outfile myAnnotatedFile.ttl“ • Read a document as input “--intype file -i /path/myDoc”
  13. 13. 13 POS tagger for multiple languages • The -modelFolder parameter set the folder that contains the POS tagging OpenNLP trained models and tokenization. • Different languages can be found at OpenNLP website http://opennlp.sourceforge.net/models- 1.5/http://opennlp.sourceforge.net/models-1.5/
  14. 14. 14 Example 2: Query a Corpus
  15. 15. 15 Querying with Twinkle Open the “/twinkle” folder and run the command: java -jar twinkle.jar
  16. 16. 16 Querying a Corpus
  17. 17. 17 Querying a Corpus
  18. 18. 18 Querying a Corpus
  19. 19. 19 Querying a Corpus
  20. 20. 20 Querying a Corpus
  21. 21. 21 Querying a Corpus
  22. 22. 22 Querying a Corpus
  23. 23. 23 Querying a Corpus
  24. 24. 24 Querying a Corpus
  25. 25. 25 Querying a Corpus
  26. 26. 26 Querying a Corpus
  27. 27. 27 Querying a Corpus
  28. 28. 28 Querying a Corpus
  29. 29. 29 Exercise 3: Querying your own NIF annotated corpus
  30. 30. 30 Querying your own NIF annotated corpus 1. Annotate your string using one of the wrappers 2. Save your annotated sentence to a file (using “--outfile”) 3. Open Twinkle 4. Query your corpus using Twinkle
  31. 31. 31 • Query your annotated corpus: – nif:Context – nif:Sentence – nif:anchorOf – nif:oliaCategory – nif:oliaLink … or practice with Brown Corpus!
  32. 32. 32
  33. 33. 33 Thank you! http://site.nlp2rdf.org/

×