Pertemuan 5: WordNet, 8 November 2011
 WordNet Latihan Open Problems                  WordNet – ISD312   NLTK dan Python   2
 Kamus, Tesaurus English Manually developed Hubungan antara kata   synonym   hyponym, hypernym   meronym, holonym  ...
 Menggunakan package nltk   from nltk.corpus import wordnet as wn Mengakses synonym set (synset) sebuah kata   wn.syns...
>>> from nltk.corpus import wordnet as wn  >>> wn.synsets(motorcar)  [Synset(car.n.01)] motorcar adalah anggota himpunan ...
 Lemma: nama synset dan kata   synset: car.n.01   kata: car   lemma: car.n.01.car Mendapatkan semua lemma dari himpun...
>>> wn.synset(car.n.01).definition  a motor vehicle with four wheels; usually propelled by    an internal combustion engin...
 Kata car berada dalam synset berbeda  >>> wn.synsets(car)  [Synset(car.n.01), Synset(car.n.02),    Synset(car.n.03), Syn...
>>> for synset in wn.synsets(car):...   print synset.lemma_names...[car, auto, automobile, machine, motorcar][car, railcar...
 Hubungan generic-specific   animal, cat   vehicle, bicycle  >>> motorcar = wn.synset(car.n.01)  >>> types_of_motorcar ...
>>> sorted([lemma.name for synset in types_of_motorcar  for lemma in synset.lemmas])[Model_T, S.U.V., SUV, Stanley_Steamer...
>>> motorcar.hypernyms()[Synset(motor_vehicle.n.01)]>>> paths = motorcar.hypernym_paths()>>> len(paths)2>>> [synset.name f...
>>> [synset.name for synset in paths[1]]  [entity.n.01, physical_entity.n.01, object.n.01,    whole.n.02, artifact.n.01,  ...
 Bagian dari tree adalah trunk  >>> wn.synset(tree.n.01).part_meronyms()  [Synset(burl.n.02), Synset(crown.n.07),    Syns...
>>> for synset in wn.synsets(mint, wn.NOUN):... print synset.name + :, synset.definition...batch.n.02: (often followed by ...
>>> wn.synset(mint.n.04).part_holonyms()[Synset(mint.n.02)]>>> wn.synset(mint.n.04).substance_holonyms()[Synset(mint.n.05)...
>>> wn.synset(walk.v.01).entailments()[Synset(step.v.01)]>>> wn.synset(eat.v.01).entailments()[Synset(swallow.v.01), Synse...
>>> wn.lemma(supply.n.02.supply).antonyms()[Lemma(demand.n.02.demand)]>>> wn.lemma(rush.v.01.rush).antonyms()[Lemma(linger...
 Synsets dihubungkan oleh lexical relations Diberikan sebuah synset, telusuri WordNet untuk  menemukan synset yang mirip...
 Semakin dekat path antara dua lemma, semakin mirip makna semantik kedua lemma tersebut  >>>   right = wn.synset(right_wh...
>>> right.lowest_common_hypernyms(minke)[Synset(baleen_whale.n.01)]>>> right.lowest_common_hypernyms(orca)[Synset(whale.n....
>>>   wn.synset(baleen_whale.n.01).min_depth()14>>>   wn.synset(whale.n.02).min_depth()13>>>   wn.synset(vertebrate.n.01)....
>>> right.path_similarity(minke)0.25>>> right.path_similarity(orca)0.16666666666666666>>> right.path_similarity(tortoise)0...
 WordNet Bahasa Indonesia Thesaurus Bahasa Indonesia    kateglo Membuat WordNet secara otomatis   mengidentifikasi ke...
 Temukan semua senses dari kata dish   menurut penguasaan bahasa Inggris anda   menurut cara yang dibahas di kelas     ...
 Soal nomor 27: The polysemy of a word is the number of senses it has. Using WordNet, we can determine that the noun dog ...
 http://www.nltk.org/book KAmus, TEsaurus, dan GLOsarium,  http://bahtera.org/kateglo http://www.sinonimkata.com/ http...
 Lexical richness Perbandingan jumlah tokens dengan jumlah kata unik   len(text1) / len(set(text1))   Integer division...
>>> def lexicalDiversity(text):...     return len(text) / len(set(text))>>> def percentage(count, total):...     return 10...
Upcoming SlideShare
Loading in...5
×

isd312-05-wordnet

813

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
813
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
36
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "isd312-05-wordnet"

  1. 1. Pertemuan 5: WordNet, 8 November 2011
  2. 2.  WordNet Latihan Open Problems WordNet – ISD312 NLTK dan Python 2
  3. 3.  Kamus, Tesaurus English Manually developed Hubungan antara kata  synonym  hyponym, hypernym  meronym, holonym  antonym WordNet – ISD312 NLTK dan Python 3
  4. 4.  Menggunakan package nltk  from nltk.corpus import wordnet as wn Mengakses synonym set (synset) sebuah kata  wn.synsets(motorcar)  wn.synset(car.n.01).lemma_names  wn.synset(car.n.01).lemmas  wn.synset(car.n.01).definition  wn.synset(car.n.01).examples WordNet – ISD312 NLTK dan Python 4
  5. 5. >>> from nltk.corpus import wordnet as wn >>> wn.synsets(motorcar) [Synset(car.n.01)] motorcar adalah anggota himpunan sinonim car.n.01 Anggota lain dari himpunan sinonim car.n.01 >>> wn.synset(car.n.01).lemma_names [car, auto, automobile, machine, motorcar] WordNet – ISD312 NLTK dan Python 5
  6. 6.  Lemma: nama synset dan kata  synset: car.n.01  kata: car  lemma: car.n.01.car Mendapatkan semua lemma dari himpunan sinonim car.n.01: >>> wn.synset(car.n.01).lemmas [Lemma(car.n.01.car), Lemma(car.n.01.auto), Lemma(car.n.01.automobile), Lemma(car.n.01.machine), Lemma(car.n.01.motorcar)] WordNet – ISD312 NLTK dan Python 6
  7. 7. >>> wn.synset(car.n.01).definition a motor vehicle with four wheels; usually propelled by an internal combustion engine >>> wn.synset(car.n.01).examples [he needs a car to get to work] atau >>> wn.lemmas(car) [Lemma(car.n.01.car), Lemma(car.n.02.car), Lemma(car.n.03.car), Lemma(car.n.04.car), Lemma(cable_car.n.01.car)] WordNet – ISD312 NLTK dan Python 7
  8. 8.  Kata car berada dalam synset berbeda >>> wn.synsets(car) [Synset(car.n.01), Synset(car.n.02), Synset(car.n.03), Synset(car.n.04), Synset(cable_car.n.01)] WordNet – ISD312 NLTK dan Python 8
  9. 9. >>> for synset in wn.synsets(car):... print synset.lemma_names...[car, auto, automobile, machine, motorcar][car, railcar, railway_car, railroad_car][car, gondola][car, elevator_car][cable_car, car] WordNet – ISD312 NLTK dan Python 9
  10. 10.  Hubungan generic-specific  animal, cat  vehicle, bicycle >>> motorcar = wn.synset(car.n.01) >>> types_of_motorcar = motorcar.hyponyms() >>> len(types_of_motorcar) 31 >>> types_of_motorcar[26] Synset(ambulance.n.01) WordNet – ISD312 NLTK dan Python 10
  11. 11. >>> sorted([lemma.name for synset in types_of_motorcar for lemma in synset.lemmas])[Model_T, S.U.V., SUV, Stanley_Steamer, ambulance, beach_waggon,...] WordNet – ISD312 NLTK dan Python 11
  12. 12. >>> motorcar.hypernyms()[Synset(motor_vehicle.n.01)]>>> paths = motorcar.hypernym_paths()>>> len(paths)2>>> [synset.name for synset in paths[0]][entity.n.01, physical_entity.n.01, object.n.01, whole.n.02, artifact.n.01,instrumentality.n.03, container.n.01, wheeled_vehicle.n.01,self-propelled_vehicle.n.01, motor_vehicle.n.01, car.n.01] WordNet – ISD312 NLTK dan Python 12
  13. 13. >>> [synset.name for synset in paths[1]] [entity.n.01, physical_entity.n.01, object.n.01, whole.n.02, artifact.n.01, instrumentality.n.03, conveyance.n.03, vehicle.n.01, wheeled_vehicle.n.01, self-propelled_vehicle.n.01, motor_vehicle.n.01, car.n.01] Root Hypernym >>> motorcar.root_hypernyms() [Synset(entity.n.01)] WordNet – ISD312 NLTK dan Python 13
  14. 14.  Bagian dari tree adalah trunk >>> wn.synset(tree.n.01).part_meronyms() [Synset(burl.n.02), Synset(crown.n.07), Synset(stump.n.01), Synset(trunk.n.01), Synset(limb.n.02)] >>> wn.synset(tree.n.01).substance_meronyms() [Synset(heartwood.n.01), Synset(sapwood.n.01)] >>> wn.synset(tree.n.01).member_holonyms() [Synset(forest.n.01)] WordNet – ISD312 NLTK dan Python 14
  15. 15. >>> for synset in wn.synsets(mint, wn.NOUN):... print synset.name + :, synset.definition...batch.n.02: (often followed by `of) a large number or amount or extentmint.n.02: any north temperate plant of the genus Mentha with aromatic leaves andsmall mauve flowersmint.n.03: any member of the mint family of plantsmint.n.04: the leaves of a mint plant used fresh or candiedmint.n.05: a candy that is flavored with a mint oilmint.n.06: a plant where money is coined by authority of the government WordNet – ISD312 NLTK dan Python 15
  16. 16. >>> wn.synset(mint.n.04).part_holonyms()[Synset(mint.n.02)]>>> wn.synset(mint.n.04).substance_holonyms()[Synset(mint.n.05)] WordNet – ISD312 NLTK dan Python 16
  17. 17. >>> wn.synset(walk.v.01).entailments()[Synset(step.v.01)]>>> wn.synset(eat.v.01).entailments()[Synset(swallow.v.01), Synset(chew.v.01)]>>> wn.synset(tease.v.03).entailments()[Synset(arouse.v.07), Synset(disappoint.v.01)] WordNet – ISD312 NLTK dan Python 17
  18. 18. >>> wn.lemma(supply.n.02.supply).antonyms()[Lemma(demand.n.02.demand)]>>> wn.lemma(rush.v.01.rush).antonyms()[Lemma(linger.v.04.linger)]>>> wn.lemma(horizontal.a.01.horizontal).antonyms( )[Lemma(vertical.a.01.vertical), Lemma(inclined.a.02.inclined)]>>> wn.lemma(staccato.r.01.staccato).antonyms()[Lemma(legato.r.01.legato)] WordNet – ISD312 NLTK dan Python 18
  19. 19.  Synsets dihubungkan oleh lexical relations Diberikan sebuah synset, telusuri WordNet untuk menemukan synset yang mirip secara semantic Penting untuk membangun index Penting untuk mengolah kueri Kueri vehicle mengambil juga dokumen tentang limousine WordNet – ISD312 NLTK dan Python 19
  20. 20.  Semakin dekat path antara dua lemma, semakin mirip makna semantik kedua lemma tersebut >>> right = wn.synset(right_whale.n.01) >>> orca = wn.synset(orca.n.01) >>> minke = wn.synset(minke_whale.n.01) >>> tortoise = wn.synset(tortoise.n.01) >>> novel = wn.synset(novel.n.01) WordNet – ISD312 NLTK dan Python 20
  21. 21. >>> right.lowest_common_hypernyms(minke)[Synset(baleen_whale.n.01)]>>> right.lowest_common_hypernyms(orca)[Synset(whale.n.02)]>>> right.lowest_common_hypernyms(tortoise)[Synset(vertebrate.n.01)]>>> right.lowest_common_hypernyms(novel)[Synset(entity.n.01)] WordNet – ISD312 NLTK dan Python 21
  22. 22. >>> wn.synset(baleen_whale.n.01).min_depth()14>>> wn.synset(whale.n.02).min_depth()13>>> wn.synset(vertebrate.n.01).min_depth()8>>> wn.synset(entity.n.01).min_depth()0 WordNet – ISD312 NLTK dan Python 22
  23. 23. >>> right.path_similarity(minke)0.25>>> right.path_similarity(orca)0.16666666666666666>>> right.path_similarity(tortoise)0.076923076923076927>>> right.path_similarity(novel)0.043478260869565216 WordNet – ISD312 NLTK dan Python 23
  24. 24.  WordNet Bahasa Indonesia Thesaurus Bahasa Indonesia  kateglo Membuat WordNet secara otomatis  mengidentifikasi kemunculan bahasa gaul  sesuatu banget, rempong, jablay, lebay VerbNet  nltk.corpus.verbnet WordNet – ISD312 NLTK dan Python 24
  25. 25.  Temukan semua senses dari kata dish  menurut penguasaan bahasa Inggris anda  menurut cara yang dibahas di kelas WordNet – ISD312 NLTK dan Python 25
  26. 26.  Soal nomor 27: The polysemy of a word is the number of senses it has. Using WordNet, we can determine that the noun dog has seven senses with len(wn.synsets(dog, n)). Compute the average polysemy of nouns, verbs, adjectives, and adverbs according to WordNet. WordNet – ISD312 NLTK dan Python 26
  27. 27.  http://www.nltk.org/book KAmus, TEsaurus, dan GLOsarium, http://bahtera.org/kateglo http://www.sinonimkata.com/ http://tjerdastangkas.blogspot.com/search/label/isd312 WordNet – ISD312 NLTK dan Python 27
  28. 28.  Lexical richness Perbandingan jumlah tokens dengan jumlah kata unik  len(text1) / len(set(text1))  Integer division  from __future__ import division Jumlah kemunculan sebuah token  text1.count(whale)  100 * text1.count(whale) / len(text1) WordNet – ISD312 NLTK dan Python 28
  29. 29. >>> def lexicalDiversity(text):... return len(text) / len(set(text))>>> def percentage(count, total):... return 100 * count / totallexicalDiversity(text5)percentage(text1.count(whale), len(text1)) WordNet – ISD312 NLTK dan Python 29
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×