Upcoming SlideShare
×

# isd312-05-wordnet

813

Published on

Published in: Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
813
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
36
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript of "isd312-05-wordnet"

1. 1. Pertemuan 5: WordNet, 8 November 2011
2. 2.  WordNet Latihan Open Problems WordNet – ISD312 NLTK dan Python 2
3. 3.  Kamus, Tesaurus English Manually developed Hubungan antara kata  synonym  hyponym, hypernym  meronym, holonym  antonym WordNet – ISD312 NLTK dan Python 3
4. 4.  Menggunakan package nltk  from nltk.corpus import wordnet as wn Mengakses synonym set (synset) sebuah kata  wn.synsets(motorcar)  wn.synset(car.n.01).lemma_names  wn.synset(car.n.01).lemmas  wn.synset(car.n.01).definition  wn.synset(car.n.01).examples WordNet – ISD312 NLTK dan Python 4
5. 5. >>> from nltk.corpus import wordnet as wn >>> wn.synsets(motorcar) [Synset(car.n.01)] motorcar adalah anggota himpunan sinonim car.n.01 Anggota lain dari himpunan sinonim car.n.01 >>> wn.synset(car.n.01).lemma_names [car, auto, automobile, machine, motorcar] WordNet – ISD312 NLTK dan Python 5
6. 6.  Lemma: nama synset dan kata  synset: car.n.01  kata: car  lemma: car.n.01.car Mendapatkan semua lemma dari himpunan sinonim car.n.01: >>> wn.synset(car.n.01).lemmas [Lemma(car.n.01.car), Lemma(car.n.01.auto), Lemma(car.n.01.automobile), Lemma(car.n.01.machine), Lemma(car.n.01.motorcar)] WordNet – ISD312 NLTK dan Python 6
7. 7. >>> wn.synset(car.n.01).definition a motor vehicle with four wheels; usually propelled by an internal combustion engine >>> wn.synset(car.n.01).examples [he needs a car to get to work] atau >>> wn.lemmas(car) [Lemma(car.n.01.car), Lemma(car.n.02.car), Lemma(car.n.03.car), Lemma(car.n.04.car), Lemma(cable_car.n.01.car)] WordNet – ISD312 NLTK dan Python 7
8. 8.  Kata car berada dalam synset berbeda >>> wn.synsets(car) [Synset(car.n.01), Synset(car.n.02), Synset(car.n.03), Synset(car.n.04), Synset(cable_car.n.01)] WordNet – ISD312 NLTK dan Python 8
9. 9. >>> for synset in wn.synsets(car):... print synset.lemma_names...[car, auto, automobile, machine, motorcar][car, railcar, railway_car, railroad_car][car, gondola][car, elevator_car][cable_car, car] WordNet – ISD312 NLTK dan Python 9
10. 10.  Hubungan generic-specific  animal, cat  vehicle, bicycle >>> motorcar = wn.synset(car.n.01) >>> types_of_motorcar = motorcar.hyponyms() >>> len(types_of_motorcar) 31 >>> types_of_motorcar[26] Synset(ambulance.n.01) WordNet – ISD312 NLTK dan Python 10
11. 11. >>> sorted([lemma.name for synset in types_of_motorcar for lemma in synset.lemmas])[Model_T, S.U.V., SUV, Stanley_Steamer, ambulance, beach_waggon,...] WordNet – ISD312 NLTK dan Python 11
12. 12. >>> motorcar.hypernyms()[Synset(motor_vehicle.n.01)]>>> paths = motorcar.hypernym_paths()>>> len(paths)2>>> [synset.name for synset in paths[0]][entity.n.01, physical_entity.n.01, object.n.01, whole.n.02, artifact.n.01,instrumentality.n.03, container.n.01, wheeled_vehicle.n.01,self-propelled_vehicle.n.01, motor_vehicle.n.01, car.n.01] WordNet – ISD312 NLTK dan Python 12
13. 13. >>> [synset.name for synset in paths[1]] [entity.n.01, physical_entity.n.01, object.n.01, whole.n.02, artifact.n.01, instrumentality.n.03, conveyance.n.03, vehicle.n.01, wheeled_vehicle.n.01, self-propelled_vehicle.n.01, motor_vehicle.n.01, car.n.01] Root Hypernym >>> motorcar.root_hypernyms() [Synset(entity.n.01)] WordNet – ISD312 NLTK dan Python 13
14. 14.  Bagian dari tree adalah trunk >>> wn.synset(tree.n.01).part_meronyms() [Synset(burl.n.02), Synset(crown.n.07), Synset(stump.n.01), Synset(trunk.n.01), Synset(limb.n.02)] >>> wn.synset(tree.n.01).substance_meronyms() [Synset(heartwood.n.01), Synset(sapwood.n.01)] >>> wn.synset(tree.n.01).member_holonyms() [Synset(forest.n.01)] WordNet – ISD312 NLTK dan Python 14
15. 15. >>> for synset in wn.synsets(mint, wn.NOUN):... print synset.name + :, synset.definition...batch.n.02: (often followed by `of) a large number or amount or extentmint.n.02: any north temperate plant of the genus Mentha with aromatic leaves andsmall mauve flowersmint.n.03: any member of the mint family of plantsmint.n.04: the leaves of a mint plant used fresh or candiedmint.n.05: a candy that is flavored with a mint oilmint.n.06: a plant where money is coined by authority of the government WordNet – ISD312 NLTK dan Python 15
16. 16. >>> wn.synset(mint.n.04).part_holonyms()[Synset(mint.n.02)]>>> wn.synset(mint.n.04).substance_holonyms()[Synset(mint.n.05)] WordNet – ISD312 NLTK dan Python 16
17. 17. >>> wn.synset(walk.v.01).entailments()[Synset(step.v.01)]>>> wn.synset(eat.v.01).entailments()[Synset(swallow.v.01), Synset(chew.v.01)]>>> wn.synset(tease.v.03).entailments()[Synset(arouse.v.07), Synset(disappoint.v.01)] WordNet – ISD312 NLTK dan Python 17
18. 18. >>> wn.lemma(supply.n.02.supply).antonyms()[Lemma(demand.n.02.demand)]>>> wn.lemma(rush.v.01.rush).antonyms()[Lemma(linger.v.04.linger)]>>> wn.lemma(horizontal.a.01.horizontal).antonyms( )[Lemma(vertical.a.01.vertical), Lemma(inclined.a.02.inclined)]>>> wn.lemma(staccato.r.01.staccato).antonyms()[Lemma(legato.r.01.legato)] WordNet – ISD312 NLTK dan Python 18
19. 19.  Synsets dihubungkan oleh lexical relations Diberikan sebuah synset, telusuri WordNet untuk menemukan synset yang mirip secara semantic Penting untuk membangun index Penting untuk mengolah kueri Kueri vehicle mengambil juga dokumen tentang limousine WordNet – ISD312 NLTK dan Python 19
20. 20.  Semakin dekat path antara dua lemma, semakin mirip makna semantik kedua lemma tersebut >>> right = wn.synset(right_whale.n.01) >>> orca = wn.synset(orca.n.01) >>> minke = wn.synset(minke_whale.n.01) >>> tortoise = wn.synset(tortoise.n.01) >>> novel = wn.synset(novel.n.01) WordNet – ISD312 NLTK dan Python 20
21. 21. >>> right.lowest_common_hypernyms(minke)[Synset(baleen_whale.n.01)]>>> right.lowest_common_hypernyms(orca)[Synset(whale.n.02)]>>> right.lowest_common_hypernyms(tortoise)[Synset(vertebrate.n.01)]>>> right.lowest_common_hypernyms(novel)[Synset(entity.n.01)] WordNet – ISD312 NLTK dan Python 21
22. 22. >>> wn.synset(baleen_whale.n.01).min_depth()14>>> wn.synset(whale.n.02).min_depth()13>>> wn.synset(vertebrate.n.01).min_depth()8>>> wn.synset(entity.n.01).min_depth()0 WordNet – ISD312 NLTK dan Python 22
23. 23. >>> right.path_similarity(minke)0.25>>> right.path_similarity(orca)0.16666666666666666>>> right.path_similarity(tortoise)0.076923076923076927>>> right.path_similarity(novel)0.043478260869565216 WordNet – ISD312 NLTK dan Python 23
24. 24.  WordNet Bahasa Indonesia Thesaurus Bahasa Indonesia  kateglo Membuat WordNet secara otomatis  mengidentifikasi kemunculan bahasa gaul  sesuatu banget, rempong, jablay, lebay VerbNet  nltk.corpus.verbnet WordNet – ISD312 NLTK dan Python 24
25. 25.  Temukan semua senses dari kata dish  menurut penguasaan bahasa Inggris anda  menurut cara yang dibahas di kelas WordNet – ISD312 NLTK dan Python 25
26. 26.  Soal nomor 27: The polysemy of a word is the number of senses it has. Using WordNet, we can determine that the noun dog has seven senses with len(wn.synsets(dog, n)). Compute the average polysemy of nouns, verbs, adjectives, and adverbs according to WordNet. WordNet – ISD312 NLTK dan Python 26
27. 27.  http://www.nltk.org/book KAmus, TEsaurus, dan GLOsarium, http://bahtera.org/kateglo http://www.sinonimkata.com/ http://tjerdastangkas.blogspot.com/search/label/isd312 WordNet – ISD312 NLTK dan Python 27
28. 28.  Lexical richness Perbandingan jumlah tokens dengan jumlah kata unik  len(text1) / len(set(text1))  Integer division  from __future__ import division Jumlah kemunculan sebuah token  text1.count(whale)  100 * text1.count(whale) / len(text1) WordNet – ISD312 NLTK dan Python 28
29. 29. >>> def lexicalDiversity(text):... return len(text) / len(set(text))>>> def percentage(count, total):... return 100 * count / totallexicalDiversity(text5)percentage(text1.count(whale), len(text1)) WordNet – ISD312 NLTK dan Python 29
1. #### A particular slide catching your eye?

Clipping is a handy way to collect important slides you want to go back to later.