Your SlideShare is downloading. ×
Semantic video classification based on subtitles and domain terminologies
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Semantic video classification based on subtitles and domain terminologies

98
views

Published on

論文討論

論文討論

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
98
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. SEMANTIC VIDEO CLASSIFICATION BASED ON SUBTITLES AND DOMAIN TERMINOLOGIES “基於字幕以及領域術語學為基 礎的影⽚片語義分群” FROM:KAMC 07’ 1ST INTERNATIONAL WORKSHOP ON KNOWLEDGE ACQUISITION FROM MULTIMEDIA CONTENT EDITOR: POLYXENI KATSIOULI, VASSILEIOS TSETSOS, 
 STATHES HADJIEFTHYMIADES 報告者:蘇⿍鼎⽂文 指導教授:林熙禎
  • 2. MOTIVATION
  • 3. 新教育革命
 當中國學生不用花半毛錢在家
 就能上到美國的知名大學課程
  • 4. 慕課: ⼀一場新教育⾰革命 免費教育網路服務:Coursera 已經有700萬註冊學 ⽣生,超過英國和法國⼤大學⽣生⼈人數的總和。 Coursera 使⽤用者中,三分之⼀一來⾃自於發展中的經濟 體。
  • 5. What is MOOC ⼤大規模網路免費公開課程(Massive Open Online Course) 源於開放教育資源的教育理念 焦點著重於如何使學⽣生更輕易取得e化教學、更能永 續經營e化教學 能⾃自由取得資源 沒有學⽣生⼈人數限制
  • 6. MOOC的優點 只需要網路連線就可以線上學習 ⾃自由分享、⾃自由批評和⾃自由瀏覽 課程彈性 Free!!
  • 7. MOOC的挑戰 容易困惑或迷失⽅方向 需要具備⾃自我管理的學習
 態度
  • 8. Guided Learning 在Video-sharing educational tool applied to the teaching in renewable energy subjects 論⽂文中實驗 證明能夠⽤用⼀一個影⽚片學習系統幫助學⽣生提⾼高學習能 ⼒力以及動機 但影⽚片由專家⼿手動加⼊入費時且無法⾃自動化 是否能夠應⽤用Youtube海量影⽚片庫來幫助?
  • 9. ⾃自動分類影⽚片的⽅方法
  • 10. Text MetaData Title, Description, Tags Entity Extraction from consistent text
  • 11. A/V Features Audio and Video signal classification ideal for games Less ideal for general content
  • 12. Video Context Entities from context Comments Web embeds User engagement
  • 13. 問題 在Youtube的教育影 ⽚片,Text MetaData 通常內容都太少了 畫⾯面、⾳音訊處理較困 難且處理成本較重 是否有其他可⽤用⽂文字 的⽅方式帶來較好的解 決⽅方法?
  • 14. Subtitle
  • 15. Subtitle
  • 16. Abstract An unsupervised approach to classify video content by analyzing the corresponding subtitles Based on the WordNet and WordNet domains Apply natural language processing techniques on video subtitles
  • 17. INTRODUCTION
  • 18. semantic information from multimedia content multimedia databases gain more and more popularity a critical and challenging topic explore efficient ways to index their content based on its features and semantics
  • 19. Subtitles carry information through natural 
 language sentences may not be able to detect all video semantics, but have several benefits: more lightweight process than video and audio processing high-level semantics are more closely related to human language
  • 20. RELATED WORK
  • 21. Semantic Video Indexing and Summarization Using Subtitles partitions the script in segments represents each one as a term frequency inverse document frequency (TF-IDF) vector video retrieval and summarization are described through the application of machine learning techniques
  • 22. MUMIS project use of natural language processing techniques for indexing and searching multimedia content based on an XML-encoded ontology is applied to textual sources of different type and in different language separately combines the annotations extracted from such sources into one integrated, formal description of their content
  • 23. Semantic principal video shot classification via mixture Gaussian a framework for semantic classification of educational surgery videos, two phases: 1.video content characterization via principal video shots 2.video classification through a mixture Gaussian model
  • 24. Content-based Video Classification Using Support Vector Machines based on low-level features such as color, shape and motion use a Support Vector Machine (SVM) classifier to classify them in one of the following class labels: “cartoons”, “commercials”, “cricket”, “football” and “tennis”
  • 25. Text Classification Decision trees are one of most important and successful machine learning technique leaves represent classifications branches correspond to the combinations of attributes that leads to those classifications In this paper, we compare the proposed method for classification with a decision tree classifier
  • 26. WORDNET AND WORDNET DOMAINS
  • 27. WordNet a large dictionaries(or lexical database)! English nouns, verbs, adjectives and adverbs are grouped into sets of “synsets” Synset contains a group of synonymous words or collocations
  • 28. V.S. Traditional dictionaries Traditional dictionaries are arranged alphabetically WordNet is arranged semantically EX: noun synset {base, alkali} noun synset {basis, base, foundation, fundament, groundwork, cornerstone} verb synset {establish, base, ground, found}.
  • 29. semantic relations Most synsets are connected to other synsets through a number of semantic relations noun synsets are related through hypernymy (generalization), hyponymy (specialization), holonymy (whole of), and meronymy (part of) relations
  • 30. semantic relations Example artefact: 
 root sysnset motorcar與motorVehicle 互為Hypernyms &Hyponyms
  • 31. WordNet domains augmenting WordNet with domain labels approximately 200 domain labels enhances WordNet synsets If none of the domain labels is adequate for a specific synset, the label Factotum is assigned to it (almost 35% synsets)
  • 32. Example Fig. 1. Some senses of the word "plant" with their corresponding domains
  • 33. SCHEME
  • 34. Step 1: Text Preprocessing subtitles are segmented into sentences POS tagger is applied to the words of each phrase stop words are removed as they carry no semantics and do not contribute to the understanding of the main text concepts
  • 35. Keywords Extraction identify and select only the most important and relevant subtitle words for further classifying the video implemented the TextRank algorithm The number of keywords extracted is based on the size of the text
  • 36. TextRank completely unsupervised graph-based ranking model keywords extraction or text summarization 利⽤用投票的原理,讓每⼀一個單字給它的鄰居投贊成 票,票的權重取決於⾃自⼰己的票數 derived from Google’s PageRank algorithm
  • 37. Step 3: Word Sense Disambiguation Most words in natural language are characterized by polysemy Ex: BANK
  • 38. Step 3: Word Sense Disambiguation Most words in natural language are characterized by polysemy Ex: BANK 銀⾏行
  • 39. Step 3: Word Sense Disambiguation Most words in natural language are characterized by polysemy Ex: BANK 銀⾏行 河岸 斜坡
  • 40. WSD algorithm adaptation of Lesk’s algorithm for WSD Lesk’s algorithm: based on glosses found in traditional dictionaries assigned the sense whose gloss shares the largest number of words with the glosses of the other words in the context
  • 41. Extend Lesk’s algorithm using WordNet to include the related words’ glosses through semantic relations ex:hyponym, hypernym ⽐比較容易在上位或下位詞中找到相關字詞
  • 42. Example he sat on the bank of the river
  • 43. Example he sat on the bank of the river Lesk’s algorithm Sit river
  • 44. Example he sat on the bank of the river Lesk’s algorithm Sit river Extend Version stream, watercourse lounge Sprawl
  • 45. Step 4: WordNet Domains Extraction derive the domains which these synsets correspond to calculate the occurrence score of each domain label and sort them in decreasing order. extract the WordNet domains with the highest occurrence score
  • 46. 圖解 keyword
  • 47. 圖解 keyword Synset
  • 48. 圖解 keyword Synset Domain X keyword Synset Domain X keyword Synset Domain Y keyword Synset Domain Z
  • 49. 圖解 keyword Synset Domain X Wv keyword Synset Domain X keyword Synset Domain Y keyword Synset Domain Z Dx Dy Dz
  • 50. Step 5: Definition of correspondences between category labels and WordNet domains choose the most appropriate class label First, we looked up in WordNet the senses related to each category label obtained the WordNet domains that correspond to the senses of each category calculated for each category the occurrence score of each of the derived domains
  • 51. Dc Sense Sense Sense Sense Dc’
  • 52. c Dc Sense Sense Sense Sense Dc’
  • 53. c Dx Dy Dz Dc Sense Sense Sense Sense Dc’
  • 54. c Dx Dy Dz Dc Sense Sense Sense Sense Dx Dx Dy Dz Dc’
  • 55. Step 6: Category label assignment top-ranked WordNet domains(Step5) Video’s set of the WordNet domains (Step 4) STEP5 STEP4
  • 56. proposed deals with assigning a category label to the video entity
  • 57. Equation(1) C be the set with all the category labels D the set of all the WordNet domains that correspond to each category label D = {Dc ' } c∈C ∪
  • 58. D
  • 59. D
  • 60. D c1c2c3cN
  • 61. D c1c2c3cN Dx Dy Dz
  • 62. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy
  • 63. Equation(2) checking which category c ∈ C satisfies equation classifies video v under the category c If more than one candidate, compare the second elements and so on Dc ' [0] = Wv[0]
  • 64. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy
  • 65. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz
  • 66. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz
  • 67. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz
  • 68. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz Cv
  • 69. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz Cv
  • 70. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz Cv c1 c3
  • 71. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz Cv c1 c3
  • 72. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz Cv c1 c3
  • 73. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz Cv c1 c3
  • 74. D c1c2c3cN Dx Dy Dz Da Dc Db Dx Dy Db Dc Dy Wv Dx Dy Dz Cv c1
  • 75. EXPERIMENT
  • 76. Experiment on documentary 36 documentaries and General types for documentary Geography, History, Animals, Politics… easier to classify documentaries usually restricted to a specific domain contain narrative
  • 77. statistical information approximately 44% of all the WordNet domains extracted from each video are assigned the label ‘Factotum
  • 78. Evaluation Classification Accuracy reflects the proportion of the classifier’s correct category assignments that agree with the user’s assignments used the Recall and F-measure performance measures to evaluate the classification results for each individual category
  • 79. Domains and category
  • 80. comparison results were compared to those obtained from decision tree classifier J4.8 of the WEKA tool results obtained are very promising since it achieved an accuracy value of 69.4% Expected distance between J4.8 as unsupervised method
  • 81. POLYSEMA Platform have been carried out in the context of the POLYSEMA project develops an end-to-end platform for interactive TV services by exploiting the metadata of the broadcast transmission
  • 82. POLYSEMA Platform present work is part of the activity in Development of semantics extraction techniques for automatic annotation of audiovisual content Three kinds of techniques are currently investigated: video summarization domain ontology learning video classification
  • 83. CONCLUSION
  • 84. Look back an innovative method for unsupervised classification of video content applying natural language processing techniques on their subtitles promising experimental results using documentaries, especially given the fact that no training phase is required.
  • 85. Improvement video segments & Subtitle Segments Compare to other text classification algorithms (mainly unsupervised approaches) define more knowledge domains more close to the movie classification keywords extraction algorithm
  • 86. Comment 基於字幕的Text mining⽅方式多採取Entity Extraction的 ⽅方法,近來則也有採MWH(multi-wing Harmoniums), Entity’s temporal features analysis的部分 作為unsupervised的⽅方式,其Category與Domain Label之間的Mapping為靜態建構,若要動態調整的時 候應該不容易 ⺫⽬目前採取Single Topic Single Video的⽅方式,但⼀一部影 ⽚片可能會不⽌止⼀一個議題,Video Segment的⽅方式⾃自動 化可能不容易,有辦法發現Topic shifting?
  • 87. Comment 現在網路教育資源不斷出現但通常難以被普通⼈人接 觸到,缺少了⼀一個整合的系統。 若我們能夠了解影⽚片的語義,那我們也許有機會可 以做出⼀一些有⽤用的應⽤用。例如幫助學⽣生找到輔助的 教材。