Your SlideShare is downloading. ×
0
多媒體資料庫
提綱 <ul><li>簡介 </li></ul><ul><li>多媒體資料庫的挑戰 </li></ul><ul><li>多維度索引技術 </li></ul><ul><li>文件資料庫 </li></ul><ul><li>影像資料庫 </li><...
簡介 <ul><li>多媒體資料與傳統資料庫之比較 </li></ul><ul><ul><li>資料內容 </li></ul></ul><ul><ul><ul><li>傳統資料庫  </li></ul></ul></ul><ul><ul><ul...
簡介 <ul><li>範例 </li></ul><ul><ul><li>將圖片以傳統資料庫的方式處理儲存 </li></ul></ul><ul><ul><ul><li>可下達的查詢 </li></ul></ul></ul><ul><ul><ul...
簡介 <ul><li>多媒體資料庫必須能提供 </li></ul><ul><ul><li>有效率之多媒體資料之儲存 </li></ul></ul><ul><ul><li>提供內涵式資料的查詢 </li></ul></ul><ul><ul><ul...
多媒體資料庫的挑戰 <ul><li>大量資料之處理 </li></ul><ul><ul><li>多媒體資料所需之儲存空間比一般資料大得多 </li></ul></ul><ul><li>多維資料之索引 </li></ul><ul><ul><li>...
多維度索引技術 <ul><li>如何將使用者查詢的結果快速正確的回傳,是很重要的問題 </li></ul><ul><ul><li>資料量大,逐筆搜尋比對耗費過多的時間 </li></ul></ul><ul><ul><li>避免逐筆比對搜尋 </...
多維度索引技術 <ul><li>在傳統資料庫中常見的索引結構 </li></ul><ul><ul><li>B+-tree  </li></ul></ul><ul><ul><ul><li>最廣為使用的索引結構 </li></ul></ul></u...
B +  -Tree  簡介  <ul><li>B +  -Tree  為一樹結構,且符合下列特性 </li></ul><ul><ul><li>為一棵平衡樹,所有的葉節點到根節點的路徑長度皆相同 </li></ul></ul><ul><ul><...
B +  -Tree  節點結構  <ul><ul><li>K i   為搜尋值  </li></ul></ul><ul><ul><li>P i   為指向子節點的指標  (for nonleaf nodes)  或為指向資料的指標  (for...
葉節點結構  <ul><li>葉節點之特性 </li></ul><ul><li>對於  i = 1, 2, . . . , n-1 , i 不是指向一個擁有搜尋值 K i 的資料記錄就是指向一個存取單元 ((bucket) ,而這個存取單元只包...
非葉節點結構  <ul><li>在被 P i 指到的搜尋樹內所有的搜尋值皆小於 K i-1   </li></ul><ul><li>在被 P i 指到的搜尋樹內所有的搜尋值皆大於或等於 K i </li></ul>
範例
討論 <ul><li>B+-tree 對於傳統表單資料庫的搜尋十分有效率,且廣為被使用 </li></ul><ul><li>然而 </li></ul><ul><ul><li>B+-tree 為單一維度的索引結構 </li></ul></ul><...
<ul><li>多媒體資料的特徵 </li></ul><ul><ul><li>文件 </li></ul></ul><ul><ul><ul><li>內容 </li></ul></ul></ul><ul><ul><ul><li>關鍵字 </li><...
<ul><ul><li>音樂 </li></ul></ul><ul><ul><ul><li>節拍 </li></ul></ul></ul><ul><ul><ul><li>和絃 </li></ul></ul></ul><ul><ul><ul><l...
<ul><li>一個多媒體資料是由多個特徵所描述,可由多維資料表示 </li></ul><ul><li>然而 B-tree, B+-tree </li></ul><ul><ul><li>僅能對單一維度的資料做索引 </li></ul></ul>...
多維度上的索引結構 <ul><li>k-d tree </li></ul><ul><ul><li>用來儲存 k –dimension 的資料 </li></ul></ul><ul><ul><li>在一個層級 (level) 中只比一個維度的資料...
<ul><ul><ul><li>範例 </li></ul></ul></ul>
 
<ul><ul><li>隨堂練習 </li></ul></ul><ul><ul><ul><li>考慮當 k>2 時的 k-d tree </li></ul></ul></ul><ul><ul><ul><ul><li>自己試試看 </li></u...
<ul><ul><li>優點 </li></ul></ul><ul><ul><ul><li>簡單 </li></ul></ul></ul><ul><ul><li>缺點 </li></ul></ul><ul><ul><ul><li>樹的高度會因資...
多維度上的索引結構 <ul><li>Mx-quadtree </li></ul><ul><ul><li>樹的形狀與插入的點的個樹以及順序無關。 </li></ul></ul><ul><ul><li>設計者必須決定一個 k ,而 k 一旦決定,則...
<ul><ul><ul><li>範例 </li></ul></ul></ul><ul><ul><ul><ul><li>假設 k=2 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>地圖被切成  個格子 ...
 
多維度上的索引結構 <ul><li>R-tree </li></ul><ul><ul><li>為一棵平衡的樹 </li></ul></ul><ul><ul><li>針對大量資料的儲存十分有用 </li></ul></ul><ul><ul><li...
<ul><ul><li>葉節點包含真正的資料 </li></ul></ul><ul><ul><li>中間節點包含真正資料的群組輪廓,以長方形來表示 </li></ul></ul><ul><ul><ul><li>左上角以及右下角 </li></u...
<ul><ul><li>範例 </li></ul></ul><ul><ul><ul><li>總共有八個物件 </li></ul></ul></ul><ul><ul><ul><li>兩維空間 </li></ul></ul></ul><ul><ul...
插入 p 14 R1 R2 R3 R4 R5 p6 p7 p5 p1 p2 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 p7 p6 p8...
inserting   p14 R1 R2 R3 R4 R5 p6 p7 p5 p1 p2 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 ...
<ul><ul><li>刪除 p 2 </li></ul></ul>p14 R1 R2 R1 R2 R3 R4 R5 p6 p7 p5 p1 p2 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 ...
<ul><ul><ul><li>找出包含 P 2 的 MBR </li></ul></ul></ul>R1 R2 R1 R2 R3 R4 R5 p6 p7 p5 p1 p2 Pointers to data tuples p8 p3 p4 p9...
<ul><ul><li>R 3 不滿足 R-tree 的定義 (underflow) </li></ul></ul>R1 R2 R1 R2 R3 R4 R5 p6 p7 p5 p1 Pointers to data tuples p8 p3 p...
<ul><ul><li>與鄰近的 Bounding rectangle 合併  </li></ul></ul>R1 R2 R1 R2 R3 R4 R5 p6 p7 p5 p1 Pointers to data tuples p8 p3 p4 p...
<ul><ul><ul><li>將 R 3 和 R 4 重新整理,修改個別的左上角以及右上角之值 </li></ul></ul></ul>R1 R2 R3 R4 R5 p6 p7 p5 p1 Pointers to data tuples p8...
另一種簡單之整合單一維度索引之多維索引結構 <ul><li>一個多媒體物件會包含的特徵為多維的 </li></ul><ul><ul><li>假設一張圖片我們以平均的 R( 紅 ), G( 綠 ), B( 藍 ) 當為特徵 </li></ul><...
<ul><li>假設資料庫內有十張圖 </li></ul>P 1 P 3 P 2 P 5 P 4
P 6 P 10 P 9 P 8 P 7
<ul><li>有下列三個查詢 </li></ul>
<ul><li>我們得到下列的 R,G,B 的平均值 </li></ul>
<ul><li>Case 1:  假設我們要找到與查詢圖片 Q 1 相似度在 0.15  內的圖片 </li></ul><ul><ul><li>Q 1  =(0.478, 0.541, 0.753), r=0.15 </li></ul></ul...
<ul><li>Case 2: 假設我們要找到與查詢圖片 Q 2 相似度在 0.02  內的圖片 </li></ul><ul><ul><li>Q 2 =(0.302,0.310,0.416), r=0.02 </li></ul></ul><ul...
<ul><ul><li>第二步驟 : 由第一個維度開始檢查,並將容許的錯誤減掉之前已用到的額度 </li></ul></ul><ul><ul><ul><li>第一維可用的誤差額度為  :0.02 </li></ul></ul></ul><ul>...
<ul><li>Case 3: 假設我們要找到與查詢圖片 Q 3 相似度在 0.05  內的圖片 </li></ul><ul><ul><li>Q 3 =(0.302, 0.223, 0.161), r=0.05 </li></ul></ul><...
<ul><ul><li>搜尋 G 上的索引 , 找出在 0.223+-0.05 =[0.173,0.273] 範圍內的圖 </li></ul></ul><ul><ul><ul><li>{P 7 , P 6 , P 2 } </li></ul><...
演算法
 
 
 
 
 
文件資料庫 <ul><li>導論 </li></ul><ul><ul><li>文件內容的分析 </li></ul></ul><ul><ul><ul><li>同意字 (Synonymy) </li></ul></ul></ul><ul><ul><...
文件資料庫 <ul><ul><ul><li>Precision= </li></ul></ul></ul><ul><ul><ul><li>Recall= </li></ul></ul></ul>所有的文件 相關的文件 搜尋所得之結果
文件資料庫 <ul><ul><ul><li>Precision/recall  之計算範例 </li></ul></ul></ul><ul><ul><ul><ul><li>請探討 precision/recall 之關係 </li></ul><...
文件資料庫 <ul><li>文件內容之描述 </li></ul><ul><ul><li>Stop lists </li></ul></ul><ul><ul><ul><li>文件內可被忽略的字,如 : a, the, he… </li></ul>...
文件資料庫 <ul><li>查詢處理 </li></ul><ul><ul><li>文件相關性之計算 </li></ul></ul><ul><ul><ul><li>字詞距離 </li></ul></ul></ul><ul><ul><ul><li>...
文件資料庫 <ul><ul><li>查詢型態 </li></ul></ul><ul><ul><ul><li>找出包含某些字詞的文件 </li></ul></ul></ul><ul><ul><ul><li>找出包含某些字詞但不包含另一些字詞的文件...
文件資料庫 <ul><ul><li>使用索引 </li></ul></ul><ul><ul><ul><li>R-tree </li></ul></ul></ul><ul><ul><ul><ul><li>不適用於高維索引結構 </li></ul>...
文件資料庫 <ul><li>Inverted list( 反轉串列 ) </li></ul><ul><ul><li>以字詞為主所形成的反轉表 </li></ul></ul><ul><ul><li>以  table   為例 </li></ul>...
文件資料庫 <ul><li>Signature files </li></ul><ul><ul><li>每個關鍵字有它所對應的 code </li></ul></ul><ul><ul><li>對一文件而言 , 該文件的 signature 即為...
文件資料庫 <ul><ul><ul><li>討論 </li></ul></ul></ul><ul><ul><ul><ul><li>R-tree, TV-tree 可處理相似度的查詢 </li></ul></ul></ul></ul><ul><u...
影像資料庫 <ul><li>查詢範例 </li></ul><ul><ul><li>範例一 : 找出與這張圖相像的圖片 </li></ul></ul><ul><ul><li>範例二 : 找出左上角有一個紅色方形,而圖形的下方為藍色的所有圖片 </...
<ul><ul><li>與影像內涵資訊相關的特徵 </li></ul></ul><ul><ul><ul><li>顏色分佈 </li></ul></ul></ul><ul><ul><ul><ul><li>可以 color histogram 表示...
影像資料庫搜尋  <ul><li>由關鍵字查詢 (Query By Keyword) </li></ul><ul><ul><li>以文字屬性描述每張影像,可以對個個屬性建構索引,並可以 SQL 的方式下查詢 </li></ul></ul><ul...
影像距離與相似度 <ul><li>Color  Similarity </li></ul><ul><li>Texture  Similarity </li></ul><ul><li>Shape  Similarity </li></ul><ul...
顏色相似度  (Color Similarity) <ul><li>顏色佔的比例 </li></ul><ul><ul><li>Ex:  R:20%, G:50%, B:30% </li></ul></ul><ul><li>顏色分布圖 (Colo...
顏色配置 <ul><li>Color layout matching : compares each grid square of the query to the corresponding grid square of a potentia...
材質相似度  (Texture Similarity) <ul><li>Pick and click </li></ul><ul><ul><li>Suppose  T(I)  is a  texture description vector  ...
形狀相似度  (Shape Similarity) <ul><li>Shape Histogram </li></ul><ul><li>Boundary Matching  </li></ul><ul><li>Sketch Matching <...
<ul><li>以內容來看,三張圖相像嗎 ? </li></ul><ul><ul><li>顏色資訊 </li></ul></ul><ul><ul><li>位置資訊 </li></ul></ul>實作範例 Image-A  Image-B  Im...
Color model RGB color space  v.s.  HSV color space
<ul><li>顏色與位置資訊的取得 </li></ul><ul><ul><li>將圖切成一個一個的格子 </li></ul></ul><ul><ul><li>找出每個格子的代表色 </li></ul></ul><ul><ul><li>相鄰格子...
 
相似度比較 <ul><li>兩張圖要相似有哪些因素是可能被使用者考慮的 ? </li></ul><ul><ul><li>顏色配置 </li></ul></ul><ul><ul><li>顏色分布 </li></ul></ul><ul><ul><l...
 
 
範例 SIZE = 13 + 10 + 5 = 28 Query image A 13 B 10 C 5
<ul><li>聽看看下面幾首音樂或音樂片段,你知道歌名是什麼嗎?   Music  1   2   3   4   5   6   7   8   9   10 </li></ul><ul><li>你是怎麼辦識出這首歌的呢?若要讓電腦幫我們做...
音樂的特徵 <ul><li>Static Music Information 如 調號、拍號等 </li></ul><ul><li>Acoustical Feature 如 loudness 、 pitch 等 </li></ul><ul><l...
特徵的取樣 <ul><li>相對音感 vs 絕對音感—旋律的位移 </li></ul><ul><ul><li>考慮以絕對音感比對所會造成的問題 </li></ul></ul><ul><ul><ul><li>升 key, 降 key 所發生的問題...
特徵的編碼 <ul><li>將特徵取出後,依適當的編碼方式將特徵標碼 </li></ul><ul><ul><li>能應付音調的升降 </li></ul></ul><ul><ul><li>能應付節拍的快慢 </li></ul></ul><ul><...
範例 <ul><li>利用重複出現的重要音調代表某首歌 </li></ul><ul><ul><li>Hierarchical rule music object->movements->Sentences-> phrases->figures ...
重複出現的式樣—定義 <ul><li>For a substring X of a sequence of notes S, if X appears more than once in S, we call X a repeating pat...
重複出現的式樣—實例 <ul><li>“ C-D-E-F-C-D-E-C-D-E-F” RP:Repeating Pattern RPF:Repeating Pattern Frequency </li></ul>2 3 3 3 2 RPF F...
重複出現的式樣 <ul><li>nontrival 的定義 A repeating pattern X is nontrivial if and only if there does not exist another repeating pa...
The Correlative-Matrix(1) <ul><li>Phrase </li></ul><ul><li>Melody string S  = “C6-Ab5-Ab5-C6-C6-Ab5-Ab5-C6-Db5-c6-Bb5-C6” ...
The Correlative-Matrix(2) Construction of correlative matrix T 12,12 -- C6 -- Bb5 1 -- C6 -- Db5 1 1 -- C6 -- Ab5 1 -- Ab5...
The Correlative-Matrix(3) <ul><li>Find all RPs and their RFs. </li></ul><ul><ul><li>定義 candidate set CS  其格式為 (pattern,rep...
The Correlative-Matrix(4) <ul><ul><li>Case 1: (T i,j =1) and (T (i+1),(j+1) =0) 例 T 1,4 =1,T 2,5 =0 insert(“C6”,1,0)into C...
The Correlative-Matrix(5) <ul><ul><li>計算 RF rep_count=0.5f(f-1)  即   f=((1+SQRT(1+8*rep_count))/2 例如本例中  (“C6”,15,1), 即 C6...
The String-Join Approach(1) <ul><li>Melody string  “C-D-E-F-C-D-E-C-D-E-F” </li></ul><ul><li>第一步 : 找出所有長度為 1 的 RPs, 並記為 {X...
The String-Join Approach(2) <ul><li>接下來長度為 2 的 RPs 可由上面的 RPs 經 joining( 記為“∞” ) 而得 例如若要找“ C-D”,  已知 {“C”,3,(1,5,8)},{“D”,3...
The String-Join Approach(3) <ul><li>同理 {“D”,3,(2,6,9)}∞{“E”,3,(3,7,10)}   ={“D-E”,3,(2,6,9)} {“E”,3,(3,7,10)}∞{“F”,2,(4,11...
The String-Join Approach(4) <ul><li>長度為 3 的 , 因為 freq(“C-D-E-F”)=freq(“E-F”)=2, 可知不只“ E-F” 是 trivial,”D-E-F” 也是  ( 否則 freq...
討論 <ul><li>相對音感 vs 絕對音感—旋律的位移 </li></ul><ul><li>依完整段落取 pattern </li></ul><ul><li>不同音樂格式的轉換 </li></ul><ul><li>問題—重要卻沒重覆的 fe...
視訊資料庫 <ul><li>內容組織 </li></ul><ul><ul><li>使用者會對哪一部分的內容感興趣 </li></ul></ul><ul><ul><li>如何儲存這部分的內容,使得查詢處理能很有效率的被執行 </li></ul><...
影片內涵資訊 <ul><li>物件 </li></ul><ul><ul><li>單純形狀的描述 </li></ul></ul><ul><ul><ul><li>可做到自動化 </li></ul></ul></ul><ul><ul><li>有意義的...
<ul><li>活動 </li></ul><ul><ul><li>單純描述 </li></ul></ul><ul><ul><ul><li>物件移動軌跡 </li></ul></ul></ul><ul><ul><ul><ul><li>如何將軌跡編...
視訊內涵資訊之建構 <ul><li>兩種資訊 </li></ul><ul><ul><li>靜態 </li></ul></ul><ul><ul><ul><li>將一個 frame 視為一張圖片 </li></ul></ul></ul><ul><u...
<ul><li>靜態資訊 </li></ul>
<ul><li>動態資訊 </li></ul>
Preface  (Cont’d ) <ul><li>移動軌跡 </li></ul>
<ul><li>影片分析 </li></ul><ul><ul><li>Shot </li></ul></ul><ul><ul><ul><li>單一連續的鏡頭所拍攝之影片段落 </li></ul></ul></ul><ul><ul><ul><li...
<ul><ul><li>場景 (SCENE) </li></ul></ul><ul><ul><ul><li>由多個描述相同事件的 shot 所組成 </li></ul></ul></ul><ul><ul><ul><li>可當作查詢的單位 </l...
涵義概念式查詢 <ul><li>可下達語意式的查詢 </li></ul><ul><ul><li>找出包含天空以及海的圖片 </li></ul></ul><ul><ul><li>找出有飛機飛過天空的影片 </li></ul></ul><ul><l...
Classification <ul><li>目標 </li></ul><ul><ul><li>預測資料之類別 </li></ul></ul><ul><li>步驟 </li></ul><ul><ul><li>建立資料分類模型 </li></ul...
Training Data Classification Algorithms IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’  Classifier (Model)
Classifier Testing Data Unseen Data (Jeff, Professor, 4) Tenured?
Classification <ul><li>演算法 </li></ul><ul><ul><li>決策樹 (decision tree) </li></ul></ul><ul><ul><li>Bayesian Belief Networks <...
訓練資料集 (training data set)
決策樹 age? overcast student? credit rating? no yes fair excellent <=30 >40 no no yes yes yes 30..40
Naïve bayesian Network :example P(n) = 5/14 P(p) = 9/14
P(true|n) = 3/5 P(true|p) = 3/9 P(false|n) = 2/5 P(false|p) = 6/9 P(high|n) = 4/5 P(high|p) = 3/9 P(normal|n) = 2/5 P(norm...
Play-tennis example: classifying X <ul><li>An unseen sample X = <rain, hot, high, false> </li></ul><ul><li>P(X|p)·P(p) =  ...
Bayesian Belief Networks Family History LungCancer PositiveXRay Smoker Emphysema Dyspnea LC ~LC (FH, S) (FH, ~S) (~FH, S) ...
Bayesian Belief Networks
The  k -Nearest Neighbor Algorithm .  _ + _ x q + _ _ + _ _ + . . . . .
Rough Set Approach <ul><li>Rough sets are used to approximately or “roughly” define equivalent classes  </li></ul><ul><li>...
Fuzzy set approach
Association pattern mining <ul><li>目標 </li></ul><ul><ul><li>尋找項目 (item) 或物件間的關聯性 </li></ul></ul><ul><ul><li>關聯性 </li></ul>...
探勘關聯式法則 : 範例 <ul><li>For rule  A      C : </li></ul><ul><ul><li>support = support({ A   C }) = 50% </li></ul></ul><ul><ul...
Apriori  演算法 <ul><li>Join Step :  C k   is generated by joining L k-1 with itself </li></ul><ul><li>Prune Step :  Any (k-1...
範例 Database D Scan D C 1 L 1 L 2 C 2 C 2 Scan D C 3 L 3 Scan D
FP-tree  演算法 <ul><li>把一大型資料庫壓縮至一緊實的資料結構 </li></ul><ul><ul><li>FP-tree </li></ul></ul><ul><ul><ul><li>只包含探勘關聯式樣式所需之相關資料 </l...
FP-tree  建置過程 min_support = 0.5 TID Items bought   (ordered) frequent items 100 { f, a, c, d, g, i, m, p } { f, c, a, m, p...
{} f:4 c:1 b:1 p:1 b:1 c:3 a:3 b:1 m:2 p:2 m:1 Header Table Item  frequency  head  f 4 c 4 a 3 b 3 m 3 p 3
FP-tree  主要探勘過程 <ul><li>對 FP-tree 內的每個 node, 建置  conditional pattern base </li></ul><ul><li>對每一個 conditional pattern-base ...
Step 1: 對 FP-tree 內的每個 node, 建置  conditional pattern base Conditional  pattern bases item cond. pattern base c f:3 a fc:3 ...
Step 2: 對每一個 conditional pattern-base  建置 conditional FP-tree All frequent patterns concerning  m m,  fm, cm, am,  fcm, fa...
Mining Frequent Patterns by Creating Conditional Pattern-Bases Empty Empty f {(f:3)}|c {(f:3)} c {(f:3, c:3)}|a {(fc:3)} a...
Step 3: Recursively mine the conditional FP-tree Cond. pattern base of “am”: (fc:3) Cond. pattern base of “cm”: (f:3) {} f...
效能分析 Data set T25I20D10K
Association pattern mining <ul><li>傳統 Association pattern mining 幾乎都是找出項目和項目間的關聯性 </li></ul><ul><li>在多媒體應用中 </li></ul><ul>...
Concepts與Semantic network <ul><li>概念 (concepts) </li></ul><ul><ul><li>知識表達之基本觀念 </li></ul></ul><ul><ul><li>Semantic notion...
Concepts與Semantic network <ul><ul><li>概念之間的關係 </li></ul></ul><ul><ul><ul><li>多重解析度 (multi-resolution) </li></ul></ul></ul>
Concepts與Semantic network <ul><li>Semantic network </li></ul><ul><ul><li>節點 </li></ul></ul><ul><ul><ul><li>物件 , 觀念或狀態 </li...
 
<ul><li>參考資料 </li></ul><ul><ul><li>V.S. Subrahmanian, Principles of Multimedia Database Systems, Morgan Kaufmann. </li></u...
Content-Based Interactivity
 
Paper study : topic 1 A Semantic Modeling Approach for Video Retrieval by Content Edoardo Ardizzone Mohand-Said Hacid ICMC...
Introduction <ul><li>Using keywords or free text to describe the necessary semantic objects is  not  sufficient. </li></ul...
Introduction(cont.) <ul><li>We exploit the 2 languages </li></ul><ul><li>One for defining the  schema  (i.e. the  </li></u...
2 Layers for video`s conceptual content <ul><li>Object layer: collect objects of interest, their  description and relation...
Schema Language—example1
Query Language (QL) <ul><li>Querying   a DB means  retrieving stored   objects  that satisfy certain conditions or  qualif...
QL cont. <ul><li>Queries  are represented as   concepts  in our  abstract language . </li></ul><ul><li>The  syntax  and   ...
QL- Example  <ul><li>“ Sequences of movies directed by  Kevin Costner  in which he is also an actor” </li></ul>
QL- Example <ul><li>“ the set of movies whose directors are also producers of  some films” </li></ul>
Semantic Annotation of Sports Video <ul><li>Videos isn`t just a sequence of images. It add the temporal dimension. </li></...
Introduction  --Typical sequence of shots in sports video
 
Classifying visual shot features
Implementation --Classifying visual shot features (cont.)
Conclusion <ul><li>There is a growing  interest  in video database and for dealing with  access problems . </li></ul><ul><...
Conclusion cont. <ul><li>This framework is appropriate for supporting  conceptual  and  intensional queries </li></ul><ul>...
Paper study: topic 2 Indexing methods for approximate string matching IEEE data engineering bulletin,2000  Gonzalo Navarro...
outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>...
Introduction <ul><li>Definition </li></ul><ul><ul><li>given a long text T 1…..n  of length n and a comparatively short pat...
outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>...
Suffix trees 1  g a  a c c g a c c t 2  a  a c c g a c c t 3  a  c c g a c c t 4  c  c g a c c t 5  c  g a c c t 6  g a  c...
Suffix array Require less space,about  4  times of text size a $ a  a  a  a  b  b  c  d  r  a b  b  c  d  r  r  a  a r  r ...
Q-grams,Q-samples TEXT 1  2  3  4  5  6  7  8  9  10 11 1 2 3 4 5 INDEX a b r a b r a c r a c a a c a d c a d a 1 8 2 3 4 ...
Edit distance ed(“SURVEY”,”SURGERY”) Final result 2 2 2 3 3 4 5 6 Y 3 2 1 2 2 3 4 5 E 4 3 2 1 1 2 3 4 V 4 3 2 1 0 1 2 3 R ...
outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>...
Neighborhood generation Pattern :abc with 1 error { * bc, a * c,ab * } U {ab,ac,bc} U{ * abc,a * bc,abc * } Text  a b r   ...
outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>...
Partitioning into exact search Pattern :abr with 1 error {a},{br} Text  a   b r   a  c  a  d  a   b r   a {abra},{abra}.. ...
outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>...
Intermediate Partitioning Pattern :abr with 1 error {a},{br} Text  a   b r   a  c  a  d  a   b r   a {abra},{abra}.. resul...
Intermediate Partitioning Pattern :abr with 1 error {abr} Text  a   b r   a  c a d  a b r a {abra},{abra}.. results Partit...
outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>...
summarization
Paper study: topic 3 Lazy Users and automatic  Video  Retrieval Tools in (the) Lowlands The Lowlands Team CWI 1 , TNO 2 , ...
Outline <ul><li>Introduction </li></ul><ul><li>Detector-base processing </li></ul><ul><li>Probabilistic multimedia retriev...
Basic key subject of    Multimedia   database <ul><li>Indexing </li></ul><ul><ul><li>K-d tree, point quadtree, MX-quadtree...
User is always Lazy! <ul><li>Facts 1: </li></ul><ul><ul><li>Almost of end user know nothing about “Query”. </li></ul></ul>...
Introduction  <ul><li>Use two complementary automatic approach: </li></ul><ul><ul><li>Visual content </li></ul></ul><ul><u...
Introduction  Combined 1-4, interactive, by a lazy user 5 Query articulation, interactive  4 Transcript-base, automatic 3 ...
Detector-base processing <ul><li>Architecture for automatic syste </li></ul>
Detector-base processing  (cont) <ul><li>Detector for  exact queries  that yield yes/no answer depending if a set of predi...
Detector-base processing  (cont) Selected detector Analysis of  the topic description Query by  example Filter-out  irrele...
Detectors  <ul><li>Camera technique  detection </li></ul><ul><ul><li>zoom, pan, tilt……. </li></ul></ul><ul><li>Face  detec...
Probabilistic multimedia retrieval <ul><li>We assume our documents are shots from video. </li></ul><ul><li>Models of  disc...
<ul><li>Using Bayes’ rule: </li></ul><ul><li>If a query consists of several independent parts (e.g. a  textual Q t  and vi...
Probabilistic multimedia retrieval <ul><li>Hierarchical  data model  of video </li></ul>video shots scenes scenes shots fr...
Probabilistic multimedia retrieval <ul><li>Text  retrieval </li></ul><ul><ul><li>Using  Sphinx3  speech recognition system...
Probabilistic multimedia retrieval <ul><li>Image  retrieval </li></ul><ul><ul><li>Retrieving the key frames of shots </li>...
Interactive experiments <ul><li>Topic lists. </li></ul><ul><ul><li>http://www-nlpir.nist.gov/projects/t01v/topicsoverview....
Topic 33: White fort Using Run 1:  Any color-based technique worked  out well for this query   Example  known-item keyframe
Topic 19: Lunar rover Color-histogram Example Known-item keyframe <ul><li>Color-based retrieval technique is not useful in...
Topic 8: Jupiter Example Some correct answers keyframes <ul><li>At first though, this query may seem to be easy to solve. ...
Topic 25: Starwar Example Some correct answers keyframes <ul><li>Text retrieval:  (if you know the name) </li></ul><ul><li...
Lazy users <ul><li>Lazy users identify result sets instead of correct answer.  (so our interactive results are not 100% pr...
Discussion <ul><li>How video retrieval systems should be evaluated. </li></ul><ul><li>The inhomogeneity of the topics </li...
Conclusion <ul><li>Our evaluation demonstrates the importance of combining various techniques to analyze the  multiple mod...
Paper study : topic 4 VIDEO INDEXING BY MOTION ACTIVITY MAPS Wei Zeng; Wen Gao; Debin Zhao; Image Processing. 2002. Procee...
Outline <ul><li>Introduction </li></ul><ul><ul><li>motion indexing </li></ul></ul><ul><li>Motion Activity Map---MAM </li><...
Introduction   <ul><li>To find a video indexing technique which could extract crucial information from videos for efficien...
Motion indexing <ul><li>Those techniques and systems about  motion indexing  can be categorized into four types. </li></ul...
Feature-based approach : <ul><li>Computes the motion parameters of predefined motion model. </li></ul><ul><li>Has been ado...
 
Trajectory-based approach : <ul><li>This approach is often chosen by object-based system for indexing video. </li></ul>
 
Semantic-based approach <ul><li>Provides semantic events or actions of motion. </li></ul><ul><li>Reference paper  “A Seman...
 
Image-based approach <ul><li>Gives synthesized pictures generated from motion of video. </li></ul><ul><li>MAM  is the imag...
Concepts of  MAM(1) <ul><li>Motion activity map is an image that accumulates the motion activity on the specific grids alo...
Concepts of MAM(2) <ul><li>It is an image-based representation about the magnitude and spatial distribution of motion. </l...
Definition of MAM(1) <ul><li>Motion activity map is an image synthesized from motion vector field, and motion vector field...
Definition of MAM(2) <ul><li>Base on the motion vector field X(t), the motion activity map (MAM) is computed as </li></ul>...
Generation of MAM Demo video segmentation Hall    shall Motion  vector field Video Video Video Temporal video Segmentation...
Organization of MAMs <ul><li>Video can be segmented into different shot levels such as shots and sub-shots, so there are a...
Organization of MAMs Interactive Video Retrieval Video Video Video Temporal Segmentation MAM Computing Layered spatial seg...
Expermental results (a)Key frame based MAM (b)MAM (c-f)Region-representation of MAM
Conclusion <ul><li>Video  shot  MAM  sub-shot1  MAM1  sub-shot2  MAM2 </li></ul><ul><li>All the MAM could be segment...
Paper study : topic 5 SOM-Base R*-Tree for  Similarity Retrieval Database Systems for Advanced Applications, 2001. Proceed...
Outline <ul><li>Self-Organizing Maps (SOM) </li></ul><ul><li>R*-Tree </li></ul><ul><li>SOM-Based R*-Tree </li></ul><ul><li...
Self-Organizing Maps (SOM) What is SOM 1.SOM provide mapping from high-demensional feature vectors onto a two-dismensional...
<ul><li>我們使用 100 個類神經元排列成 10×10 的二維矩陣來進行電腦模擬,用來進行測試的輸入向量  的維度也是二維的資料,且其機率分佈為均勻地分佈在  。   </li></ul>Self-Organizing Maps (SOM)
圖:均勻分佈之資料的自我組織特徵映射圖: (a) 隨機設定之初始鍵結值向量 ; (b) 經過 50 次疊代後之鍵結值向量 ;(c)  經過 1,000 次疊代 後之鍵結值向量 ;(d)  經過 10,000 次疊代後之鍵結值向量 ; Self-...
<ul><li>類神經元在特徵映射圖中的機率分佈,的確可以反應出輸入向量的機率分佈。這裏要強調一點的是,資料的機率分佈特性並非是線性地反應於映射圖中。   </li></ul>三群高斯分佈之資料。   Self-Organizing Maps ...
Self-Organizing Maps (SOM) SOM  Algorithm 1.Init Map neuron. 2.input feature vector x. 3.find winner neuron  (BMN:Beat-Mat...
R*-Tree <ul><li>The R*-tree improves the performance of the R-tree by modifying the insertion and split algorithms by intr...
R*-Tree <ul><li>Each internal node contais an array of (p,  ) entries.Where p is a pointer to in child node of this inter...
R*-Tree (cont.) Space of point data
R*-Tree (cont.) Tree access structure
SOM-Based R*-Tree <ul><li>1 、 Clustering similar images </li></ul><ul><ul><li>We first generate the topological feature ma...
SOM-Based R*-Tree (cont.)
SOM-Based R*-Tree (cont.) <ul><li>2 、 Construction </li></ul><ul><ul><li>In order to construct the R*-tree, we select a CB...
Experiments <ul><li>We preformed experiments to compare the Som-base with SOM and R*-tree. </li></ul><ul><li>Image databas...
Experiments (cont.) <ul><li>Feature Extraction: </li></ul><ul><ul><li>use Haar waveletes to compute feature vector </li></...
Experiments (cont.) <ul><li>Construcion of SOM-based R*-tree </li></ul>
Experiments (cont.)
Experiments (cont.) <ul><li>We experimented with four type of searches:  </li></ul><ul><li>(I) normal SOM including empty ...
Experiments (cont.) <ul><li>Retrieval from SOM with empty nodes </li></ul><ul><li>Retrieval from SOM without empty nodes <...
Experiments (cont.)
Conclusion <ul><li>For high-dimensional data ,we using a topological feature map and a best-matching-image-list (BMIL) obt...
Upcoming SlideShare
Loading in...5
×

多媒體資料庫(New)3rd

3,287

Published on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,287
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
60
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • 這一章節主要探討收集回來後的多媒體資料,要如何儲存,處理以及搜尋的相關技術
  • Transcript of "多媒體資料庫(New)3rd"

    1. 1. 多媒體資料庫
    2. 2. 提綱 <ul><li>簡介 </li></ul><ul><li>多媒體資料庫的挑戰 </li></ul><ul><li>多維度索引技術 </li></ul><ul><li>文件資料庫 </li></ul><ul><li>影像資料庫 </li></ul><ul><li>音訊資料庫 </li></ul><ul><li>視訊資料庫 </li></ul>
    3. 3. 簡介 <ul><li>多媒體資料與傳統資料庫之比較 </li></ul><ul><ul><li>資料內容 </li></ul></ul><ul><ul><ul><li>傳統資料庫 </li></ul></ul></ul><ul><ul><ul><ul><li>以文字的方式儲存,常以多個屬性描述一個實體或物件 </li></ul></ul></ul></ul><ul><ul><ul><li>多媒體資料庫 </li></ul></ul></ul><ul><ul><ul><ul><li>為涵義豐富的媒體,內容無法單純以多個屬性將其描述 </li></ul></ul></ul></ul><ul><ul><li>資料展示 </li></ul></ul><ul><ul><ul><li>傳統資料庫 </li></ul></ul></ul><ul><ul><ul><ul><li>文字,表單 </li></ul></ul></ul></ul><ul><ul><ul><li>多媒體資料庫 </li></ul></ul></ul><ul><ul><ul><ul><li>需要更豐富的視覺聽覺之展示 </li></ul></ul></ul></ul>
    4. 4. 簡介 <ul><li>範例 </li></ul><ul><ul><li>將圖片以傳統資料庫的方式處理儲存 </li></ul></ul><ul><ul><ul><li>可下達的查詢 </li></ul></ul></ul><ul><ul><ul><ul><li>找出 XXX 所畫的圖片 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>找出在 1945~1955 年,由 OOO 所繪製的圖片 </li></ul></ul></ul></ul><ul><ul><ul><li>無法處理的查詢 </li></ul></ul></ul><ul><ul><ul><ul><li>找出與此圖片相類似的圖片 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>找出左上角有一台紅色車子的圖片 </li></ul></ul></ul></ul>
    5. 5. 簡介 <ul><li>多媒體資料庫必須能提供 </li></ul><ul><ul><li>有效率之多媒體資料之儲存 </li></ul></ul><ul><ul><li>提供內涵式資料的查詢 </li></ul></ul><ul><ul><ul><li>與媒體本身內容相關之查詢 </li></ul></ul></ul><ul><ul><li>多樣性的多媒體資料之展示 </li></ul></ul>
    6. 6. 多媒體資料庫的挑戰 <ul><li>大量資料之處理 </li></ul><ul><ul><li>多媒體資料所需之儲存空間比一般資料大得多 </li></ul></ul><ul><li>多維資料之索引 </li></ul><ul><ul><li>快速的搜尋技巧 </li></ul></ul><ul><li>相似度之計算 </li></ul><ul><ul><li>容錯式的查詢 </li></ul></ul><ul><li>資料之展示 </li></ul>
    7. 7. 多維度索引技術 <ul><li>如何將使用者查詢的結果快速正確的回傳,是很重要的問題 </li></ul><ul><ul><li>資料量大,逐筆搜尋比對耗費過多的時間 </li></ul></ul><ul><ul><li>避免逐筆比對搜尋 </li></ul></ul><ul><ul><ul><li>對資料建立索引加快查詢 </li></ul></ul></ul><ul><ul><ul><li>索引可視為一種分類的指標,依據索引的指示,即可找到與查詢相關的資料。 </li></ul></ul></ul>
    8. 8. 多維度索引技術 <ul><li>在傳統資料庫中常見的索引結構 </li></ul><ul><ul><li>B+-tree </li></ul></ul><ul><ul><ul><li>最廣為使用的索引結構 </li></ul></ul></ul><ul><ul><li>Hash </li></ul></ul><ul><ul><ul><li>Static hashing </li></ul></ul></ul><ul><ul><ul><li>Dynamic hashing </li></ul></ul></ul><ul><ul><li>Grid file </li></ul></ul><ul><ul><li>Bitmap index </li></ul></ul>
    9. 9. B + -Tree 簡介 <ul><li>B + -Tree 為一樹結構,且符合下列特性 </li></ul><ul><ul><li>為一棵平衡樹,所有的葉節點到根節點的路徑長度皆相同 </li></ul></ul><ul><ul><li>對於所有的非根節點以及非葉節點,必須擁有  n/ 2  ~ n 個子節點 </li></ul></ul><ul><ul><li>葉節點必須擁有  (n-1)/2  ~ n-1 值 </li></ul></ul>
    10. 10. B + -Tree 節點結構 <ul><ul><li>K i 為搜尋值 </li></ul></ul><ul><ul><li>P i 為指向子節點的指標 (for nonleaf nodes) 或為指向資料的指標 (for leaf nodes). </li></ul></ul><ul><li>節點內的搜尋值為排序過的 K 1 < K 2 < K 3 < ... < K n </li></ul>
    11. 11. 葉節點結構 <ul><li>葉節點之特性 </li></ul><ul><li>對於 i = 1, 2, . . . , n-1 , i 不是指向一個擁有搜尋值 K i 的資料記錄就是指向一個存取單元 ((bucket) ,而這個存取單元只包含擁有搜尋值 K i 的資料 </li></ul><ul><li>P n 指向下一個葉節點 </li></ul>
    12. 12. 非葉節點結構 <ul><li>在被 P i 指到的搜尋樹內所有的搜尋值皆小於 K i-1 </li></ul><ul><li>在被 P i 指到的搜尋樹內所有的搜尋值皆大於或等於 K i </li></ul>
    13. 13. 範例
    14. 14. 討論 <ul><li>B+-tree 對於傳統表單資料庫的搜尋十分有效率,且廣為被使用 </li></ul><ul><li>然而 </li></ul><ul><ul><li>B+-tree 為單一維度的索引結構 </li></ul></ul><ul><ul><li>多媒體資料的特性 </li></ul></ul>
    15. 15. <ul><li>多媒體資料的特徵 </li></ul><ul><ul><li>文件 </li></ul></ul><ul><ul><ul><li>內容 </li></ul></ul></ul><ul><ul><ul><li>關鍵字 </li></ul></ul></ul><ul><ul><li>圖片 </li></ul></ul><ul><ul><ul><li>主要構成顏色 </li></ul></ul></ul><ul><ul><ul><li>包含物件 </li></ul></ul></ul><ul><ul><ul><li>物件大小 </li></ul></ul></ul><ul><ul><ul><li>顏色分佈 </li></ul></ul></ul><ul><ul><ul><li>紋理特徵… </li></ul></ul></ul>
    16. 16. <ul><ul><li>音樂 </li></ul></ul><ul><ul><ul><li>節拍 </li></ul></ul></ul><ul><ul><ul><li>和絃 </li></ul></ul></ul><ul><ul><ul><li>音調… </li></ul></ul></ul><ul><ul><li>影片 </li></ul></ul><ul><ul><ul><li>物體之移動軌跡 </li></ul></ul></ul><ul><ul><ul><li>包含物件 </li></ul></ul></ul><ul><ul><ul><li>顏色… </li></ul></ul></ul><ul><li>可以依內涵資訊為查詢條件 </li></ul><ul><ul><li>找出與某張圖相像的圖 </li></ul></ul><ul><ul><li>找出包含類似某段旋律之歌曲 </li></ul></ul><ul><ul><li>找出有機車飛越火車的影片片段 </li></ul></ul>
    17. 17. <ul><li>一個多媒體資料是由多個特徵所描述,可由多維資料表示 </li></ul><ul><li>然而 B-tree, B+-tree </li></ul><ul><ul><li>僅能對單一維度的資料做索引 </li></ul></ul><ul><ul><li>不適用於多維度資料 </li></ul></ul><ul><li>如何對多維度資料建立索引加速查詢,對多媒體資料的搜尋十分重要。 </li></ul>
    18. 18. 多維度上的索引結構 <ul><li>k-d tree </li></ul><ul><ul><li>用來儲存 k –dimension 的資料 </li></ul></ul><ul><ul><li>在一個層級 (level) 中只比一個維度的資料 </li></ul></ul><ul><ul><li>在節點 N 所在層級比較的維度上,在節點 N 所指到的左子樹內所有的資料其該維度的值皆比節點 N 該維度的值小,而右子樹的值則皆大於或等於節點 N 該維度的值 </li></ul></ul>
    19. 19. <ul><ul><ul><li>範例 </li></ul></ul></ul>
    20. 21. <ul><ul><li>隨堂練習 </li></ul></ul><ul><ul><ul><li>考慮當 k>2 時的 k-d tree </li></ul></ul></ul><ul><ul><ul><ul><li>自己試試看 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>將 A(30,24,58), B(46,78, 33), C(20,33,15), D(58,40, 50), E(40,88,56), F(38,54,44) 插入 k-d tree 中 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>請利用你所建立的 k-d tree ,找出與 X(34,50,46) 距離在 15 以內的點 </li></ul></ul></ul></ul><ul><ul><ul><li>刪除時如何處理 ? </li></ul></ul></ul>
    21. 22. <ul><ul><li>優點 </li></ul></ul><ul><ul><ul><li>簡單 </li></ul></ul></ul><ul><ul><li>缺點 </li></ul></ul><ul><ul><ul><li>樹的高度會因資料插入順序的不同而不同 </li></ul></ul></ul><ul><ul><ul><li>很可能造成一棵歪斜樹 </li></ul></ul></ul><ul><ul><ul><ul><li>搜尋的效率將會變得十分差 </li></ul></ul></ul></ul><ul><ul><ul><li>資料刪除的過程較為複雜 </li></ul></ul></ul>
    22. 23. 多維度上的索引結構 <ul><li>Mx-quadtree </li></ul><ul><ul><li>樹的形狀與插入的點的個樹以及順序無關。 </li></ul></ul><ul><ul><li>設計者必須決定一個 k ,而 k 一旦決定,則無法更改。 </li></ul></ul><ul><ul><li>整個地圖會被切成 個格子 </li></ul></ul><ul><ul><li>刪除與插入的步驟十分簡單 </li></ul></ul>
    23. 24. <ul><ul><ul><li>範例 </li></ul></ul></ul><ul><ul><ul><ul><li>假設 k=2 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>地圖被切成 個格子 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>將 A,B,C,D 四個點放入 MX-quad-tree 中 </li></ul></ul></ul></ul>
    24. 26. 多維度上的索引結構 <ul><li>R-tree </li></ul><ul><ul><li>為一棵平衡的樹 </li></ul></ul><ul><ul><li>針對大量資料的儲存十分有用 </li></ul></ul><ul><ul><li>可減少大量的磁碟存取 </li></ul></ul><ul><ul><li>一個 R-tree 的節點有 k 個指標 </li></ul></ul><ul><ul><li>除了根節點與葉節點外,每一個節點必須包含 至 k 個非空的指標 </li></ul></ul><ul><ul><ul><li>控制磁碟存取的次數 </li></ul></ul></ul>
    25. 27. <ul><ul><li>葉節點包含真正的資料 </li></ul></ul><ul><ul><li>中間節點包含真正資料的群組輪廓,以長方形來表示 </li></ul></ul><ul><ul><ul><li>左上角以及右下角 </li></ul></ul></ul><ul><ul><ul><li>可為多維度 </li></ul></ul></ul><ul><ul><li>插入與刪除包括了節點的分裂以及整合,較為複雜。 </li></ul></ul>
    26. 28. <ul><ul><li>範例 </li></ul></ul><ul><ul><ul><li>總共有八個物件 </li></ul></ul></ul><ul><ul><ul><li>兩維空間 </li></ul></ul></ul><ul><ul><ul><li>假設 k=3 </li></ul></ul></ul>
    27. 29. 插入 p 14 R1 R2 R3 R4 R5 p6 p7 p5 p1 p2 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 p7 p6 p8 p2 p3 p4 p5 p9 p10 p11 p12 p13 R1 R2 p14
    28. 30. inserting p14 R1 R2 R3 R4 R5 p6 p7 p5 p1 p2 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 p7 p6 p8 p2 p3 p4 p5 p9 p10 p11 p12 p13 p14 p14 R1 R2
    29. 31. <ul><ul><li>刪除 p 2 </li></ul></ul>p14 R1 R2 R1 R2 R3 R4 R5 p6 p7 p5 p1 p2 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 p7 p6 p8 p2 p3 p4 p5 p9 p10 p11 p12 p13 p14
    30. 32. <ul><ul><ul><li>找出包含 P 2 的 MBR </li></ul></ul></ul>R1 R2 R1 R2 R3 R4 R5 p6 p7 p5 p1 p2 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 p7 p6 p8 p2 p3 p4 p5 p9 p10 p11 p12 p13 p14 p14
    31. 33. <ul><ul><li>R 3 不滿足 R-tree 的定義 (underflow) </li></ul></ul>R1 R2 R1 R2 R3 R4 R5 p6 p7 p5 p1 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 p7 p6 p8 p3 p4 p5 p9 p10 p11 p12 p13 p14 p14
    32. 34. <ul><ul><li>與鄰近的 Bounding rectangle 合併 </li></ul></ul>R1 R2 R1 R2 R3 R4 R5 p6 p7 p5 p1 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 p7 p6 p8 p3 p4 p5 p9 p10 p11 p12 p13 p14 p14
    33. 35. <ul><ul><ul><li>將 R 3 和 R 4 重新整理,修改個別的左上角以及右上角之值 </li></ul></ul></ul>R1 R2 R3 R4 R5 p6 p7 p5 p1 Pointers to data tuples p8 p3 p4 p9 p10 p11 p12 p13 R6 R7 R3 R4 R5 R6 R7 p1 p7 p6 p8 p3 p4 p5 p9 p10 p11 p12 p13 p14 p14 R1 R2
    34. 36. 另一種簡單之整合單一維度索引之多維索引結構 <ul><li>一個多媒體物件會包含的特徵為多維的 </li></ul><ul><ul><li>假設一張圖片我們以平均的 R( 紅 ), G( 綠 ), B( 藍 ) 當為特徵 </li></ul></ul><ul><ul><li>特徵空間為三維 </li></ul></ul><ul><ul><li>範例 </li></ul></ul>
    35. 37. <ul><li>假設資料庫內有十張圖 </li></ul>P 1 P 3 P 2 P 5 P 4
    36. 38. P 6 P 10 P 9 P 8 P 7
    37. 39. <ul><li>有下列三個查詢 </li></ul>
    38. 40. <ul><li>我們得到下列的 R,G,B 的平均值 </li></ul>
    39. 41. <ul><li>Case 1: 假設我們要找到與查詢圖片 Q 1 相似度在 0.15 內的圖片 </li></ul><ul><ul><li>Q 1 =(0.478, 0.541, 0.753), r=0.15 </li></ul></ul><ul><ul><li>資料庫內找出個個維度與查詢最近的值 </li></ul></ul><ul><ul><ul><li>(0.451, 0.447, 0.561) </li></ul></ul></ul><ul><ul><ul><li>(|0.478-0.451|, |0.541-0.447|, |0.753-0.561|) </li></ul></ul></ul><ul><ul><ul><li>=> (0.027,0.094,0.192) </li></ul></ul></ul><ul><ul><ul><li>所以沒有符合查詢的資料 </li></ul></ul></ul>>0.15
    40. 42. <ul><li>Case 2: 假設我們要找到與查詢圖片 Q 2 相似度在 0.02 內的圖片 </li></ul><ul><ul><li>Q 2 =(0.302,0.310,0.416), r=0.02 </li></ul></ul><ul><ul><li>第一步 : 在資料庫內找出個個維度與查詢最近的值 </li></ul></ul><ul><ul><ul><li>(0.318, 0.302,0.400) </li></ul></ul></ul><ul><ul><ul><li>(|0.302-0.318|, |0.310-0.302|, |0.416-0.400|) </li></ul></ul></ul>皆小於 0.02 ,故必須進行第二步驟
    41. 43. <ul><ul><li>第二步驟 : 由第一個維度開始檢查,並將容許的錯誤減掉之前已用到的額度 </li></ul></ul><ul><ul><ul><li>第一維可用的誤差額度為 :0.02 </li></ul></ul></ul><ul><ul><ul><li>第二維可用的誤差額度為 : (0.02 2 -0.016 2 ) 1/2 </li></ul></ul></ul><ul><ul><ul><li>第三維可用的誤差額度為 : (0.02 2 -0.016 2- 0.008 2 ) 1/2 </li></ul></ul></ul>大於已知之最小誤差 0.016 ,所以資料庫內沒有符合查詢的資料存在
    42. 44. <ul><li>Case 3: 假設我們要找到與查詢圖片 Q 3 相似度在 0.05 內的圖片 </li></ul><ul><ul><li>Q 3 =(0.302, 0.223, 0.161), r=0.05 </li></ul></ul><ul><ul><li>(|0.302-0.318|, |0.223-0.200|, |0.161-0.161|) </li></ul></ul><ul><ul><li>=> (0.016,0.023,0) G>R>B </li></ul></ul><ul><ul><li>第一步驟無法決定資料庫內沒有欲查詢的資料 </li></ul></ul>
    43. 45. <ul><ul><li>搜尋 G 上的索引 , 找出在 0.223+-0.05 =[0.173,0.273] 範圍內的圖 </li></ul></ul><ul><ul><ul><li>{P 7 , P 6 , P 2 } </li></ul></ul></ul><ul><ul><li>搜尋 R 上的索引 , 找出在 0.302+-(0.044) =[0.258, 0.346] 範圍內的圖 </li></ul></ul><ul><ul><ul><li>{P 2 , P 8 } </li></ul></ul></ul><ul><ul><li>搜尋 B 上的索引,找出在 0.161+- 0.041 =[0.120, 0.202] 範圍內的圖 </li></ul></ul><ul><ul><ul><li>{P 2 , P 5 , P 9 } </li></ul></ul></ul><ul><ul><li>將結果整合,得到 P 2 為符合查詢之資料 </li></ul></ul>
    44. 46. 演算法
    45. 52. 文件資料庫 <ul><li>導論 </li></ul><ul><ul><li>文件內容的分析 </li></ul></ul><ul><ul><ul><li>同意字 (Synonymy) </li></ul></ul></ul><ul><ul><ul><li>一辭多義 (Polysemy) </li></ul></ul></ul><ul><ul><li>搜尋結果之評估 </li></ul></ul><ul><ul><ul><li>Precision </li></ul></ul></ul><ul><ul><ul><ul><li>找到的文件正確的機率 </li></ul></ul></ul></ul><ul><ul><ul><li>Recall </li></ul></ul></ul><ul><ul><ul><ul><li>相關的文件被找到的機率 </li></ul></ul></ul></ul>
    46. 53. 文件資料庫 <ul><ul><ul><li>Precision= </li></ul></ul></ul><ul><ul><ul><li>Recall= </li></ul></ul></ul>所有的文件 相關的文件 搜尋所得之結果
    47. 54. 文件資料庫 <ul><ul><ul><li>Precision/recall 之計算範例 </li></ul></ul></ul><ul><ul><ul><ul><li>請探討 precision/recall 之關係 </li></ul></ul></ul></ul>所有的文件 相關的文件 搜尋所得之結果 50 150 20
    48. 55. 文件資料庫 <ul><li>文件內容之描述 </li></ul><ul><ul><li>Stop lists </li></ul></ul><ul><ul><ul><li>文件內可被忽略的字,如 : a, the, he… </li></ul></ul></ul><ul><ul><li>Word stems </li></ul></ul><ul><ul><ul><li>同一個字各種不同之時態或單複數等等 </li></ul></ul></ul><ul><ul><li>Frequency tables </li></ul></ul>0 0 1 0 boat 0 2 2 0 slip 2 0 0 0 connection 3 0 0 1 videotape 3 1 0 1 drug 0 0 0 1 sex d 4 d 3 d 2 d 1 Term/ 文件
    49. 56. 文件資料庫 <ul><li>查詢處理 </li></ul><ul><ul><li>文件相關性之計算 </li></ul></ul><ul><ul><ul><li>字詞距離 </li></ul></ul></ul><ul><ul><ul><li>Cosine 距離 </li></ul></ul></ul>
    50. 57. 文件資料庫 <ul><ul><li>查詢型態 </li></ul></ul><ul><ul><ul><li>找出包含某些字詞的文件 </li></ul></ul></ul><ul><ul><ul><li>找出包含某些字詞但不包含另一些字詞的文件 </li></ul></ul></ul><ul><ul><ul><li>找出離查詢向量最近的文件 </li></ul></ul></ul><ul><ul><ul><li>找出離查詢向量最近的前 k 個文件 </li></ul></ul></ul><ul><ul><ul><li>找出與查詢向量距離  之內的文件 </li></ul></ul></ul>
    51. 58. 文件資料庫 <ul><ul><li>使用索引 </li></ul></ul><ul><ul><ul><li>R-tree </li></ul></ul></ul><ul><ul><ul><ul><li>不適用於高維索引結構 </li></ul></ul></ul></ul><ul><ul><ul><li>TV-tree </li></ul></ul></ul><ul><ul><ul><ul><li>與 R-tree 類似 , 但在一各節點 , 只考慮部分維度的關係 </li></ul></ul></ul></ul><ul><ul><ul><li>Inverted list </li></ul></ul></ul><ul><ul><ul><li>Signature files </li></ul></ul></ul>
    52. 59. 文件資料庫 <ul><li>Inverted list( 反轉串列 ) </li></ul><ul><ul><li>以字詞為主所形成的反轉表 </li></ul></ul><ul><ul><li>以 table 為例 </li></ul></ul><ul><ul><ul><li>Sex : d1 </li></ul></ul></ul><ul><ul><ul><li>Drug: d1, d3, d4 </li></ul></ul></ul><ul><ul><ul><li>Videotape: d1, d4 </li></ul></ul></ul><ul><ul><li>搜尋範例以及型態 </li></ul></ul><ul><ul><ul><li>and, or, not </li></ul></ul></ul><ul><ul><ul><li>無法處理相似度查詢 </li></ul></ul></ul><ul><ul><li>缺點 </li></ul></ul><ul><ul><ul><li>Size 大 </li></ul></ul></ul><ul><ul><ul><ul><li>壓縮技巧 </li></ul></ul></ul></ul>
    53. 60. 文件資料庫 <ul><li>Signature files </li></ul><ul><ul><li>每個關鍵字有它所對應的 code </li></ul></ul><ul><ul><li>對一文件而言 , 該文件的 signature 即為將其所包含的關鍵字的 code superimpose 在一起 </li></ul></ul><ul><ul><li>搜尋範例 </li></ul></ul><ul><ul><li>可處理之查詢型態 </li></ul></ul><ul><ul><ul><li>And, Or </li></ul></ul></ul><ul><ul><ul><li>Not ? </li></ul></ul></ul>
    54. 61. 文件資料庫 <ul><ul><ul><li>討論 </li></ul></ul></ul><ul><ul><ul><ul><li>R-tree, TV-tree 可處理相似度的查詢 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Inverted indices, Signature files 無法處理相似度的查詢,只能處理包含某些字詞的查詢 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Signature files 不適合處理不包含某個 ( 些 ) 字詞的查詢 </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>請舉例說明 </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>R-tree 不適合高維度的資料 </li></ul></ul></ul></ul>
    55. 62. 影像資料庫 <ul><li>查詢範例 </li></ul><ul><ul><li>範例一 : 找出與這張圖相像的圖片 </li></ul></ul><ul><ul><li>範例二 : 找出左上角有一個紅色方形,而圖形的下方為藍色的所有圖片 </li></ul></ul><ul><li>可代表一張影像的資訊 </li></ul><ul><ul><li>與影像內涵資訊無關之資訊 </li></ul></ul><ul><ul><ul><li>作者 </li></ul></ul></ul><ul><ul><ul><li>完成時間 </li></ul></ul></ul><ul><ul><ul><li>完成地點 </li></ul></ul></ul><ul><ul><ul><li>etc.. </li></ul></ul></ul>
    56. 63. <ul><ul><li>與影像內涵資訊相關的特徵 </li></ul></ul><ul><ul><ul><li>顏色分佈 </li></ul></ul></ul><ul><ul><ul><ul><li>可以 color histogram 表示 </li></ul></ul></ul></ul><ul><ul><ul><li>紋理 </li></ul></ul></ul><ul><ul><ul><li>內含物件 </li></ul></ul></ul><ul><ul><ul><ul><li>形狀 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>顏色 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>大小 </li></ul></ul></ul></ul><ul><ul><ul><ul><li>位置 </li></ul></ul></ul></ul><ul><ul><ul><li>主要構成顏色 </li></ul></ul></ul><ul><ul><ul><li>Etc. </li></ul></ul></ul>
    57. 64. 影像資料庫搜尋 <ul><li>由關鍵字查詢 (Query By Keyword) </li></ul><ul><ul><li>以文字屬性描述每張影像,可以對個個屬性建構索引,並可以 SQL 的方式下查詢 </li></ul></ul><ul><li>以範例查詢 (Query By Example (QBE)): </li></ul><ul><ul><li>使用者對系統展示一張範例圖,系統則根據資料庫內每張圖與這張範例圖的相似度決定回傳的答案 </li></ul></ul><ul><ul><li>查詢型態 </li></ul></ul><ul><ul><ul><li>找出離查詢範例最近的影像 </li></ul></ul></ul><ul><ul><ul><li>找出離查詢範例最近的前 k 個影像 </li></ul></ul></ul><ul><ul><ul><li>找出與查詢範例距離  之內的影像 </li></ul></ul></ul>
    58. 65. 影像距離與相似度 <ul><li>Color Similarity </li></ul><ul><li>Texture Similarity </li></ul><ul><li>Shape Similarity </li></ul><ul><li>Object & Relationship similarity </li></ul>
    59. 66. 顏色相似度 (Color Similarity) <ul><li>顏色佔的比例 </li></ul><ul><ul><li>Ex: R:20%, G:50%, B:30% </li></ul></ul><ul><li>顏色分布圖 (Color histogram) </li></ul><ul><li>Dhist(I,Q)=(h(I)-h(Q)) T A(h(I)-h(Q)) </li></ul><ul><ul><li>A is a similarity matrix  colors that are very similar should have similarity values close to one. </li></ul></ul>
    60. 67. 顏色配置 <ul><li>Color layout matching : compares each grid square of the query to the corresponding grid square of a potential matching image and combines the results into a single image distance  </li></ul><ul><li>where C I (g) represents the color in grid square g of a database image I and C Q (g) represents the color in the corresponding grid square g of the query image Q . some suitable representations of color are </li></ul><ul><ul><ul><li>Mean </li></ul></ul></ul><ul><ul><ul><li>Mean and standard deviation </li></ul></ul></ul><ul><ul><ul><li>Multi-bin histogram </li></ul></ul></ul>
    61. 68. 材質相似度 (Texture Similarity) <ul><li>Pick and click </li></ul><ul><ul><li>Suppose T(I) is a texture description vector which is a vector of numbers that summarizes the texture in a given image I (for example: Laws texture energy measures), then the texture distance measure is defined by </li></ul></ul><ul><li>Texture layout </li></ul>
    62. 69. 形狀相似度 (Shape Similarity) <ul><li>Shape Histogram </li></ul><ul><li>Boundary Matching </li></ul><ul><li>Sketch Matching </li></ul>
    63. 70. <ul><li>以內容來看,三張圖相像嗎 ? </li></ul><ul><ul><li>顏色資訊 </li></ul></ul><ul><ul><li>位置資訊 </li></ul></ul>實作範例 Image-A Image-B Image-C
    64. 71. Color model RGB color space v.s. HSV color space
    65. 72. <ul><li>顏色與位置資訊的取得 </li></ul><ul><ul><li>將圖切成一個一個的格子 </li></ul></ul><ul><ul><li>找出每個格子的代表色 </li></ul></ul><ul><ul><li>相鄰格子有相同的顏色,及組合成更大的格子 </li></ul></ul><ul><ul><li>最後的大區塊的顏色,位置及形狀是將來比對上所使用的重大資訊。 </li></ul></ul>
    66. 74. 相似度比較 <ul><li>兩張圖要相似有哪些因素是可能被使用者考慮的 ? </li></ul><ul><ul><li>顏色配置 </li></ul></ul><ul><ul><li>顏色分布 </li></ul></ul><ul><ul><li>物件位置 </li></ul></ul><ul><ul><li>物件大小 </li></ul></ul><ul><ul><li>物件形狀 </li></ul></ul>
    67. 77. 範例 SIZE = 13 + 10 + 5 = 28 Query image A 13 B 10 C 5
    68. 78. <ul><li>聽看看下面幾首音樂或音樂片段,你知道歌名是什麼嗎? Music 1 2 3 4 5 6 7 8 9 10 </li></ul><ul><li>你是怎麼辦識出這首歌的呢?若要讓電腦幫我們做同樣的事,要怎麼設計呢? </li></ul>音樂資料庫
    69. 79. 音樂的特徵 <ul><li>Static Music Information 如 調號、拍號等 </li></ul><ul><li>Acoustical Feature 如 loudness 、 pitch 等 </li></ul><ul><li>Thematic Feature 如 melodies 、 rhythms 及 chords 例“ sol-sol-sol-mi” 、” 0.5-0.5-0.5-2” 及“ C-Am-Dm-G7” </li></ul><ul><li>Structural Feature 古典音樂格式的二個基本規則 hierarchical rule 及 repetition rule </li></ul>
    70. 80. 特徵的取樣 <ul><li>相對音感 vs 絕對音感—旋律的位移 </li></ul><ul><ul><li>考慮以絕對音感比對所會造成的問題 </li></ul></ul><ul><ul><ul><li>升 key, 降 key 所發生的問題 </li></ul></ul></ul><ul><ul><li>節拍取樣也有相同的問題 </li></ul></ul><ul><li>依完整段落取 pattern </li></ul><ul><li>多音軌的取樣問題 </li></ul>
    71. 81. 特徵的編碼 <ul><li>將特徵取出後,依適當的編碼方式將特徵標碼 </li></ul><ul><ul><li>能應付音調的升降 </li></ul></ul><ul><ul><li>能應付節拍的快慢 </li></ul></ul><ul><ul><li>要讓聽起來像的音樂,其編碼出來的 code 之間的距離也要近 </li></ul></ul>
    72. 82. 範例 <ul><li>利用重複出現的重要音調代表某首歌 </li></ul><ul><ul><li>Hierarchical rule music object->movements->Sentences-> phrases->figures </li></ul></ul><ul><ul><li>Repetition rule 如“ C6-Ab5-Ab5-C6” 及“ F6-C6-C6-Eb6” </li></ul></ul>
    73. 83. 重複出現的式樣—定義 <ul><li>For a substring X of a sequence of notes S, if X appears more than once in S, we call X a repeating pattern of S. The repeating frequency of the repeating pattern X, denoted as freq(X), is the number of appearances of X in S. The length of the repeating pattern X, denoted |X|, is the number of notes in X. </li></ul>
    74. 84. 重複出現的式樣—實例 <ul><li>“ C-D-E-F-C-D-E-C-D-E-F” RP:Repeating Pattern RPF:Repeating Pattern Frequency </li></ul>2 3 3 3 2 RPF F E D C E-F RP 3 3 2 3 2 RPF D-E C-D D-E-F C-D-E C-D-E-F RP
    75. 85. 重複出現的式樣 <ul><li>nontrival 的定義 A repeating pattern X is nontrivial if and only if there does not exist another repeating pattern Y such that freq(X)=freq(Y) and X is a substring of Y. </li></ul><ul><li>實例 上頁的 10 個 RP 中,只有“ C-D-E-F” 及” C-D-E” 為 nontrival </li></ul>
    76. 86. The Correlative-Matrix(1) <ul><li>Phrase </li></ul><ul><li>Melody string S = “C6-Ab5-Ab5-C6-C6-Ab5-Ab5-C6-Db5-c6-Bb5-C6” </li></ul><ul><li>Repeating Patterns </li></ul>Ab5 1 4 C6 1 6 C6-Ab5-Ab5-C6 4 2 RP PL(Pattern Length) RPF
    77. 87. The Correlative-Matrix(2) Construction of correlative matrix T 12,12 -- C6 -- Bb5 1 -- C6 -- Db5 1 1 -- C6 -- Ab5 1 -- Ab5 1 1 1 -- C6 1 1 4 1 -- C6 3 1 -- Ab5 1 2 1 -- Ab5 1 1 1 1 1 -- C6 C6 Bb5 C6 Db5 C6 Ab5 Ab5 C6 C6 Ab5 Ab5 C6
    78. 88. The Correlative-Matrix(3) <ul><li>Find all RPs and their RFs. </li></ul><ul><ul><li>定義 candidate set CS 其格式為 (pattern,rep_count,sub_count) </li></ul></ul><ul><ul><li>CS 一開始為空集合,接下來根據 T 來計算及 insert RP 到 CS 內 </li></ul></ul><ul><ul><li>因為條件有 (T i,j =1)or(T i,j >1) 及 (T (i+1),(j+1) =0) or (T (i+1),(j+1) <>0), 所以有以下四種情形 </li></ul></ul>
    79. 89. The Correlative-Matrix(4) <ul><ul><li>Case 1: (T i,j =1) and (T (i+1),(j+1) =0) 例 T 1,4 =1,T 2,5 =0 insert(“C6”,1,0)into CS </li></ul></ul><ul><ul><li>Case 2: (T i,j =1) and (T (i+1),(j+1) <>0) 例 T 1,5 =1,T 2,6 =2 modify(“C6”,1,0)into(“C6”,2,1) </li></ul></ul><ul><ul><li>Case 3: (T i,j >1) and (T (i+1),(j+1) <>0) 例 T 2,6 =2,T 3,7 =3 insert(“C6-Ab5”,1,1), (“Ab5”,1,1)into CS </li></ul></ul><ul><ul><li>Case 4: (T i,j >1) and (T (i+1),(j+1) =0) 例 T 4,8 =4,T 5,9 =0 insert (C6-Ab5-Ab5-C6”,1,0),(“Ab5-Ab5-C6”,1,1)and (“Ab5-C6”,1,1)into CS and change(“C6”,6,1)into(“C6”,7,2) </li></ul></ul>
    80. 90. The Correlative-Matrix(5) <ul><ul><li>計算 RF rep_count=0.5f(f-1) 即 f=((1+SQRT(1+8*rep_count))/2 例如本例中 (“C6”,15,1), 即 C6 的 rep_count=15, 所以 f=((1+SQRT(1+8x15))/2=6 同理 “ Ab5” 的 RF 為 4,“C6-Ab5-Ab5-C6” 的 RF 為 2 </li></ul></ul>
    81. 91. The String-Join Approach(1) <ul><li>Melody string “C-D-E-F-C-D-E-C-D-E-F” </li></ul><ul><li>第一步 : 找出所有長度為 1 的 RPs, 並記為 {X,freq(X),(position1,position2,…)} 如本例可找到 {“C”,3,(1,5,8)},{“D”,3,(2,6,9)},{E”,3,(3,7,10)},and {“F”,2,(4,11)} </li></ul>
    82. 92. The String-Join Approach(2) <ul><li>接下來長度為 2 的 RPs 可由上面的 RPs 經 joining( 記為“∞” ) 而得 例如若要找“ C-D”, 已知 {“C”,3,(1,5,8)},{“D”,3,(2,6,9)} 則可確定“ C-D” 亦出現在 (1,5,8), 可表示為 {“C”,3,(1,5,8)}∞{“D”,3,(2,6,9)} ={“C-D”,3,(1,5,8)} </li></ul>
    83. 93. The String-Join Approach(3) <ul><li>同理 {“D”,3,(2,6,9)}∞{“E”,3,(3,7,10)} ={“D-E”,3,(2,6,9)} {“E”,3,(3,7,10)}∞{“F”,2,(4,11)} ={“E-F”,2,(3,10)} </li></ul><ul><li>而長度為 4 的 , 可由長度為 2 的 join 而得 如 {“C-D”,3,(1,5,8)}∞{“E-F”,2,(3,10)} ={“C-D-E-F”,2,(1,8)} </li></ul>
    84. 94. The String-Join Approach(4) <ul><li>長度為 3 的 , 因為 freq(“C-D-E-F”)=freq(“E-F”)=2, 可知不只“ E-F” 是 trivial,”D-E-F” 也是 ( 否則 freq(“E-F”) 要大於 2) 而 {“C-D”,3,(1,5,8)}∞{“D-E”,3,(2,6,9)} ={“C-D-E”,3,(1,5,8)} 且 freq(“C-D-E”) > freq(“C-D-E-F”) 所以“ C-D-E” 為 nontrivial </li></ul><ul><li>最後 , 得知此例的 nontrivial repeating patterns 為“ C-D-E-F” 及“ C-D-E” </li></ul>
    85. 95. 討論 <ul><li>相對音感 vs 絕對音感—旋律的位移 </li></ul><ul><li>依完整段落取 pattern </li></ul><ul><li>不同音樂格式的轉換 </li></ul><ul><li>問題—重要卻沒重覆的 feature </li></ul>
    86. 96. 視訊資料庫 <ul><li>內容組織 </li></ul><ul><ul><li>使用者會對哪一部分的內容感興趣 </li></ul></ul><ul><ul><li>如何儲存這部分的內容,使得查詢處理能很有效率的被執行 </li></ul></ul><ul><ul><li>如何設計查詢語言,與傳統的 SQL 有何不同 </li></ul></ul><ul><ul><li>影片的內容可自動的被取出來嗎 ? </li></ul></ul>
    87. 97. 影片內涵資訊 <ul><li>物件 </li></ul><ul><ul><li>單純形狀的描述 </li></ul></ul><ul><ul><ul><li>可做到自動化 </li></ul></ul></ul><ul><ul><li>有意義的物件描述 </li></ul></ul><ul><ul><ul><li>人物 </li></ul></ul></ul><ul><ul><ul><ul><li>男主角,女主角… </li></ul></ul></ul></ul><ul><ul><ul><li>動物 </li></ul></ul></ul><ul><ul><ul><ul><li>豬,貓,狗… </li></ul></ul></ul></ul><ul><ul><ul><li>非生物 </li></ul></ul></ul><ul><ul><ul><ul><li>皮箱,鑰匙… </li></ul></ul></ul></ul><ul><ul><ul><li>幾乎不可能做到自動化 </li></ul></ul></ul>
    88. 98. <ul><li>活動 </li></ul><ul><ul><li>單純描述 </li></ul></ul><ul><ul><ul><li>物件移動軌跡 </li></ul></ul></ul><ul><ul><ul><ul><li>如何將軌跡編碼成電腦可比對的 code 為一個很重要的課題 </li></ul></ul></ul></ul><ul><ul><ul><li>可做到自動化 </li></ul></ul></ul><ul><ul><li>含有意義的行為描述 </li></ul></ul><ul><ul><ul><li>車禍,甲男把皮箱交給乙女… </li></ul></ul></ul><ul><ul><ul><li>必須用單純描述做為基礎描述 </li></ul></ul></ul><ul><ul><ul><li>不容易做到完全自動化 </li></ul></ul></ul>
    89. 99. 視訊內涵資訊之建構 <ul><li>兩種資訊 </li></ul><ul><ul><li>靜態 </li></ul></ul><ul><ul><ul><li>將一個 frame 視為一張圖片 </li></ul></ul></ul><ul><ul><ul><li>利用圖片搜尋技巧 </li></ul></ul></ul><ul><ul><li>動態 </li></ul></ul><ul><ul><ul><li>將連續 frame 視為一個動作 </li></ul></ul></ul><ul><ul><ul><li>物件移動軌跡必須被考慮 </li></ul></ul></ul>
    90. 100. <ul><li>靜態資訊 </li></ul>
    91. 101. <ul><li>動態資訊 </li></ul>
    92. 102. Preface (Cont’d ) <ul><li>移動軌跡 </li></ul>
    93. 103. <ul><li>影片分析 </li></ul><ul><ul><li>Shot </li></ul></ul><ul><ul><ul><li>單一連續的鏡頭所拍攝之影片段落 </li></ul></ul></ul><ul><ul><ul><li>組成影片的單位 </li></ul></ul></ul><ul><ul><ul><li>同一個 shot 內的 frame 內容類似 </li></ul></ul></ul><ul><ul><ul><ul><li>可以在一個 shot 中找出其代表的 frame ,來表示整個 shot </li></ul></ul></ul></ul><ul><ul><ul><li>Shot 偵測 </li></ul></ul></ul><ul><ul><ul><ul><li>利用顏色分佈的改變偵測 shot 的界線 (boundary) </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>可自動化 , 但可能會因顏色的突然改變而誤找 </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><li>目前 shot segmentation 工具可以達到一各十分高的正確率 (>95%) </li></ul></ul></ul></ul>
    94. 104. <ul><ul><li>場景 (SCENE) </li></ul></ul><ul><ul><ul><li>由多個描述相同事件的 shot 所組成 </li></ul></ul></ul><ul><ul><ul><li>可當作查詢的單位 </li></ul></ul></ul><ul><ul><li>物件移動軌跡 </li></ul></ul><ul><ul><ul><li>找出物體移動的軌跡,可代表某一事件 </li></ul></ul></ul>
    95. 105. 涵義概念式查詢 <ul><li>可下達語意式的查詢 </li></ul><ul><ul><li>找出包含天空以及海的圖片 </li></ul></ul><ul><ul><li>找出有飛機飛過天空的影片 </li></ul></ul><ul><li>以低階 (low level) 特徵值的關聯 , 找出媒體之內涵意義 </li></ul><ul><ul><li>半自動分類 </li></ul></ul><ul><ul><li>Classification </li></ul></ul><ul><ul><li>Association pattern mining </li></ul></ul><ul><li>Concept 與 Semantic network </li></ul>
    96. 106. Classification <ul><li>目標 </li></ul><ul><ul><li>預測資料之類別 </li></ul></ul><ul><li>步驟 </li></ul><ul><ul><li>建立資料分類模型 </li></ul></ul><ul><ul><ul><li>根據訓練資料集 (training set) </li></ul></ul></ul><ul><ul><li>評估資料分類模型的準確度 </li></ul></ul><ul><ul><ul><li>根據測試資料集 (testing data) </li></ul></ul></ul><ul><ul><li>資料分類預測 </li></ul></ul>
    97. 107. Training Data Classification Algorithms IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ Classifier (Model)
    98. 108. Classifier Testing Data Unseen Data (Jeff, Professor, 4) Tenured?
    99. 109. Classification <ul><li>演算法 </li></ul><ul><ul><li>決策樹 (decision tree) </li></ul></ul><ul><ul><li>Bayesian Belief Networks </li></ul></ul><ul><ul><li>k-nearest neighbor classifier </li></ul></ul><ul><ul><li>case-based reasoning </li></ul></ul><ul><ul><li>Genetic algorithm </li></ul></ul><ul><ul><li>Rough set approach </li></ul></ul><ul><ul><li>Fuzzy set approaches </li></ul></ul><ul><ul><li>Neural Network </li></ul></ul>
    100. 110. 訓練資料集 (training data set)
    101. 111. 決策樹 age? overcast student? credit rating? no yes fair excellent <=30 >40 no no yes yes yes 30..40
    102. 112. Naïve bayesian Network :example P(n) = 5/14 P(p) = 9/14
    103. 113. P(true|n) = 3/5 P(true|p) = 3/9 P(false|n) = 2/5 P(false|p) = 6/9 P(high|n) = 4/5 P(high|p) = 3/9 P(normal|n) = 2/5 P(normal|p) = 6/9 P(hot|n) = 2/5 P(hot|p) = 2/9 P(mild|n) = 2/5 P(mild|p) = 4/9 P(cool|n) = 1/5 P(cool|p) = 3/9 P(rain|n) = 2/5 P(rain|p) = 3/9 P(overcast|n) = 0 P(overcast|p) = 4/9 P(sunny|n) = 3/5 P(sunny|p) = 2/9 windy humidity temperature outlook
    104. 114. Play-tennis example: classifying X <ul><li>An unseen sample X = <rain, hot, high, false> </li></ul><ul><li>P(X|p)·P(p) = P(rain|p)·P(hot|p)·P(high|p)·P(false|p)·P(p) = 3/9·2/9·3/9·6/9·9/14 = 0.010582 </li></ul><ul><li>P(X|n)·P(n) = P(rain|n)·P(hot|n)·P(high|n)·P(false|n)·P(n) = 2/5·2/5·4/5·2/5·5/14 = 0.018286 </li></ul><ul><li>Sample X is classified in class n </li></ul>
    105. 115. Bayesian Belief Networks Family History LungCancer PositiveXRay Smoker Emphysema Dyspnea LC ~LC (FH, S) (FH, ~S) (~FH, S) (~FH, ~S) 0.8 0.2 0.5 0.5 0.7 0.3 0.1 0.9 Bayesian Belief Networks The conditional probability table for the variable LungCancer
    106. 116. Bayesian Belief Networks
    107. 117. The k -Nearest Neighbor Algorithm . _ + _ x q + _ _ + _ _ + . . . . .
    108. 118. Rough Set Approach <ul><li>Rough sets are used to approximately or “roughly” define equivalent classes </li></ul><ul><li>A rough set for a given class C is approximated by two sets: a </li></ul><ul><ul><li>lower approximation (certain to be in C) </li></ul></ul><ul><ul><li>upper approximation (cannot be described as not belonging to C) </li></ul></ul>
    109. 119. Fuzzy set approach
    110. 120. Association pattern mining <ul><li>目標 </li></ul><ul><ul><li>尋找項目 (item) 或物件間的關聯性 </li></ul></ul><ul><ul><li>關聯性 </li></ul></ul><ul><ul><ul><li>一起出現的次數要夠多 (support) </li></ul></ul></ul><ul><ul><ul><li>伴隨出現之條件機率值要夠大 (confidence) </li></ul></ul></ul><ul><li>演算法 </li></ul><ul><ul><li>Apriori algorithm </li></ul></ul><ul><ul><li>Lattice approach </li></ul></ul><ul><ul><li>FP-tree </li></ul></ul>
    111. 121. 探勘關聯式法則 : 範例 <ul><li>For rule A  C : </li></ul><ul><ul><li>support = support({ A C }) = 50% </li></ul></ul><ul><ul><li>confidence = support({ A C })/support({ A }) = 66.6% </li></ul></ul><ul><li>The Apriori principle: </li></ul><ul><ul><li>Any subset of a frequent itemset must be frequent </li></ul></ul>Min. support 50% Min. confidence 50%
    112. 122. Apriori 演算法 <ul><li>Join Step : C k is generated by joining L k-1 with itself </li></ul><ul><li>Prune Step : Any (k-1)-itemset that is not frequent cannot be a subset of a frequent k-itemset </li></ul><ul><li>Pseudo-code : </li></ul><ul><ul><ul><li>C k : Candidate itemset of size k </li></ul></ul></ul><ul><ul><ul><li>L k : frequent itemset of size k </li></ul></ul></ul><ul><ul><ul><li>L 1 = {frequent items}; </li></ul></ul></ul><ul><ul><ul><li>for ( k = 1; L k !=  ; k ++) do begin </li></ul></ul></ul><ul><ul><ul><li>C k+1 = candidates generated from L k ; </li></ul></ul></ul><ul><ul><ul><li>for each transaction t in database do </li></ul></ul></ul><ul><ul><ul><ul><li>increment the count of all candidates in C k+1 that are contained in t </li></ul></ul></ul></ul><ul><ul><ul><li>L k+1 = candidates in C k+1 with min_support </li></ul></ul></ul><ul><ul><ul><li>end </li></ul></ul></ul><ul><ul><ul><li>return  k L k ; </li></ul></ul></ul>
    113. 123. 範例 Database D Scan D C 1 L 1 L 2 C 2 C 2 Scan D C 3 L 3 Scan D
    114. 124. FP-tree 演算法 <ul><li>把一大型資料庫壓縮至一緊實的資料結構 </li></ul><ul><ul><li>FP-tree </li></ul></ul><ul><ul><ul><li>只包含探勘關聯式樣式所需之相關資料 </li></ul></ul></ul><ul><ul><ul><li>避免花費高昂的資料庫掃描 </li></ul></ul></ul>
    115. 125. FP-tree 建置過程 min_support = 0.5 TID Items bought (ordered) frequent items 100 { f, a, c, d, g, i, m, p } { f, c, a, m, p } 200 { a, b, c, f, l, m, o } { f, c, a, b, m } 300 { b, f, h, j, o } { f, b } 400 { b, c, k, s, p } { c, b, p } 500 { a, f, c, e, l, p, m, n } { f, c, a, m, p } <ul><li>Steps: </li></ul><ul><li>Scan DB once, find frequent 1-itemset (single item pattern) </li></ul><ul><li>Order frequent items in frequency descending order </li></ul><ul><li>Scan DB again, construct FP-tree </li></ul>
    116. 126. {} f:4 c:1 b:1 p:1 b:1 c:3 a:3 b:1 m:2 p:2 m:1 Header Table Item frequency head f 4 c 4 a 3 b 3 m 3 p 3
    117. 127. FP-tree 主要探勘過程 <ul><li>對 FP-tree 內的每個 node, 建置 conditional pattern base </li></ul><ul><li>對每一個 conditional pattern-base 建置 conditional FP-tree </li></ul><ul><li>重複上面步驟,一直到 </li></ul><ul><ul><li>FP-tree 只剩下單一路徑 </li></ul></ul>
    118. 128. Step 1: 對 FP-tree 內的每個 node, 建置 conditional pattern base Conditional pattern bases item cond. pattern base c f:3 a fc:3 b fca:1, f:1, c:1 m fca:2, fcab:1 p fcam:2, cb:1 {} f:4 c:1 b:1 p:1 b:1 c:3 a:3 b:1 m:2 p:2 m:1 Header Table Item frequency head f 4 c 4 a 3 b 3 m 3 p 3
    119. 129. Step 2: 對每一個 conditional pattern-base 建置 conditional FP-tree All frequent patterns concerning m m, fm, cm, am, fcm, fam, cam, fcam <ul><li>m-conditional pattern base: </li></ul><ul><ul><li>fca:2, fcab:1 </li></ul></ul>{} f:3 c:3 a:3 m-conditional FP-tree   {} f:4 c:1 b:1 p:1 b:1 c:3 a:3 b:1 m:2 p:2 m:1 Header Table Item frequency head f 4 c 4 a 3 b 3 m 3 p 3
    120. 130. Mining Frequent Patterns by Creating Conditional Pattern-Bases Empty Empty f {(f:3)}|c {(f:3)} c {(f:3, c:3)}|a {(fc:3)} a Empty {(fca:1), (f:1), (c:1)} b {(f:3, c:3, a:3)}|m {(fca:2), (fcab:1)} m {(c:3)}|p {(fcam:2), (cb:1)} p Conditional FP-tree Conditional pattern-base Item
    121. 131. Step 3: Recursively mine the conditional FP-tree Cond. pattern base of “am”: (fc:3) Cond. pattern base of “cm”: (f:3) {} f:3 cm-conditional FP-tree Cond. pattern base of “cam”: (f:3) {} f:3 cam-conditional FP-tree {} f:3 c:3 a:3 m-conditional FP-tree {} f:3 c:3 am-conditional FP-tree
    122. 132. 效能分析 Data set T25I20D10K
    123. 133. Association pattern mining <ul><li>傳統 Association pattern mining 幾乎都是找出項目和項目間的關聯性 </li></ul><ul><li>在多媒體應用中 </li></ul><ul><ul><li>互斥之關係亦十分重要 </li></ul></ul><ul><li>可幫助分類的準確性 </li></ul>
    124. 134. Concepts與Semantic network <ul><li>概念 (concepts) </li></ul><ul><ul><li>知識表達之基本觀念 </li></ul></ul><ul><ul><li>Semantic notions of the objects in the world </li></ul></ul>
    125. 135. Concepts與Semantic network <ul><ul><li>概念之間的關係 </li></ul></ul><ul><ul><ul><li>多重解析度 (multi-resolution) </li></ul></ul></ul>
    126. 136. Concepts與Semantic network <ul><li>Semantic network </li></ul><ul><ul><li>節點 </li></ul></ul><ul><ul><ul><li>物件 , 觀念或狀態 </li></ul></ul></ul><ul><ul><li>連結 </li></ul></ul><ul><ul><ul><li>節點之間的關聯 </li></ul></ul></ul>
    127. 138. <ul><li>參考資料 </li></ul><ul><ul><li>V.S. Subrahmanian, Principles of Multimedia Database Systems, Morgan Kaufmann. </li></ul></ul><ul><ul><li>C.Y. Tsai, A.L.P. Chen and K. Essig,”Efficient Image Retrieval Approaches for Different Similarity Requirements”, Proc. SPIE Conference on Storage and Retrieval for Image and Video Databases, 2000 </li></ul></ul><ul><ul><li>Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2000 . </li></ul></ul>
    128. 139. Content-Based Interactivity
    129. 141. Paper study : topic 1 A Semantic Modeling Approach for Video Retrieval by Content Edoardo Ardizzone Mohand-Said Hacid ICMCS 1999 July
    130. 142. Introduction <ul><li>Using keywords or free text to describe the necessary semantic objects is not sufficient. </li></ul><ul><li>The issues that need to be addressed is </li></ul><ul><li>the representation of video information </li></ul><ul><li>the organization of this information </li></ul><ul><li>user-friendly representation </li></ul>
    131. 143. Introduction(cont.) <ul><li>We exploit the 2 languages </li></ul><ul><li>One for defining the schema (i.e. the </li></ul><ul><li>structure) </li></ul><ul><li>The other for querying through schema </li></ul><ul><li>And 2 layers for representing video`s conceptual content </li></ul><ul><li>Object layer </li></ul><ul><li>Schema layer </li></ul>
    132. 144. 2 Layers for video`s conceptual content <ul><li>Object layer: collect objects of interest, their description and relation among them. Objects in video sequence are represented as visual entities . </li></ul><ul><li>Schema layer: intend to capture the structure and knowledge for video retrieval. Visual entities can be classified into hierarchical structure . </li></ul>
    133. 145. Schema Language—example1
    134. 146. Query Language (QL) <ul><li>Querying a DB means retrieving stored objects that satisfy certain conditions or qualifications and hence are interesting for a user. </li></ul><ul><li>In OODB , classes are used to represent sets of objects. </li></ul>
    135. 147. QL cont. <ul><li>Queries are represented as concepts in our abstract language . </li></ul><ul><li>The syntax and semantics of a concept language for making queries </li></ul>
    136. 148. QL- Example <ul><li>“ Sequences of movies directed by Kevin Costner in which he is also an actor” </li></ul>
    137. 149. QL- Example <ul><li>“ the set of movies whose directors are also producers of some films” </li></ul>
    138. 150. Semantic Annotation of Sports Video <ul><li>Videos isn`t just a sequence of images. It add the temporal dimension. </li></ul><ul><li>An approach for semantic annotation of sports videos that include several different sports and even non-sports content </li></ul>
    139. 151. Introduction --Typical sequence of shots in sports video
    140. 153. Classifying visual shot features
    141. 154. Implementation --Classifying visual shot features (cont.)
    142. 155. Conclusion <ul><li>There is a growing interest in video database and for dealing with access problems . </li></ul><ul><li>One of the central problems in the creation of robust and scalable systems for manipulating video information lies in representing video content </li></ul>
    143. 156. Conclusion cont. <ul><li>This framework is appropriate for supporting conceptual and intensional queries </li></ul><ul><li>Be able to perform exact as well as partial or fuzzy matching </li></ul><ul><li>some physical features : color, objects’s shape…ect. </li></ul>
    144. 157. Paper study: topic 2 Indexing methods for approximate string matching IEEE data engineering bulletin,2000 Gonzalo Navarro, Ricardo Baeza-Yates, Erkki Sutinen, Jorma Tarhio
    145. 158. outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>Partitioning into exact search </li></ul><ul><li>Intermediate partitioning </li></ul><ul><li>summarization </li></ul>
    146. 159. Introduction <ul><li>Definition </li></ul><ul><ul><li>given a long text T 1…..n of length n and a comparatively short pattern P 1…..m of length m , both sequences over an alphabet Σ of size σ ,find the text positions that match the pattern with at most k “errors”. </li></ul></ul><ul><li>Applications </li></ul><ul><ul><li>Retrieving musical passages similar to a sample </li></ul></ul><ul><ul><li>Finding DNA subsequences after possible mutations </li></ul></ul><ul><ul><li>Searching text under the presence of typing or spelling errors </li></ul></ul>
    147. 160. outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>Partitioning into exact search </li></ul><ul><li>Intermediate partitioning </li></ul><ul><li>summarization </li></ul>
    148. 161. Suffix trees 1 g a a c c g a c c t 2 a a c c g a c c t 3 a c c g a c c t 4 c c g a c c t 5 c g a c c t 6 g a c c t 7 a c c t 8 c c t 9 c t 10 t Weak point:large space requirement,about 9 times of text size.
    149. 162. Suffix array Require less space,about 4 times of text size a $ a a a a b b c d r a b b c d r r a a r r a a $ c a a c $ $ c
    150. 163. Q-grams,Q-samples TEXT 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 INDEX a b r a b r a c r a c a a c a d c a d a 1 8 2 3 4 5 Q-samples,unlike q-grams, do not overlap , and may even be some space between each pair of samples. a b r a c a d a b r a
    151. 164. Edit distance ed(“SURVEY”,”SURGERY”) Final result 2 2 2 3 3 4 5 6 Y 3 2 1 2 2 3 4 5 E 4 3 2 1 1 2 3 4 V 4 3 2 1 0 1 2 3 R 5 4 3 2 1 0 1 2 U 6 5 4 3 2 1 0 1 S 7 6 5 4 3 2 1 0 Y R E G R U S
    152. 165. outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>Partitioning into exact search </li></ul><ul><li>Intermediate partitioning </li></ul><ul><li>summarization </li></ul>
    153. 166. Neighborhood generation Pattern :abc with 1 error { * bc, a * c,ab * } U {ab,ac,bc} U{ * abc,a * bc,abc * } Text a b r a c a d a b r a {abr},{ac},{abr},.. results K-Neignborhood K-neighborhood(candidate) could be quite large, So,this approach works well for small m and k. searching
    154. 167. outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>Partitioning into exact search </li></ul><ul><li>Intermediate partitioning </li></ul><ul><li>summarization </li></ul>
    155. 168. Partitioning into exact search Pattern :abr with 1 error {a},{br} Text a b r a c a d a b r a {abra},{abra}.. results Partition pattern 1.For large error level the text areas to verify cover almost almost all the text. 2.If s grow,pieces get shorter, more match to check,but make the filter stricter. Exact search verification Text a b r a c a d a b r a into (K+s) pieces filtration
    156. 169. outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>Partitioning into exact search </li></ul><ul><li>Intermediate partitioning </li></ul><ul><li>summarization </li></ul>
    157. 170. Intermediate Partitioning Pattern :abr with 1 error {a},{br} Text a b r a c a d a b r a {abra},{abra}.. results Partition pattern Neighborhood generation allow floor of k/j verification Text a b r a c a d a b r a into j (j=2)pieces J=2 (j=K+1;partitioning into exact search) searching
    158. 171. Intermediate Partitioning Pattern :abr with 1 error {abr} Text a b r a c a d a b r a {abra},{abra}.. results Partition pattern 1.Which j value to use? the search time decreases when j move from 1 to k+1. but the verification cost grows, oppositiely. Neighborhood generation allow floor of k/j into j (j=1)pieces J=1 (neighborhood generation) searching {*abr,a*br,ab*r,abr*}U {ab,br,ar}U{ab*,*br,a*r}
    159. 172. outline <ul><li>Introduction </li></ul><ul><li>Basic concepts </li></ul><ul><li>Neighborhood generation </li></ul><ul><li>Partitioning into exact search </li></ul><ul><li>Intermediate partitioning </li></ul><ul><li>summarization </li></ul>
    160. 173. summarization
    161. 174. Paper study: topic 3 Lazy Users and automatic Video Retrieval Tools in (the) Lowlands The Lowlands Team CWI 1 , TNO 2 , University of Amsterdam 3 , University of Twente 4 The Netherlands Jan Baan 2 , Alex van Ballegooij 1 , Jan Mark Geusenbroek 3 , Jurgen den Hartog 2 , Djoerd Hiemstra 4 , Johan List 1 , Thijs Westerveld 4 , Ioannis Patras 3 , Stephan Raaijmakers 2 , Cees Snoek 3 , Leon Todoran 3 , Jeroen Vendrig 3 , Arjen P. de Vries 1 and Marcel Worring 3 . Proceeding of the 10 th Text Retrieval Conference(TREC), 2001
    162. 175. Outline <ul><li>Introduction </li></ul><ul><li>Detector-base processing </li></ul><ul><li>Probabilistic multimedia retrieval </li></ul><ul><li>Interactive experiment </li></ul><ul><li>Lazy users </li></ul><ul><li>Discussion </li></ul><ul><li>Conclusion </li></ul>
    163. 176. Basic key subject of Multimedia database <ul><li>Indexing </li></ul><ul><ul><li>K-d tree, point quadtree, MX-quadtree, R-tree, suffix-tree, TV-tree…… </li></ul></ul><ul><ul><li>Determined by database designer. </li></ul></ul><ul><li>Similarity </li></ul><ul><ul><li>No standard. </li></ul></ul><ul><ul><li>How similar is decide by user. </li></ul></ul>
    164. 177. User is always Lazy! <ul><li>Facts 1: </li></ul><ul><ul><li>Almost of end user know nothing about “Query”. </li></ul></ul><ul><li>Facts 2: </li></ul><ul><ul><li>What they want may only a concept, can not clearly to descript. </li></ul></ul><ul><li>Facts 3: </li></ul><ul><ul><li>Users like selection, not question. </li></ul></ul>
    165. 178. Introduction <ul><li>Use two complementary automatic approach: </li></ul><ul><ul><li>Visual content </li></ul></ul><ul><ul><li>Transcript </li></ul></ul><ul><li>The experiment focus on revealing relationships between: </li></ul><ul><ul><li>Different modalities </li></ul></ul><ul><ul><li>The amount of human processing </li></ul></ul><ul><ul><li>The quality of rersults </li></ul></ul>
    166. 179. Introduction Combined 1-4, interactive, by a lazy user 5 Query articulation, interactive 4 Transcript-base, automatic 3 Combined 1-3, automatic 2 Detector-base, automatic 1 Description Run
    167. 180. Detector-base processing <ul><li>Architecture for automatic syste </li></ul>
    168. 181. Detector-base processing (cont) <ul><li>Detector for exact queries that yield yes/no answer depending if a set of predicates is satisfied. </li></ul><ul><li>Detector for approximate queries that yield a measure that expresses how similar is. </li></ul>
    169. 182. Detector-base processing (cont) Selected detector Analysis of the topic description Query by example Filter-out irrelevant material Final ranked results
    170. 183. Detectors <ul><li>Camera technique detection </li></ul><ul><ul><li>zoom, pan, tilt……. </li></ul></ul><ul><li>Face detector </li></ul><ul><ul><li>no face, 1-face, 2-faces…5-faces, many-faces </li></ul></ul><ul><li>Caption retrieval </li></ul><ul><ul><li>Text segmentation, OCR, fuzzy string matching </li></ul></ul><ul><li>Monologue detection </li></ul><ul><ul><li>Shot should contain speech </li></ul></ul><ul><ul><li>Shot should have a static or unknown camera technique </li></ul></ul><ul><ul><li>Shot should have a minimum length </li></ul></ul><ul><li>Detectors base on color invariant features </li></ul><ul><ul><li>Keyframes store with color histogram </li></ul></ul>
    171. 184. Probabilistic multimedia retrieval <ul><li>We assume our documents are shots from video. </li></ul><ul><li>Models of discrete signals (i.e. text). </li></ul><ul><ul><li>Mixture of discrete probability measures </li></ul></ul><ul><li>Models of continuous signals (i.e. image). </li></ul><ul><ul><li>Mixture of continuous probability measures </li></ul></ul>
    172. 185. <ul><li>Using Bayes’ rule: </li></ul><ul><li>If a query consists of several independent parts (e.g. a textual Q t and visual part Q v ) </li></ul>Probabilistic multimedia retrieval
    173. 186. Probabilistic multimedia retrieval <ul><li>Hierarchical data model of video </li></ul>video shots scenes scenes shots frames frames
    174. 187. Probabilistic multimedia retrieval <ul><li>Text retrieval </li></ul><ul><ul><li>Using Sphinx3 speech recognition system from Carnegie Mellon University </li></ul></ul><ul><ul><li>Input query keyword </li></ul></ul><ul><ul><li>Retrieval to shots level </li></ul></ul>
    175. 188. Probabilistic multimedia retrieval <ul><li>Image retrieval </li></ul><ul><ul><li>Retrieving the key frames of shots </li></ul></ul><ul><ul><li>Cut key frames of each shots into blocks of 8 x 8 pixels </li></ul></ul><ul><ul><li>Perform by Discrete Cosine Transform (DCT), which used in the JPEG compression standard. </li></ul></ul>
    176. 189. Interactive experiments <ul><li>Topic lists. </li></ul><ul><ul><li>http://www-nlpir.nist.gov/projects/t01v/topicsoverview.html </li></ul></ul><ul><li>Topic 33: White fort </li></ul><ul><li>Topic 19: Lunar rover </li></ul><ul><li>Topic 8: Jupiter </li></ul><ul><li>Topic 25: Starwar </li></ul>
    177. 190. Topic 33: White fort Using Run 1: Any color-based technique worked out well for this query Example known-item keyframe
    178. 191. Topic 19: Lunar rover Color-histogram Example Known-item keyframe <ul><li>Color-based retrieval technique is not useful in this case </li></ul><ul><li>By Run 4 : </li></ul><ul><li>Allow user to making explicit their own world knowledge: in scenes on the moon , the sky is black . </li></ul>
    179. 192. Topic 8: Jupiter Example Some correct answers keyframes <ul><li>At first though, this query may seem to be easy to solve. </li></ul><ul><li>But it is apparent that colors in different photos. </li></ul><ul><li>Using three color-histogram and their interrelationships. </li></ul>Color-sets
    180. 193. Topic 25: Starwar Example Some correct answers keyframes <ul><li>Text retrieval: (if you know the name) </li></ul><ul><li>The first filter selects only those images that have sufficient amount of golden content. </li></ul><ul><li>Secondly, a set of filters reduces the data-set by selecting those images that contain the color-sets shown. </li></ul>R2D2, C3PO
    181. 194. Lazy users <ul><li>Lazy users identify result sets instead of correct answer. (so our interactive results are not 100% precision.) </li></ul><ul><li>The combination strategies used to construct run 5 consisted of: </li></ul>Choose the run that looks best Concatenate or interleave top-N from various runs Continue with an automatic, seeded search strategy
    182. 195. Discussion <ul><li>How video retrieval systems should be evaluated. </li></ul><ul><li>The inhomogeneity of the topics </li></ul><ul><ul><li>“ sailboat on the beach” vs. “yacht on the sea” </li></ul></ul><ul><li>The low quality of the data </li></ul><ul><ul><li>photos of Jupiter </li></ul></ul><ul><li>The evaluation measures used </li></ul>
    183. 196. Conclusion <ul><li>Our evaluation demonstrates the importance of combining various techniques to analyze the multiple modalities . </li></ul><ul><li>the optimal technique depends always on the query. </li></ul><ul><li>User interaction is still required to decide upon a good strategy. </li></ul>
    184. 197. Paper study : topic 4 VIDEO INDEXING BY MOTION ACTIVITY MAPS Wei Zeng; Wen Gao; Debin Zhao; Image Processing. 2002. Proceedings. 2002 International Conference on , Volume: 1 , 2002 Page(s): 912 -915
    185. 198. Outline <ul><li>Introduction </li></ul><ul><ul><li>motion indexing </li></ul></ul><ul><li>Motion Activity Map---MAM </li></ul><ul><li>Definition of MAM </li></ul><ul><li>Generation of MAM </li></ul><ul><li>Organization of MAMs </li></ul><ul><li>Experimental results </li></ul><ul><li>Conclusion </li></ul>
    186. 199. Introduction <ul><li>To find a video indexing technique which could extract crucial information from videos for efficient visual content-based queries. </li></ul><ul><li>In order to foster the content-based indexing and retrieval. </li></ul><ul><li>The video indexing should be based on good feature representation such as motion feature. </li></ul><ul><li>Motion feature depicts the dynamic contents of video, and enrich the semantics of videos, such as running and flying. </li></ul>
    187. 200. Motion indexing <ul><li>Those techniques and systems about motion indexing can be categorized into four types. </li></ul><ul><ul><li>Feature-based approach : </li></ul></ul><ul><ul><li>Trajectory-based approach : </li></ul></ul><ul><ul><li>Semantic-based approach </li></ul></ul><ul><ul><li>Image-based approach </li></ul></ul>
    188. 201. Feature-based approach : <ul><li>Computes the motion parameters of predefined motion model. </li></ul><ul><li>Has been adopted by MPEG7(still draft) </li></ul><ul><li>example </li></ul>
    189. 203. Trajectory-based approach : <ul><li>This approach is often chosen by object-based system for indexing video. </li></ul>
    190. 205. Semantic-based approach <ul><li>Provides semantic events or actions of motion. </li></ul><ul><li>Reference paper “A Semantic Event-Detection Approach and Its Application to Detecting Hunts in Wildlife Video” </li></ul>
    191. 207. Image-based approach <ul><li>Gives synthesized pictures generated from motion of video. </li></ul><ul><li>MAM is the image-based approach </li></ul>
    192. 208. Concepts of MAM(1) <ul><li>Motion activity map is an image that accumulates the motion activity on the specific grids along the temporal axis of videos. </li></ul>grid t i j
    193. 209. Concepts of MAM(2) <ul><li>It is an image-based representation about the magnitude and spatial distribution of motion. </li></ul><ul><li>One video clique can generate several MAMs and all MAMs are organized into a hierarchical tree view according to the structure of video. </li></ul>
    194. 210. Definition of MAM(1) <ul><li>Motion activity map is an image synthesized from motion vector field, and motion vector field can be defined as following temporal function. X(t), where t=t 0 ,t 1 ,………t k . </li></ul><ul><ul><li>X(t) = v( i, j, t ) </li></ul></ul>(i, j)  Where v x = ( i, j, t ) and v y = ( i, j, t ) are the x-axis component and y-axis component of motion vector on the grid ( i, j) .
    195. 211. Definition of MAM(2) <ul><li>Base on the motion vector field X(t), the motion activity map (MAM) is computed as </li></ul>(i, j)  Where f(v(i, j, t)) is the motion activity measure function on grid (i, j) and  is the grid set of video.
    196. 212. Generation of MAM Demo video segmentation Hall shall Motion vector field Video Video Video Temporal video Segmentation MAM Computing MAM Quantization MAM spatial Segmentation MAM Region- Based MAMs
    197. 213. Organization of MAMs <ul><li>Video can be segmented into different shot levels such as shots and sub-shots, so there are a lot of MAMs corresponding to a video shot. </li></ul><ul><li>All the MAMs of video can be organized into a hierarchical tree representing the structure of video. </li></ul>
    198. 214. Organization of MAMs Interactive Video Retrieval Video Video Video Temporal Segmentation MAM Computing Layered spatial segmentation MAM display MAM Database
    199. 215. Expermental results (a)Key frame based MAM (b)MAM (c-f)Region-representation of MAM
    200. 216. Conclusion <ul><li>Video  shot  MAM  sub-shot1  MAM1  sub-shot2  MAM2 </li></ul><ul><li>All the MAM could be segmented into Region-representation. </li></ul><ul><li>Optimalize MAM-based representation, we mark the pixel of MAM with a specific color according to the related intensity. </li></ul>
    201. 217. Paper study : topic 5 SOM-Base R*-Tree for Similarity Retrieval Database Systems for Advanced Applications, 2001. Proceedings. Seventh International Conference on , 2001 Kun-seok. Oh, Yaokai Feng, Kunihiko Kaneko, Akifumi Makinouchi, Sang-hyun Bae
    202. 218. Outline <ul><li>Self-Organizing Maps (SOM) </li></ul><ul><li>R*-Tree </li></ul><ul><li>SOM-Based R*-Tree </li></ul><ul><li>Experiments </li></ul><ul><li>Conclusion </li></ul>
    203. 219. Self-Organizing Maps (SOM) What is SOM 1.SOM provide mapping from high-demensional feature vectors onto a two-dismensional space 2.The mapping preserves the topology of the feature vector. 3.The map is called to topological feature map , and preserves the mutual relationships(similarity) in feature space of input data. 4.The vectors contained in each node of the topological feature map are usually called codebook vectors .
    204. 220. <ul><li>我們使用 100 個類神經元排列成 10×10 的二維矩陣來進行電腦模擬,用來進行測試的輸入向量 的維度也是二維的資料,且其機率分佈為均勻地分佈在 。 </li></ul>Self-Organizing Maps (SOM)
    205. 221. 圖:均勻分佈之資料的自我組織特徵映射圖: (a) 隨機設定之初始鍵結值向量 ; (b) 經過 50 次疊代後之鍵結值向量 ;(c) 經過 1,000 次疊代 後之鍵結值向量 ;(d) 經過 10,000 次疊代後之鍵結值向量 ; Self-Organizing Maps (SOM)
    206. 222. <ul><li>類神經元在特徵映射圖中的機率分佈,的確可以反應出輸入向量的機率分佈。這裏要強調一點的是,資料的機率分佈特性並非是線性地反應於映射圖中。 </li></ul>三群高斯分佈之資料。 Self-Organizing Maps (SOM)
    207. 223. Self-Organizing Maps (SOM) SOM Algorithm 1.Init Map neuron. 2.input feature vector x. 3.find winner neuron (BMN:Beat-Match Node) 4.adjusting all neuron’s weight 5.continus step 2, until no adjusting.
    208. 224. R*-Tree <ul><li>The R*-tree improves the performance of the R-tree by modifying the insertion and split algorithms by introducing the forced reinsertions mechanism </li></ul><ul><li>The R*-tree is proposed as an index structure for spatial data such as geographical and CAD data </li></ul>
    209. 225. R*-Tree <ul><li>Each internal node contais an array of (p,  ) entries.Where p is a pointer to in child node of this internal node, and  is the minimum bounding rectangle(MBR) of her child node pointer to by the pointer p. </li></ul><ul><li>Each lead node contains an array of (OID,  ) for spatial objects, where OID is an object identifer, and  is the MBR of the object identified by OID. </li></ul>
    210. 226. R*-Tree (cont.) Space of point data
    211. 227. R*-Tree (cont.) Tree access structure
    212. 228. SOM-Based R*-Tree <ul><li>1 、 Clustering similar images </li></ul><ul><ul><li>We first generate the topological feature map using the SOM, We generate the BMIL by computing the distance between the feature vector and codebook vectors from the topological feature map. </li></ul></ul><ul><ul><li>The BMN(best-match-nodes: node with minimum distance) is chosen from the map nodes. </li></ul></ul><ul><ul><li>Next the weigth vector are updated </li></ul></ul>
    213. 229. SOM-Based R*-Tree (cont.)
    214. 230. SOM-Based R*-Tree (cont.) <ul><li>2 、 Construction </li></ul><ul><ul><li>In order to construct the R*-tree, we select a CBV (codebook vector) from the topological feature map as an entry . </li></ul></ul><ul><ul><li>If it is an empty node . We select the next codebook vector. Otherwise determine the leaf node which insert codebook vector. </li></ul></ul><ul><ul><li>A leaf of the SOM-based R*-tree has the following structure: </li></ul></ul>
    215. 231. Experiments <ul><li>We preformed experiments to compare the Som-base with SOM and R*-tree. </li></ul><ul><li>Image database use: 40,000 atrificial/natural (storage on local disk) </li></ul><ul><li>Image size: 128*128 pixels </li></ul><ul><li>Performed on:COMPAQ deskpro( OS:FreeBSD) with 128MB RAM </li></ul>
    216. 232. Experiments (cont.) <ul><li>Feature Extraction: </li></ul><ul><ul><li>use Haar waveletes to compute feature vector </li></ul></ul><ul><ul><li>The color space YIQ-space (NTSC transmission primaries ) </li></ul></ul><ul><ul><li>Each elecment of this feature vector represents an agerage of 32*32 pixels of original image. </li></ul></ul><ul><ul><li>The color feature vector has 48 dimensions (4*4*3 ; where 3 is the ehree channels of YIQ-space) </li></ul></ul>
    217. 233. Experiments (cont.) <ul><li>Construcion of SOM-based R*-tree </li></ul>
    218. 234. Experiments (cont.)
    219. 235. Experiments (cont.) <ul><li>We experimented with four type of searches: </li></ul><ul><li>(I) normal SOM including empty nodes </li></ul><ul><li>(II) normal SOM with eliminated empty nodes </li></ul><ul><li>(III) normal R*-tree </li></ul><ul><li>(IV) SOM-based R*-tree with eliminated empty nodes </li></ul>
    220. 236. Experiments (cont.) <ul><li>Retrieval from SOM with empty nodes </li></ul><ul><li>Retrieval from SOM without empty nodes </li></ul>
    221. 237. Experiments (cont.)
    222. 238. Conclusion <ul><li>For high-dimensional data ,we using a topological feature map and a best-matching-image-list (BMIL) obtained via the learning of a SOM </li></ul><ul><li>In an experiment ,we performed a similarity search using real image data and compared the performance of the SOM-based R*-tree with a normal SOM and R*-tree ,base on retrieval time cost </li></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×