Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM

1,015 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM

  1. 1. Content System architecture Experimental Results Conclusion TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM MediaEval Benchmarking Initiative for Multimedia Evaluation Jozef Vavrek, Mat´ˇ Pleva, Jozef Juh´r us a Department of Electronics and Multimedia Communications Technical University of Koˇice, Slovak Republic s e-mail:{jozef.vavrek; matus.pleva; jozef.juhar}@tuke.sk 04 October, 2012
  2. 2. Content System architecture Experimental Results Conclusion 1 System architecture Segmentation Feature Extraction Support Vector Machine Method Searching Algorithm 2 Experimental Results 3 Conclusion
  3. 3. Content System architecture Experimental Results ConclusionProposed query-by-example searching architecture Audio documents Feature DTW Segmentation utterances extraction (MCA) Support Vector Audio documents Machine queries
  4. 4. Content System architecture Experimental Results ConclusionSegmentation and pre-processing segmentation: into the segments with variable length: lsegment = lquery ⇒ rectangular window use: for further phase of pre-processing and feature extraction pre-processing: pre-emphasis filtering, Hamming’s window: lwindow = lquery /100 ⇒ overlapping - 50%, use: to emphasize higher frequency components, to reduce abrupt changes within the spectrum of the signal, to increase classification performance of the SVM classifier utterance 1.segment 2.segment 3.segment 4.segment framing lwindow=lquery/100 query lsegment=lquery
  5. 5. Content System architecture Experimental Results ConclusionFeature Extraction coefficients (features) frames (instances) 0 12 0 12 log of amplitude IDFT 12 transformation filtering 0 (DFT, FFT) spectrum (Mel filter bank) (DCT) 0 12 Mel Feature vector matrix avgMCA 1000 utterance segment query 500 250,1 MCA MFCCs MFCCs+ZCR MFCCs+ZCR+MPEG-7 Dimension Similarity matrix 13x13 (ASS, ASC, ASF, ASE) (Cost matrix)
  6. 6. Content System architecture Experimental Results ConclusionSupport Vector Machine classifier linear SVM with soft and hard margin defined by decision hyperplane l d(w, x, b) = w· x + b = wi xi + b, (1) i=1 x2 x2 Hard margin Class 1; y=+1 Class 1; y=+1 Soft margin Decision hyperplane Class 2; y=-1 Class 2; y=-1 x1 x1
  7. 7. Content System architecture Experimental Results ConclusionNonlinear SVM classifier mapping into the high-dimensional feature space by kernel functions l d(x) = αi yi z(x)· z(xi ) + b, (2) i=1 K (xi , xj ) = zi · zj = Φ(xi )· Φ(xj ) . (3) x2 x2 Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) Φ( ) x1 x1 used kernel functions Mat. expression Type K (xi , xj ) = xi · xj Linear d K (xi , xj ) = γ xi · xj + 1 Polynomial of degree d K (xi , xj ) = exp(−γ|xi − xj |2 ) Gaussian Radial Basis Function (RBF)
  8. 8. Content System architecture Experimental Results ConclusionSVM based searching (classification) algorithm Segment 1 Segment 2 Segment 3 . . . Segment N lquery query001 frames segment 1 +1 lwindow=lquery/100 -1 0 1 ... 11 12 13 MFCCs query001 segment 2 +1 -1 query001 segment N Compute MCA of DTW +1 -1 < threshold Train SVM with linear SVM model Compute miss(+1) kernel and C=1 miss(-1) Num. of iterations Query detected > threshold
  9. 9. Content System architecture Experimental Results ConclusionExperimental results Number of iteration Score parameter: 100 = 2.82 correctly predicted frames Error rate: 1 − all tested frames = 0.18 miss(+)+miss(−) Miss-classification rate: all predicted data = 0.12 Evaluation results of the tested algorithm database set P(FA) P(Miss) ATWV evalQ-devC 1.54617 0.960 -0.052 devQ-evalC 1.62595 0.948 -0.233 evalQ-evalC 1.68694 0.974 -0.164 devQ-devC 1.78786 0.943 -0.194
  10. 10. Content System architecture Experimental Results ConclusionConclusions and Future Work Proposed query-by-example searching system based on the minimum cost alignment of DTW algorithm and unsupervised SVM miss-classification error rate. No other resources were used during the development. Poor detection performance with high number of false alarms and miss-detections caused by variable length of queries and detected terms with similar spectral characteristics within each utterances. Relatively high computational time (searching time) of proposed algorithm. Future work: design an effective query-by-example searching system with lower computational time and miss-detections.
  11. 11. Content System architecture Experimental Results Conclusion Thank You For Your Attention

×