0
Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# 얼굴 검출 기법과 감성 언어 인식기법

712

Published on

Published in: Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total Views
712
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
14
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. Part I: 얼굴 검출 기법 Part II: 감성 언어 인식 기법 2011. 3. 11( 금 ). 김성호 영남대학교 전자공학과 Brown Bag Seminar
• 2. Part I: 얼굴 검출 기법 연구 [IPIU 2011 학회 발표 ]
• Motivation
• 3. Proposed Object Representation Scheme Viewpoint Figure/Ground mask Local appearance For 2D object: (object center, scale) For 3D object: 3D object pose Boundary shape Figure/ground information Appearance codebook Part pose Joint appearance and shape model
• 4. Visual Context in the Joint Appearance & Shape Model
• How to integrate those contextual cues?
BU+TD Spatial Context Hierarchical Context Part – Part context (bottom-up) Object - Background context (top-down) Part – Whole context (bottom-up/top-down)  Grouping property  Supporting contextually related category  Predicting figure-ground Weak neighbor support Strong neighbor support Cooperative
• 5. Mathematical Formulation for Categorization (1/2) Solution: C ategory label, V iewpoint, M ask Key issue: difficult modeling of prior due to complex high dimensions Our approach appearance pose Utilize graphical model especially Directed graphical model (Bayesian Net) V M F A X {C,B} N Top-down Bottom-up Viewpoint Figure-ground Codebook index b2 f4 f5 b4 b5 b6 b3 f3 b1 f1 f2 V M F G {C,B}
• 6. Learning for Distributed Category Representation CC: Category specific Codebook for top-down inference UC: Universal Codebook for bottom-up inference … … … … … … Joint appearance and boundary with viewpoint Car Airplane Issue How to select optimal codebook (CB) for category representation? Previous constellation model: fixed no. of parts  Cannot handle large variations Why distributed?  To handle large intra class variations
• 7. Codebook Selection Reducing Surface Markings
• Focus
• What codebook can reduce the effect of surface markin gs ?
• Our strategy
• Intermediate blurring
• Statistical property  Entropy
Repeatable part Surface marking part
• 8. Entropy of Candidate Codebook Low entropy  surface marking High entropy  Semantic parts Finding : High entropy codebook in should be selected for surface marking reduction
• 9. Inference Flow related to Category Model Input … … … Car Airplane … background CB UCB CCB Car category Multi-modal viewpoint Multi-modal figure-ground mask Final result Category Model Part-whole context Part-part context (estimate weight) Dense feature Matching to UC Grouping (similarity & proximity) +
• 10. Demo of Categorization and Segmentation
• 11. Category Detection: Caltech Face Dataset [DB1]
• 435 face images with clutter
• 468 background images
• Learning
• Randomly select 15 faces
• Randomly select 15 background
• Test
• 200 novel face images
• 200 novel background
[DB1] http://www.robots.ox.ac.uk/~vgg/data3.html [Weber00] M. Weber, M. Welling, and P. Perona, “Unsupervised learning of models for recognition”, In Proc. ECCV, pp. 18–32, 2000. [Fergus03] R. Fergus, P. Perona, A. Zisserman, “Object class recognition by unsupervised scale invariant learning”, In CVPR, 2003. [Shotton05] J. Shotton, A. Blake, R. Cipolla, “Contour-based learning for object detection”, In ICCV, 2005. Method N train ROC EER (Region error<25%) Unsegmented Segmented [Weber00] 200 0 94.0% [Fergus03] 220 0 96.4% [Shotton05] 50 10 96.5% Ours 0 15 97.3 %
• 12. Examples of Face Detection
• 13. Test image Bottom-up viewpoints Bottom-up mask Hypothesized viewpoint Hypothesized mask Final Inference result by Boosted MCMC
• 14. Test Results in Real Scene (KAIST)
• Note: We use Caltech DB and test real images.
• 15. Conclusions and Discussions
• Joint appearance and boundary with viewpoint is suitable object model for the object categorization in cluttered scenes.
• Visual contexts (part-part, part-whole, object-background context) can discriminate ambiguous figure-ground.
• Bayesian Net can model both the categorization and the figure-ground segmentation.
• Boosted MCMC can provide efficient inference for cluttered objects.
• Future work
• Modeling of more flexible figure-ground mask
• Using boundary shape in likelihood calculation
• 16. Part II: 감성언어 인식 기법 연구 - Introduction
• Speech
• A sequence of elementary acoustic symbols
• Information in speech
• Gender information, age, accent, speaker’s identity, health, and emotion
• Emotional speech recognition
• Recently, increased attention in this area
• 융합과제 : 반한 감정에 대한 정량적 분석에 도움 .
• 17. Structure of Emotional Speech Recognition
• 핵심
• Feature extractor
• Classifier
Recognized emotions MFCC SVM or Nearest class mean classifier
• 18. Feature for Emotional Speech Recognition
• Mel Frequency Cepstral Coefficients ( MFCC )
• Convey information of short time energy in frequency domain
Signal Fourier transform (frequency domain) Mapping the power spectrum onto the mel scale Take Log of the mel frequency Final MFCC: Amplitude of resulting spectrum Mel scale: 사람이 차이를 느끼는 주파수 간격
• 19. Classifier: Support Vector Machine Feature space Learning : Finding optimal classifier Recognition : Performed by the learned classifier
• 20. Classifier: Nearest Class Mean Feature space Learning : Finding class means Recognition : Finding nearest class
• 21. Exp.1 on EMO Database
• 구성
• 7 종의 감정 데이터 (happy, angry, anxious, fearful, bored, disgusted, neutral)
• 10 종의 문장
• 10 명의 성우 ( 남 5, 여 5)
• 언어 : 독일어
anger happy boredom
• 22. Recognition using Nearest Class Mean Classifier
• Learning: 150 (randomly selected), test: 150
Recognition rate: 47.0%
• 23. Recognition using SVM
• Recognition rate: 38.0%
SVM 보다 Nearest Class Mean Classifier 가 우수함 .
• 24. Exp2. 독일어로 학습  일본어 테스트
• 놀람
• 슬픔
• 기쁨
독일어와 일본어의 차이로 인해 인식이 불안정함 .
• 25. Exp3. 일본어로 학습  일본어로 테스트
• DB 구성 : 5 개 감정 , 57 개 음성클립 ( 언덕 위의 구름 4 화 )