SlideShare a Scribd company logo
Acoustic Features to Predict Topic Change in
Instructional Video
Vineet Kumar , Sushant Gupta
Computer Science & Engineering Department
Jadavpur University
{vntkumar8,sushantgupta1996}@gmail.com
4 March 2017
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Motivation
India is one of the largest producer of engineers in the world. The
growth of Massive Open Online Courses (MOOCs) are considered
as one of the biggest revolution in education in last several
decades. One significant impact of the MOOC phenomenon is that
they have accelerated the widespread availability of quality Open
Educational Resources (OER).
MOOCs have given an open challenge to all current methods of
education system. They have high potential of acceptability among
all kind of learners. We believe that Instructional Videos are going
to be next generation textbooks.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
120 Hours of Educational Content from Youtube
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Feature Extraction
We wanted to use acoustic features of the videos, hence after an
extensive study of sound and audio processing literature and found
that MFCC & InterSpeech 2009 Feature Set are general best
choice for such acoustic processing tasks.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Feature Extraction
We wanted to use acoustic features of the videos, hence after an
extensive study of sound and audio processing literature and found
that MFCC & InterSpeech 2009 Feature Set are general best
choice for such acoustic processing tasks.
hgjg
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Disambiguity Resolution
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Data Set
Taking out the features for every 50ms of a video, our data
set was pretty large with around
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Data Set
Taking out the features for every 50ms of a video, our data
set was pretty large with around
9 Million instances
In a video, we noticed around 15-20 topic changes in an hour.
If we assume topic change to be 1 second transition then in
an hour,i.e, 3600 seconds we had only 15-20 seconds where
topic has actually changed. This lead to a class imbalance
problem. We had used a nice method to tackle this class
imbalance problem by down-sampling.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
DataSet Labeling
This was the most time consuming task of our project. We had to
watch each and every video and manually note down the exact
time of topic change as accurate in terms of second. This labelling
was done by both of us independently. Labelling Scheme is :-
45 instances above are -ve (0)
20 instances above are skipped
30 instances above are +ve (1)
Exact time of topic change (1)
30 instances below are +ve (1)
20 instances below are skipped
45 instances below are -ve (0)
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Dimension Reduction
Can we reduce the dimension of feature space ?
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Dimension Reduction
Can we reduce the dimension of feature space ?
Classical problem of Dimensionality reduction
We used Learning Vector Quantization to reduce the feature
space. We ranked the features according to their importance.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Dimension Reduction
Can we reduce the dimension of feature space ?
Classical problem of Dimensionality reduction
We used Learning Vector Quantization to reduce the feature
space. We ranked the features according to their importance.
An LVQ system is represented by prototypes
W = (w(i), ..., w(n)) which are defined in the feature space
of observed data. In training algorithms one determines, for
each data point, the prototype which is closest to the input
according to a given distance measure. The position of this
so-called winner prototype is then adapted, i.e. the winner is
moved closer if it correctly classifies the data point or moved
away if it classifies the data point incorrectly.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Dimension Reduction - I
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Model Training
Now we have manageable number of instances
Train the model using SVM & Random Forest
SVM with RBF
78% of accuracy is achieved
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Future Directions - I
This work is an aid to help already available state of art table
of content generation for video. Visual cues and acoustic
features both are combined to accurately pinpoint the topic
change transition and build a table of content kind of thing
for educational video.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Future Directions - I
This work is an aid to help already available state of art table
of content generation for video. Visual cues and acoustic
features both are combined to accurately pinpoint the topic
change transition and build a table of content kind of thing
for educational video.
Improving Accuracy In order to improve our accuracy we are
planning to train our model on Support Vector Machines for
Multiple instance Learning. Instead of receiving a set of
instances which are individually labeled, the learner receives a
set of labeled bags, each containing many instances. In the
simple case of multiple-instance binary classification, a bag
may be labeled negative if all the instances in it are negative.
On the other hand, a bag is labeled positive if there is at least
one instance in it which is positive. From a collection of
labeled bags, the learner tries to either (i) induce a concept
that will label individual instances correctly or (ii) learn how
to label bags without inducing the concept.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
Future Directions - II
Song detection
war scene detection rather any particular topic of interest
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
References I
Stuart Andrews, Ioannis Tsochantaridis, and Thomas
Hofmann.
Support vector machines for multiple-instance learning.
In Proceedings of the 15th International Conference on Neural
Information Processing Systems, NIPS’02, pages 577–584,
Cambridge, MA, USA, 2002. MIT Press.
Arijit Biswas, Ankit Gandhi, and Om Deshmukh.
Mmtoc: A multimodal method for table of content creation in
educational videos.
In Proceedings of the 23rd ACM International Conference on
Multimedia, MM ’15, pages 621–630, New York, NY, USA,
2015. ACM.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
References II
Florian Eyben, Felix Weninger, Florian Gross, and Bj¨orn
Schuller.
Recent developments in opensmile, the munich open-source
multimedia feature extractor.
In Proceedings of the 21st ACM International Conference on
Multimedia, MM ’13, pages 835–838, New York, NY, USA,
2013. ACM.
Teuvo Kohonen.
The handbook of brain theory and neural networks.
chapter Learning Vector Quantization, pages 537–540. MIT
Press, Cambridge, MA, USA, 1998.
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
We are very much thankful to Xerox Research Center India
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
We are very much thankful to Xerox Research Center India
Thank You !!!
Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change

More Related Content

Similar to Acoustic Features to Predict Topic Change

Btp report final_lalit
Btp report final_lalitBtp report final_lalit
Btp report final_lalit
Toastmaster International
 
Event recognition image & video segmentation
Event recognition image & video segmentationEvent recognition image & video segmentation
Event recognition image & video segmentation
eSAT Journals
 
Video inpainting using backgroung registration
Video inpainting using backgroung registrationVideo inpainting using backgroung registration
Video inpainting using backgroung registration
eSAT Publishing House
 
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET Journal
 
76201967
7620196776201967
76201967
IJRAT
 
Predicting Movie Success Using Neural Network
Predicting Movie Success Using Neural NetworkPredicting Movie Success Using Neural Network
Predicting Movie Success Using Neural Network
International Journal of Science and Research (IJSR)
 
Cartoonization of images using machine Learning
Cartoonization of images using machine LearningCartoonization of images using machine Learning
Cartoonization of images using machine Learning
IRJET Journal
 
On the Influence Propagation of Web Videos
On the Influence Propagation of Web VideosOn the Influence Propagation of Web Videos
On the Influence Propagation of Web Videos
abidhavp
 
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd AswdrImage Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
Melanie Smith
 
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET Journal
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
Sudeep Das, Ph.D.
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Koteswar Rao Jerripothula
 
Evaluation v1.1
Evaluation v1.1Evaluation v1.1
Evaluation v1.1
FrontRoomFilms
 
A Model Of Opinion Mining For Classifying Movies
A Model Of Opinion Mining For Classifying MoviesA Model Of Opinion Mining For Classifying Movies
A Model Of Opinion Mining For Classifying Movies
Andrew Molina
 
Audio Classification using Artificial Neural Network with Denoising Algorithm...
Audio Classification using Artificial Neural Network with Denoising Algorithm...Audio Classification using Artificial Neural Network with Denoising Algorithm...
Audio Classification using Artificial Neural Network with Denoising Algorithm...
IRJET Journal
 
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
IRJET Journal
 
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videosAdria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Codiax
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptx
SaravanaD2
 
User Testing: Losing Weight (UX Cocktail Hour)
User Testing: Losing Weight (UX Cocktail Hour)User Testing: Losing Weight (UX Cocktail Hour)
User Testing: Losing Weight (UX Cocktail Hour)
Paul Veugen
 
project topics-1.pptx
project topics-1.pptxproject topics-1.pptx
project topics-1.pptx
purushothamansinivas
 

Similar to Acoustic Features to Predict Topic Change (20)

Btp report final_lalit
Btp report final_lalitBtp report final_lalit
Btp report final_lalit
 
Event recognition image & video segmentation
Event recognition image & video segmentationEvent recognition image & video segmentation
Event recognition image & video segmentation
 
Video inpainting using backgroung registration
Video inpainting using backgroung registrationVideo inpainting using backgroung registration
Video inpainting using backgroung registration
 
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...IRJET-  	  Analysis of Music Recommendation System using Machine Learning Alg...
IRJET- Analysis of Music Recommendation System using Machine Learning Alg...
 
76201967
7620196776201967
76201967
 
Predicting Movie Success Using Neural Network
Predicting Movie Success Using Neural NetworkPredicting Movie Success Using Neural Network
Predicting Movie Success Using Neural Network
 
Cartoonization of images using machine Learning
Cartoonization of images using machine LearningCartoonization of images using machine Learning
Cartoonization of images using machine Learning
 
On the Influence Propagation of Web Videos
On the Influence Propagation of Web VideosOn the Influence Propagation of Web Videos
On the Influence Propagation of Web Videos
 
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd AswdrImage Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
 
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
IRJET- Implementation of Emotion based Music Recommendation System using SVM ...
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
Image segmentation using advanced fuzzy c-mean algorithm [FYP @ IITR, obtaine...
 
Evaluation v1.1
Evaluation v1.1Evaluation v1.1
Evaluation v1.1
 
A Model Of Opinion Mining For Classifying Movies
A Model Of Opinion Mining For Classifying MoviesA Model Of Opinion Mining For Classifying Movies
A Model Of Opinion Mining For Classifying Movies
 
Audio Classification using Artificial Neural Network with Denoising Algorithm...
Audio Classification using Artificial Neural Network with Denoising Algorithm...Audio Classification using Artificial Neural Network with Denoising Algorithm...
Audio Classification using Artificial Neural Network with Denoising Algorithm...
 
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
 
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videosAdria Recasens, DeepMind – Multi-modal self-supervised learning from videos
Adria Recasens, DeepMind – Multi-modal self-supervised learning from videos
 
REVIEW PPT.pptx
REVIEW PPT.pptxREVIEW PPT.pptx
REVIEW PPT.pptx
 
User Testing: Losing Weight (UX Cocktail Hour)
User Testing: Losing Weight (UX Cocktail Hour)User Testing: Losing Weight (UX Cocktail Hour)
User Testing: Losing Weight (UX Cocktail Hour)
 
project topics-1.pptx
project topics-1.pptxproject topics-1.pptx
project topics-1.pptx
 

Recently uploaded

sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
ssuser36d3051
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
nooriasukmaningtyas
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
Madhumitha Jayaram
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
University of Maribor
 

Recently uploaded (20)

sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Low power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniquesLow power architecture of logic gates using adiabatic techniques
Low power architecture of logic gates using adiabatic techniques
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
 

Acoustic Features to Predict Topic Change

  • 1. Acoustic Features to Predict Topic Change in Instructional Video Vineet Kumar , Sushant Gupta Computer Science & Engineering Department Jadavpur University {vntkumar8,sushantgupta1996}@gmail.com 4 March 2017 Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 2. Motivation India is one of the largest producer of engineers in the world. The growth of Massive Open Online Courses (MOOCs) are considered as one of the biggest revolution in education in last several decades. One significant impact of the MOOC phenomenon is that they have accelerated the widespread availability of quality Open Educational Resources (OER). MOOCs have given an open challenge to all current methods of education system. They have high potential of acceptability among all kind of learners. We believe that Instructional Videos are going to be next generation textbooks. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 3. 120 Hours of Educational Content from Youtube Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 4. Feature Extraction We wanted to use acoustic features of the videos, hence after an extensive study of sound and audio processing literature and found that MFCC & InterSpeech 2009 Feature Set are general best choice for such acoustic processing tasks. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 5. Feature Extraction We wanted to use acoustic features of the videos, hence after an extensive study of sound and audio processing literature and found that MFCC & InterSpeech 2009 Feature Set are general best choice for such acoustic processing tasks. hgjg Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 6. Disambiguity Resolution Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 7. Data Set Taking out the features for every 50ms of a video, our data set was pretty large with around Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 8. Data Set Taking out the features for every 50ms of a video, our data set was pretty large with around 9 Million instances In a video, we noticed around 15-20 topic changes in an hour. If we assume topic change to be 1 second transition then in an hour,i.e, 3600 seconds we had only 15-20 seconds where topic has actually changed. This lead to a class imbalance problem. We had used a nice method to tackle this class imbalance problem by down-sampling. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 9. DataSet Labeling This was the most time consuming task of our project. We had to watch each and every video and manually note down the exact time of topic change as accurate in terms of second. This labelling was done by both of us independently. Labelling Scheme is :- 45 instances above are -ve (0) 20 instances above are skipped 30 instances above are +ve (1) Exact time of topic change (1) 30 instances below are +ve (1) 20 instances below are skipped 45 instances below are -ve (0) Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 10. Dimension Reduction Can we reduce the dimension of feature space ? Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 11. Dimension Reduction Can we reduce the dimension of feature space ? Classical problem of Dimensionality reduction We used Learning Vector Quantization to reduce the feature space. We ranked the features according to their importance. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 12. Dimension Reduction Can we reduce the dimension of feature space ? Classical problem of Dimensionality reduction We used Learning Vector Quantization to reduce the feature space. We ranked the features according to their importance. An LVQ system is represented by prototypes W = (w(i), ..., w(n)) which are defined in the feature space of observed data. In training algorithms one determines, for each data point, the prototype which is closest to the input according to a given distance measure. The position of this so-called winner prototype is then adapted, i.e. the winner is moved closer if it correctly classifies the data point or moved away if it classifies the data point incorrectly. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 13. Dimension Reduction - I Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 14. Model Training Now we have manageable number of instances Train the model using SVM & Random Forest SVM with RBF 78% of accuracy is achieved Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 15. Future Directions - I This work is an aid to help already available state of art table of content generation for video. Visual cues and acoustic features both are combined to accurately pinpoint the topic change transition and build a table of content kind of thing for educational video. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 16. Future Directions - I This work is an aid to help already available state of art table of content generation for video. Visual cues and acoustic features both are combined to accurately pinpoint the topic change transition and build a table of content kind of thing for educational video. Improving Accuracy In order to improve our accuracy we are planning to train our model on Support Vector Machines for Multiple instance Learning. Instead of receiving a set of instances which are individually labeled, the learner receives a set of labeled bags, each containing many instances. In the simple case of multiple-instance binary classification, a bag may be labeled negative if all the instances in it are negative. On the other hand, a bag is labeled positive if there is at least one instance in it which is positive. From a collection of labeled bags, the learner tries to either (i) induce a concept that will label individual instances correctly or (ii) learn how to label bags without inducing the concept. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 17. Future Directions - II Song detection war scene detection rather any particular topic of interest Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 18. References I Stuart Andrews, Ioannis Tsochantaridis, and Thomas Hofmann. Support vector machines for multiple-instance learning. In Proceedings of the 15th International Conference on Neural Information Processing Systems, NIPS’02, pages 577–584, Cambridge, MA, USA, 2002. MIT Press. Arijit Biswas, Ankit Gandhi, and Om Deshmukh. Mmtoc: A multimodal method for table of content creation in educational videos. In Proceedings of the 23rd ACM International Conference on Multimedia, MM ’15, pages 621–630, New York, NY, USA, 2015. ACM. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 19. References II Florian Eyben, Felix Weninger, Florian Gross, and Bj¨orn Schuller. Recent developments in opensmile, the munich open-source multimedia feature extractor. In Proceedings of the 21st ACM International Conference on Multimedia, MM ’13, pages 835–838, New York, NY, USA, 2013. ACM. Teuvo Kohonen. The handbook of brain theory and neural networks. chapter Learning Vector Quantization, pages 537–540. MIT Press, Cambridge, MA, USA, 1998. Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 20. We are very much thankful to Xerox Research Center India Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change
  • 21. We are very much thankful to Xerox Research Center India Thank You !!! Vineet Kumar , Sushant Gupta Acoustic Features to Predict Topic Change