SlideShare a Scribd company logo
1 of 26
Multimodal Analysis Of Stand-up Comedians
Audio, Video and Lexical Analysis
Yash Singh, Madhav Sharan, Sree Priyanka Uppu, Nandan PC,
Harsh Fatepuria, Rahul Agrawal
Motivation
Data set Description
Feature Engineering
Feature Analysis
Machine Learning
Conclusion
Why Stand up comedian?
● We love watching stand ups
● They express variety of emotions
● Feedback from audience in form of laughter available.
● Relatively new
Motivation
H1: Certain facial expression could contribute to laughter
Hypotheses
H2 : Pauses and word elongation contribute towards laughter
Hypotheses
H3 : Voice modulation - Pitch and intensity changes can also play a
crucial role
H4 : Laughter is sequential in nature, meaning small laughter could
add up to bigger laughters.
• We collected 3 hours 46 minutes of data from ‘The Tonight Show
Starring Jimmy Fallon’ or ‘Late Night with Conan O’Brien’.
• 46 Videos (11.76Gb) ~ approx 5 mins each
• 27 males and 19 female artists
• The backdrop in the videos is dark
• Most part of the videos ( 80-90%) the artist faces the camera.
Data Collection
• For facial feature extractions, we blacked out the frames manually
when the camera does not capture the artist’s face ~ setting the
video features to those frames as 0.
• For audio features, there can be 0, during a pause or while the
audience are laughing.
Pre Processing
• Manually segment the videos based on punch lines
• Annotate the laughter level in each segment based on product of
mean pitch and mean intensity to–
o Big (55% ~ 100% intensity)
o Small (36% ~ 55% intensity)
o No (0~36% intensity)
• Pitch range of 75 to 625 Hz gives a good sampling rate of 10 ms
and covers a wide range of frequencies.
• Pitch of laughter varies across videos and hence, it is normalized to
the range [0,1].
Data Annotation
OpenSmile
• Extracted:5 low-level descriptors. we extracted
✧ Musical Chroma features - Tone
✧ Prosody features (Loudness and pitch),
✧ Energy(1),
✧ MFCC(13 MFCC from 0-12 from 26 Mel-frequency bands).
• All these features were captured at a frame rate of 10 ms.
• Processing : Aggregated the features on standard deviation and
mean for each segment
Feature Engineering - Audio
OpenFace
• Extracted
✧ eye gaze direction vector in world coordinates for both the eyes
✧ the location of the head with respect to camera in milimeters and the
rotation (radians)
✧ 68 facial landmark location in 2D pixel format (x,y)
✧ 33 rigid and non-rigid shape parameters
✧ 11 AU intensities and AU occurrences.
• Processing : Aggregated the features on standard deviation and mean
for each segment
Feature Engineering - Video
• Analyze features like Action Units, gaze (y and z direction), pose (rotation of
head) and various facial landmark points, Frown and Eyebrow raise.
IBM Watson
● Pauses
● Last pause
● Word elongation
● Sentiments
Feature Engineering - Textual
H1 : AU related features
Feature Analysis - Visual
AU 07 (Lid tightener) AU 14 (Dimpler)
H1 : facial features
Feature Analysis - Visual
Frown (distance)
H2 : Pause related features
Feature Analysis - Textual
Last Pause Length No of pauses
H3 : Pitch , Loudness, Energy related features
Feature Analysis - Audio
Pitch variation Loudness variation
H3 : Pitch , Loudness, Energy related features
Feature Analysis - Audio
Energy variation Energy mean
Multi modal analysis using boosted decision tree
classifier
Machine Learning
H1 : Certain facial expression could contribute to laughter
H2 : Pauses and word elongation contribute towards laughter
Results
H3 : Voice modulation - Pitch and intensity changes can also play a
crucial role
XGBoost
Early Fusion:
● Min Video frames = 100 frames/segment
● Min Audio frames = 30 frames/segment
● Min text = 0 words /segment
● No good way of taking equal frames from each modality → difficult to
do early fusion
H4 : Laughter is sequential in nature :
LSTM - Challenge
H4 : Laughter is sequential in nature :
Late Fusion
To do
● Tune LSTM
● Try other classifiers :
✧ SVM
✧ Naive Bayes
Thank you.

More Related Content

Recently uploaded

NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
Amil baba
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
zifhagzkk
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
fztigerwe
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
ju0dztxtn
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 

Recently uploaded (20)

NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeral
 
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
如何办理(Dalhousie毕业证书)达尔豪斯大学毕业证成绩单留信学历认证
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
123.docx. .
123.docx.                                 .123.docx.                                 .
123.docx. .
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 

Featured

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

Multimodal Analysis Of Stand-up Comedians

  • 1. Multimodal Analysis Of Stand-up Comedians Audio, Video and Lexical Analysis Yash Singh, Madhav Sharan, Sree Priyanka Uppu, Nandan PC, Harsh Fatepuria, Rahul Agrawal
  • 2. Motivation Data set Description Feature Engineering Feature Analysis Machine Learning Conclusion
  • 3. Why Stand up comedian? ● We love watching stand ups ● They express variety of emotions ● Feedback from audience in form of laughter available. ● Relatively new Motivation
  • 4. H1: Certain facial expression could contribute to laughter Hypotheses
  • 5. H2 : Pauses and word elongation contribute towards laughter Hypotheses
  • 6. H3 : Voice modulation - Pitch and intensity changes can also play a crucial role
  • 7. H4 : Laughter is sequential in nature, meaning small laughter could add up to bigger laughters.
  • 8. • We collected 3 hours 46 minutes of data from ‘The Tonight Show Starring Jimmy Fallon’ or ‘Late Night with Conan O’Brien’. • 46 Videos (11.76Gb) ~ approx 5 mins each • 27 males and 19 female artists • The backdrop in the videos is dark • Most part of the videos ( 80-90%) the artist faces the camera. Data Collection
  • 9. • For facial feature extractions, we blacked out the frames manually when the camera does not capture the artist’s face ~ setting the video features to those frames as 0. • For audio features, there can be 0, during a pause or while the audience are laughing. Pre Processing
  • 10. • Manually segment the videos based on punch lines • Annotate the laughter level in each segment based on product of mean pitch and mean intensity to– o Big (55% ~ 100% intensity) o Small (36% ~ 55% intensity) o No (0~36% intensity) • Pitch range of 75 to 625 Hz gives a good sampling rate of 10 ms and covers a wide range of frequencies. • Pitch of laughter varies across videos and hence, it is normalized to the range [0,1]. Data Annotation
  • 11. OpenSmile • Extracted:5 low-level descriptors. we extracted ✧ Musical Chroma features - Tone ✧ Prosody features (Loudness and pitch), ✧ Energy(1), ✧ MFCC(13 MFCC from 0-12 from 26 Mel-frequency bands). • All these features were captured at a frame rate of 10 ms. • Processing : Aggregated the features on standard deviation and mean for each segment Feature Engineering - Audio
  • 12. OpenFace • Extracted ✧ eye gaze direction vector in world coordinates for both the eyes ✧ the location of the head with respect to camera in milimeters and the rotation (radians) ✧ 68 facial landmark location in 2D pixel format (x,y) ✧ 33 rigid and non-rigid shape parameters ✧ 11 AU intensities and AU occurrences. • Processing : Aggregated the features on standard deviation and mean for each segment Feature Engineering - Video
  • 13. • Analyze features like Action Units, gaze (y and z direction), pose (rotation of head) and various facial landmark points, Frown and Eyebrow raise.
  • 14. IBM Watson ● Pauses ● Last pause ● Word elongation ● Sentiments Feature Engineering - Textual
  • 15. H1 : AU related features Feature Analysis - Visual AU 07 (Lid tightener) AU 14 (Dimpler)
  • 16. H1 : facial features Feature Analysis - Visual Frown (distance)
  • 17. H2 : Pause related features Feature Analysis - Textual Last Pause Length No of pauses
  • 18. H3 : Pitch , Loudness, Energy related features Feature Analysis - Audio Pitch variation Loudness variation
  • 19. H3 : Pitch , Loudness, Energy related features Feature Analysis - Audio Energy variation Energy mean
  • 20. Multi modal analysis using boosted decision tree classifier Machine Learning
  • 21. H1 : Certain facial expression could contribute to laughter H2 : Pauses and word elongation contribute towards laughter Results
  • 22. H3 : Voice modulation - Pitch and intensity changes can also play a crucial role XGBoost
  • 23. Early Fusion: ● Min Video frames = 100 frames/segment ● Min Audio frames = 30 frames/segment ● Min text = 0 words /segment ● No good way of taking equal frames from each modality → difficult to do early fusion H4 : Laughter is sequential in nature : LSTM - Challenge
  • 24. H4 : Laughter is sequential in nature : Late Fusion
  • 25. To do ● Tune LSTM ● Try other classifiers : ✧ SVM ✧ Naive Bayes