SlideShare a Scribd company logo
Crossing the boundaries between Natural
Language Processing and Computer Vision
Haithem Afli
Analytics Institute - Inspire
Dublin
January 26th, 2017
1/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
ADAPT : The Global Centre of Excellence for Digital
Content and Media Innovation
Member of the ADAPT Machine Translation team
Manager of the ADAPT Social Media research group
2/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Natural Language : An age-old industry ?
If you think the language industry is new
→ think again !
3/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Natural Language : An age-old industry ?
If you think the language industry is new
→ think again !
Rosetta Stone (British Museum)
Carved in 196 BCE and re-discovered in 1799
4/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Language : An age-old industry ?
For as far back as we can see, human has needed to
communicate → Language industry paved the way for the
renaissance of culture
The Tower of Babel and The House of Wisdom in Bagdad
(Bait-al-Hikma)
Nowadays, Natural language processing is a field of computer
science, artificial intelligence, and computational
linguistics
5/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
The age of social media
Rapid growth of user-generated content available on the Web
Facebook updates, tweets on Twitter, WhatsApp messages,
Youtube videos, etc.
→ individual users have been able to actively participate in
the generation of online content in different modalities (Text,
images and vidoes)
⇒ caption generation models become a strong technique to
capture and determine objects in the images and express their
relationships in natural language
⇒ Combining NLP and Computer vision fields
6/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
TRECVID 2016
TREC Video Retrieval Evaluation (TRECVID) goal is to
promote progress in content-based analysis of and retrieval
from digital video
Over the last 15 years TRECvid has had tasks in
shot bound detection, concept detection, instance search,
known item search, example search, surveillance video event
detection, multimedia event detection, video summarisation
→ New pilot on captioning
7/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
2016 Showcase Task - Video to Text Description
Goals and Motivations
Measure how well can automatic system describe a video in
natural language.
Measure how well can an automatic system match high-level
textual descriptions to low-level computer vision features.
Transfer successful image captioning technology to the video
domain.
Real world Applications
Video summarization
Supporting search and browsing
Accessibility - video description to the blind
Video event prediction
8/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Video Dataset
Annotation guidelines by NIST :
For each video, annotators were asked to combine 4 facets if
applicable :
Who is the video describing (objects, persons, animals,. . . etc)
What are the objects and beings doing ? (actions, states,
events,. . . etc)
Where (locale, site, place, geographic,...etc)
When such as time of day, season, ...etc
9/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Samples of captions
10/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Task 1 : Matching & Ranking
11/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Task 2 : Description Generation
12/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
DCU participation
Collaboration initiated by Prof. Alan Smeaton (Insight
Centre) and Prof. Andy Way (ADAPT Centre)
13/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
DCU participation : Pre-Processing Vine Videos
Object Concepts : We used the VGG-16 deep convolutional
neural network to map keyframes in the videos to 1,000 object
concept probabilities. We used 10 equally spaced keyframes
per Vine video.
Behaviour Concepts We applied crowd behaviour
recognition to categorise the motion characteristics of a given
Vine sequence. Keyframes are extracted and probability scores
calculated for 94 crowd behaviour concepts such as fight, run,
mob, parade and protest.
Locations were represented by extracting the probability
scores from the softmax layer of VGG16 network pre-trained
on the Places2 Dataset .
14/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
DCU participation : The Caption Generation Sub-Task
we used an attention based model for automatic captions
generation of images extracted from the VTT videos.
Since we segmented the video into several static images, we
generate one caption for each image of the video as one of the
candidates for the video caption using NeuralTalk2, a
CNN-RNN toolkit trained on the MSCOCO data set
NeuralTalk2 takes an image and predicts its sentence
description with a Recurrent Neural Network.
15/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
DCU participation : The Caption Ranking Sub-Task
Caption matching task treated as an Information Retrieval
(IR) task.
IR : Given a query, retrieve a ranked list of documents sorted
by similarity values.
Query : Text comprised of the concept vector associated with
each image.
Retrievable document : The text associated with the captions.
16/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Very good results on Caption Generation and Ranking
17/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Possible application : Multimodal Sentiment Analysis
18/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Continuation ..
Motivated by our performance in caption ranking and caption
generation we will refine our methods used in both tasks by
broadening the number of underlying concepts.
Continue the collaboration with new partners.
19/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
Thank you
haithem.afli@adaptcentre.ie
@AfliHaithem
20/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016

More Related Content

Similar to Analytics2017

Filling the knowledge gap: New metrics for GLAMs and GLAM-Wiki cooperation
Filling the knowledge gap: New metrics for GLAMs and GLAM-Wiki cooperationFilling the knowledge gap: New metrics for GLAMs and GLAM-Wiki cooperation
Filling the knowledge gap: New metrics for GLAMs and GLAM-Wiki cooperation
Iolanda Pensa
 
[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹
[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹
[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹
台灣資料科學年會
 
06.Heritage Management 2018, Wikipedia
06.Heritage Management 2018, Wikipedia06.Heritage Management 2018, Wikipedia
06.Heritage Management 2018, Wikipedia
Iolanda Pensa
 
Interactive Video Search - Tutorial at ACM Multimedia 2015
Interactive Video Search - Tutorial at ACM Multimedia 2015Interactive Video Search - Tutorial at ACM Multimedia 2015
Interactive Video Search - Tutorial at ACM Multimedia 2015
klschoef
 
The global API Ecosystem: A deep dive into gaming
The global API Ecosystem: A deep dive into gamingThe global API Ecosystem: A deep dive into gaming
The global API Ecosystem: A deep dive into gaming
Jukka Huhtamäki
 
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
maranlar
 
Introduction about AAT-Taiwan Project
Introduction about AAT-Taiwan ProjectIntroduction about AAT-Taiwan Project
Introduction about AAT-Taiwan ProjectAAT Taiwan
 
Thinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesThinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, Issues
FIAT/IFTA
 
Team 1: Easy-to-Use Satellite Images Discovery in a Map Viewer
Team 1: Easy-to-Use Satellite Images Discovery in a Map Viewer Team 1: Easy-to-Use Satellite Images Discovery in a Map Viewer
Team 1: Easy-to-Use Satellite Images Discovery in a Map Viewer
plan4all
 
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
diannepatricia
 
Lecture Capture: Is the Customer Always Right? New Technologies in Education ...
Lecture Capture: Is the Customer Always Right? New Technologies in Education ...Lecture Capture: Is the Customer Always Right? New Technologies in Education ...
Lecture Capture: Is the Customer Always Right? New Technologies in Education ...
Rachel Maxwell
 
Industry 4.0
Industry 4.0Industry 4.0
Benchmarking Linked Data Introductory Remarks
Benchmarking Linked Data Introductory RemarksBenchmarking Linked Data Introductory Remarks
Benchmarking Linked Data Introductory Remarks
Holistic Benchmarking of Big Linked Data
 
FutureTDM Symposium_DEMOS
FutureTDM Symposium_DEMOSFutureTDM Symposium_DEMOS
FutureTDM Symposium_DEMOS
FutureTDM
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data miningTaha Mokfi
 
Building the future of television - Lynda Hardman
Building the future of television - Lynda HardmanBuilding the future of television - Lynda Hardman
Building the future of television - Lynda HardmanVolt
 
Look, Listen and Act [Navigation via Reinforcement Learning]
Look, Listen and Act [Navigation via Reinforcement Learning]Look, Listen and Act [Navigation via Reinforcement Learning]
Look, Listen and Act [Navigation via Reinforcement Learning]
이 의령
 
Web Storytelling and Open Data Publishing for Tourism
Web Storytelling and Open Data Publishing for TourismWeb Storytelling and Open Data Publishing for Tourism
Web Storytelling and Open Data Publishing for Tourism
Andrea Volpini
 
Towards a Benchmark for Expressive Stream Reasoning
Towards a Benchmark for Expressive Stream ReasoningTowards a Benchmark for Expressive Stream Reasoning
Towards a Benchmark for Expressive Stream Reasoning
Riccardo Tommasini
 
The Archives Forum - The National Archives - 02 March 2011
The Archives Forum - The National Archives - 02 March 2011The Archives Forum - The National Archives - 02 March 2011
The Archives Forum - The National Archives - 02 March 2011David F. Flanders
 

Similar to Analytics2017 (20)

Filling the knowledge gap: New metrics for GLAMs and GLAM-Wiki cooperation
Filling the knowledge gap: New metrics for GLAMs and GLAM-Wiki cooperationFilling the knowledge gap: New metrics for GLAMs and GLAM-Wiki cooperation
Filling the knowledge gap: New metrics for GLAMs and GLAM-Wiki cooperation
 
[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹
[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹
[2018 台灣人工智慧學校校友年會] Textual Data Analytics in Finance / 王釧茹
 
06.Heritage Management 2018, Wikipedia
06.Heritage Management 2018, Wikipedia06.Heritage Management 2018, Wikipedia
06.Heritage Management 2018, Wikipedia
 
Interactive Video Search - Tutorial at ACM Multimedia 2015
Interactive Video Search - Tutorial at ACM Multimedia 2015Interactive Video Search - Tutorial at ACM Multimedia 2015
Interactive Video Search - Tutorial at ACM Multimedia 2015
 
The global API Ecosystem: A deep dive into gaming
The global API Ecosystem: A deep dive into gamingThe global API Ecosystem: A deep dive into gaming
The global API Ecosystem: A deep dive into gaming
 
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
Multimedia Information Retrieval: Bytes and pixels meet the challenges of hum...
 
Introduction about AAT-Taiwan Project
Introduction about AAT-Taiwan ProjectIntroduction about AAT-Taiwan Project
Introduction about AAT-Taiwan Project
 
Thinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesThinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, Issues
 
Team 1: Easy-to-Use Satellite Images Discovery in a Map Viewer
Team 1: Easy-to-Use Satellite Images Discovery in a Map Viewer Team 1: Easy-to-Use Satellite Images Discovery in a Map Viewer
Team 1: Easy-to-Use Satellite Images Discovery in a Map Viewer
 
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
 
Lecture Capture: Is the Customer Always Right? New Technologies in Education ...
Lecture Capture: Is the Customer Always Right? New Technologies in Education ...Lecture Capture: Is the Customer Always Right? New Technologies in Education ...
Lecture Capture: Is the Customer Always Right? New Technologies in Education ...
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
 
Benchmarking Linked Data Introductory Remarks
Benchmarking Linked Data Introductory RemarksBenchmarking Linked Data Introductory Remarks
Benchmarking Linked Data Introductory Remarks
 
FutureTDM Symposium_DEMOS
FutureTDM Symposium_DEMOSFutureTDM Symposium_DEMOS
FutureTDM Symposium_DEMOS
 
Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
Building the future of television - Lynda Hardman
Building the future of television - Lynda HardmanBuilding the future of television - Lynda Hardman
Building the future of television - Lynda Hardman
 
Look, Listen and Act [Navigation via Reinforcement Learning]
Look, Listen and Act [Navigation via Reinforcement Learning]Look, Listen and Act [Navigation via Reinforcement Learning]
Look, Listen and Act [Navigation via Reinforcement Learning]
 
Web Storytelling and Open Data Publishing for Tourism
Web Storytelling and Open Data Publishing for TourismWeb Storytelling and Open Data Publishing for Tourism
Web Storytelling and Open Data Publishing for Tourism
 
Towards a Benchmark for Expressive Stream Reasoning
Towards a Benchmark for Expressive Stream ReasoningTowards a Benchmark for Expressive Stream Reasoning
Towards a Benchmark for Expressive Stream Reasoning
 
The Archives Forum - The National Archives - 02 March 2011
The Archives Forum - The National Archives - 02 March 2011The Archives Forum - The National Archives - 02 March 2011
The Archives Forum - The National Archives - 02 March 2011
 

More from Haithem Afli

Navigating the Age of Generative AI with a Human-Centred Approach
Navigating the Age of Generative AI with a Human-Centred ApproachNavigating the Age of Generative AI with a Human-Centred Approach
Navigating the Age of Generative AI with a Human-Centred Approach
Haithem Afli
 
How NLP is reshaping Fintech
How NLP is reshaping Fintech How NLP is reshaping Fintech
How NLP is reshaping Fintech
Haithem Afli
 
Looking Beyond the AI & IoT Research and Industrial Opportunities: How two Br...
Looking Beyond the AI & IoTResearch and Industrial Opportunities:How two Br...Looking Beyond the AI & IoTResearch and Industrial Opportunities:How two Br...
Looking Beyond the AI & IoT Research and Industrial Opportunities: How two Br...
Haithem Afli
 
AI Meets Digital Health, Social Science and AgriTech
AI Meets Digital Health, Social Science and AgriTechAI Meets Digital Health, Social Science and AgriTech
AI Meets Digital Health, Social Science and AgriTech
Haithem Afli
 
Affective Analytics and Visualization for Ensemble event-driven stock market ...
Affective Analytics and Visualization for Ensemble event-driven stock market ...Affective Analytics and Visualization for Ensemble event-driven stock market ...
Affective Analytics and Visualization for Ensemble event-driven stock market ...
Haithem Afli
 
Introduction to Natural Language Processing
Introduction to Natural Language Processing  Introduction to Natural Language Processing
Introduction to Natural Language Processing
Haithem Afli
 
Natural Language Engineering in the Golden Age of Artificial Intelligence
 Natural Language Engineering in the Golden Age of Artificial Intelligence Natural Language Engineering in the Golden Age of Artificial Intelligence
Natural Language Engineering in the Golden Age of Artificial Intelligence
Haithem Afli
 
Industrial Internet Consortium 2019
Industrial Internet Consortium 2019Industrial Internet Consortium 2019
Industrial Internet Consortium 2019
Haithem Afli
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaHaithem Afli
 

More from Haithem Afli (9)

Navigating the Age of Generative AI with a Human-Centred Approach
Navigating the Age of Generative AI with a Human-Centred ApproachNavigating the Age of Generative AI with a Human-Centred Approach
Navigating the Age of Generative AI with a Human-Centred Approach
 
How NLP is reshaping Fintech
How NLP is reshaping Fintech How NLP is reshaping Fintech
How NLP is reshaping Fintech
 
Looking Beyond the AI & IoT Research and Industrial Opportunities: How two Br...
Looking Beyond the AI & IoTResearch and Industrial Opportunities:How two Br...Looking Beyond the AI & IoTResearch and Industrial Opportunities:How two Br...
Looking Beyond the AI & IoT Research and Industrial Opportunities: How two Br...
 
AI Meets Digital Health, Social Science and AgriTech
AI Meets Digital Health, Social Science and AgriTechAI Meets Digital Health, Social Science and AgriTech
AI Meets Digital Health, Social Science and AgriTech
 
Affective Analytics and Visualization for Ensemble event-driven stock market ...
Affective Analytics and Visualization for Ensemble event-driven stock market ...Affective Analytics and Visualization for Ensemble event-driven stock market ...
Affective Analytics and Visualization for Ensemble event-driven stock market ...
 
Introduction to Natural Language Processing
Introduction to Natural Language Processing  Introduction to Natural Language Processing
Introduction to Natural Language Processing
 
Natural Language Engineering in the Golden Age of Artificial Intelligence
 Natural Language Engineering in the Golden Age of Artificial Intelligence Natural Language Engineering in the Golden Age of Artificial Intelligence
Natural Language Engineering in the Golden Age of Artificial Intelligence
 
Industrial Internet Consortium 2019
Industrial Internet Consortium 2019Industrial Internet Consortium 2019
Industrial Internet Consortium 2019
 
Parallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corporaParallel text extraction from multimodal comparable corpora
Parallel text extraction from multimodal comparable corpora
 

Recently uploaded

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 

Recently uploaded (20)

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 

Analytics2017

  • 1. Crossing the boundaries between Natural Language Processing and Computer Vision Haithem Afli Analytics Institute - Inspire Dublin January 26th, 2017 1/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 2. ADAPT : The Global Centre of Excellence for Digital Content and Media Innovation Member of the ADAPT Machine Translation team Manager of the ADAPT Social Media research group 2/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 3. Natural Language : An age-old industry ? If you think the language industry is new → think again ! 3/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 4. Natural Language : An age-old industry ? If you think the language industry is new → think again ! Rosetta Stone (British Museum) Carved in 196 BCE and re-discovered in 1799 4/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 5. Language : An age-old industry ? For as far back as we can see, human has needed to communicate → Language industry paved the way for the renaissance of culture The Tower of Babel and The House of Wisdom in Bagdad (Bait-al-Hikma) Nowadays, Natural language processing is a field of computer science, artificial intelligence, and computational linguistics 5/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 6. The age of social media Rapid growth of user-generated content available on the Web Facebook updates, tweets on Twitter, WhatsApp messages, Youtube videos, etc. → individual users have been able to actively participate in the generation of online content in different modalities (Text, images and vidoes) ⇒ caption generation models become a strong technique to capture and determine objects in the images and express their relationships in natural language ⇒ Combining NLP and Computer vision fields 6/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 7. TRECVID 2016 TREC Video Retrieval Evaluation (TRECVID) goal is to promote progress in content-based analysis of and retrieval from digital video Over the last 15 years TRECvid has had tasks in shot bound detection, concept detection, instance search, known item search, example search, surveillance video event detection, multimedia event detection, video summarisation → New pilot on captioning 7/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 8. 2016 Showcase Task - Video to Text Description Goals and Motivations Measure how well can automatic system describe a video in natural language. Measure how well can an automatic system match high-level textual descriptions to low-level computer vision features. Transfer successful image captioning technology to the video domain. Real world Applications Video summarization Supporting search and browsing Accessibility - video description to the blind Video event prediction 8/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 9. Video Dataset Annotation guidelines by NIST : For each video, annotators were asked to combine 4 facets if applicable : Who is the video describing (objects, persons, animals,. . . etc) What are the objects and beings doing ? (actions, states, events,. . . etc) Where (locale, site, place, geographic,...etc) When such as time of day, season, ...etc 9/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 10. Samples of captions 10/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 11. Task 1 : Matching & Ranking 11/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 12. Task 2 : Description Generation 12/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 13. DCU participation Collaboration initiated by Prof. Alan Smeaton (Insight Centre) and Prof. Andy Way (ADAPT Centre) 13/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 14. DCU participation : Pre-Processing Vine Videos Object Concepts : We used the VGG-16 deep convolutional neural network to map keyframes in the videos to 1,000 object concept probabilities. We used 10 equally spaced keyframes per Vine video. Behaviour Concepts We applied crowd behaviour recognition to categorise the motion characteristics of a given Vine sequence. Keyframes are extracted and probability scores calculated for 94 crowd behaviour concepts such as fight, run, mob, parade and protest. Locations were represented by extracting the probability scores from the softmax layer of VGG16 network pre-trained on the Places2 Dataset . 14/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 15. DCU participation : The Caption Generation Sub-Task we used an attention based model for automatic captions generation of images extracted from the VTT videos. Since we segmented the video into several static images, we generate one caption for each image of the video as one of the candidates for the video caption using NeuralTalk2, a CNN-RNN toolkit trained on the MSCOCO data set NeuralTalk2 takes an image and predicts its sentence description with a Recurrent Neural Network. 15/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 16. DCU participation : The Caption Ranking Sub-Task Caption matching task treated as an Information Retrieval (IR) task. IR : Given a query, retrieve a ranked list of documents sorted by similarity values. Query : Text comprised of the concept vector associated with each image. Retrievable document : The text associated with the captions. 16/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 17. Very good results on Caption Generation and Ranking 17/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 18. Possible application : Multimodal Sentiment Analysis 18/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 19. Continuation .. Motivated by our performance in caption ranking and caption generation we will refine our methods used in both tasks by broadening the number of underlying concepts. Continue the collaboration with new partners. 19/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016
  • 20. Thank you haithem.afli@adaptcentre.ie @AfliHaithem 20/ 20 Haithem Afli (@AfliHaithem) NLP & CV @ TrecVid 2016