SlideShare a Scribd company logo
*Biniam Asnake
*Dawit Mulugeta
Presentation Outline:
• Introduction to MM
• Article Reviews:
1. Visual Mining of Multimedia Data for Social and
Behavioral Studies
2. Multimedia Data Mining for Traffic Video Sequences
3. Tune into the voice of your customer with voice
mining
• Conclusion
• Recommendations
Introduction
• Advances in multimedia acquisition and storage
technology have led to tremendous growth in
very large and detailed multimedia databases.
• A large amount of high-resolution high-quality
multimedia data has been collected in
research laboratories in various scientific
disciplines, especially in social, behavioral and
cognitive studies.
• If these multimedia files are analyzed, useful
information to users can be revealed.
… Introduction
• Multimedia mining deals with the
extraction of implicit knowledge,
multimedia data relationships, or
other patterns not explicitly
stored in multimedia files.
(S. Kotsiantis et. al, 2006)
• Multimedia mining is an interdisciplinary
endeavor that draws upon expertise in
computer vision, multimedia processing,
multimedia retrieval, data mining, machine
learning, database and artificial intelligence.
… Introduction
• How to automatically and effectively discover
new knowledge from rich multimedia data poses
a compelling challenge.
• Multimedia data mining consists of two stages.
1) Researchers extract some derived data
from raw multimedia data.
• This step can be implemented by human coding or by
using image/speech processing programs.
1) Researchers work on derived data with the
goal to find interesting patterns.
Visual Mining of Multimedia Data
for Social and Behavioral Studies
Chen Yu, Yiwen Zhong, Thomas
Smith, Ikhyun Park, Weixia Huang
Visualization approaches for multivariate data
• TimeSearcher
– is a time series exploratory and visualization tool that allows
users to query time series.
• ThemeRiver
– is used to visualize thematic changes in large document
collections.
• VizTree
– is designed to visually mine and monitor massive time series
data.
• Spiral
– is mainly used to compare and analyze periodic structures in
time series data,
• Van Wijk et al
– designed a cluster and calendar-based approach for the
visualization of calendar-based data.
Identified Problems
• Current methods of visualization deal with
linear time or highly periodic time;
– not designed to handle event-based data which is
typical in multimedia applications.
• Those methods focus on visualization,
navigation, or query only.
Objective
• This new approach provides an interactive
tool to integrate visualization with data
mining.
Multimedia Dataset Used
• Video:
– there were three video streams recorded simultaneously
with the frequency of 10 frames per second, and the
resolution of each frame is 320x240.
• Audio:
– The speech of the participants was recorded at a frequency of
44.1 kHz.
• Motion tracking:
– there were two sensors, one on each participant’s head. Each
sensor provided 6 dimensional (x, y, z, head, pitch, and roll)
data points at a frequency of 120Hz.
• In total, the dataset consists of about 90,000 image
frames, 864,000 position data points, and 50 minutes of
speech.
Visualization of Multimedia Data
There are two major display components in the application:
a multimedia playback window and a visualization window.
to visually
explore the
derived data
streams and
discover new
patterns and
findings
Data Representation and Visualization
• The time-based /temporal data can be
categorized into two kinds:
1. CONTINUOUS VARIABLES:
• related to time points (a series of
single measurement at particular
moments in time)
2. EVENT VARIABLES:
• related to time intervals
(e.g. the onset and offset of an event)
(1) Continuous Time Series Data
• 3 ways to visually explore continuous time
series data:
{1} as individual data streams
{2} as a set of multiple data streams
{3} as an arithmetic combination of
multiple data streams
1. Using curves to visualize
individual data streams
• A novel feature added -> HISTOGRAM DISPLAY.
• The purpose is to allow users to explore individual
data streams and examine both the overall
statistics of a data stream (Global Histogram) and
the statistics within a local window (Local
Histogram).
2. Using gray-level representation to
visualize a set of multiple data streams
• Purpose ->to visually display and explore two
kinds of information:
(1) possible correlation between multiple data
streams
(2) interesting joint patterns across multiple data
streams.
3. Using area graphs to visualize an arithmetic
combination of multiple data streams
• Users can combine multiple temporal variables
together (by + and -) in various ways and then
visually explore the combined distribution.
(2) Event Data
• Events are presented as bars of color, with
their size on screen corresponding to their
duration.
• Users can visually explore (1) freq. of event
(2) its duration and (3) its periodicity
To handle potential more complex patterns
involving more variables and logic operations,
users can define a new event variable.
(3) Concurrent visualization of
Continuous and Event variables
The display panel will highlight those
continuous values at the moments when the
selected events happen.
Event-based Interactive Visual Exploration
By visually exploring the data –
instance by instance,
users can directly compare those moments to detect the
similarities between these.
many
multimedia
data are
essentially
event-driven.
Event Grouping
• Users can visually examine each instance of an event,
and categorize the instances into groups. -> Saved
• The overall grouping results can then be visualized in one
single panel.
Flexible Interfaces between
Visualization and Data Processing
• The media playback panel allows users to play back video
and audio data at various speeds. On the top of this,
– The researchers designed and implemented one
critical component
to connect multimedia playback with
visual data mining
raw multimedia data <-> exploring derived data
• To increase the flexibility to be compatible
with data mining,
– this system allows users to use any programming
language (like: MatLab, R, C/C++) to obtain new
results.
The researchers' Future Work
• to conduct a systematical evaluation of
the prototype system
–using experimental paradigm
–to have a better idea of:
• what are advantages and limitations of the
current system and
• what will need to be improved.
Conclusion of the Article
• The visualization tool developed allows
users
–To easily examine and synthesize
information into new ideas and
hypotheses, but also
–quickly quantify and test the insights
gained from visualization.
Multimedia Data Mining for
Traffic Video Sequences
Shu-Ching Chen, Mei-Ling Shyu,
Chengcui Zhang, Jeff Strickrott
Introduction and Motivation
• Traffic video analysis can discover and provide
useful Information such as:
– queue detection, vehicle classification, traffic flow,
and incident detection at the Intersections.
• Some municipalities are installing video camera
systems to monitor and extract traffic control
information from their highways in real time.
Identified Problems
• The current transportation applications and research
work either:
– Do not connect to databases or
– have limited capabilities to index and store the
collected data
– cannot provide organized, unsupervised,
conveniently accessible and easy-to-use
multimedia information to traffic planners.
• In order to discover and provide some important but
previously unknown knowledge from the traffic video
sequences to the traffic planners, multimedia data
mining techniques need to be employed.
The Proposed Framework
• Includes:
–Background Subtraction
–Vehicle Object Identification and Tracking
–Multimedia Augmented Transition Network
(MATN) model and
–Multimedia Input Strings
Background Subtraction
• It is a technique to remove non-moving
components from a video sequence.
• This technique was used:
to enhance the basic SPCPE algorithm
(Simultaneous Partition and Class Parameter Estimation)
(unsupervised video segmentation method)
to get better segmentation results.
The main assumption is that the camera remains stationary
Object Tracking
• The 1st
step -> to extract the segments in each class.
• Then the minimal bounding box and the centroid
point for each segment are obtained.
Using MATNs & Multimedia Input Strings
to Model Video Key Frames
• A Multimedia Augmented Transition Network
(MATN) model
– can be represented diagrammatically by a labeled
directed graph, called a transition graph.
• A Multimedia Input String is
–accepted by the grammar if there is a path of
transitions which corresponds to the sequence of symbols in
the string and which leads from a specified initial
state to one of a set of specified final states.
… MATNs and Multimedia Input Strings
• Key frames play as the indices for a shot.
• In this paper, each frame is divided into nine sub-
regions with the corresponding subscript numbers.
• Each key frame is represented by:
– an input symbol in a multimedia input string
– “&” symbol between two vehicle objects
• is used to denote that the vehicle objects appear in the same
frame.
– subscripted numbers
• are used to distinguish the relative spatial positions of the
vehicle objects relative to the target object “ground”.
Multimedia Input String that represents two key frames
Example:
the nine sub-regions and
their corresponding subscript numbers
an example MATN model
Experiment Setup
• The traffic video sequence was:
– captured with a Sony Handycam CCD TR64 and
– digitized with an Brooktree Bt848 based capture card
on a Windows NT 2000 Celeron-based platform.
• The video sequence consists of about 16 minutes of
video with approximately constant lighting conditions.
• A small portion of the traffic video is used to
illustrate how the proposed framework can be
applied to traffic applications to answer spatio-
temporal queries like:
“Estimate the traffic flow of this road
intersection from 8:00 AM to 8:30 AM.”
Experiment Results
• Using the background subtraction technique,
– both the efficiency of the segmentation
process and the accuracy of the segmentation
results are improved achieving more accurate
video indexing and annotation.
Conclusion
• The proposed framework can model complex
situations such as traffic video for intersection
monitoring.
• Segmentation results as
well as the multimedia
input strings for frames
4, 9, 15, 16 and 35.
• The leftmost column
gives the original video
frames;
• the second column
shows difference images
obtained by subtracting
the background
reference frame from
the original frames;
• the third column shows
the vehicle segments
extracted from the
video frames, and
• the rightmost column
shows the bounding
boxes of the vehicle
objects
Tune into the voice of your
customer with voice mining
By Manya Mayes
Introduction
• Understanding customer comments coming in the forms text,
audio and video that are word for word records, e-mail, voice
mail, surveys and the Web, and most recently via social
networking sites (YouTube, Facebook, etc.) will determine the
business transaction of an organization.
• Especially the vice mining is getting growth and helps to
identify the reasons for call point, the effectiveness of
marketing campaigns, the competitors most mentioned by
your clients, why certain products sell more than others, and
predict the customer satisfaction level of every interaction.
• Combing voice capture with business intelligence, analytics
and text mining provides valuable customer intelligence for
marketing and competitive intelligence business functions.
Introduction(Cont.)
• In addition to the traditional keyboard-entered comments of customer
feedback, companies may also record the audio of these customer
interactions spoken by both the agent and the customer.
• The manual listening and interpreting customers’ feedback is often
inaccurate and inconsistent.
• As a result, automated methods are becoming more prevalent.
• An automated phonetic index search is the typical approach to
understand customer audio information using particular segments
voice-to-text transcription that is identified by domain expertise.
• Stored audio signals can be transcribed and analyzed to predict what is
most likely to happen next such as determining the likelihood that the
customer will close his or her account.
• Techniques such as segmentation are used to automatically group or
classify call transcriptions.
The process: analyzing audio data and
Phonetic index search
• Analyzing audio data can help you identify the call reasons,
the effectiveness of campaigns, the competitors
mentioned by clients, and can predict the customer
satisfaction level.
• The audio signal itself can be analyzed for a wide variety of
information with the metadata
– The Captured metadata fields include call length,
Emotion/stress detection, Silence, number of holds,
number of transfers and the like.
The process(Cont.)
• Phonemes are the basic units of sounds in a
language and a phonetic index is a partial
transcription of an audio signal.
• Metadata about calls can be used for reporting
purposes and incorporated into analytical models
for discovery purposes and identify a dissatisfied
customer.
• A phonetic index search automatically transforms
the captured audio signal into a sequence of
phonemes or sounds.
• Phonetics indexing allows fast searching of the
signal.
Categorizing calls
• Categorizing calls based on the phonetic index search
and full text transcription with the results of the search
indexes.
• Transcriptions are usually only performed on certain
calls
– e.g., calls where customers suggest they will close their
accounts, cancel their subscriptions or call with service
problems.
• By providing a full transcription of all customer calls
and combining the metadata about the call can:
– describe the issues that customers are calling and predict
which customers are most likely to close their accounts, etc
– allowing appropriate action to be taken before it is too
late.
Voice mining using SAS Text Miner and
its advantage
• SAS can read the audio outputs that are captured using Call
Miner, NICE Systems, other similar tools.
• The information provided by the voice capture includes:
– the categories created by the phonetic index search,
– the metadata about the call and the call transcriptions.
• SAS provides industry-leading data integration with the
ability to access a wide variety of data sources and formats,
enabling information to be delivered to users in a way that
they can use it.
– SAS Text Miner provides access to more than 200 document formats
and users are able to gather information from voice vendors of
choice
Voice mining(Cont.)
• The automatically clustering/segmenting documents
and profiling these segments using metadata about
the call will provide further information about the
segment.
– The method is used understand the types of issues
customers are calling about.
• Profiling these segments using metadata about the call
and related customer information provides further
information about the segments.
• The predictive modeling which is a data driven and
consistent method to understand what might happen
next and enables the center agent too take preventive
actions.
• The customer’s experience over the phone can help
predict loyalty, churn, satisfaction and more
Integrating structured data for segment
profiling
• To get an even clearer picture of the results of
text clustering, related structured data (metadata
about the call and related customer information)
was used to further describe the issues.
• The results show that call length and the call hold
indicator provide additional information in the
billing issues cluster.
• Terms that are highly associated with the
selected term are displayed in a hyperbolic tree
structure.
Predicting Cancellation of Subscription
• Once Instance
• In order to make a prediction on the likelihood of
cancellation of subscription, the churn prediction model
used which includes the call
– outcome(result of the call) showing whether or not the
customer cancelled his or her subscription
– the data describing the interaction with the customer such
as the transcriptions of the calls, the metadata about the
calls, demographics, purchasing behavior and
frequency/monetary information.
• The model to predict cancellation of subscription should
use historical data up to, but not including, the call
where the customer actually cancels his or her
subscription.
Predicting(Cont.)
Predicting (Cont)
• The artificial value of 1 is given whenever the term
“cancel” or any of its variations (such as cancels,
cancelled, cancelling, cancellation, etc.) was found and
a value of 0 otherwise.
• The Text Miner node then takes the call transcriptions
and uses linguistic techniques to identify terms,
multiple-word terms, parts of speech, stems, etc., and
uses statistical techniques to give the customer
feedback text a numeric transformation.
• The data is then passed to the Regression, Neural
Network and Decision Tree nodes to build multiple
competing models using the churn outcome and the
text transformations..
Predicting(Cont.)
• The metadata about the call and related customer
information also may be used at this time to
improve model lift.
• The Model Comparison node then takes the results
of each of the preceding models and selects the
“best” model based on which model correctly
classifies the text as predicting churn or no churn.
• Once a best model has been selected, the
underlying code is then used to apply the model to
new data. This is known as model scoring or model
deployment.
Predicting (Cont.)
• The underlying SAS code behind the predictive model
described above was saved and registered as a SAS
Stored Process via the SAS Management Console.
• Several stored processes are created to highlight
various deployments of the MSNTV transcribed data.
• Since the current voice technology does not allow for
real-time transcription, voice captures cannot be
deployed in real time.
• The results are customized to show the original text
and the corresponding prediction of service
cancellation.
Predicting (Cont)
• The user can manipulate the resulting
spreadsheet to show a graphical
representation of the cancellations of
subscriptions. The SAS tasks available via the
SAS Add-In for Microsoft Office are displayed.
• SAS BI dashboards display additional
information about the MSNTV data. The
dashboard is configured to show several views
of the call center data.
Predicting (Cont)
•The propensity to
cancel indicator is about
38 percent chance of
cancelling their
subscriptions.
•The power can enable
companies to retain key
customers and avoid
the costs associated
with undue churn.
Conclusion
• Based on the Voice Mining tools and creating a
stored process can produce valuable information and
knowledge available to business analysts and
managers who might not have had access to this
information previously.
• Despite data quality issues, SAS Text Miner did a
remarkable job of finding consistent patterns in the
customer and agent comments
• By actually hearing and understanding what
customers are already telling you, numerous
indicators can be used to build loyalty, reduce churn
and make your products safer.
Recommendations
• As much as the importance of multimedia mining, there are
no local researches on multimedia mining and only few
researches multimedia retrieval (esp. image).
• Therefore, we recommend conducting research on
multimedia mining for audio, speech, video as well as
advanced image retrieval systems.
• Organizations like libraries, museums and other information
centers (like Television and Radio broadcasters) that have
digital repositories should use the advantages provided by
the application multimedia mining.
• Other organizations (such as Transportation and traffic office)
are also recommended to digitize the information which is
kept in non-computer readable formats and apply multimedia
mining on top of it.
Multimedia Mining

More Related Content

What's hot

I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
vikas dhakane
 
File replication
File replicationFile replication
File replication
Klawal13
 
Developing a Map Reduce Application
Developing a Map Reduce ApplicationDeveloping a Map Reduce Application
Developing a Map Reduce Application
Dr. C.V. Suresh Babu
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
J M
 
Address in the target code in Compiler Construction
Address in the target code in Compiler ConstructionAddress in the target code in Compiler Construction
Address in the target code in Compiler Construction
Muhammad Haroon
 
Unit 2
Unit 2Unit 2
Type Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLikeType Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLike
United International University
 
5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patterns
Krish_ver2
 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
Vignesh Saravanan
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Simplilearn
 
Heap Management
Heap ManagementHeap Management
Heap Management
Jenny Galino
 
Data mining primitives
Data mining primitivesData mining primitives
Data mining primitives
lavanya marichamy
 
Register allocation and assignment
Register allocation and assignmentRegister allocation and assignment
Register allocation and assignment
Karthi Keyan
 
Hill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligenceHill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligence
sandeep54552
 
Target language in compiler design
Target language in compiler designTarget language in compiler design
Target language in compiler design
Muhammad Haroon
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data MiningR A Akerkar
 
Lect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysisLect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysis
hktripathy
 

What's hot (20)

I. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHMI. AO* SEARCH ALGORITHM
I. AO* SEARCH ALGORITHM
 
File replication
File replicationFile replication
File replication
 
Developing a Map Reduce Application
Developing a Map Reduce ApplicationDeveloping a Map Reduce Application
Developing a Map Reduce Application
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
Address in the target code in Compiler Construction
Address in the target code in Compiler ConstructionAddress in the target code in Compiler Construction
Address in the target code in Compiler Construction
 
Unit 2
Unit 2Unit 2
Unit 2
 
Type Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLikeType Checking(Compiler Design) #ShareThisIfYouLike
Type Checking(Compiler Design) #ShareThisIfYouLike
 
5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patterns
 
Np cooks theorem
Np cooks theoremNp cooks theorem
Np cooks theorem
 
Bayesian learning
Bayesian learningBayesian learning
Bayesian learning
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Heap Management
Heap ManagementHeap Management
Heap Management
 
Data mining primitives
Data mining primitivesData mining primitives
Data mining primitives
 
Register allocation and assignment
Register allocation and assignmentRegister allocation and assignment
Register allocation and assignment
 
Hill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligenceHill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligence
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Target language in compiler design
Target language in compiler designTarget language in compiler design
Target language in compiler design
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Lect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysisLect7 Association analysis to correlation analysis
Lect7 Association analysis to correlation analysis
 

Similar to Multimedia Mining

slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
SharanrajK22MMT1003
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
pratik pratyay
 
Key Frame Extraction for Salient Activity Recognition
Key Frame Extraction for Salient Activity RecognitionKey Frame Extraction for Salient Activity Recognition
Key Frame Extraction for Salient Activity Recognition
Suhas Pillai
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
Fatima Qayyum
 
Towards a distributed framework to analyze multimodal data.pdf
Towards a distributed framework to analyze multimodal data.pdfTowards a distributed framework to analyze multimodal data.pdf
Towards a distributed framework to analyze multimodal data.pdf
CarlosRodrigues517978
 
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Lokukaluge Prasad Perera
 
New Method for Traffic Density Estimation Based on Topic Model
New Method for Traffic Density Estimation Based on Topic ModelNew Method for Traffic Density Estimation Based on Topic Model
New Method for Traffic Density Estimation Based on Topic Model
Nidhi Shirbhayye
 
Census Hub Project
Census Hub ProjectCensus Hub Project
Census Hub Project
Vincenzo Patruno
 
ruSMART 2013 presentation
ruSMART 2013 presentationruSMART 2013 presentation
ruSMART 2013 presentation
Oscar Rodríguez Rocha
 
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Geokomunita
 
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Yves Sucaet
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Enrico Motta
 
3D-ICONS Guidelines
3D-ICONS Guidelines 3D-ICONS Guidelines
3D-ICONS Guidelines
3D ICONS Project
 
Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLAPaul Barsch
 
Traffic sign recognition and detection using SVM and CNN
Traffic sign recognition and detection using SVM and CNNTraffic sign recognition and detection using SVM and CNN
Traffic sign recognition and detection using SVM and CNN
IRJET Journal
 
A Transfer Learning Approach to Traffic Sign Recognition
A Transfer Learning Approach to Traffic Sign RecognitionA Transfer Learning Approach to Traffic Sign Recognition
A Transfer Learning Approach to Traffic Sign Recognition
IRJET Journal
 
B08 A3pc 82 Diapo Girardot En
B08 A3pc 82 Diapo Girardot EnB08 A3pc 82 Diapo Girardot En
B08 A3pc 82 Diapo Girardot En
Territorial Intelligence
 
2019 cvpr paper overview by Ho Seong Lee
2019 cvpr paper overview by Ho Seong Lee2019 cvpr paper overview by Ho Seong Lee
2019 cvpr paper overview by Ho Seong Lee
Moazzem Hossain
 
2019 cvpr paper_overview
2019 cvpr paper_overview2019 cvpr paper_overview
2019 cvpr paper_overview
LEE HOSEONG
 

Similar to Multimedia Mining (20)

slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
 
Key Frame Extraction for Salient Activity Recognition
Key Frame Extraction for Salient Activity RecognitionKey Frame Extraction for Salient Activity Recognition
Key Frame Extraction for Salient Activity Recognition
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
A Low-Cost IoT Application for the Urban Traffic of Vehicles, Based on Wirele...
 
Towards a distributed framework to analyze multimodal data.pdf
Towards a distributed framework to analyze multimodal data.pdfTowards a distributed framework to analyze multimodal data.pdf
Towards a distributed framework to analyze multimodal data.pdf
 
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...
 
New Method for Traffic Density Estimation Based on Topic Model
New Method for Traffic Density Estimation Based on Topic ModelNew Method for Traffic Density Estimation Based on Topic Model
New Method for Traffic Density Estimation Based on Topic Model
 
Census Hub Project
Census Hub ProjectCensus Hub Project
Census Hub Project
 
ruSMART 2013 presentation
ruSMART 2013 presentationruSMART 2013 presentation
ruSMART 2013 presentation
 
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...
 
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
3D-ICONS Guidelines
3D-ICONS Guidelines 3D-ICONS Guidelines
3D-ICONS Guidelines
 
Harnessing Big Data_UCLA
Harnessing Big Data_UCLAHarnessing Big Data_UCLA
Harnessing Big Data_UCLA
 
Traffic sign recognition and detection using SVM and CNN
Traffic sign recognition and detection using SVM and CNNTraffic sign recognition and detection using SVM and CNN
Traffic sign recognition and detection using SVM and CNN
 
A Transfer Learning Approach to Traffic Sign Recognition
A Transfer Learning Approach to Traffic Sign RecognitionA Transfer Learning Approach to Traffic Sign Recognition
A Transfer Learning Approach to Traffic Sign Recognition
 
B08 A3pc 82 Diapo Girardot En
B08 A3pc 82 Diapo Girardot EnB08 A3pc 82 Diapo Girardot En
B08 A3pc 82 Diapo Girardot En
 
2019 cvpr paper overview by Ho Seong Lee
2019 cvpr paper overview by Ho Seong Lee2019 cvpr paper overview by Ho Seong Lee
2019 cvpr paper overview by Ho Seong Lee
 
2019 cvpr paper_overview
2019 cvpr paper_overview2019 cvpr paper_overview
2019 cvpr paper_overview
 

More from Biniam Asnake

Text Mining
Text MiningText Mining
Text Mining
Biniam Asnake
 
Software Trends: Past, Present and Future
Software Trends: Past, Present and FutureSoftware Trends: Past, Present and Future
Software Trends: Past, Present and Future
Biniam Asnake
 
Service Oriented Architecture (SOA)
Service Oriented Architecture (SOA)Service Oriented Architecture (SOA)
Service Oriented Architecture (SOA)
Biniam Asnake
 
Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based Retrieval
Biniam Asnake
 
Information Systems: A Case Study of Bank of America and Commercial Bank of E...
Information Systems: A Case Study of Bank of America and Commercial Bank of E...Information Systems: A Case Study of Bank of America and Commercial Bank of E...
Information Systems: A Case Study of Bank of America and Commercial Bank of E...
Biniam Asnake
 
Computer vision and robotics
Computer vision and roboticsComputer vision and robotics
Computer vision and robotics
Biniam Asnake
 

More from Biniam Asnake (6)

Text Mining
Text MiningText Mining
Text Mining
 
Software Trends: Past, Present and Future
Software Trends: Past, Present and FutureSoftware Trends: Past, Present and Future
Software Trends: Past, Present and Future
 
Service Oriented Architecture (SOA)
Service Oriented Architecture (SOA)Service Oriented Architecture (SOA)
Service Oriented Architecture (SOA)
 
Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based Retrieval
 
Information Systems: A Case Study of Bank of America and Commercial Bank of E...
Information Systems: A Case Study of Bank of America and Commercial Bank of E...Information Systems: A Case Study of Bank of America and Commercial Bank of E...
Information Systems: A Case Study of Bank of America and Commercial Bank of E...
 
Computer vision and robotics
Computer vision and roboticsComputer vision and robotics
Computer vision and robotics
 

Recently uploaded

OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 

Recently uploaded (20)

OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 

Multimedia Mining

  • 2. Presentation Outline: • Introduction to MM • Article Reviews: 1. Visual Mining of Multimedia Data for Social and Behavioral Studies 2. Multimedia Data Mining for Traffic Video Sequences 3. Tune into the voice of your customer with voice mining • Conclusion • Recommendations
  • 3. Introduction • Advances in multimedia acquisition and storage technology have led to tremendous growth in very large and detailed multimedia databases. • A large amount of high-resolution high-quality multimedia data has been collected in research laboratories in various scientific disciplines, especially in social, behavioral and cognitive studies. • If these multimedia files are analyzed, useful information to users can be revealed.
  • 4. … Introduction • Multimedia mining deals with the extraction of implicit knowledge, multimedia data relationships, or other patterns not explicitly stored in multimedia files. (S. Kotsiantis et. al, 2006) • Multimedia mining is an interdisciplinary endeavor that draws upon expertise in computer vision, multimedia processing, multimedia retrieval, data mining, machine learning, database and artificial intelligence.
  • 5. … Introduction • How to automatically and effectively discover new knowledge from rich multimedia data poses a compelling challenge. • Multimedia data mining consists of two stages. 1) Researchers extract some derived data from raw multimedia data. • This step can be implemented by human coding or by using image/speech processing programs. 1) Researchers work on derived data with the goal to find interesting patterns.
  • 6. Visual Mining of Multimedia Data for Social and Behavioral Studies Chen Yu, Yiwen Zhong, Thomas Smith, Ikhyun Park, Weixia Huang
  • 7. Visualization approaches for multivariate data • TimeSearcher – is a time series exploratory and visualization tool that allows users to query time series. • ThemeRiver – is used to visualize thematic changes in large document collections. • VizTree – is designed to visually mine and monitor massive time series data. • Spiral – is mainly used to compare and analyze periodic structures in time series data, • Van Wijk et al – designed a cluster and calendar-based approach for the visualization of calendar-based data.
  • 8. Identified Problems • Current methods of visualization deal with linear time or highly periodic time; – not designed to handle event-based data which is typical in multimedia applications. • Those methods focus on visualization, navigation, or query only. Objective • This new approach provides an interactive tool to integrate visualization with data mining.
  • 9.
  • 10. Multimedia Dataset Used • Video: – there were three video streams recorded simultaneously with the frequency of 10 frames per second, and the resolution of each frame is 320x240. • Audio: – The speech of the participants was recorded at a frequency of 44.1 kHz. • Motion tracking: – there were two sensors, one on each participant’s head. Each sensor provided 6 dimensional (x, y, z, head, pitch, and roll) data points at a frequency of 120Hz. • In total, the dataset consists of about 90,000 image frames, 864,000 position data points, and 50 minutes of speech.
  • 11. Visualization of Multimedia Data There are two major display components in the application: a multimedia playback window and a visualization window. to visually explore the derived data streams and discover new patterns and findings
  • 12. Data Representation and Visualization • The time-based /temporal data can be categorized into two kinds: 1. CONTINUOUS VARIABLES: • related to time points (a series of single measurement at particular moments in time) 2. EVENT VARIABLES: • related to time intervals (e.g. the onset and offset of an event)
  • 13. (1) Continuous Time Series Data • 3 ways to visually explore continuous time series data: {1} as individual data streams {2} as a set of multiple data streams {3} as an arithmetic combination of multiple data streams
  • 14. 1. Using curves to visualize individual data streams • A novel feature added -> HISTOGRAM DISPLAY. • The purpose is to allow users to explore individual data streams and examine both the overall statistics of a data stream (Global Histogram) and the statistics within a local window (Local Histogram).
  • 15. 2. Using gray-level representation to visualize a set of multiple data streams • Purpose ->to visually display and explore two kinds of information: (1) possible correlation between multiple data streams (2) interesting joint patterns across multiple data streams.
  • 16. 3. Using area graphs to visualize an arithmetic combination of multiple data streams • Users can combine multiple temporal variables together (by + and -) in various ways and then visually explore the combined distribution.
  • 17. (2) Event Data • Events are presented as bars of color, with their size on screen corresponding to their duration. • Users can visually explore (1) freq. of event (2) its duration and (3) its periodicity
  • 18. To handle potential more complex patterns involving more variables and logic operations, users can define a new event variable.
  • 19. (3) Concurrent visualization of Continuous and Event variables The display panel will highlight those continuous values at the moments when the selected events happen.
  • 20. Event-based Interactive Visual Exploration By visually exploring the data – instance by instance, users can directly compare those moments to detect the similarities between these. many multimedia data are essentially event-driven.
  • 21. Event Grouping • Users can visually examine each instance of an event, and categorize the instances into groups. -> Saved • The overall grouping results can then be visualized in one single panel.
  • 22. Flexible Interfaces between Visualization and Data Processing • The media playback panel allows users to play back video and audio data at various speeds. On the top of this, – The researchers designed and implemented one critical component to connect multimedia playback with visual data mining raw multimedia data <-> exploring derived data • To increase the flexibility to be compatible with data mining, – this system allows users to use any programming language (like: MatLab, R, C/C++) to obtain new results.
  • 23. The researchers' Future Work • to conduct a systematical evaluation of the prototype system –using experimental paradigm –to have a better idea of: • what are advantages and limitations of the current system and • what will need to be improved.
  • 24. Conclusion of the Article • The visualization tool developed allows users –To easily examine and synthesize information into new ideas and hypotheses, but also –quickly quantify and test the insights gained from visualization.
  • 25. Multimedia Data Mining for Traffic Video Sequences Shu-Ching Chen, Mei-Ling Shyu, Chengcui Zhang, Jeff Strickrott
  • 26. Introduction and Motivation • Traffic video analysis can discover and provide useful Information such as: – queue detection, vehicle classification, traffic flow, and incident detection at the Intersections. • Some municipalities are installing video camera systems to monitor and extract traffic control information from their highways in real time.
  • 27. Identified Problems • The current transportation applications and research work either: – Do not connect to databases or – have limited capabilities to index and store the collected data – cannot provide organized, unsupervised, conveniently accessible and easy-to-use multimedia information to traffic planners. • In order to discover and provide some important but previously unknown knowledge from the traffic video sequences to the traffic planners, multimedia data mining techniques need to be employed.
  • 28. The Proposed Framework • Includes: –Background Subtraction –Vehicle Object Identification and Tracking –Multimedia Augmented Transition Network (MATN) model and –Multimedia Input Strings
  • 29. Background Subtraction • It is a technique to remove non-moving components from a video sequence. • This technique was used: to enhance the basic SPCPE algorithm (Simultaneous Partition and Class Parameter Estimation) (unsupervised video segmentation method) to get better segmentation results.
  • 30. The main assumption is that the camera remains stationary
  • 31. Object Tracking • The 1st step -> to extract the segments in each class. • Then the minimal bounding box and the centroid point for each segment are obtained.
  • 32. Using MATNs & Multimedia Input Strings to Model Video Key Frames • A Multimedia Augmented Transition Network (MATN) model – can be represented diagrammatically by a labeled directed graph, called a transition graph. • A Multimedia Input String is –accepted by the grammar if there is a path of transitions which corresponds to the sequence of symbols in the string and which leads from a specified initial state to one of a set of specified final states.
  • 33. … MATNs and Multimedia Input Strings • Key frames play as the indices for a shot. • In this paper, each frame is divided into nine sub- regions with the corresponding subscript numbers. • Each key frame is represented by: – an input symbol in a multimedia input string – “&” symbol between two vehicle objects • is used to denote that the vehicle objects appear in the same frame. – subscripted numbers • are used to distinguish the relative spatial positions of the vehicle objects relative to the target object “ground”.
  • 34. Multimedia Input String that represents two key frames Example: the nine sub-regions and their corresponding subscript numbers an example MATN model
  • 35. Experiment Setup • The traffic video sequence was: – captured with a Sony Handycam CCD TR64 and – digitized with an Brooktree Bt848 based capture card on a Windows NT 2000 Celeron-based platform. • The video sequence consists of about 16 minutes of video with approximately constant lighting conditions. • A small portion of the traffic video is used to illustrate how the proposed framework can be applied to traffic applications to answer spatio- temporal queries like: “Estimate the traffic flow of this road intersection from 8:00 AM to 8:30 AM.”
  • 36. Experiment Results • Using the background subtraction technique, – both the efficiency of the segmentation process and the accuracy of the segmentation results are improved achieving more accurate video indexing and annotation. Conclusion • The proposed framework can model complex situations such as traffic video for intersection monitoring.
  • 37. • Segmentation results as well as the multimedia input strings for frames 4, 9, 15, 16 and 35. • The leftmost column gives the original video frames; • the second column shows difference images obtained by subtracting the background reference frame from the original frames; • the third column shows the vehicle segments extracted from the video frames, and • the rightmost column shows the bounding boxes of the vehicle objects
  • 38. Tune into the voice of your customer with voice mining By Manya Mayes
  • 39. Introduction • Understanding customer comments coming in the forms text, audio and video that are word for word records, e-mail, voice mail, surveys and the Web, and most recently via social networking sites (YouTube, Facebook, etc.) will determine the business transaction of an organization. • Especially the vice mining is getting growth and helps to identify the reasons for call point, the effectiveness of marketing campaigns, the competitors most mentioned by your clients, why certain products sell more than others, and predict the customer satisfaction level of every interaction. • Combing voice capture with business intelligence, analytics and text mining provides valuable customer intelligence for marketing and competitive intelligence business functions.
  • 40. Introduction(Cont.) • In addition to the traditional keyboard-entered comments of customer feedback, companies may also record the audio of these customer interactions spoken by both the agent and the customer. • The manual listening and interpreting customers’ feedback is often inaccurate and inconsistent. • As a result, automated methods are becoming more prevalent. • An automated phonetic index search is the typical approach to understand customer audio information using particular segments voice-to-text transcription that is identified by domain expertise. • Stored audio signals can be transcribed and analyzed to predict what is most likely to happen next such as determining the likelihood that the customer will close his or her account. • Techniques such as segmentation are used to automatically group or classify call transcriptions.
  • 41. The process: analyzing audio data and Phonetic index search • Analyzing audio data can help you identify the call reasons, the effectiveness of campaigns, the competitors mentioned by clients, and can predict the customer satisfaction level. • The audio signal itself can be analyzed for a wide variety of information with the metadata – The Captured metadata fields include call length, Emotion/stress detection, Silence, number of holds, number of transfers and the like.
  • 42. The process(Cont.) • Phonemes are the basic units of sounds in a language and a phonetic index is a partial transcription of an audio signal. • Metadata about calls can be used for reporting purposes and incorporated into analytical models for discovery purposes and identify a dissatisfied customer. • A phonetic index search automatically transforms the captured audio signal into a sequence of phonemes or sounds. • Phonetics indexing allows fast searching of the signal.
  • 43. Categorizing calls • Categorizing calls based on the phonetic index search and full text transcription with the results of the search indexes. • Transcriptions are usually only performed on certain calls – e.g., calls where customers suggest they will close their accounts, cancel their subscriptions or call with service problems. • By providing a full transcription of all customer calls and combining the metadata about the call can: – describe the issues that customers are calling and predict which customers are most likely to close their accounts, etc – allowing appropriate action to be taken before it is too late.
  • 44. Voice mining using SAS Text Miner and its advantage • SAS can read the audio outputs that are captured using Call Miner, NICE Systems, other similar tools. • The information provided by the voice capture includes: – the categories created by the phonetic index search, – the metadata about the call and the call transcriptions. • SAS provides industry-leading data integration with the ability to access a wide variety of data sources and formats, enabling information to be delivered to users in a way that they can use it. – SAS Text Miner provides access to more than 200 document formats and users are able to gather information from voice vendors of choice
  • 45. Voice mining(Cont.) • The automatically clustering/segmenting documents and profiling these segments using metadata about the call will provide further information about the segment. – The method is used understand the types of issues customers are calling about. • Profiling these segments using metadata about the call and related customer information provides further information about the segments. • The predictive modeling which is a data driven and consistent method to understand what might happen next and enables the center agent too take preventive actions. • The customer’s experience over the phone can help predict loyalty, churn, satisfaction and more
  • 46. Integrating structured data for segment profiling • To get an even clearer picture of the results of text clustering, related structured data (metadata about the call and related customer information) was used to further describe the issues. • The results show that call length and the call hold indicator provide additional information in the billing issues cluster. • Terms that are highly associated with the selected term are displayed in a hyperbolic tree structure.
  • 47. Predicting Cancellation of Subscription • Once Instance • In order to make a prediction on the likelihood of cancellation of subscription, the churn prediction model used which includes the call – outcome(result of the call) showing whether or not the customer cancelled his or her subscription – the data describing the interaction with the customer such as the transcriptions of the calls, the metadata about the calls, demographics, purchasing behavior and frequency/monetary information. • The model to predict cancellation of subscription should use historical data up to, but not including, the call where the customer actually cancels his or her subscription.
  • 49. Predicting (Cont) • The artificial value of 1 is given whenever the term “cancel” or any of its variations (such as cancels, cancelled, cancelling, cancellation, etc.) was found and a value of 0 otherwise. • The Text Miner node then takes the call transcriptions and uses linguistic techniques to identify terms, multiple-word terms, parts of speech, stems, etc., and uses statistical techniques to give the customer feedback text a numeric transformation. • The data is then passed to the Regression, Neural Network and Decision Tree nodes to build multiple competing models using the churn outcome and the text transformations..
  • 50. Predicting(Cont.) • The metadata about the call and related customer information also may be used at this time to improve model lift. • The Model Comparison node then takes the results of each of the preceding models and selects the “best” model based on which model correctly classifies the text as predicting churn or no churn. • Once a best model has been selected, the underlying code is then used to apply the model to new data. This is known as model scoring or model deployment.
  • 51. Predicting (Cont.) • The underlying SAS code behind the predictive model described above was saved and registered as a SAS Stored Process via the SAS Management Console. • Several stored processes are created to highlight various deployments of the MSNTV transcribed data. • Since the current voice technology does not allow for real-time transcription, voice captures cannot be deployed in real time. • The results are customized to show the original text and the corresponding prediction of service cancellation.
  • 52. Predicting (Cont) • The user can manipulate the resulting spreadsheet to show a graphical representation of the cancellations of subscriptions. The SAS tasks available via the SAS Add-In for Microsoft Office are displayed. • SAS BI dashboards display additional information about the MSNTV data. The dashboard is configured to show several views of the call center data.
  • 53. Predicting (Cont) •The propensity to cancel indicator is about 38 percent chance of cancelling their subscriptions. •The power can enable companies to retain key customers and avoid the costs associated with undue churn.
  • 54. Conclusion • Based on the Voice Mining tools and creating a stored process can produce valuable information and knowledge available to business analysts and managers who might not have had access to this information previously. • Despite data quality issues, SAS Text Miner did a remarkable job of finding consistent patterns in the customer and agent comments • By actually hearing and understanding what customers are already telling you, numerous indicators can be used to build loyalty, reduce churn and make your products safer.
  • 55. Recommendations • As much as the importance of multimedia mining, there are no local researches on multimedia mining and only few researches multimedia retrieval (esp. image). • Therefore, we recommend conducting research on multimedia mining for audio, speech, video as well as advanced image retrieval systems. • Organizations like libraries, museums and other information centers (like Television and Radio broadcasters) that have digital repositories should use the advantages provided by the application multimedia mining. • Other organizations (such as Transportation and traffic office) are also recommended to digitize the information which is kept in non-computer readable formats and apply multimedia mining on top of it.

Editor's Notes

  1. The multimedia playback window is a digital media player that allows users to access video and audio data and play them back in various ways. The visualization window is the main tool that allows users to visually explore the derived data streams and discover new patterns and findings.
  2. The local histogram is updated as users move the zoom box while the global histogram is constant.
  3. The local histogram is updated as users move the zoom box while the global histogram is constant.
  4. Our visualization of multiple event variables allows users to see not only individual events but also joint events
  5. The researchers observed that many multimedia data are essentially event-driven.
  6. The tool provide flexible interfaces between visualization and data mining. It is important that users can refer to the raw multimedia data while exploring derived data. as far as users write the results into text files with pre-defined formats.