The document outlines a presentation on multimedia data mining. It discusses three articles: 1) a tool for visually mining multimedia data for social studies, 2) a framework for mining traffic video sequences, and 3) using voice mining to understand customer feedback. It also provides an introduction to multimedia data mining and recommendations.
Artificial Intelligence: Introduction, Typical Applications. State Space Search: Depth Bounded
DFS, Depth First Iterative Deepening. Heuristic Search: Heuristic Functions, Best First Search,
Hill Climbing, Variable Neighborhood Descent, Beam Search, Tabu Search. Optimal Search: A
*
algorithm, Iterative Deepening A*
, Recursive Best First Search, Pruning the CLOSED and OPEN
Lists
File Replication : High availability is a desirable feature of a good distributed file system and file replication is the primary mechanism for improving file availability. Replication is a key strategy for improving reliability, fault tolerance and availability. Therefore duplicating files on multiple machines improves availability and performance.
Replicated file : A replicated file is a file that has multiple copies, with each copy located on a separate file server. Each copy of the set of copies that comprises a replicated file is referred to as replica of the replicated file.
Replication is often confused with caching, probably because they both deal with multiple copies of data. The two concepts has the following basic differences:
A replica is associated with server, whereas a cached copy is associated with a client.
The existence of cached copy is primarily dependent on the locality in file access patterns, whereas the existence of a replica normally depends on availability and performance requirements.
Satynarayanana [1992] distinguishes a replicated copy from a cached copy by calling the first-class replicas and second-class replicas respectively
This is the 3- Tier architecture of Data Warehouse. This is the topic under Data Mining subject. Data mining is extracting knowledge from large amount of data.
The purpose of types:
To define what the program should do.
e.g. read an array of integers and return a double
To guarantee that the program is meaningful.
that it does not add a string to an integer
that variables are declared before they are used
To document the programmer's intentions.
better than comments, which are not checked by the compiler
To optimize the use of hardware.
reserve the minimal amount of memory, but not more
use the most appropriate machine instructions.
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
This Support Vector Machine (SVM) presentation will help you understand Support Vector Machine algorithm, a supervised machine learning algorithm which can be used for both classification and regression problems. This SVM presentation will help you learn where and when to use SVM algorithm, how does the algorithm work, what are hyperplanes and support vectors in SVM, how distance margin helps in optimizing the hyperplane, kernel functions in SVM for data transformation and advantages of SVM algorithm. At the end, we will also implement Support Vector Machine algorithm in Python to differentiate crocodiles from alligators for a given dataset.
Below topics are explained in this Support Vector Machine presentation:
1. What is Machine Learning?
2. Why support vector machine?
3. What is support vector machine?
4. Understanding support vector machine
5. Advantages of support vector machine
6. Use case in Python
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, Naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Artificial Intelligence: Introduction, Typical Applications. State Space Search: Depth Bounded
DFS, Depth First Iterative Deepening. Heuristic Search: Heuristic Functions, Best First Search,
Hill Climbing, Variable Neighborhood Descent, Beam Search, Tabu Search. Optimal Search: A
*
algorithm, Iterative Deepening A*
, Recursive Best First Search, Pruning the CLOSED and OPEN
Lists
File Replication : High availability is a desirable feature of a good distributed file system and file replication is the primary mechanism for improving file availability. Replication is a key strategy for improving reliability, fault tolerance and availability. Therefore duplicating files on multiple machines improves availability and performance.
Replicated file : A replicated file is a file that has multiple copies, with each copy located on a separate file server. Each copy of the set of copies that comprises a replicated file is referred to as replica of the replicated file.
Replication is often confused with caching, probably because they both deal with multiple copies of data. The two concepts has the following basic differences:
A replica is associated with server, whereas a cached copy is associated with a client.
The existence of cached copy is primarily dependent on the locality in file access patterns, whereas the existence of a replica normally depends on availability and performance requirements.
Satynarayanana [1992] distinguishes a replicated copy from a cached copy by calling the first-class replicas and second-class replicas respectively
This is the 3- Tier architecture of Data Warehouse. This is the topic under Data Mining subject. Data mining is extracting knowledge from large amount of data.
The purpose of types:
To define what the program should do.
e.g. read an array of integers and return a double
To guarantee that the program is meaningful.
that it does not add a string to an integer
that variables are declared before they are used
To document the programmer's intentions.
better than comments, which are not checked by the compiler
To optimize the use of hardware.
reserve the minimal amount of memory, but not more
use the most appropriate machine instructions.
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
This Support Vector Machine (SVM) presentation will help you understand Support Vector Machine algorithm, a supervised machine learning algorithm which can be used for both classification and regression problems. This SVM presentation will help you learn where and when to use SVM algorithm, how does the algorithm work, what are hyperplanes and support vectors in SVM, how distance margin helps in optimizing the hyperplane, kernel functions in SVM for data transformation and advantages of SVM algorithm. At the end, we will also implement Support Vector Machine algorithm in Python to differentiate crocodiles from alligators for a given dataset.
Below topics are explained in this Support Vector Machine presentation:
1. What is Machine Learning?
2. Why support vector machine?
3. What is support vector machine?
4. Understanding support vector machine
5. Advantages of support vector machine
6. Use case in Python
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, Naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Industrial IoT to Predictive Analytics: A Reverse Engineering Approach from S...Lokukaluge Prasad Perera
A novel mathematical framework to support industrial digitization of shipping is presented in this study. The framework supports a data flow path, i.e. from Industrial IoT (i.e. with Big Data) to Predictive Analytics, where digital models with advanced data analytics are introduced. The digital models are derived from ship performance and navigation data sets and a combination of such models facilitates towards the proposed Predictive Analytics. Since the respective data sets are used to derive the Predictive Analytics, this mathematical framework is also categorized as a reverse engineering approach. Furthermore, a data anomaly detection and recover procedure that is associated with the same framework to improve the respective data quality are also described in this study.
New Method for Traffic Density Estimation Based on Topic ModelNidhi Shirbhayye
Description: The system presents a new framework for traffic density estimation based on topic model, which is an unsupervised model. It uses a set of visual features without any individual vehicle detection and tracking need, and discovers the motion patterns automatically in traffic scenes by using topic model.
The Census Hub Project can be considerated at the moment as the most advanced project where Internet technologies and SDMX solutions for data transmission get together for an ambicious goal: the data dissemination of Census 2011 results.
We analyze the Census Hub architecture, where a central Hub at Eurostat side manage the user interface, transforming all selections made by the user on the screen in an sdmx query. This query is sent to the web service at NSI side, that parses the query and transforms it in an SQL query that can be used with a data base containing census data. Depending on how many countrys are involved in the answer, the hub will query the web service provided for that country. Finally, the Hub receive all answer fron NSI's and build up a final table, putting all answers toghether. The importance of this implementation is that is a completely new system that change completely the way to disseminate and exchange official data among organizations.
Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízov...Geokomunita
Radim Štampach – Interaktívne webové mapy ako nástroj pre analýzu heterogénnych dát pre krízové riadenie, stretnutie GISkola #2, 21. 01. 2021, online cez BigBlueButton
Digital Pathology Information Web Services (DPIWS): Convergence in Digital Pa...Yves Sucaet
A standard recommendation for Digital Pathology Information Web-Services. A novel recommendation for the Digital Pathology Information Web-Services (DPIWS) standard is presented, with respect to specific characteristics of the informative content of discourse. The recommendation establishes a common software interface for the exchange of digital pathology (DP)images and related metadata over the web, independently of storage, encoding, and internal handling details. Thepro-posed structure is implemented and tested in a “Pathomation”software environment.
Part 1 of the printed publication "3D-ICONS Guidelines and Case Studies" First published in November 2014.
Public fascination with the architectural and archaeological heritage is well known, it is proven to be one of the main reasons for tourism according to the UN World Tourism Organisation. Historic buildings and archaeological monuments form a significant component Europe’s cultural heritage; they are the physical testimonies of European history and of the di°erent events that led to the creation of the European landscape, as we know it today.
The documentation of built heritage increasingly avails of 3D scanning and other remote sensing technologies, which produces digital replicas in an accurate and fast way. Such digital models have a large range of uses, from the conservation and preservation of monuments to the communication of their cultural value to the public. They may also support in-depth analysis of their architectural and artistic features as well as allow the production of interpretive reconstructions of their past appearance.
The goal of the 3D-ICONS project, funded under the European Commission’s ICT Policy Support Programme which builds on the results of CARARE (www.carare.eu) and 3D-COFORM (www.3d-coform.eu), is to provide Europeana with 3D models of architectural and archaeological monuments of remarkable cultural importance. The project brings together 16 partners (see appendix 2) from across Europe (11 countries) with relevant expertise in 3D modelling and digitization. The main purpose of this project is to produce around 4000 accurate 3D models which have to be processed into a simplified form in order to be visualized on low end personal computers and on the web.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
2. Presentation Outline:
• Introduction to MM
• Article Reviews:
1. Visual Mining of Multimedia Data for Social and
Behavioral Studies
2. Multimedia Data Mining for Traffic Video Sequences
3. Tune into the voice of your customer with voice
mining
• Conclusion
• Recommendations
3. Introduction
• Advances in multimedia acquisition and storage
technology have led to tremendous growth in
very large and detailed multimedia databases.
• A large amount of high-resolution high-quality
multimedia data has been collected in
research laboratories in various scientific
disciplines, especially in social, behavioral and
cognitive studies.
• If these multimedia files are analyzed, useful
information to users can be revealed.
4. … Introduction
• Multimedia mining deals with the
extraction of implicit knowledge,
multimedia data relationships, or
other patterns not explicitly
stored in multimedia files.
(S. Kotsiantis et. al, 2006)
• Multimedia mining is an interdisciplinary
endeavor that draws upon expertise in
computer vision, multimedia processing,
multimedia retrieval, data mining, machine
learning, database and artificial intelligence.
5. … Introduction
• How to automatically and effectively discover
new knowledge from rich multimedia data poses
a compelling challenge.
• Multimedia data mining consists of two stages.
1) Researchers extract some derived data
from raw multimedia data.
• This step can be implemented by human coding or by
using image/speech processing programs.
1) Researchers work on derived data with the
goal to find interesting patterns.
6. Visual Mining of Multimedia Data
for Social and Behavioral Studies
Chen Yu, Yiwen Zhong, Thomas
Smith, Ikhyun Park, Weixia Huang
7. Visualization approaches for multivariate data
• TimeSearcher
– is a time series exploratory and visualization tool that allows
users to query time series.
• ThemeRiver
– is used to visualize thematic changes in large document
collections.
• VizTree
– is designed to visually mine and monitor massive time series
data.
• Spiral
– is mainly used to compare and analyze periodic structures in
time series data,
• Van Wijk et al
– designed a cluster and calendar-based approach for the
visualization of calendar-based data.
8. Identified Problems
• Current methods of visualization deal with
linear time or highly periodic time;
– not designed to handle event-based data which is
typical in multimedia applications.
• Those methods focus on visualization,
navigation, or query only.
Objective
• This new approach provides an interactive
tool to integrate visualization with data
mining.
9.
10. Multimedia Dataset Used
• Video:
– there were three video streams recorded simultaneously
with the frequency of 10 frames per second, and the
resolution of each frame is 320x240.
• Audio:
– The speech of the participants was recorded at a frequency of
44.1 kHz.
• Motion tracking:
– there were two sensors, one on each participant’s head. Each
sensor provided 6 dimensional (x, y, z, head, pitch, and roll)
data points at a frequency of 120Hz.
• In total, the dataset consists of about 90,000 image
frames, 864,000 position data points, and 50 minutes of
speech.
11. Visualization of Multimedia Data
There are two major display components in the application:
a multimedia playback window and a visualization window.
to visually
explore the
derived data
streams and
discover new
patterns and
findings
12. Data Representation and Visualization
• The time-based /temporal data can be
categorized into two kinds:
1. CONTINUOUS VARIABLES:
• related to time points (a series of
single measurement at particular
moments in time)
2. EVENT VARIABLES:
• related to time intervals
(e.g. the onset and offset of an event)
13. (1) Continuous Time Series Data
• 3 ways to visually explore continuous time
series data:
{1} as individual data streams
{2} as a set of multiple data streams
{3} as an arithmetic combination of
multiple data streams
14. 1. Using curves to visualize
individual data streams
• A novel feature added -> HISTOGRAM DISPLAY.
• The purpose is to allow users to explore individual
data streams and examine both the overall
statistics of a data stream (Global Histogram) and
the statistics within a local window (Local
Histogram).
15. 2. Using gray-level representation to
visualize a set of multiple data streams
• Purpose ->to visually display and explore two
kinds of information:
(1) possible correlation between multiple data
streams
(2) interesting joint patterns across multiple data
streams.
16. 3. Using area graphs to visualize an arithmetic
combination of multiple data streams
• Users can combine multiple temporal variables
together (by + and -) in various ways and then
visually explore the combined distribution.
17. (2) Event Data
• Events are presented as bars of color, with
their size on screen corresponding to their
duration.
• Users can visually explore (1) freq. of event
(2) its duration and (3) its periodicity
18. To handle potential more complex patterns
involving more variables and logic operations,
users can define a new event variable.
19. (3) Concurrent visualization of
Continuous and Event variables
The display panel will highlight those
continuous values at the moments when the
selected events happen.
20. Event-based Interactive Visual Exploration
By visually exploring the data –
instance by instance,
users can directly compare those moments to detect the
similarities between these.
many
multimedia
data are
essentially
event-driven.
21. Event Grouping
• Users can visually examine each instance of an event,
and categorize the instances into groups. -> Saved
• The overall grouping results can then be visualized in one
single panel.
22. Flexible Interfaces between
Visualization and Data Processing
• The media playback panel allows users to play back video
and audio data at various speeds. On the top of this,
– The researchers designed and implemented one
critical component
to connect multimedia playback with
visual data mining
raw multimedia data <-> exploring derived data
• To increase the flexibility to be compatible
with data mining,
– this system allows users to use any programming
language (like: MatLab, R, C/C++) to obtain new
results.
23. The researchers' Future Work
• to conduct a systematical evaluation of
the prototype system
–using experimental paradigm
–to have a better idea of:
• what are advantages and limitations of the
current system and
• what will need to be improved.
24. Conclusion of the Article
• The visualization tool developed allows
users
–To easily examine and synthesize
information into new ideas and
hypotheses, but also
–quickly quantify and test the insights
gained from visualization.
25. Multimedia Data Mining for
Traffic Video Sequences
Shu-Ching Chen, Mei-Ling Shyu,
Chengcui Zhang, Jeff Strickrott
26. Introduction and Motivation
• Traffic video analysis can discover and provide
useful Information such as:
– queue detection, vehicle classification, traffic flow,
and incident detection at the Intersections.
• Some municipalities are installing video camera
systems to monitor and extract traffic control
information from their highways in real time.
27. Identified Problems
• The current transportation applications and research
work either:
– Do not connect to databases or
– have limited capabilities to index and store the
collected data
– cannot provide organized, unsupervised,
conveniently accessible and easy-to-use
multimedia information to traffic planners.
• In order to discover and provide some important but
previously unknown knowledge from the traffic video
sequences to the traffic planners, multimedia data
mining techniques need to be employed.
28. The Proposed Framework
• Includes:
–Background Subtraction
–Vehicle Object Identification and Tracking
–Multimedia Augmented Transition Network
(MATN) model and
–Multimedia Input Strings
29. Background Subtraction
• It is a technique to remove non-moving
components from a video sequence.
• This technique was used:
to enhance the basic SPCPE algorithm
(Simultaneous Partition and Class Parameter Estimation)
(unsupervised video segmentation method)
to get better segmentation results.
31. Object Tracking
• The 1st
step -> to extract the segments in each class.
• Then the minimal bounding box and the centroid
point for each segment are obtained.
32. Using MATNs & Multimedia Input Strings
to Model Video Key Frames
• A Multimedia Augmented Transition Network
(MATN) model
– can be represented diagrammatically by a labeled
directed graph, called a transition graph.
• A Multimedia Input String is
–accepted by the grammar if there is a path of
transitions which corresponds to the sequence of symbols in
the string and which leads from a specified initial
state to one of a set of specified final states.
33. … MATNs and Multimedia Input Strings
• Key frames play as the indices for a shot.
• In this paper, each frame is divided into nine sub-
regions with the corresponding subscript numbers.
• Each key frame is represented by:
– an input symbol in a multimedia input string
– “&” symbol between two vehicle objects
• is used to denote that the vehicle objects appear in the same
frame.
– subscripted numbers
• are used to distinguish the relative spatial positions of the
vehicle objects relative to the target object “ground”.
34. Multimedia Input String that represents two key frames
Example:
the nine sub-regions and
their corresponding subscript numbers
an example MATN model
35. Experiment Setup
• The traffic video sequence was:
– captured with a Sony Handycam CCD TR64 and
– digitized with an Brooktree Bt848 based capture card
on a Windows NT 2000 Celeron-based platform.
• The video sequence consists of about 16 minutes of
video with approximately constant lighting conditions.
• A small portion of the traffic video is used to
illustrate how the proposed framework can be
applied to traffic applications to answer spatio-
temporal queries like:
“Estimate the traffic flow of this road
intersection from 8:00 AM to 8:30 AM.”
36. Experiment Results
• Using the background subtraction technique,
– both the efficiency of the segmentation
process and the accuracy of the segmentation
results are improved achieving more accurate
video indexing and annotation.
Conclusion
• The proposed framework can model complex
situations such as traffic video for intersection
monitoring.
37. • Segmentation results as
well as the multimedia
input strings for frames
4, 9, 15, 16 and 35.
• The leftmost column
gives the original video
frames;
• the second column
shows difference images
obtained by subtracting
the background
reference frame from
the original frames;
• the third column shows
the vehicle segments
extracted from the
video frames, and
• the rightmost column
shows the bounding
boxes of the vehicle
objects
38. Tune into the voice of your
customer with voice mining
By Manya Mayes
39. Introduction
• Understanding customer comments coming in the forms text,
audio and video that are word for word records, e-mail, voice
mail, surveys and the Web, and most recently via social
networking sites (YouTube, Facebook, etc.) will determine the
business transaction of an organization.
• Especially the vice mining is getting growth and helps to
identify the reasons for call point, the effectiveness of
marketing campaigns, the competitors most mentioned by
your clients, why certain products sell more than others, and
predict the customer satisfaction level of every interaction.
• Combing voice capture with business intelligence, analytics
and text mining provides valuable customer intelligence for
marketing and competitive intelligence business functions.
40. Introduction(Cont.)
• In addition to the traditional keyboard-entered comments of customer
feedback, companies may also record the audio of these customer
interactions spoken by both the agent and the customer.
• The manual listening and interpreting customers’ feedback is often
inaccurate and inconsistent.
• As a result, automated methods are becoming more prevalent.
• An automated phonetic index search is the typical approach to
understand customer audio information using particular segments
voice-to-text transcription that is identified by domain expertise.
• Stored audio signals can be transcribed and analyzed to predict what is
most likely to happen next such as determining the likelihood that the
customer will close his or her account.
• Techniques such as segmentation are used to automatically group or
classify call transcriptions.
41. The process: analyzing audio data and
Phonetic index search
• Analyzing audio data can help you identify the call reasons,
the effectiveness of campaigns, the competitors
mentioned by clients, and can predict the customer
satisfaction level.
• The audio signal itself can be analyzed for a wide variety of
information with the metadata
– The Captured metadata fields include call length,
Emotion/stress detection, Silence, number of holds,
number of transfers and the like.
42. The process(Cont.)
• Phonemes are the basic units of sounds in a
language and a phonetic index is a partial
transcription of an audio signal.
• Metadata about calls can be used for reporting
purposes and incorporated into analytical models
for discovery purposes and identify a dissatisfied
customer.
• A phonetic index search automatically transforms
the captured audio signal into a sequence of
phonemes or sounds.
• Phonetics indexing allows fast searching of the
signal.
43. Categorizing calls
• Categorizing calls based on the phonetic index search
and full text transcription with the results of the search
indexes.
• Transcriptions are usually only performed on certain
calls
– e.g., calls where customers suggest they will close their
accounts, cancel their subscriptions or call with service
problems.
• By providing a full transcription of all customer calls
and combining the metadata about the call can:
– describe the issues that customers are calling and predict
which customers are most likely to close their accounts, etc
– allowing appropriate action to be taken before it is too
late.
44. Voice mining using SAS Text Miner and
its advantage
• SAS can read the audio outputs that are captured using Call
Miner, NICE Systems, other similar tools.
• The information provided by the voice capture includes:
– the categories created by the phonetic index search,
– the metadata about the call and the call transcriptions.
• SAS provides industry-leading data integration with the
ability to access a wide variety of data sources and formats,
enabling information to be delivered to users in a way that
they can use it.
– SAS Text Miner provides access to more than 200 document formats
and users are able to gather information from voice vendors of
choice
45. Voice mining(Cont.)
• The automatically clustering/segmenting documents
and profiling these segments using metadata about
the call will provide further information about the
segment.
– The method is used understand the types of issues
customers are calling about.
• Profiling these segments using metadata about the call
and related customer information provides further
information about the segments.
• The predictive modeling which is a data driven and
consistent method to understand what might happen
next and enables the center agent too take preventive
actions.
• The customer’s experience over the phone can help
predict loyalty, churn, satisfaction and more
46. Integrating structured data for segment
profiling
• To get an even clearer picture of the results of
text clustering, related structured data (metadata
about the call and related customer information)
was used to further describe the issues.
• The results show that call length and the call hold
indicator provide additional information in the
billing issues cluster.
• Terms that are highly associated with the
selected term are displayed in a hyperbolic tree
structure.
47. Predicting Cancellation of Subscription
• Once Instance
• In order to make a prediction on the likelihood of
cancellation of subscription, the churn prediction model
used which includes the call
– outcome(result of the call) showing whether or not the
customer cancelled his or her subscription
– the data describing the interaction with the customer such
as the transcriptions of the calls, the metadata about the
calls, demographics, purchasing behavior and
frequency/monetary information.
• The model to predict cancellation of subscription should
use historical data up to, but not including, the call
where the customer actually cancels his or her
subscription.
49. Predicting (Cont)
• The artificial value of 1 is given whenever the term
“cancel” or any of its variations (such as cancels,
cancelled, cancelling, cancellation, etc.) was found and
a value of 0 otherwise.
• The Text Miner node then takes the call transcriptions
and uses linguistic techniques to identify terms,
multiple-word terms, parts of speech, stems, etc., and
uses statistical techniques to give the customer
feedback text a numeric transformation.
• The data is then passed to the Regression, Neural
Network and Decision Tree nodes to build multiple
competing models using the churn outcome and the
text transformations..
50. Predicting(Cont.)
• The metadata about the call and related customer
information also may be used at this time to
improve model lift.
• The Model Comparison node then takes the results
of each of the preceding models and selects the
“best” model based on which model correctly
classifies the text as predicting churn or no churn.
• Once a best model has been selected, the
underlying code is then used to apply the model to
new data. This is known as model scoring or model
deployment.
51. Predicting (Cont.)
• The underlying SAS code behind the predictive model
described above was saved and registered as a SAS
Stored Process via the SAS Management Console.
• Several stored processes are created to highlight
various deployments of the MSNTV transcribed data.
• Since the current voice technology does not allow for
real-time transcription, voice captures cannot be
deployed in real time.
• The results are customized to show the original text
and the corresponding prediction of service
cancellation.
52. Predicting (Cont)
• The user can manipulate the resulting
spreadsheet to show a graphical
representation of the cancellations of
subscriptions. The SAS tasks available via the
SAS Add-In for Microsoft Office are displayed.
• SAS BI dashboards display additional
information about the MSNTV data. The
dashboard is configured to show several views
of the call center data.
53. Predicting (Cont)
•The propensity to
cancel indicator is about
38 percent chance of
cancelling their
subscriptions.
•The power can enable
companies to retain key
customers and avoid
the costs associated
with undue churn.
54. Conclusion
• Based on the Voice Mining tools and creating a
stored process can produce valuable information and
knowledge available to business analysts and
managers who might not have had access to this
information previously.
• Despite data quality issues, SAS Text Miner did a
remarkable job of finding consistent patterns in the
customer and agent comments
• By actually hearing and understanding what
customers are already telling you, numerous
indicators can be used to build loyalty, reduce churn
and make your products safer.
55. Recommendations
• As much as the importance of multimedia mining, there are
no local researches on multimedia mining and only few
researches multimedia retrieval (esp. image).
• Therefore, we recommend conducting research on
multimedia mining for audio, speech, video as well as
advanced image retrieval systems.
• Organizations like libraries, museums and other information
centers (like Television and Radio broadcasters) that have
digital repositories should use the advantages provided by
the application multimedia mining.
• Other organizations (such as Transportation and traffic office)
are also recommended to digitize the information which is
kept in non-computer readable formats and apply multimedia
mining on top of it.
Editor's Notes
The multimedia playback window is a digital media player that allows users to access video and audio data and play them back in various ways.
The visualization window is the main tool that allows users to visually explore the derived data streams and discover new patterns and findings.
The local histogram is updated as users move the zoom box while the global histogram is constant.
The local histogram is updated as users move the zoom box while the global histogram is constant.
Our visualization of multiple event variables allows users to see not only individual events but also joint events
The researchers observed that many multimedia data are essentially event-driven.
The tool provide flexible interfaces between visualization and data mining.
It is important that users can refer to the raw multimedia data while exploring derived data.
as far as users write the results into text files with pre-defined formats.