SlideShare a Scribd company logo
1 of 29
Download to read offline
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
IEEE TRANSACTION ON MULTIMEDIA 2016 TOPICS
Hybrid Zero Block Detection for High Efficiency Video Coding
Abstract - In this paper we propose an efficient hybrid zero block early detection method for
high efficiency video coding (HEVC). Our method detects both genuine zero blocks (GZBs) and
pseudo zero blocks (PZBs). For GZB detection, we use a two sum of absolute difference bounds
and a one sum of absolute transformed difference threshold to decrease the GZB detection
complexity. A fast rate-distortion estimation algorithm for HEVC is proposed to improve the
PZB detection rate. Experimental results on the HM platform show that the proposed method
saves about 50% of the rate-distortion optimization (RTO) time, with negligible Bjøntegaard
delta bit rate loss. Our method is faster than other state-of-the-art ZB detection methods for
HEVC by 10%-30%.
IEEE Transactions on Multimedia (March 2016)
Consistent Coding Scheme for Single-Image Super-Resolution Via Independent
Dictionaries
Abstract - In this paper, we present a unified frame based on collaborative representation (CR)
for single-image super-resolution (SR), which learns low-resolution (LR) and high-resolution
(HR) dictionaries independently in the training stage and adopts a consistent coding scheme
(CCS) to guarantee the prediction accuracy of HR coding coefficients during SR reconstruction.
The independent LR and HR dictionaries are learned based on CR with l2-norm regularization,
which can well describe the corresponding LR and HR patch space, respectively. Furthermore, a
mapping function is learned to map LR coding coefficients onto the corresponding HR coding
coefficients. Propagation filtering can achieve smoothing over an image while preserving image
context like edges or textural regions. Moreover, to preserve the edge structures of a super-
resolved image and suppress artifacts, a propagation filtering-based constraint and image
nonlocal self-similarity regularization are introduced into the SR reconstruction framework.
Experimental comparison with state-of-the-art single image SR algorithms validates the
effectiveness of proposed approach.
IEEE Transactions on Multimedia (March 2016)
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Joint Inference of Objects and Scenes With Efficient Learning of Text-Object-Scene
Relations
Abstract - The rapid growth of web images presents new challenges as well as opportunities to
the task of image understanding. Conventional approaches rely heavily on fine-grained
annotations, such as bounding boxes and semantic segmentations, which are not available for
web-scale images. In general, images over the Internet are accompanied with descriptive texts,
which are relevant to their contents. To bridge the gap between textual and visual analysis for
image understanding, this paper presents an algorithm to learn the relations between scenes,
objects, and texts with the help of image-level annotations. In particular, the relation between the
texts and objects is modeled as the matching probability between the nouns and the object
classes, which can be solved via a constrained bipartite matching problem. On the other hand, the
relations between the scenes and objects/texts are modeled as the conditional distributions of
their co-occurrence. Built upon the learned cross-domain relations, an integrated model brings
together scenes, objects, and texts for joint image understanding, including scene classification,
object classification and localization, and the prediction of object cardinalities. The proposed
cross-domain learning algorithm and the integrated model elevate the performance of image
understanding for web images in the context of textual descriptions. Experimental results show
that the proposed algorithm significantly outperforms conventional methods in various computer
vision tasks.
IEEE Transactions on Multimedia (March 2016)
Blind Quality Assessment of Tone-Mapped Images Via Analysis of Information,
Naturalness, and Structure
Abstract - High dynamic range (HDR) imaging techniques have been working constantly,
actively, and validly in the fault detection and disease diagnosis in the astronomical and medical
fields, and currently they have also gained much more attention from digital image processing
and computer vision communities. While HDR imaging devices are starting to have friendly
prices, HDR display devices are still out of reach of typical consumers. Due to the limited
availability of HDR display devices, in most cases tone mapping operators (TMOs) are used to
convert HDR images to standard low dynamic range (LDR) images for visualization. But
existing TMOs cannot work effectively for all kinds of HDR images, with their performance
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
largely depending on brightness, contrast, and structure properties of a scene. To accurately
measure and compare the performance of distinct TMOs, in this paper develop an effective and
efficient no-reference objective quality metric which can automatically assess LDR images
created by different TMOs without access to the original HDR images. Our model is shown to be
statistically superior to recent full- and no-reference quality measures on the existing tone-
mapped image database and a new relevant database built in this work.
IEEE Transactions on Multimedia (March 2016)
Semi-Supervised Bi-Dictionary Learning for Image Classification With Smooth
Representation-Based Label Propagation
Abstract - In this paper, we propose semi-supervised bi-dictionary learning for image
classification with smooth representation-based label propagation (SRLP). Natural images
contain complex contents of multiple objects with complicated background, clutter, and
occlusions, which prevents image features from belonging to a specific category. Therefore, we
employ reconstruction-based classification to implement discriminative dictionary learning in a
probabilistic manner. We jointly learn a discriminative dictionary called anchor in the feature
space and its corresponding soft label called anchor label in the label space, where the
combination of anchor and anchor label is referred to as bi-dictionary. The learnt bi-dictionary is
utilized to bridge the semantic gap in image classification. First, SRLP constructs smoothed
reconstruction problems for bi-dictionary learning. Then, SRLP produces the reconstruction
coefficients in the feature space over the anchor to infer soft labels of samples in the label space.
Experimental results demonstrate that the proposed method is capable of learning a pair of
discriminative dictionaries for image classification in the feature and label spaces and
outperforms the-state-of-the-art reconstruction-based classification ones.
IEEE Transactions on Multimedia (March 2016)
A Distance-Computation-Free Search Scheme for Binary Code Databases
Abstract - Recently, binary codes have been widely used in many multimedia applications to
approximate high-dimensional multimedia features for practical similarity search due to the
highly compact data representation and efficient distance computation. While the majority of the
hashing methods aim at learning more accurate hash codes, only a few of them focus on indexing
methods to accelerate the search for binary code databases. Among these indexing methods,
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
most of them suffer from extremely high memory cost or extensive Hamming distance
computations. In this paper, we propose a new Hamming distance search scheme for large scale
binary code databases, which is free of Hamming distance computations to return the exact
results. Without the necessity to compare database binary codes with queries, the search
performance can be improved and databases can be externally maintained. More specifically, we
adopt the inverted multi-index data structure to index binary codes. Importantly, the Hamming
distance information embedded in the structure is utilized in the designed search scheme such
that the verification of exact results no longer relies on Hamming distance computations. As a
step further, we optimize the performance of the inverted multi-index structure by taking the
code distributions among different bits into account for index construction. Empirical results on
large-scale binary code databases demonstrate the superiority of our method over existing
approaches in terms of both memory usage and search efficiency.
IEEE Transactions on Multimedia (March 2016)
QoE Evaluation of Multimedia Services Based on Audiovisual Quality and User Interest
Abstract - Quality of experience (QoE) has significant influence on whether or not a user will
choose a service or product in the competitive era. For multimedia services, there are various
factors in a communication ecosystem working together on users, which stimulate their different
senses inducing multidimensional perceptions of the services, and inevitably increase the
difficulty in measurement and estimation of the user's QoE. In this paper, a user-centric objective
QoE evaluation model (QAVIC model for short) is proposed to estimate the user's overall QoE
for audiovisual services, which takes account of perceptual audiovisual quality (QAV) and user
interest in audiovisual content (IC) amongst influencing factors on QoE such as technology,
content, context, and user in the communication ecosystem. To predict the user interest, a
number of general viewing behaviors are considered to formulate the IC evaluation model.
Subjective tests have been conducted for training and validation of the QAVIC model. The
experimental results show that the proposed QAVIC model can estimate the user's QoE
reasonably accurately using a 5-point scale absolute category rating scheme.
IEEE Transactions on Multimedia (March 2016)
A Locality Sensitive Low-Rank Model for Image Tag Completion
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - Many visual applications have benefited from the outburst of web images, yet the
imprecise and incomplete tags arbitrarily provided by users, as the thorn of the rose, may hamper
the performance of retrieval or indexing systems relying on such data. In this paper, we propose
a novel locality sensitive low-rank model for image tag completion, which approximates the
global nonlinear model with a collection of local linear models. To effectively infuse the idea of
locality sensitivity, a simple and effective pre-processing module is designed to learn suitable
representation for data partition, and a global consensus regularizer is introduced to mitigate the
risk of overfitting. Meanwhile, low-rank matrix factorization is employed as local models, where
the local geometry structures are preserved for the low-dimensional representation of both tags
and samples. Extensive empirical evaluations conducted on three datasets demonstrate the
effectiveness and efficiency of the proposed method, where our method outperforms pervious
ones by a large margin.
IEEE Transactions on Multimedia (March 2016)
Compressed-Sensed-Domain L1-PCA Video Surveillance
Abstract - We consider the problem of foreground and background extraction from compressed-
sensed (CS) surveillance videos that are captured by a static CS camera. We propose, for the first
time in the literature, a principal component analysis (PCA) approach that computes directly in
the CS domain the low-rank subspace of the background scene. Rather than computing the
conventional L2-norm-based principal components, which are simply the dominant left singular
vectors of the CS-domain data matrix, we compute the principal components under an L1-norm
maximization criterion. The background scene is then obtained by projecting the CS
measurement vector onto the L1 principal components followed by total-variation (TV)
minimization image recovery. The proposed L1-norm procedure directly carries out low-rank
background representation without reconstructing the video sequence and, at the same time,
exhibits significant robustness against outliers in CS measurements compared to L2-norm PCA.
An adaptive CS- L1-PCA method is also developed for low-latency video surveillance. Extensive
experimental studies described in this paper illustrate and support the theoretical developments.
IEEE Transactions on Multimedia (March 2016)
User-Service Rating Prediction by Exploring Social Users' Rating Behaviors
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - With the boom of social media, it is a very popular trend for people to share what
they are doing with friends across various social networking platforms. Nowadays, we have a
vast amount of descriptions, comments, and ratings for local services. The information is
valuable for new users to judge whether the services meet their requirements before partaking. In
this paper, we propose a user-service rating prediction approach by exploring social users' rating
behaviors. In order to predict user-service ratings, we focus on users' rating behaviors. In our
opinion, the rating behavior in recommender system could be embodied in these aspects: 1)
when user rated the item, 2) what the rating is, 3) what the item is, 4) what the user interest that
we could dig from his/her rating records is, and 5) how the user's rating behavior diffuses among
his/her social friends. Therefore, we propose a concept of the rating schedule to represent users'
daily rating behaviors. In addition, we propose the factor of interpersonal rating behavior
diffusion to deep understand users' rating behaviors. In the proposed user-service rating
prediction approach, we fuse four factors-user personal interest (related to user and the item's
topics), interpersonal interest similarity (related to user interest), interpersonal rating behavior
similarity (related to users' rating behavior habits), and interpersonal rating behavior diffusion
(related to users' behavior diffusions)-into a unified matrix-factorized framework. We conduct a
series of experiments in the Yelp dataset and Douban Movie dataset. Experimental results show
the effectiveness of our approach.
IEEE Transactions on Multimedia (March 2016)
A Novel Lip Descriptor for Audio-Visual Keyword Spotting Based on Adaptive Decision
Fusion
Abstract - Keyword spotting remains a challenge when applied to real-world environments with
dramatically changing noise. In recent studies, audio-visual integration methods have
demonstrated superiorities since visual speech is not influenced by acoustic noise. However, for
visual speech recognition, individual utterance mannerisms can lead to confusion and false
recognition. To solve this problem, a novel lip descriptor is presented involving both geometry-
based and appearance-based features in this paper. Specifically, a set of geometry-based features
is proposed based on an advanced facial landmark localization method. In order to obtain robust
and discriminative representation, a spatiotemporal lip feature is put forward concerning
similarities among textons and mapping the feature to intra-class subspace. Moreover, a parallel
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
two-step keyword spotting strategy based on decision fusion is proposed in order to make the
best use of audio-visual speech and adapt to diverse noise conditions. Weights generated using a
neural network combine acoustic and visual contributions. Experimental results on the OuluVS
dataset and PKU-AV dataset demonstrate that the proposed lip descriptor shows competitive
performance compared to the state of the art. Additionally, the proposed audio-visual keyword
spotting (AV-KWS) method based on decision-level fusion significantly improves the noise
robustness and attains better performance than feature-level fusion, which is also capable of
adapting to various noisy conditions.
IEEE Transactions on Multimedia (March 2016)
Collaborative Wireless Freeview Video Streaming With Network Coding
Abstract - Free viewpoint video (FVV) offers compelling interactive experience by allowing
users to switch to any viewing angle at any time. An FVV is composed of a large number of
camera-captured anchor views, with virtual views (not captured by any camera) rendered from
their nearby anchors using techniques such as depth-image-based rendering (DIBR). We
consider a group of wireless users who may interact with an FVV by independently switching
views. We study a novel live FVV streaming network where each user pulls a subset of anchors
from the server via a primary channel. To enhance anchor availability at each user, a user
generates network-coded (NC) packets using some of its anchors and broadcasts them to its
direct neighbors via a secondary channel. Given limited primary and secondary channel
bandwidths at the devices, we seek to maximize the received video quality (i.e., minimize
distortion) by jointly optimizing the set of anchors each device pulls and the anchor combination
to generate NC packets. To our best knowledge, this is among the first body of work addressing
such joint optimization problem for wireless live FVV streaming with NC-based collaboration.
We first formulate the problem and show that it is NP-hard. We then propose a scalable and
effective algorithm called PAFV (Peer-Assisted Freeview Video). In PAFV, each node
collaboratively and distributedly decides on the anchors to pull and NC packets to share so as to
minimize video distortion in its neighborhood. Extensive simulation studies show that PAFV
outperforms other algorithms, achieving substantially lower video distortion (often by more than
20-50%) with significantly less redundancy (by as much as 70%). Our Android-based video
experiment further confirms the effectiveness of PAFV over comparison schemes.
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
IEEE Transactions on Multimedia (March 2016)
A Decision-Tree-Based Perceptual Video Quality Prediction Model and Its Application in
FEC for Wireless Multimedia Communications
Abstract - With the exponential growth of video traffic over wireless networked and embedded
devices, mechanisms are needed to predict and control perceptual video quality to meet the
quality of experience (QoE) requirements in an energy-efficient way. This paper proposes an
energy-efficient QoE support framework for wireless video communications. It consists of two
components: 1) a perceptual video quality model that allows the prediction of video quality in
real-time and with low complexity, and 2) an application layer energy-efficient and content-
aware forward error correction (FEC) scheme for preventing quality degradation caused by
network packet losses. The perceptual video quality model characterizes factors related to video
content as well as distortion caused by compression and transmission. Prediction of perceptual
quality is achieved through a decision tree using a set of observable features from the
compressed bitstream and the network. The proposed model can achieve prediction accuracy of
88.9% and 90.5% on two distinct testing sets. Based on the proposed quality model, a novel FEC
scheme is introduced to protect video packets from losses during transmission. Given a user-
defined perceptual quality requirement, the FEC scheme adjusts the level of protection for
different components in a video stream to minimize network overhead. Simulation results show
that the proposed FEC scheme can enhance the perceptual quality of videos. Compared to
conventional FEC methods for video communications, the proposed FEC scheme can reduce
network overhead by 41% on average.
IEEE Transactions on Multimedia (April 2016)
mDASH: A Markov Decision-Based Rate Adaptation Approach for Dynamic HTTP
Streaming
Abstract - Dynamic adaptive streaming over HTTP (DASH) has recently been widely deployed
in the Internet. It, however, does not impose any adaptation logic for selecting the quality of
video fragments requested by clients. In this paper, we propose a novel Markov decision-based
rate adaptation scheme for DASH aiming to maximize the quality of user experience under time-
varying channel conditions. To this end, our proposed method takes into account those key
factors that make a critical impact on visual quality, including video playback quality, video rate
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
switching frequency and amplitude, buffer overflow/underflow, and buffer occupancy. Besides,
to reduce computational complexity, we propose a low-complexity sub-optimal greedy algorithm
which is suitable for real-time video streaming. Our experiments in network test-bed and real-
world Internet all demonstrate the good performance of the proposed method in both objective
and subjective visual quality.
IEEE Transactions on Multimedia (April 2016)
Complexity Control Based on a Fast Coding Unit Decision Method in the HEVC Video
Coding Standard
Abstract - The emerging high-efficiency video coding standard achieves higher coding
efficiency than previous standards by virtue of a set of new coding tools such as the quadtree
coding structure. In this novel structure, the pixels are organized into coding units (CU),
prediction units, and transform units, the sizes of which can be optimized at every level
following a tree configuration. These tools allow highly flexible data representation; however,
they incur a very high computational complexity. In this paper, we propose an effective
complexity control (CC) algorithm based on a hierarchical approach. An early termination
condition is defined at every CU size to determine whether subsequent CU sizes should be
explored. The actual encoding times are also considered to satisfy the target complexity in real
time. Moreover, all parameters of the algorithm are estimated on the fly to adapt its behavior to
the video content, the encoding configuration, and the target complexity over time. The
experimental results prove that our proposal is able to achieve a target complexity reduction of
up to 60% with respect to full exploration, with notable accuracy and limited losses in coding
performance. It was compared with a state-of-the-art CC method and shown to achieve a
significantly better trade-off between coding complexity and efficiency as well as higher
accuracy in reaching the target complexity. Furthermore, a comparison with a state-of-the-art
complexity reduction method highlights the advantages of our CC framework. Finally, we show
that the proposed method performs well when the target complexity varies over time.
IEEE Transactions on Multimedia (April 2016)
A Low-Power Video Recording System With Multiple Operation Modes for H.264 and
Light-Weight Compression
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - An increasing demand for mobile video recording systems makes it important to
reduce power consumption and to increase battery lifetime. The H.264/AVC compression is
widely used for many video recording systems because of its high compression efficiency;
however, the complex coding structure of H.264/AVC compression requires large power
consumption. A light-weight video compression (LWC), based on discrete wavelet transform
and set partitioning in hierarchical trees, consumes less power than H.264/AVC compression
thanks to its relatively simple coding structure, although its compression efficiency is lower than
that of H.264/AVC compression. This paper proposes a low-power video recording system that
combines both the H.264/AVC encoder with high compression efficiency and LWC with low
power consumption. The LWC is used to compress video data for temporal storage while the
H.264/AVC encoder is used for permanent storage of data when some events are detected. For
further power reduction, a down-sampling operation is utilized for permanent data storage. For
an effective use of the two compressions with the down-sampling operation, an appropriate
scheme is selected according to the proportion of long-term to short-term storage and the target
bitrate. The proposed system reduces power consumption by up to 72.5% compared to that in a
conventional video recording system.
IEEE Transactions on Multimedia (April 2016)
Human Visual System-Based Saliency Detection for High Dynamic Range Content
Abstract - The human visual system (HVS) attempts to select salient areas to reduce cognitive
processing efforts. Computational models of visual attention try to predict the most relevant and
important areas of videos or images viewed by the human eye. Such models, in turn, can be
applied to areas such as computer graphics, video coding, and quality assessment. Although
several models have been proposed, only one of them is applicable to high dynamic range (HDR)
image content, and no work has been done for HDR videos. Moreover, the main shortcoming of
the existing models is that they cannot simulate the characteristics of HVS under the wide
luminous range found in HDR content. This paper addresses these issues by presenting a
computational approach to model the bottom-up visual saliency for HDR input by combining
spatial and temporal visual features. An analysis of eye movement data affirms the effectiveness
of the proposed model. Comparisons employing three well-known quantitative metrics show that
the proposed model substantially improves predictions of visual attention for HDR content.
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
IEEE Transactions on Multimedia (April 2016)
Multimodal Personality Recognition in Collaborative Goal-Oriented Tasks
Abstract - Incorporating research on personality recognition into computers, both from a
cognitive as well as an engineering perspective, would facilitate the interactions between humans
and machines. Previous attempts on personality recognition have focused on a variety of
different corpora (ranging from text to audiovisual data), scenarios (interviews, meetings),
channels of communication (audio, video, text), and different subsets of personality traits (out of
the five ones from the Big Five Model). Our study uses simple acoustic and visual nonverbal
features extracted from multimodal data, which have been recorded in previously uninvestigated
scenarios, and consider all five personality traits and not just a subset. First, we look at the
human-machine interaction scenario, where we introduce the display of different “collaboration
levels.” Second, we look at the contribution of the human-human interaction (HHI) scenario on
the emergence of personality traits. Investigating the HHI scenario creates a stronger basis for
future human-agents interactions. Our goal is to study, from a computational approach, the
emergence degree of the five personality traits in these two scenarios. The results demonstrate
the relevance of each of the two scenarios when it comes to the degree of emergence of certain
traits and the feasibility to automatically recognize personality under different conditions.
IEEE Transactions on Multimedia (April 2016)
Core Failure Mitigation in Integer Sum-of-Product Computations on Cloud Computing
Systems
Abstract - The decreasing mean-time-to-failure estimates in cloud computing systems indicate
that multimedia applications running on such environments should be able to mitigate an
increasing number of core failures at runtime. We propose a new roll-forward failure-mitigation
approach for integer sum-of-product computations, with emphasis on generic matrix
multiplication (GEMM) and convolution/crosscorrelation (CONV) routines. Our approach is
based on the production of redundant results within the numerical representation of the outputs
via the use of numerical packing. This differs from all existing roll-forward solutions that require
a separate set of checksum (or duplicate) results. Our proposal imposes 37.5% reduction in the
maximum output bitwidth supported in comparison to integer sum-of-product realizations
performed on 32-bit integer representations which is comparable to the bitwidth requirement of
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
checksum-methods for multiple core failure mitigation. Experiments with state-of-the-art
GEMM and CONV routines running on a c4.8xlarge compute-optimized instance of amazon
web services elastic compute cloud (AWS EC2) demonstrate that the proposed approach is able
to mitigate up to one quadcore failure while achieving processing throughput that is: 1)
comparable to that of the conventional, failure-intolerant, integer GEMM and CONV routines, 2)
substantially superior to that of the equivalent roll-forward failure-mitigation method based on
checksum streams. Furthermore, when used within an image retrieval framework deployed over
a cluster of AWS EC2 spot (i.e., low-cost albeit terminatable) instances, our proposal leads to: 1)
16%-23% cost reduction against the equivalent checksum-based method and 2) more than 70%
cost reduction against conventional failure-intolerant processing on AWS EC2 on-demand (i.e.,
higher-cost albeit guaranteed) instances.
IEEE Transactions on Multimedia (April 2016)
Factorization Algorithms for Temporal Psychovisual Modulation Display
Abstract - Temporal psychovisual modulation (TPVM) is a new information display technology
which aims to generate multiple visual percepts for different viewers on a single display
simultaneously. In a TPVM system, the viewers wearing different active liquid crystal (LC)
glasses with varying transparency levels can see different images (called personal views). The
viewers without LC glasses can also see a semantically meaningful image (called shared view).
The display frames and weights for the LC glasses in the TPVM system can be computed
through nonnegative matrix factorization (NMF) with three additional constrains: the values of
images and modulation weights should have upper bound (i.e., limited luminance of the display
and transparency level of the LC); the shared view without using viewing devices should be
considered (i.e., the sum of all basis images should be a meaningful image); and the sparsity of
modulation weights should be considered due to the material property of LC. In this paper, we
proposed to solve the constrained NMF problem by a modified version of hierarchical alternating
least squares (HALS) algorithms. Through experiments, we analyze the choice of parameters in
the setup of TPVM system. This work serves as a guideline for practical implementation of
TPVM display system.
IEEE Transactions on Multimedia (April 2016)
Free-Energy Principle Inspired Video Quality Metric and Its Use in Video Coding
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - In this paper, we extend the free-energy principle to video quality assessment (VQA)
by incorporating with the recent psychophysical study on human visual speed perception
(HVSP). A novel video quality metric, namely the free-energy principle inspired video quality
metric (FePVQ), is therefore developed and applied to perceptual video coding optimization. The
free-energy principle suggests that the human visual system (HVS) can actively predict “orderly”
information and avoid “disorderly” information for image perception. Basically, “orderly” is
associated with the skeletons and edges of objects, and “disorderly” mostly concerns textures in
images. Based on this principle, an image is separated into orderly and disorderly regions, and
processed differently in image quality assessment. For videos, visual attention, or fixation, is
associated with the objects with significant motion according to HVSP, resulting in a motion
strength factor in the FePVQ so that the free-energy principle is extended into spatio-temporal
domain for VQA. In addition, we investigate the application of the FePVQ in perceptual rate
distortion optimization (RDO). For this purpose, the FePVQ is realized with low computational
cost by using the relative total variation model and the block-wise motion vectors of video
coding to simulate the free-energy principle and the HVSP, respectively. The experimental
results indicate that the proposed FePVQ is highly consistent with the HVS perception. The
linear correlation coefficient and Spearman's rank-order correlation coefficient are up to 0.8324
and 0.8281 on the LIVE video database. Better perceptual quality of encoded video sequences is
achieved by FePVQ-motivated RDO in video coding.
IEEE Transactions on Multimedia (April 2016)
Holons Visual Representation for Image Retrieval
Abstract - Along with the enlargement of image scale, convolutional local features, such as
SIFT, are ineffective for representing or indexing and more compact visual representations are
required. Due to the intrinsic mechanism, the state-of-the-art vector of locally aggregated
descriptors (VLAD) has a few limits. Based on this, we propose a new descriptor named holons
visual representation (HVR). The proposed HVR is a derivative mutational self-contained
combination of global and local information. It exploits both global characteristics and the
statistic information of local descriptors in the image dataset. It also takes advantages of local
features of each image and computes their distribution with respect to the entire local descriptor
space. Accordingly, the HVR is computed by a two-layer hierarchical scheme, which splits the
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
local feature space and obtains raw partitions, as well as the corresponding refined partitions.
Then, according to the distances from the centroids of partition spaces to local features and their
spatial correlation, we assign the local features into their nearest raw partitions and refined
partitions to obtain the global description of an image. Compared with VLAD, HVR holds
critical structure information and enhances the discriminative power of individual representation
with a small amount of computation cost, while using the same memory overhead. Extensive
experiments on several benchmark datasets demonstrate that the proposed HVR outperforms
conventional approaches in terms of scalability as well as retrieval accuracy for images with
similar intra local information.
IEEE Transactions on Multimedia (April 2016)
Query-Adaptive Small Object Search Using Object Proposals and Shape-Aware
Descriptors
Abstract - While there has been a significant amount of work on object search and image
retrieval, the focus has primarily been on establishing effective models for the whole images,
scenes, and objects occupying a large portion of an image. In this paper, we propose to leverage
object proposals to identify small and smooth-structured objects in a large image database.
Unlike popular methods exploring a coarse image-level pairwise similarity, the search is
designed to exploit the similarity measures at the proposal level. An effective graph-based query
expansion strategy is designed to assess each of these better matched proposals against all its
neighbors within the same image for a precise localization. Combined with a shape-aware feature
descriptor EdgeBoW, a set of more insightful edge-weights and node-utility measures, the
proposed search strategy can handle varying view angles, illumination conditions, deformation,
and occlusion efficiently. Experiments performed on a number of other benchmark datasets show
the powerful and superior generalization ability of this single integrated framework in dealing
with both clutter-intensive real-life images and poor-quality binary document images at equal
dexterity.
IEEE Transactions on Multimedia (April 2016)
Folksonomy-Based Visual Ontology Construction and Its Applications
Abstract - An ontology hierarchically encodes concepts and concept relationships, and has a
variety of applications such as semantic understanding and information retrieval. Previous work
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
for building ontologies has primarily relied on labor-intensive human contributions or focused on
text-based extraction. In this paper, we consider the problem of automatically constructing a
folksonomy-based visual ontology (FBVO) from the user-generated annotated images. A
systematic framework is proposed consisting of three stages as concept discovery, concept
relationship extraction, and concept hierarchy construction. The noisy issues of the user-
generated tags are carefully addressed to guarantee the quality of derived FBVO. The
constructed FBVO finally consists of 139 825 concept nodes and millions of concept
relationships by mining more than 2.4 million Flickr images. Experimental evaluations show that
the derived FBVO is of high quality and consistent with human perception. We further
demonstrate the utility of the derived FBVO in applications of complex visual recognition and
exploratory image search.
IEEE Transactions on Multimedia (April 2016)
Learning Personalized Models for Facial Expression Analysis and Gesture Recognition
Abstract - Facial expression and gesture recognition algorithms are key enabling technologies
for human-computer interaction (HCI) systems. State of the art approaches for automatic
detection of body movements and analyzing emotions from facial features heavily rely on
advanced machine learning algorithms. Most of these methods are designed for the average user,
but the assumption “one-size-fits-all” ignores diversity in cultural background, gender, ethnicity,
and personal behavior, and limits their applicability in real-world scenarios. A possible solution
is to build personalized interfaces, which practically implies learning person-specific classifiers
and usually collecting a significant amount of labeled samples for each novel user. As data
annotation is a tedious and time-consuming process, in this paper we present a framework for
personalizing classification models which does not require labeled target data. Personalization is
achieved by devising a novel transfer learning approach. Specifically, we propose a regression
framework which exploits auxiliary (source) annotated data to learn the relation between person-
specific sample distributions and parameters of the corresponding classifiers. Then, when
considering a new target user, the classification model is computed by simply feeding the
associated (unlabeled) sample distribution into the learned regression function. We evaluate the
proposed approach in different applications: pain recognition and action unit detection using
visual data and gestures classification using inertial measurements, demonstrating the generality
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
of our method with respect to different input data types and basic classifiers. We also show the
advantages of our approach in terms of accuracy and computational time both with respect to
user-independent approaches and to previous personalization techniques.
IEEE Transactions on Multimedia (April 2016)
Scalable Video Event Retrieval by Visual State Binary Embedding
Abstract - With the exponential increase of media data on the web, fast media retrieval is
becoming a significant research topic in multimedia content analysis. Among the variety of
techniques, learning binary embedding (hashing) functions is one of the most popular approaches
that can achieve scalable information retrieval in large databases, and it is mainly used in the
near-duplicate multimedia search. However, till now most hashing methods are specifically
designed for nearduplicate retrieval at the visual level rather than the semantic level. In this
paper, we propose a Visual State Binary Embedding (VSBE) model to encode the video frames,
which can preserve the essential semantic information in binary matrices, to facilitate fast video
event retrieval in unconstrained cases. Compared with other video binary embedding models,
one advantage of our proposed VSBE model is that it only needs a limited number of key frames
from the training videos for hash function training, so the computational complexity is much
lower in the training phase. At the same time, we apply the pair-wise constraints generated from
the visual states to sketch the local properties of the events at the semantic level, so accuracy is
also ensured. We conducted extensive experiments on the challenging TRECVID MED dataset,
and have proved the superiority of our proposed VSBE model.
IEEE Transactions on Multimedia (April 2016)
Link Adaptation for High-Quality Uncompressed Video Streaming in 60-GHz Wireless
Networks
Abstract - The emerging 60-GHz multigigabits per second wireless technology enables the
streaming of high-quality “uncompressed” video, which has been impossible with other existing
wireless technologies. To support such a resource-hungry uncompressed video streaming service
with limited wireless resources, it is necessary to design efficient link adaptation policies
selecting suitable transmission rates for the 60-GHz wireless channel environment, thus
optimizing video quality and resource management. For proper design of the link adaptation
policies, we propose a new metric, called expected peak signal-to-noise ratio (ePSNR), to
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
numerically estimate the video streaming quality. By using the ePSNR as a criterion, we propose
two link adaptation policies with different objectives considering unequal error protection (UEP).
The proposed link adaptation policies attempt to 1) maximize the video quality for given wireless
resources, or 2) minimize the required wireless resources, while meeting the video quality. From
the link adaptation policies, we provide a distributed resource management scheme for multiple
users to maintain satisfactory video streaming quality. Our extensive simulation results
demonstrate that the newly proposed variable, i.e., ePSNR, well represents the level of video
quality. It is also shown that the proposed link adaptation policies can enhance the resource
efficiency while achieving acceptable quality of the video streaming.
IEEE Transactions on Multimedia (April 2016)
Multiview and 3D Video Compression Using Neighboring Block Based Disparity Vectors
Abstract - Compression of the statistical redundancy among different viewpoints, i.e., inter-view
redundancy, is a fundamental and critical problem in multiview and three-dimensional (3D)
video coding. To exploit the inter-view redundancy, disparity vectors are required to identify
pixels of the same objects within two different views; in this way, the enhancement coding tools
can be efficiently employed as new modes in block-based video codecs to achieve higher
compression efficiency. Although disparity can be converted from depth, it is not possible in
multiview video coding since depth information is not considered. Even when depth information
is coded, it breaks the so-called multiview compatibility wherein texture views can be decoded
without depth information. To resolve this problem, in this paper, a neighboring block-based
disparity vector derivation (NBDV) method is proposed. The basic concept of NBDV is to derive
a disparity vector (DV) of a current block by utilizing the motion information of spatially and
temporally neighboring blocks predicted from another view. Through extensive experiments and
analysis, it is shown that the proposed NBDV method achieves efficient DV derivation in the
state-of-art video codecs, and it keeps the multiview compatibility with a relatively lower
complexity. The proposed method has become an essential part of the 3D video standard
extensions of H.264/AVC and HEVC.
IEEE Transactions on Multimedia (April 2016)
Predicting the Performance in Decision-Making Tasks: From Individual Cues to Group
Interaction
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - This paper addresses the problem of predicting the performance of decision-making
groups. Towards this goal, we evaluate the predictive power of group attributes and discussion
dynamics by using automatically extracted features, such as group members' aural and visual
cues, interaction between team members, and influence of each team member, as well as self-
reported features such as personality- and perception-related cues, hierarchical structure of the
group, and individual- and group-level task performances. We tackle the inference problem from
two angles depending on the way that features are extracted: 1) a holistic approach based on the
entire meeting, and 2) a sequential approach based on the thin slices of the meeting. In the
former, key factors affecting the group performance are identified and the prediction is achieved
by support vector machines. As for the latter, we compare and contrast the classification
performance of an influence model-based novel classifier with that of hidden Markov model
(HMM). Experimental results indicate that the group looking cues and the influence cues are
major predictors of group performance and the influence model outperforms the HMM in almost
all experimental conditions. We also show that combining classifiers covering unique aspects of
data results in improvement in the classification performance.
IEEE Transactions on Multimedia (April 2016)
Comparison and Evaluation of Sonification Strategies for Guidance Tasks
Abstract - This paper aims to reveal the efficiency of sonification strategies in terms of rapidity,
precision, and overshooting in the case of a one-dimensional guidance task. The sonification
strategies are based on the four main perceptual attributes of a sound (pitch, loudness,
duration/tempo, and timbre) and classified with respect to the presence or not of one or several
auditory references. Perceptual evaluations are used to display the strategies in a
precision/rapidity space and enable prediction of user behavior for a chosen sonification strategy.
The evaluation of sonification strategies constitutes a first step toward general guidelines for
sound design in interactive multimedia systems that involve guidance issues.
IEEE Transactions on Multimedia (April 2016)
3D Ear Identification Using Block-wise Statistics based Features and LC-KSVD
Abstract - Biometrics authentication has been corroborated to be an effective method for
recognizing a person’s identity with high confidence. In this field, the use of 3D ear shape is a
recent trend. As a biometric identifier, ear has several inherent merits. However, although a great
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
deal of efforts have been devoted, there is still a large room for improvement for developing a
highly effective and efficient 3D ear identification approach. In this paper, we attempt to fill this
gap to some extent by proposing a novel 3D ear classification scheme that makes use of the label
consistent K-SVD (LC-KSVD) framework. As an effective supervised dictionary learning
algorithm, LC-KSVD learns a single compact discriminative dictionary for sparse coding and a
multi-class linear classifier simultaneously. To use the LC-KSVD framework, one key issue is
how to extract feature vectors from 3D ear scans. To this end, we propose a block-wise statistics
based feature extraction scheme. Specifically, we divide a 3D ear ROI into uniform blocks and
extract a histogram of surface types from each block; histograms from all blocks are then
concatenated to form the desired feature vector. Feature vectors extracted in this way are highly
discriminative and are robust to mere misalignment between samples. Experiments demonstrate
that our approach can achieve better recognition accuracy than the other state-of-the-art methods.
More importantly, its computational complexity is extremely low, making it quite suitable for the
large-scale identification applications. Matlab source codes are publicly online available at
http://sse.tongji.edu.cn/linzhang/LCKSVDEar/LCKSVDEar.htm.
IEEE Transactions on Multimedia (May 2016)
Sketch-based Image Retrieval by Salient Contour Reinforcement
Abstract - The paper presents a sketch-based image retrieval algorithm. One of the main
challenges in sketch-based image retrieval (SBIR) is to measure the similarity between a sketch
and an image. To tackle this problem, we propose a SBIR based approach by salient contour
reinforcement. In our approach, we divide the image contour into two types. The first is the
global contour map. The second that is called the salient contour map is helpful to find out the
object in images similar to the query. In addition, based on the two contour maps, we propose a
new descriptor, namely angular radial orientation partitioning (AROP) feature. It fully utilizes
the edge pixels’ orientation information in contour maps to identify the spatial relationships. Our
AROP feature based on the two candidate contour maps is both efficient and effective to
discover false matches of local features between sketch and images, and can greatly improve the
retrieval performance. The application of retrieval system based on this algorithm is established.
The experiments on the image dataset with 0.3 million images show the effectiveness of the
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
proposed method and comparisons with other algorithms are also given. Compared to baseline
performance, the proposed method achieves 10% higher precision in top 5.
IEEE Transactions on Multimedia (May 2016)
Democratic Diffusion Aggregation for Image Retrieval
Abstract - Content-based image retrieval is an important research topic in the multimedia filed.
In large-scale image search using local features, image features are encoded and aggregated into
a compact vector to avoid indexing each feature individually. In aggregation step, sum-
aggregation is wildly used in many existing work and demonstrates promising performance.
However, it is based on a strong and implicit assumption that the local descriptors of an image
are identically and independently distributed in descriptor space and image plane. To address this
problem, we propose a new aggregation method named democratic diffusion aggregation with
weak spatial context embedded. The main idea of our aggregation method is to re-weight the
embedded vectors before sum-aggregation by considering the relevance among local descriptors.
Different from previous work, by conducting a diffusion process on the improved kernel matrix,
we calculate the weighting coefficients more efficiently without any iterative optimization.
Besides, considering the relevance of local descriptors from different images, we also discuss an
efficient query fusion strategy which uses the initial topranked image vectors to enhance the
retrieval performance. Experimental results show that our aggregation method exhibits much
higher efficiency (about ×14 faster) and better retrieval accuracy compared with previous
methods, and the query fusion strategy consistently improves the retrieval quality.
IEEE Transactions on Multimedia (May 2016)
Democratic Diffusion Aggregation for Image Retrieval
Abstract - Content-based image retrieval is an important research topic in the multimedia filed.
In large-scale image search using local features, image features are encoded and aggregated into
a compact vector to avoid indexing each feature individually. In aggregation step, sum-
aggregation is wildly used in many existing work and demonstrates promising performance.
However, it is based on a strong and implicit assumption that the local descriptors of an image
are identically and independently distributed in descriptor space and image plane. To address this
problem, we propose a new aggregation method named democratic diffusion aggregation with
weak spatial context embedded. The main idea of our aggregation method is to re-weight the
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
embedded vectors before sum-aggregation by considering the relevance among local descriptors.
Different from previous work, by conducting a diffusion process on the improved kernel matrix,
we calculate the weighting coefficients more efficiently without any iterative optimization.
Besides, considering the relevance of local descriptors from different images, we also discuss an
efficient query fusion strategy which uses the initial topranked image vectors to enhance the
retrieval performance. Experimental results show that our aggregation method exhibits much
higher efficiency (about ×14 faster) and better retrieval accuracy compared with previous
methods, and the query fusion strategy consistently improves the retrieval quality.
IEEE Transactions on Multimedia (May 2016)
Tag based Image Search by Social Re-Ranking
Abstract - Social media sharing websites like Flickr allow users to annotate images with free
tags, which significantly contribute to the development of the web image retrieval and
organization. Tag-based image search is an important method to find images contributed by
social users in such social websites. However, how to make the top ranked result relevant and
with diversity is challenging. In this paper, we propose a social re-ranking system for tag-based
image retrieval with the consideration of image’s relevance and diversity. We aim at re-ranking
images according to their visual information, semantic information and social clues. The initial
results include images contributed by different social users. Usually each user contributes several
images. First we sort these images by inter-user re-ranking. Users that have higher contribution
to the given query rank higher. Then we sequentially implement intra-user re-ranking on the
ranked user’s image set, and only the most relevant image from each user’s image set is selected.
These selected images compose the final retrieved results. We build an inverted index structure
for the social image dataset to accelerate the searching process. Experimental results on Flickr
dataset show that our social re-ranking method is effective and efficient.
IEEE Transactions on Multimedia (May 2016)
Learning Geographical Hierarchy Features via a Compositional Model
Abstract - Image location prediction is to estimate the geolocation where an image is taken,
which is important for many image applications, such as image retrieval, image browsing and
organization. Since social image contains heterogeneous contents, such as visual content and
textual content, effectively incorporating these contents to predict location is nontrivial.
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Moreover, it is observed that image content patterns and the locations where they may appear
correlate hierarchically. Traditional image location prediction methods mainly adopt a single-
level architecture and assume images are independently distributed in geographical space, which
is not directly adaptable to the hierarchical correlation. In this paper, we propose a
Geographically Hierarchical Bi-modal Deep Belief Network model (GHBDBN), which is a
compositional learning architecture that integrates multi-modal deep learning model with non-
parametric hierarchical prior model. GH-BDBN learns a joint representation capturing the
correlations among different types of image content using a bi-modal DBN, with a
geographically hierarchical prior over the joint representation to model the hierarchical
correlation between image content and location. Then, an efficient inference algorithm is
proposed to learn the parameters and the geographical hierarchical structure of geographical
locations. Experimental results demonstrate the superiority of our model for image location
prediction.
IEEE Transactions on Multimedia (May 2016)
Semantic Discriminative Metric Learning for Image Similarity Measurement
Abstract - Along with the arrival of multimedia time, multimedia data has replaced textual data
to transfer information in various fields. As an important form of multimedia data, images have
been widely utilized by many applications, such as face recognition, image classification.
Therefore, how to accurately annotate each image from a large set of images is of vital
importance but challenging. To perform these tasks well, it’s crucial to extract suitable features
to character the visual contents of images and learn an appropriate distance metric to measure
similarities between all images. Unfortunately, existing feature operators, such as histogram of
gradient, local binary pattern and color histogram, care more about the visual character of images
and lack the ability to distinguish semantic information. Similarities between those features can’t
reflect the real category correlations due to the well-known semantic gap. In order to solve this
problem, this paper proposes a regularized distance metric framework called Semantic
Discriminative Metric Learning (SDML). SDML combines geometric mean with normalized
divergences and separates images from different classes simultaneously. The learned distance
metric can treat all images from different classes equally. And distinctions between similar
classes with entirely different semantic contents are emphasized by SDML. This procedure
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
ensures the consistency between dissimilarities and semantic distinctions and avoids inaccuracy
similarities incurred by unbalanced locations of samples. Various experiments on benchmark
image datasets show the excellent performance of the novel method.
IEEE Transactions on Multimedia (May 2016)
6-DOF Image Localization from Massive Geo-tagged Reference Images
Abstract - The 6-DOF (Degrees Of Freedom) image localization, which aims to calculate the
spatial position and rotation of a camera, is a challenging problem for most location-based
services. In existing approaches, this problem is often tackled by finding the matches between
2D image points and 3D structure points so as to derive the location information via direct linear
transformation algorithm. However, as these 2D-to-3D based approaches need to reconstruct the
3D structure points of the scene, they may be not flexible to employ massive and increasing geo-
tagged data. To this end, this paper presents a novel approach for 6-DOF image localization by
fusing candidate poses relative to reference images. In this approach, we propose to localize an
input image according to the position and rotation information of multiple geo-tagged images
retrieved from a reference dataset. From the reference images, an efficient relative pose
estimation algorithm is proposed to derive a set of candidate poses for the input image. Each
candidate pose encodes the relative rotation and direction of the input image with respect to a
specific reference image. Finally, these candidate poses can be fused together by minimizing a
welldefined geometry error so that the 6-DOF location of the input image is effectively derived.
Experimental results show that our method can obtain satisfactory localization accuracy. In
addition, the proposed relative pose estimation algorithm is much faster than existing work.
IEEE Transactions on Multimedia (May 2016)
Delay-Optimized Video Traffic Routing in Software-Defined Interdatacenter Networks
Abstract - Many video streaming applications operate their geo-distributed services in the cloud,
taking advantage of superior connectivities between datacenters to push content closer to users or
to relay live video traffic between end users at a higher throughput. In the meantime, inter-
datacenter networks also carry high volumes of other types of traffic, including service
replication and data backups, e.g., for storage and email services. It is an important research topic
to optimally engineer and schedule inter-datacenter traffic, taking into account the stringent
latency requirements of video flows when transmitted along inter-datacenter links shared with
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
other types of traffic. Since inter-datacenter networks are usually overprovisioned, unlike prior
work that mainly aims to maximize link utilization, we propose a delay-optimized traffic routing
scheme to explicitly differentiate path selection for different sessions according to their delay
sensitivities, leading to a software-defined inter-datacenter networking overlay implemented at
the application layer. We show that our solution can yield sparse path selection by only solving
linear programs, and thus, in contrast to prior traffic engineering solutions, does not lead to
overly fine-grained traffic splitting, further reducing packet resequencing overhead and the
number of forwarding rules to be installed in each forwarding unit. Real-world experiments
based on a deployment on six globally distributed Amazon EC2 datacenters have shown that our
system can effectively prioritize and improve the delay performance of inter-datacenter video
flows at a low cost.
IEEE Transactions on Multimedia (May 2016)
Multiple Human Identification and Cosegmentation: A Human-Oriented CRF Approach
with Poselets
Abstract - Localizing, identifying and extracting humans with consistent appearance jointly
from a personal photo stream is an important problem and has wide applications. The strong
variations in foreground and background and irregularly occurring foreground humans make this
realistic problem challenging. Inspired by the advance in object detection, scene understanding
and image cosegmentation, in this paper we explore explicit constraints to label and segment
human objects rather than other non-human objects and “stuff”. We refer to such a problem as
Multiple Human Identification and Cosegmentation (MHIC). To identify specific human
subjects, we propose an efficient human instance detector by combining an extended color line
model with a poselet-based human detector. Moreover, to capture high level human shape
information, a novel soft shape cue is proposed. It is initialized by the human detector, then
further enhanced through a generalized geodesic distance transform, and refined finally with a
joint bilateral filter. We also propose to capture the rich feature context around each pixel by
using an adaptive cross region data structure, which gives a higher discriminative power than a
single pixel-based estimation. The high-level object cues from the detector and the shape are
then integrated with the low-level pixel cues and mid-level contour cues into a principled
conditional random field (CRF) framework, which can be efficiently solved by using fast graph
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
cut algorithms. We evaluate our method over a newly created NTU-MHIC human dataset, which
contains 351 images with manually annotated ground-truth segmentation. Both visual and
quantitative results demonstrate that our method achieves state-of-the-art performance for the
MHIC task.
IEEE Transactions on Multimedia (May 2016)
Game Theoretic Resource Allocation in Media Cloud with Mobile Social Users
Abstract - Due to the rapid increases in both the population of mobile social users and the
demand for quality of experience (QoE), providing mobile social users with satisfied multimedia
services has become an important issue. Media cloud has been shown to be an efficient solution
to resolve the above issue, by allowing mobile social users connecting it through a group of
distributed brokers. However, as the resource in media cloud is limited, how to allocate resource
among media cloud, brokers and mobile social users becomes a new challenge. Therefore, in this
paper, we propose a game theoretic resource allocation scheme for media cloud to allocate
resource to mobile social users though brokers. Firstly, a framework of resource allocation
among media cloud, brokers and mobile social users is presented. Media cloud can dynamically
determine the price of resource and allocate its resource to brokers. Mobile social user can select
his broker to connect media cloud by adjusting the strategy to achieve the maximum revenue,
based on the social features in the community. Next, we formulate the interactions among media
cloud, brokers and mobile social users by a four-stage Stackelberg game. In addition, through the
backward induction method, we propose an iterative algorithm to implement the proposed
scheme and obtain the Stackelberg equilibrium. Finally, simulation results show that each player
in the game can obtain the optimal strategy where Stackelberg equilibrium exists stably.
IEEE Transactions on Multimedia (May 2016)
DPcode: Privacy-Preserving Frequent Visual Patterns Publication on Cloud
Abstract - Nowadays, cloud has become a promising multimedia data processing and sharing
platform. Many institutes and companies plan to outsource and share their large-scale video and
image datasets on cloud for scientific research and public interest. Among various video
applications, the discovery of frequent visual patterns over graphical data is an exploratory and
important technique. However, the privacy concerns over the leakage of sensitive information
contained in the videos/images impedes the further implementation. Although the frequent visual
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
patterns mining (FVPM) algorithm aggregates summary over individual frames and seems not to
pose privacy threat, the private information contained in individual frames still may be leaked
from the statistical result. In this paper, we study the problem of privacy-preserving publishing of
graphical data FVPM on cloud. We propose the first differentially private frequent visual
patterns mining algorithm for graphical data, named DPcode. We propose a novel mechanism
that integrates the privacy-preserving visual word conversion with the differentially private
mechanism under the noise allocation strategy of the sparse vector technique. The optimized
algorithms properly allocate the privacy budgets among different phases in FPM algorithm over
images and reduce the corresponding data distortion. Extensive experiments are conducted based
on datasets commonly used in visual mining algorithms. The results show that our approach
achieves high utility while satisfying a practical privacy requirement.
IEEE Transactions on Multimedia (May 2016)
Audio recapture detection with convolutional neural networks
Abstract - In this work, we investigate how features can be effectively learned by deep neural
networks for audio forensic problems. By providing a preliminary feature preprocessing based
on Electric Network Frequency (ENF) analysis, we propose a convolutional neural network
(CNN) for training and classification of genuine and recaptured audio recordings. Hierarchical
representations which contain levels of details of the ENF components are learned from the deep
neural networks and can be used for further classification. The proposed method works for small
audio clips of 2 seconds’ duration, whereas the state of the art may fail with such small audio
clips. Experimental results demonstrate that the proposed network yields high detection accuracy
with each ENF harmonic component represented as a single-channel input. The performance can
be further improved by a combined input representation which incorporates both the fundamental
ENF and its harmonics. Convergence property of the network and the effect of using analysis
window with various sizes are also studied. Performance comparison against the support tensor
machine demonstrates the advantage of using CNN for the task of audio recapture detection.
Moreover, visualization of the intermediate feature maps provides some insight into what the
deep neural networks actually learn and how they make decisions.
IEEE Transactions on Multimedia (May 2016)
A Context-aware Framework for Reducing Bandwidth Usage of Mobile Video Chats
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - Mobile video chat apps offer users an approachable way to communicate with others.
As the highspeed 4G networks being deployed worldwide, the number of mobile video chat app
users increases. However, video chatting on mobile devices brings users financial concerns,
since streaming video demands high bandwidth and can use up a large amount of data in dozens
of minutes. Lowering the bandwidth usage of mobile video chats is challenging since video
quality may be compromised. In this paper, we attempt to tame this challenge. Technically, we
propose a context-aware frame rate adaption framework, named LBVC (Low-bandwidth Video
Chat). It follows a sender-receiver cooperative principle that smartly handles the trade-off
between lowering bandwidth usage and maintaining video quality. We implement LBVC by
modifying an open-source app - Linphone and evaluate it with both objective experiments and
subjective studies.
IEEE Transactions on Multimedia (May 2016)
Resource Allocation With Video Traffic Prediction in Cloud-Based Space Systems
Abstract - This paper considers the resource allocation problems for video transmission in
space-based information networks. The queueing system analyzed in this study is constituted by
multiple users and a single server. The server is operated as a cloud that can sense the traffic
arrivals to each user's queue and then allocates the transmission resource and service rate for
users. The objectives are to make configurations over time to minimize the time average cost of
the system, and to minimize the waiting time of packets after they enter the queue. Meanwhile,
the constraints on the queue stability of the system must be satisfied. In this paper, we introduce
a predictive backpressure algorithm, which considers the future arrivals with a certain prediction
window size into the consideration of resource allocation to make decisions on which packets to
be served first. In addition, this paper designs a multiresolution wavelet decomposition-based
backpropagation network for the prediction of video traffic, which exhibits the long-range
dependence property. Simulation results indicate that the delay of the queueing system can be
reduced through this prediction-based resource allocation, and the prediction accuracy for the
video traffic is improved according to the proposed prediction system.
IEEE Transactions on Multimedia (May 2016)
SALIC: Social Active Learning for Image Classification
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
Abstract - In this paper we present SALIC, an active learning method that is placed in the
context of social networks and focuses on selecting the samples that are most appropriate to
expand the training set of a binary classifier. The process of active learning can be fully
automated in this social context by replacing the human oracle with the user tagged images
obtained from social networks. However, the noisy nature of user-contributed tags adds further
complexity to the problem of sample selection since, apart from their informativeness (i.e. how
much they are expected to inform the classifier if we knew their label), our confidence about
their actual content should also be maximized (i.e. how certain the oracle is on its decision about
the contents of an image). The main contribution of this work is in proposing a probabilistic
approach for jointly maximizing the two aforementioned quantities with a view to automate the
process of active learning. Based on this approach the training set is expanded with samples that
maximize the joint probability of selecting a sample given its informativeness and our
confidence for its true content. In the examined noisy context, the oracle’s confidence is
necessary to provide a contextual-based indication of the images’ true contents, while the
samples’ informativeness is required to reduce the computational complexity and minimize the
mistakes of the unreliable oracle. We prove the validity and superiority of SALIC over various
baselines and state-of-theart methods experimentally. In addition, we show that SALIC allows us
to select training data as effectively as typical active learning, without the cost of manual
annotation. Finally, we argue that the speed-up achieved when learning actively in this social
context (where labels can be obtained without the cost of human annotation) is necessary to cope
with the continuously growing requirements of large scale applications. In this respect, we prove
experimentally- that SALIC requires 10 times less training data in order to reach the exactly
same performance as a straightforward informativeness-agnostic learning approach.
IEEE Transactions on Multimedia (May 2016)
Efficient Image Sharpness Assessment Based on Content Aware Total Variation
Abstract - State-of-the-art sharpness assessment methods are mostly based on edge width,
gradient, high-frequency energy or pixel intensity variation. Such methods consider very little
the image content variation in conjunction with the sharpness assessment which causes the
sharpness metric to be less effective for different content images. In this paper, we propose an
For more Details, Feel free to contact us at any time.
Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/
Mail Id: tsysglobalsolutions2014@gmail.com.
efficient no-reference image sharpness assessment called Content Aware Total Variation
(CATV) by considering the importance of image content variation in sharpness measurement. By
parameterizing the image TV statistics using Generalized Gaussian Distribution (GGD), the
sharpness measure is identified by the standard deviation, and the image content variation
evaluator is indicated by the shape-parameter. However, the standard deviation is content
dependent which is different for the regions with strong edges, high frequency textures, low
frequency textures, and blank areas. By incorporating the shape-parameter in moderating of the
standard deviation, we propose a content aware sharpness metric. The experimental results show
that the proposed method is highly correlated with the human vision system and has better
sharpness assessment results than the state-of-the-art techniques on the blurred subset images of
LIVE, TID2008, CSIQ and IVC databases. Also, our method has very low computational
complexity which is suitable for online applications. The correlations with the subjective of the
four databases and statistical significance analysis reveal that our method has superior results
when compared with previous techniques.
IEEE Transactions on Multimedia (May 2016)
SUPPORT OFFERED TO REGISTERED STUDENTS:
1. IEEE Base paper.
2. Review material as per individuals’ university guidelines
3. Future Enhancement
4. assist in answering all critical questions
5. Training on programming language
6. Complete Source Code.
7. Final Report / Document
8. International Conference / International Journal Publication on your Project.
FOLLOW US ON FACEBOOK @ TSYS Academic Projects

More Related Content

What's hot

10.1.1.432.9149
10.1.1.432.914910.1.1.432.9149
10.1.1.432.9149moemi1
 
An Impact on Content Based Image Retrival A Perspective View
An Impact on Content Based Image Retrival A Perspective ViewAn Impact on Content Based Image Retrival A Perspective View
An Impact on Content Based Image Retrival A Perspective Viewijtsrd
 
IRJET- Visual Information Narrator using Neural Network
IRJET- Visual Information Narrator using Neural NetworkIRJET- Visual Information Narrator using Neural Network
IRJET- Visual Information Narrator using Neural NetworkIRJET Journal
 
Edge backpropagation for noisy logo recognition
Edge backpropagation for noisy logo recognitionEdge backpropagation for noisy logo recognition
Edge backpropagation for noisy logo recognitionAmir Shokri
 
Text extraction from images
Text extraction from imagesText extraction from images
Text extraction from imagesGarby Baby
 
Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...Divya Gera
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONgerogepatton
 
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...CSCJournals
 
Facial image retrieval on semantic features using adaptive mean genetic algor...
Facial image retrieval on semantic features using adaptive mean genetic algor...Facial image retrieval on semantic features using adaptive mean genetic algor...
Facial image retrieval on semantic features using adaptive mean genetic algor...TELKOMNIKA JOURNAL
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...IJCSIS Research Publications
 
Improved wolf algorithm on document images detection using optimum mean techn...
Improved wolf algorithm on document images detection using optimum mean techn...Improved wolf algorithm on document images detection using optimum mean techn...
Improved wolf algorithm on document images detection using optimum mean techn...journalBEEI
 
Face Recognition for Human Identification using BRISK Feature and Normal Dist...
Face Recognition for Human Identification using BRISK Feature and Normal Dist...Face Recognition for Human Identification using BRISK Feature and Normal Dist...
Face Recognition for Human Identification using BRISK Feature and Normal Dist...ijtsrd
 
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR TransformText Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR TransformIOSR Journals
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )SBGC
 
IDENTIFICATION OF IMAGE SPAM BY USING LOW LEVEL & METADATA FEATURES
IDENTIFICATION OF IMAGE SPAM BY USING LOW LEVEL & METADATA FEATURES IDENTIFICATION OF IMAGE SPAM BY USING LOW LEVEL & METADATA FEATURES
IDENTIFICATION OF IMAGE SPAM BY USING LOW LEVEL & METADATA FEATURES IJNSA Journal
 

What's hot (17)

10.1.1.432.9149
10.1.1.432.914910.1.1.432.9149
10.1.1.432.9149
 
An Impact on Content Based Image Retrival A Perspective View
An Impact on Content Based Image Retrival A Perspective ViewAn Impact on Content Based Image Retrival A Perspective View
An Impact on Content Based Image Retrival A Perspective View
 
IRJET- Visual Information Narrator using Neural Network
IRJET- Visual Information Narrator using Neural NetworkIRJET- Visual Information Narrator using Neural Network
IRJET- Visual Information Narrator using Neural Network
 
Edge backpropagation for noisy logo recognition
Edge backpropagation for noisy logo recognitionEdge backpropagation for noisy logo recognition
Edge backpropagation for noisy logo recognition
 
Technical Portion of PhD Research
Technical Portion of PhD ResearchTechnical Portion of PhD Research
Technical Portion of PhD Research
 
Text extraction from images
Text extraction from imagesText extraction from images
Text extraction from images
 
Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
 
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
A Texture Based Methodology for Text Region Extraction from Low Resolution Na...
 
Facial image retrieval on semantic features using adaptive mean genetic algor...
Facial image retrieval on semantic features using adaptive mean genetic algor...Facial image retrieval on semantic features using adaptive mean genetic algor...
Facial image retrieval on semantic features using adaptive mean genetic algor...
 
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
An Intelligent Approach for Effective Retrieval of Content from Large Data Se...
 
Improved wolf algorithm on document images detection using optimum mean techn...
Improved wolf algorithm on document images detection using optimum mean techn...Improved wolf algorithm on document images detection using optimum mean techn...
Improved wolf algorithm on document images detection using optimum mean techn...
 
M1803016973
M1803016973M1803016973
M1803016973
 
Face Recognition for Human Identification using BRISK Feature and Normal Dist...
Face Recognition for Human Identification using BRISK Feature and Normal Dist...Face Recognition for Human Identification using BRISK Feature and Normal Dist...
Face Recognition for Human Identification using BRISK Feature and Normal Dist...
 
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR TransformText Extraction of Colour Images using Mathematical Morphology & HAAR Transform
Text Extraction of Colour Images using Mathematical Morphology & HAAR Transform
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
 
IDENTIFICATION OF IMAGE SPAM BY USING LOW LEVEL & METADATA FEATURES
IDENTIFICATION OF IMAGE SPAM BY USING LOW LEVEL & METADATA FEATURES IDENTIFICATION OF IMAGE SPAM BY USING LOW LEVEL & METADATA FEATURES
IDENTIFICATION OF IMAGE SPAM BY USING LOW LEVEL & METADATA FEATURES
 

Viewers also liked

A Built In Self-Test and Repair Analyser for Embedded Memories
A Built In Self-Test and Repair Analyser for Embedded MemoriesA Built In Self-Test and Repair Analyser for Embedded Memories
A Built In Self-Test and Repair Analyser for Embedded MemoriesIJSRD
 
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...Shashidhar Reddy
 
Visual speech to text conversion applicable to telephone communication
Visual speech to text conversion  applicable  to telephone communicationVisual speech to text conversion  applicable  to telephone communication
Visual speech to text conversion applicable to telephone communicationSwathi Venugopal
 
Mobile Visual Search
Mobile Visual SearchMobile Visual Search
Mobile Visual SearchOge Marques
 
Reliability of ECC-based Memory Architectures with Online Self-repair Capabil...
Reliability of ECC-based Memory Architectures with Online Self-repair Capabil...Reliability of ECC-based Memory Architectures with Online Self-repair Capabil...
Reliability of ECC-based Memory Architectures with Online Self-repair Capabil...奈良先端大 情報科学研究科
 

Viewers also liked (8)

A Built In Self-Test and Repair Analyser for Embedded Memories
A Built In Self-Test and Repair Analyser for Embedded MemoriesA Built In Self-Test and Repair Analyser for Embedded Memories
A Built In Self-Test and Repair Analyser for Embedded Memories
 
RealSpeaker EN
RealSpeaker ENRealSpeaker EN
RealSpeaker EN
 
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...
Built In Self Testing(BIST) Architecture for Motin Estimation and Computing A...
 
Atsip avsp17
Atsip avsp17Atsip avsp17
Atsip avsp17
 
Visual speech to text conversion applicable to telephone communication
Visual speech to text conversion  applicable  to telephone communicationVisual speech to text conversion  applicable  to telephone communication
Visual speech to text conversion applicable to telephone communication
 
Mobile Visual Search
Mobile Visual SearchMobile Visual Search
Mobile Visual Search
 
Mobile Visual Search
Mobile Visual SearchMobile Visual Search
Mobile Visual Search
 
Reliability of ECC-based Memory Architectures with Online Self-repair Capabil...
Reliability of ECC-based Memory Architectures with Online Self-repair Capabil...Reliability of ECC-based Memory Architectures with Online Self-repair Capabil...
Reliability of ECC-based Memory Architectures with Online Self-repair Capabil...
 

Similar to IEEE MultiMedia 2016 Title and Abstract

IEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and AbstractIEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and Abstracttsysglobalsolutions
 
Query adaptive image search with hash codes
Query adaptive image search with hash codesQuery adaptive image search with hash codes
Query adaptive image search with hash codesIEEEFINALYEARPROJECTS
 
JAVA 2013 IEEE IMAGEPROCESSING PROJECT Query adaptive image search with hash ...
JAVA 2013 IEEE IMAGEPROCESSING PROJECT Query adaptive image search with hash ...JAVA 2013 IEEE IMAGEPROCESSING PROJECT Query adaptive image search with hash ...
JAVA 2013 IEEE IMAGEPROCESSING PROJECT Query adaptive image search with hash ...IEEEGLOBALSOFTTECHNOLOGIES
 
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd AswdrImage Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd AswdrMelanie Smith
 
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
 BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC... BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...Nexgen Technology
 
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...nexgentech
 
Investigating the Effect of BD-CRAFT to Text Detection Algorithms
Investigating the Effect of BD-CRAFT to Text Detection AlgorithmsInvestigating the Effect of BD-CRAFT to Text Detection Algorithms
Investigating the Effect of BD-CRAFT to Text Detection Algorithmsgerogepatton
 
INVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMS
INVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMSINVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMS
INVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMSijaia
 
Volumetric Medical Images Lossy Compression using Stationary Wavelet Transfor...
Volumetric Medical Images Lossy Compression using Stationary Wavelet Transfor...Volumetric Medical Images Lossy Compression using Stationary Wavelet Transfor...
Volumetric Medical Images Lossy Compression using Stationary Wavelet Transfor...Omar Ghazi
 
IRJET- Object Detection using Hausdorff Distance
IRJET-  	  Object Detection using Hausdorff DistanceIRJET-  	  Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff DistanceIRJET Journal
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
 
IRJET - Object Detection using Hausdorff Distance
IRJET -  	  Object Detection using Hausdorff DistanceIRJET -  	  Object Detection using Hausdorff Distance
IRJET - Object Detection using Hausdorff DistanceIRJET Journal
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEEBEBTECHSTUDENTPROJECTS
 
IRJET- A Survey on Image Retrieval using Machine Learning
IRJET- A Survey on Image Retrieval using Machine LearningIRJET- A Survey on Image Retrieval using Machine Learning
IRJET- A Survey on Image Retrieval using Machine LearningIRJET Journal
 
Emr a scalable graph based ranking model for content-based image retrieval
Emr a scalable graph based ranking model for content-based image retrievalEmr a scalable graph based ranking model for content-based image retrieval
Emr a scalable graph based ranking model for content-based image retrievalPvrtechnologies Nellore
 
Query adaptive image search with hash codes
Query adaptive image search with hash codesQuery adaptive image search with hash codes
Query adaptive image search with hash codesJPINFOTECH JAYAPRAKASH
 
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...IOSR Journals
 
Enhancement and Segmentation of Historical Records
Enhancement and Segmentation of Historical RecordsEnhancement and Segmentation of Historical Records
Enhancement and Segmentation of Historical Recordscsandit
 
A Review on Matching For Sketch Technique
A Review on Matching For Sketch TechniqueA Review on Matching For Sketch Technique
A Review on Matching For Sketch TechniqueIOSR Journals
 

Similar to IEEE MultiMedia 2016 Title and Abstract (20)

IEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and AbstractIEEE Pattern analysis and machine intelligence 2016 Title and Abstract
IEEE Pattern analysis and machine intelligence 2016 Title and Abstract
 
Query adaptive image search with hash codes
Query adaptive image search with hash codesQuery adaptive image search with hash codes
Query adaptive image search with hash codes
 
JAVA 2013 IEEE IMAGEPROCESSING PROJECT Query adaptive image search with hash ...
JAVA 2013 IEEE IMAGEPROCESSING PROJECT Query adaptive image search with hash ...JAVA 2013 IEEE IMAGEPROCESSING PROJECT Query adaptive image search with hash ...
JAVA 2013 IEEE IMAGEPROCESSING PROJECT Query adaptive image search with hash ...
 
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd AswdrImage Compression Using Hybrid Svd Wdr And Svd Aswdr
Image Compression Using Hybrid Svd Wdr And Svd Aswdr
 
Matlab abstract 2016
Matlab abstract 2016Matlab abstract 2016
Matlab abstract 2016
 
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
 BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC... BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
 
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
 
Investigating the Effect of BD-CRAFT to Text Detection Algorithms
Investigating the Effect of BD-CRAFT to Text Detection AlgorithmsInvestigating the Effect of BD-CRAFT to Text Detection Algorithms
Investigating the Effect of BD-CRAFT to Text Detection Algorithms
 
INVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMS
INVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMSINVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMS
INVESTIGATING THE EFFECT OF BD-CRAFT TO TEXT DETECTION ALGORITHMS
 
Volumetric Medical Images Lossy Compression using Stationary Wavelet Transfor...
Volumetric Medical Images Lossy Compression using Stationary Wavelet Transfor...Volumetric Medical Images Lossy Compression using Stationary Wavelet Transfor...
Volumetric Medical Images Lossy Compression using Stationary Wavelet Transfor...
 
IRJET- Object Detection using Hausdorff Distance
IRJET-  	  Object Detection using Hausdorff DistanceIRJET-  	  Object Detection using Hausdorff Distance
IRJET- Object Detection using Hausdorff Distance
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In Videos
 
IRJET - Object Detection using Hausdorff Distance
IRJET -  	  Object Detection using Hausdorff DistanceIRJET -  	  Object Detection using Hausdorff Distance
IRJET - Object Detection using Hausdorff Distance
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
 
IRJET- A Survey on Image Retrieval using Machine Learning
IRJET- A Survey on Image Retrieval using Machine LearningIRJET- A Survey on Image Retrieval using Machine Learning
IRJET- A Survey on Image Retrieval using Machine Learning
 
Emr a scalable graph based ranking model for content-based image retrieval
Emr a scalable graph based ranking model for content-based image retrievalEmr a scalable graph based ranking model for content-based image retrieval
Emr a scalable graph based ranking model for content-based image retrieval
 
Query adaptive image search with hash codes
Query adaptive image search with hash codesQuery adaptive image search with hash codes
Query adaptive image search with hash codes
 
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
 
Enhancement and Segmentation of Historical Records
Enhancement and Segmentation of Historical RecordsEnhancement and Segmentation of Historical Records
Enhancement and Segmentation of Historical Records
 
A Review on Matching For Sketch Technique
A Review on Matching For Sketch TechniqueA Review on Matching For Sketch Technique
A Review on Matching For Sketch Technique
 

Recently uploaded

Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 

Recently uploaded (20)

Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 

IEEE MultiMedia 2016 Title and Abstract

  • 1. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. IEEE TRANSACTION ON MULTIMEDIA 2016 TOPICS Hybrid Zero Block Detection for High Efficiency Video Coding Abstract - In this paper we propose an efficient hybrid zero block early detection method for high efficiency video coding (HEVC). Our method detects both genuine zero blocks (GZBs) and pseudo zero blocks (PZBs). For GZB detection, we use a two sum of absolute difference bounds and a one sum of absolute transformed difference threshold to decrease the GZB detection complexity. A fast rate-distortion estimation algorithm for HEVC is proposed to improve the PZB detection rate. Experimental results on the HM platform show that the proposed method saves about 50% of the rate-distortion optimization (RTO) time, with negligible Bjøntegaard delta bit rate loss. Our method is faster than other state-of-the-art ZB detection methods for HEVC by 10%-30%. IEEE Transactions on Multimedia (March 2016) Consistent Coding Scheme for Single-Image Super-Resolution Via Independent Dictionaries Abstract - In this paper, we present a unified frame based on collaborative representation (CR) for single-image super-resolution (SR), which learns low-resolution (LR) and high-resolution (HR) dictionaries independently in the training stage and adopts a consistent coding scheme (CCS) to guarantee the prediction accuracy of HR coding coefficients during SR reconstruction. The independent LR and HR dictionaries are learned based on CR with l2-norm regularization, which can well describe the corresponding LR and HR patch space, respectively. Furthermore, a mapping function is learned to map LR coding coefficients onto the corresponding HR coding coefficients. Propagation filtering can achieve smoothing over an image while preserving image context like edges or textural regions. Moreover, to preserve the edge structures of a super- resolved image and suppress artifacts, a propagation filtering-based constraint and image nonlocal self-similarity regularization are introduced into the SR reconstruction framework. Experimental comparison with state-of-the-art single image SR algorithms validates the effectiveness of proposed approach. IEEE Transactions on Multimedia (March 2016)
  • 2. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Joint Inference of Objects and Scenes With Efficient Learning of Text-Object-Scene Relations Abstract - The rapid growth of web images presents new challenges as well as opportunities to the task of image understanding. Conventional approaches rely heavily on fine-grained annotations, such as bounding boxes and semantic segmentations, which are not available for web-scale images. In general, images over the Internet are accompanied with descriptive texts, which are relevant to their contents. To bridge the gap between textual and visual analysis for image understanding, this paper presents an algorithm to learn the relations between scenes, objects, and texts with the help of image-level annotations. In particular, the relation between the texts and objects is modeled as the matching probability between the nouns and the object classes, which can be solved via a constrained bipartite matching problem. On the other hand, the relations between the scenes and objects/texts are modeled as the conditional distributions of their co-occurrence. Built upon the learned cross-domain relations, an integrated model brings together scenes, objects, and texts for joint image understanding, including scene classification, object classification and localization, and the prediction of object cardinalities. The proposed cross-domain learning algorithm and the integrated model elevate the performance of image understanding for web images in the context of textual descriptions. Experimental results show that the proposed algorithm significantly outperforms conventional methods in various computer vision tasks. IEEE Transactions on Multimedia (March 2016) Blind Quality Assessment of Tone-Mapped Images Via Analysis of Information, Naturalness, and Structure Abstract - High dynamic range (HDR) imaging techniques have been working constantly, actively, and validly in the fault detection and disease diagnosis in the astronomical and medical fields, and currently they have also gained much more attention from digital image processing and computer vision communities. While HDR imaging devices are starting to have friendly prices, HDR display devices are still out of reach of typical consumers. Due to the limited availability of HDR display devices, in most cases tone mapping operators (TMOs) are used to convert HDR images to standard low dynamic range (LDR) images for visualization. But existing TMOs cannot work effectively for all kinds of HDR images, with their performance
  • 3. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. largely depending on brightness, contrast, and structure properties of a scene. To accurately measure and compare the performance of distinct TMOs, in this paper develop an effective and efficient no-reference objective quality metric which can automatically assess LDR images created by different TMOs without access to the original HDR images. Our model is shown to be statistically superior to recent full- and no-reference quality measures on the existing tone- mapped image database and a new relevant database built in this work. IEEE Transactions on Multimedia (March 2016) Semi-Supervised Bi-Dictionary Learning for Image Classification With Smooth Representation-Based Label Propagation Abstract - In this paper, we propose semi-supervised bi-dictionary learning for image classification with smooth representation-based label propagation (SRLP). Natural images contain complex contents of multiple objects with complicated background, clutter, and occlusions, which prevents image features from belonging to a specific category. Therefore, we employ reconstruction-based classification to implement discriminative dictionary learning in a probabilistic manner. We jointly learn a discriminative dictionary called anchor in the feature space and its corresponding soft label called anchor label in the label space, where the combination of anchor and anchor label is referred to as bi-dictionary. The learnt bi-dictionary is utilized to bridge the semantic gap in image classification. First, SRLP constructs smoothed reconstruction problems for bi-dictionary learning. Then, SRLP produces the reconstruction coefficients in the feature space over the anchor to infer soft labels of samples in the label space. Experimental results demonstrate that the proposed method is capable of learning a pair of discriminative dictionaries for image classification in the feature and label spaces and outperforms the-state-of-the-art reconstruction-based classification ones. IEEE Transactions on Multimedia (March 2016) A Distance-Computation-Free Search Scheme for Binary Code Databases Abstract - Recently, binary codes have been widely used in many multimedia applications to approximate high-dimensional multimedia features for practical similarity search due to the highly compact data representation and efficient distance computation. While the majority of the hashing methods aim at learning more accurate hash codes, only a few of them focus on indexing methods to accelerate the search for binary code databases. Among these indexing methods,
  • 4. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. most of them suffer from extremely high memory cost or extensive Hamming distance computations. In this paper, we propose a new Hamming distance search scheme for large scale binary code databases, which is free of Hamming distance computations to return the exact results. Without the necessity to compare database binary codes with queries, the search performance can be improved and databases can be externally maintained. More specifically, we adopt the inverted multi-index data structure to index binary codes. Importantly, the Hamming distance information embedded in the structure is utilized in the designed search scheme such that the verification of exact results no longer relies on Hamming distance computations. As a step further, we optimize the performance of the inverted multi-index structure by taking the code distributions among different bits into account for index construction. Empirical results on large-scale binary code databases demonstrate the superiority of our method over existing approaches in terms of both memory usage and search efficiency. IEEE Transactions on Multimedia (March 2016) QoE Evaluation of Multimedia Services Based on Audiovisual Quality and User Interest Abstract - Quality of experience (QoE) has significant influence on whether or not a user will choose a service or product in the competitive era. For multimedia services, there are various factors in a communication ecosystem working together on users, which stimulate their different senses inducing multidimensional perceptions of the services, and inevitably increase the difficulty in measurement and estimation of the user's QoE. In this paper, a user-centric objective QoE evaluation model (QAVIC model for short) is proposed to estimate the user's overall QoE for audiovisual services, which takes account of perceptual audiovisual quality (QAV) and user interest in audiovisual content (IC) amongst influencing factors on QoE such as technology, content, context, and user in the communication ecosystem. To predict the user interest, a number of general viewing behaviors are considered to formulate the IC evaluation model. Subjective tests have been conducted for training and validation of the QAVIC model. The experimental results show that the proposed QAVIC model can estimate the user's QoE reasonably accurately using a 5-point scale absolute category rating scheme. IEEE Transactions on Multimedia (March 2016) A Locality Sensitive Low-Rank Model for Image Tag Completion
  • 5. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Abstract - Many visual applications have benefited from the outburst of web images, yet the imprecise and incomplete tags arbitrarily provided by users, as the thorn of the rose, may hamper the performance of retrieval or indexing systems relying on such data. In this paper, we propose a novel locality sensitive low-rank model for image tag completion, which approximates the global nonlinear model with a collection of local linear models. To effectively infuse the idea of locality sensitivity, a simple and effective pre-processing module is designed to learn suitable representation for data partition, and a global consensus regularizer is introduced to mitigate the risk of overfitting. Meanwhile, low-rank matrix factorization is employed as local models, where the local geometry structures are preserved for the low-dimensional representation of both tags and samples. Extensive empirical evaluations conducted on three datasets demonstrate the effectiveness and efficiency of the proposed method, where our method outperforms pervious ones by a large margin. IEEE Transactions on Multimedia (March 2016) Compressed-Sensed-Domain L1-PCA Video Surveillance Abstract - We consider the problem of foreground and background extraction from compressed- sensed (CS) surveillance videos that are captured by a static CS camera. We propose, for the first time in the literature, a principal component analysis (PCA) approach that computes directly in the CS domain the low-rank subspace of the background scene. Rather than computing the conventional L2-norm-based principal components, which are simply the dominant left singular vectors of the CS-domain data matrix, we compute the principal components under an L1-norm maximization criterion. The background scene is then obtained by projecting the CS measurement vector onto the L1 principal components followed by total-variation (TV) minimization image recovery. The proposed L1-norm procedure directly carries out low-rank background representation without reconstructing the video sequence and, at the same time, exhibits significant robustness against outliers in CS measurements compared to L2-norm PCA. An adaptive CS- L1-PCA method is also developed for low-latency video surveillance. Extensive experimental studies described in this paper illustrate and support the theoretical developments. IEEE Transactions on Multimedia (March 2016) User-Service Rating Prediction by Exploring Social Users' Rating Behaviors
  • 6. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Abstract - With the boom of social media, it is a very popular trend for people to share what they are doing with friends across various social networking platforms. Nowadays, we have a vast amount of descriptions, comments, and ratings for local services. The information is valuable for new users to judge whether the services meet their requirements before partaking. In this paper, we propose a user-service rating prediction approach by exploring social users' rating behaviors. In order to predict user-service ratings, we focus on users' rating behaviors. In our opinion, the rating behavior in recommender system could be embodied in these aspects: 1) when user rated the item, 2) what the rating is, 3) what the item is, 4) what the user interest that we could dig from his/her rating records is, and 5) how the user's rating behavior diffuses among his/her social friends. Therefore, we propose a concept of the rating schedule to represent users' daily rating behaviors. In addition, we propose the factor of interpersonal rating behavior diffusion to deep understand users' rating behaviors. In the proposed user-service rating prediction approach, we fuse four factors-user personal interest (related to user and the item's topics), interpersonal interest similarity (related to user interest), interpersonal rating behavior similarity (related to users' rating behavior habits), and interpersonal rating behavior diffusion (related to users' behavior diffusions)-into a unified matrix-factorized framework. We conduct a series of experiments in the Yelp dataset and Douban Movie dataset. Experimental results show the effectiveness of our approach. IEEE Transactions on Multimedia (March 2016) A Novel Lip Descriptor for Audio-Visual Keyword Spotting Based on Adaptive Decision Fusion Abstract - Keyword spotting remains a challenge when applied to real-world environments with dramatically changing noise. In recent studies, audio-visual integration methods have demonstrated superiorities since visual speech is not influenced by acoustic noise. However, for visual speech recognition, individual utterance mannerisms can lead to confusion and false recognition. To solve this problem, a novel lip descriptor is presented involving both geometry- based and appearance-based features in this paper. Specifically, a set of geometry-based features is proposed based on an advanced facial landmark localization method. In order to obtain robust and discriminative representation, a spatiotemporal lip feature is put forward concerning similarities among textons and mapping the feature to intra-class subspace. Moreover, a parallel
  • 7. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. two-step keyword spotting strategy based on decision fusion is proposed in order to make the best use of audio-visual speech and adapt to diverse noise conditions. Weights generated using a neural network combine acoustic and visual contributions. Experimental results on the OuluVS dataset and PKU-AV dataset demonstrate that the proposed lip descriptor shows competitive performance compared to the state of the art. Additionally, the proposed audio-visual keyword spotting (AV-KWS) method based on decision-level fusion significantly improves the noise robustness and attains better performance than feature-level fusion, which is also capable of adapting to various noisy conditions. IEEE Transactions on Multimedia (March 2016) Collaborative Wireless Freeview Video Streaming With Network Coding Abstract - Free viewpoint video (FVV) offers compelling interactive experience by allowing users to switch to any viewing angle at any time. An FVV is composed of a large number of camera-captured anchor views, with virtual views (not captured by any camera) rendered from their nearby anchors using techniques such as depth-image-based rendering (DIBR). We consider a group of wireless users who may interact with an FVV by independently switching views. We study a novel live FVV streaming network where each user pulls a subset of anchors from the server via a primary channel. To enhance anchor availability at each user, a user generates network-coded (NC) packets using some of its anchors and broadcasts them to its direct neighbors via a secondary channel. Given limited primary and secondary channel bandwidths at the devices, we seek to maximize the received video quality (i.e., minimize distortion) by jointly optimizing the set of anchors each device pulls and the anchor combination to generate NC packets. To our best knowledge, this is among the first body of work addressing such joint optimization problem for wireless live FVV streaming with NC-based collaboration. We first formulate the problem and show that it is NP-hard. We then propose a scalable and effective algorithm called PAFV (Peer-Assisted Freeview Video). In PAFV, each node collaboratively and distributedly decides on the anchors to pull and NC packets to share so as to minimize video distortion in its neighborhood. Extensive simulation studies show that PAFV outperforms other algorithms, achieving substantially lower video distortion (often by more than 20-50%) with significantly less redundancy (by as much as 70%). Our Android-based video experiment further confirms the effectiveness of PAFV over comparison schemes.
  • 8. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. IEEE Transactions on Multimedia (March 2016) A Decision-Tree-Based Perceptual Video Quality Prediction Model and Its Application in FEC for Wireless Multimedia Communications Abstract - With the exponential growth of video traffic over wireless networked and embedded devices, mechanisms are needed to predict and control perceptual video quality to meet the quality of experience (QoE) requirements in an energy-efficient way. This paper proposes an energy-efficient QoE support framework for wireless video communications. It consists of two components: 1) a perceptual video quality model that allows the prediction of video quality in real-time and with low complexity, and 2) an application layer energy-efficient and content- aware forward error correction (FEC) scheme for preventing quality degradation caused by network packet losses. The perceptual video quality model characterizes factors related to video content as well as distortion caused by compression and transmission. Prediction of perceptual quality is achieved through a decision tree using a set of observable features from the compressed bitstream and the network. The proposed model can achieve prediction accuracy of 88.9% and 90.5% on two distinct testing sets. Based on the proposed quality model, a novel FEC scheme is introduced to protect video packets from losses during transmission. Given a user- defined perceptual quality requirement, the FEC scheme adjusts the level of protection for different components in a video stream to minimize network overhead. Simulation results show that the proposed FEC scheme can enhance the perceptual quality of videos. Compared to conventional FEC methods for video communications, the proposed FEC scheme can reduce network overhead by 41% on average. IEEE Transactions on Multimedia (April 2016) mDASH: A Markov Decision-Based Rate Adaptation Approach for Dynamic HTTP Streaming Abstract - Dynamic adaptive streaming over HTTP (DASH) has recently been widely deployed in the Internet. It, however, does not impose any adaptation logic for selecting the quality of video fragments requested by clients. In this paper, we propose a novel Markov decision-based rate adaptation scheme for DASH aiming to maximize the quality of user experience under time- varying channel conditions. To this end, our proposed method takes into account those key factors that make a critical impact on visual quality, including video playback quality, video rate
  • 9. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. switching frequency and amplitude, buffer overflow/underflow, and buffer occupancy. Besides, to reduce computational complexity, we propose a low-complexity sub-optimal greedy algorithm which is suitable for real-time video streaming. Our experiments in network test-bed and real- world Internet all demonstrate the good performance of the proposed method in both objective and subjective visual quality. IEEE Transactions on Multimedia (April 2016) Complexity Control Based on a Fast Coding Unit Decision Method in the HEVC Video Coding Standard Abstract - The emerging high-efficiency video coding standard achieves higher coding efficiency than previous standards by virtue of a set of new coding tools such as the quadtree coding structure. In this novel structure, the pixels are organized into coding units (CU), prediction units, and transform units, the sizes of which can be optimized at every level following a tree configuration. These tools allow highly flexible data representation; however, they incur a very high computational complexity. In this paper, we propose an effective complexity control (CC) algorithm based on a hierarchical approach. An early termination condition is defined at every CU size to determine whether subsequent CU sizes should be explored. The actual encoding times are also considered to satisfy the target complexity in real time. Moreover, all parameters of the algorithm are estimated on the fly to adapt its behavior to the video content, the encoding configuration, and the target complexity over time. The experimental results prove that our proposal is able to achieve a target complexity reduction of up to 60% with respect to full exploration, with notable accuracy and limited losses in coding performance. It was compared with a state-of-the-art CC method and shown to achieve a significantly better trade-off between coding complexity and efficiency as well as higher accuracy in reaching the target complexity. Furthermore, a comparison with a state-of-the-art complexity reduction method highlights the advantages of our CC framework. Finally, we show that the proposed method performs well when the target complexity varies over time. IEEE Transactions on Multimedia (April 2016) A Low-Power Video Recording System With Multiple Operation Modes for H.264 and Light-Weight Compression
  • 10. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Abstract - An increasing demand for mobile video recording systems makes it important to reduce power consumption and to increase battery lifetime. The H.264/AVC compression is widely used for many video recording systems because of its high compression efficiency; however, the complex coding structure of H.264/AVC compression requires large power consumption. A light-weight video compression (LWC), based on discrete wavelet transform and set partitioning in hierarchical trees, consumes less power than H.264/AVC compression thanks to its relatively simple coding structure, although its compression efficiency is lower than that of H.264/AVC compression. This paper proposes a low-power video recording system that combines both the H.264/AVC encoder with high compression efficiency and LWC with low power consumption. The LWC is used to compress video data for temporal storage while the H.264/AVC encoder is used for permanent storage of data when some events are detected. For further power reduction, a down-sampling operation is utilized for permanent data storage. For an effective use of the two compressions with the down-sampling operation, an appropriate scheme is selected according to the proportion of long-term to short-term storage and the target bitrate. The proposed system reduces power consumption by up to 72.5% compared to that in a conventional video recording system. IEEE Transactions on Multimedia (April 2016) Human Visual System-Based Saliency Detection for High Dynamic Range Content Abstract - The human visual system (HVS) attempts to select salient areas to reduce cognitive processing efforts. Computational models of visual attention try to predict the most relevant and important areas of videos or images viewed by the human eye. Such models, in turn, can be applied to areas such as computer graphics, video coding, and quality assessment. Although several models have been proposed, only one of them is applicable to high dynamic range (HDR) image content, and no work has been done for HDR videos. Moreover, the main shortcoming of the existing models is that they cannot simulate the characteristics of HVS under the wide luminous range found in HDR content. This paper addresses these issues by presenting a computational approach to model the bottom-up visual saliency for HDR input by combining spatial and temporal visual features. An analysis of eye movement data affirms the effectiveness of the proposed model. Comparisons employing three well-known quantitative metrics show that the proposed model substantially improves predictions of visual attention for HDR content.
  • 11. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. IEEE Transactions on Multimedia (April 2016) Multimodal Personality Recognition in Collaborative Goal-Oriented Tasks Abstract - Incorporating research on personality recognition into computers, both from a cognitive as well as an engineering perspective, would facilitate the interactions between humans and machines. Previous attempts on personality recognition have focused on a variety of different corpora (ranging from text to audiovisual data), scenarios (interviews, meetings), channels of communication (audio, video, text), and different subsets of personality traits (out of the five ones from the Big Five Model). Our study uses simple acoustic and visual nonverbal features extracted from multimodal data, which have been recorded in previously uninvestigated scenarios, and consider all five personality traits and not just a subset. First, we look at the human-machine interaction scenario, where we introduce the display of different “collaboration levels.” Second, we look at the contribution of the human-human interaction (HHI) scenario on the emergence of personality traits. Investigating the HHI scenario creates a stronger basis for future human-agents interactions. Our goal is to study, from a computational approach, the emergence degree of the five personality traits in these two scenarios. The results demonstrate the relevance of each of the two scenarios when it comes to the degree of emergence of certain traits and the feasibility to automatically recognize personality under different conditions. IEEE Transactions on Multimedia (April 2016) Core Failure Mitigation in Integer Sum-of-Product Computations on Cloud Computing Systems Abstract - The decreasing mean-time-to-failure estimates in cloud computing systems indicate that multimedia applications running on such environments should be able to mitigate an increasing number of core failures at runtime. We propose a new roll-forward failure-mitigation approach for integer sum-of-product computations, with emphasis on generic matrix multiplication (GEMM) and convolution/crosscorrelation (CONV) routines. Our approach is based on the production of redundant results within the numerical representation of the outputs via the use of numerical packing. This differs from all existing roll-forward solutions that require a separate set of checksum (or duplicate) results. Our proposal imposes 37.5% reduction in the maximum output bitwidth supported in comparison to integer sum-of-product realizations performed on 32-bit integer representations which is comparable to the bitwidth requirement of
  • 12. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. checksum-methods for multiple core failure mitigation. Experiments with state-of-the-art GEMM and CONV routines running on a c4.8xlarge compute-optimized instance of amazon web services elastic compute cloud (AWS EC2) demonstrate that the proposed approach is able to mitigate up to one quadcore failure while achieving processing throughput that is: 1) comparable to that of the conventional, failure-intolerant, integer GEMM and CONV routines, 2) substantially superior to that of the equivalent roll-forward failure-mitigation method based on checksum streams. Furthermore, when used within an image retrieval framework deployed over a cluster of AWS EC2 spot (i.e., low-cost albeit terminatable) instances, our proposal leads to: 1) 16%-23% cost reduction against the equivalent checksum-based method and 2) more than 70% cost reduction against conventional failure-intolerant processing on AWS EC2 on-demand (i.e., higher-cost albeit guaranteed) instances. IEEE Transactions on Multimedia (April 2016) Factorization Algorithms for Temporal Psychovisual Modulation Display Abstract - Temporal psychovisual modulation (TPVM) is a new information display technology which aims to generate multiple visual percepts for different viewers on a single display simultaneously. In a TPVM system, the viewers wearing different active liquid crystal (LC) glasses with varying transparency levels can see different images (called personal views). The viewers without LC glasses can also see a semantically meaningful image (called shared view). The display frames and weights for the LC glasses in the TPVM system can be computed through nonnegative matrix factorization (NMF) with three additional constrains: the values of images and modulation weights should have upper bound (i.e., limited luminance of the display and transparency level of the LC); the shared view without using viewing devices should be considered (i.e., the sum of all basis images should be a meaningful image); and the sparsity of modulation weights should be considered due to the material property of LC. In this paper, we proposed to solve the constrained NMF problem by a modified version of hierarchical alternating least squares (HALS) algorithms. Through experiments, we analyze the choice of parameters in the setup of TPVM system. This work serves as a guideline for practical implementation of TPVM display system. IEEE Transactions on Multimedia (April 2016) Free-Energy Principle Inspired Video Quality Metric and Its Use in Video Coding
  • 13. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Abstract - In this paper, we extend the free-energy principle to video quality assessment (VQA) by incorporating with the recent psychophysical study on human visual speed perception (HVSP). A novel video quality metric, namely the free-energy principle inspired video quality metric (FePVQ), is therefore developed and applied to perceptual video coding optimization. The free-energy principle suggests that the human visual system (HVS) can actively predict “orderly” information and avoid “disorderly” information for image perception. Basically, “orderly” is associated with the skeletons and edges of objects, and “disorderly” mostly concerns textures in images. Based on this principle, an image is separated into orderly and disorderly regions, and processed differently in image quality assessment. For videos, visual attention, or fixation, is associated with the objects with significant motion according to HVSP, resulting in a motion strength factor in the FePVQ so that the free-energy principle is extended into spatio-temporal domain for VQA. In addition, we investigate the application of the FePVQ in perceptual rate distortion optimization (RDO). For this purpose, the FePVQ is realized with low computational cost by using the relative total variation model and the block-wise motion vectors of video coding to simulate the free-energy principle and the HVSP, respectively. The experimental results indicate that the proposed FePVQ is highly consistent with the HVS perception. The linear correlation coefficient and Spearman's rank-order correlation coefficient are up to 0.8324 and 0.8281 on the LIVE video database. Better perceptual quality of encoded video sequences is achieved by FePVQ-motivated RDO in video coding. IEEE Transactions on Multimedia (April 2016) Holons Visual Representation for Image Retrieval Abstract - Along with the enlargement of image scale, convolutional local features, such as SIFT, are ineffective for representing or indexing and more compact visual representations are required. Due to the intrinsic mechanism, the state-of-the-art vector of locally aggregated descriptors (VLAD) has a few limits. Based on this, we propose a new descriptor named holons visual representation (HVR). The proposed HVR is a derivative mutational self-contained combination of global and local information. It exploits both global characteristics and the statistic information of local descriptors in the image dataset. It also takes advantages of local features of each image and computes their distribution with respect to the entire local descriptor space. Accordingly, the HVR is computed by a two-layer hierarchical scheme, which splits the
  • 14. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. local feature space and obtains raw partitions, as well as the corresponding refined partitions. Then, according to the distances from the centroids of partition spaces to local features and their spatial correlation, we assign the local features into their nearest raw partitions and refined partitions to obtain the global description of an image. Compared with VLAD, HVR holds critical structure information and enhances the discriminative power of individual representation with a small amount of computation cost, while using the same memory overhead. Extensive experiments on several benchmark datasets demonstrate that the proposed HVR outperforms conventional approaches in terms of scalability as well as retrieval accuracy for images with similar intra local information. IEEE Transactions on Multimedia (April 2016) Query-Adaptive Small Object Search Using Object Proposals and Shape-Aware Descriptors Abstract - While there has been a significant amount of work on object search and image retrieval, the focus has primarily been on establishing effective models for the whole images, scenes, and objects occupying a large portion of an image. In this paper, we propose to leverage object proposals to identify small and smooth-structured objects in a large image database. Unlike popular methods exploring a coarse image-level pairwise similarity, the search is designed to exploit the similarity measures at the proposal level. An effective graph-based query expansion strategy is designed to assess each of these better matched proposals against all its neighbors within the same image for a precise localization. Combined with a shape-aware feature descriptor EdgeBoW, a set of more insightful edge-weights and node-utility measures, the proposed search strategy can handle varying view angles, illumination conditions, deformation, and occlusion efficiently. Experiments performed on a number of other benchmark datasets show the powerful and superior generalization ability of this single integrated framework in dealing with both clutter-intensive real-life images and poor-quality binary document images at equal dexterity. IEEE Transactions on Multimedia (April 2016) Folksonomy-Based Visual Ontology Construction and Its Applications Abstract - An ontology hierarchically encodes concepts and concept relationships, and has a variety of applications such as semantic understanding and information retrieval. Previous work
  • 15. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. for building ontologies has primarily relied on labor-intensive human contributions or focused on text-based extraction. In this paper, we consider the problem of automatically constructing a folksonomy-based visual ontology (FBVO) from the user-generated annotated images. A systematic framework is proposed consisting of three stages as concept discovery, concept relationship extraction, and concept hierarchy construction. The noisy issues of the user- generated tags are carefully addressed to guarantee the quality of derived FBVO. The constructed FBVO finally consists of 139 825 concept nodes and millions of concept relationships by mining more than 2.4 million Flickr images. Experimental evaluations show that the derived FBVO is of high quality and consistent with human perception. We further demonstrate the utility of the derived FBVO in applications of complex visual recognition and exploratory image search. IEEE Transactions on Multimedia (April 2016) Learning Personalized Models for Facial Expression Analysis and Gesture Recognition Abstract - Facial expression and gesture recognition algorithms are key enabling technologies for human-computer interaction (HCI) systems. State of the art approaches for automatic detection of body movements and analyzing emotions from facial features heavily rely on advanced machine learning algorithms. Most of these methods are designed for the average user, but the assumption “one-size-fits-all” ignores diversity in cultural background, gender, ethnicity, and personal behavior, and limits their applicability in real-world scenarios. A possible solution is to build personalized interfaces, which practically implies learning person-specific classifiers and usually collecting a significant amount of labeled samples for each novel user. As data annotation is a tedious and time-consuming process, in this paper we present a framework for personalizing classification models which does not require labeled target data. Personalization is achieved by devising a novel transfer learning approach. Specifically, we propose a regression framework which exploits auxiliary (source) annotated data to learn the relation between person- specific sample distributions and parameters of the corresponding classifiers. Then, when considering a new target user, the classification model is computed by simply feeding the associated (unlabeled) sample distribution into the learned regression function. We evaluate the proposed approach in different applications: pain recognition and action unit detection using visual data and gestures classification using inertial measurements, demonstrating the generality
  • 16. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. of our method with respect to different input data types and basic classifiers. We also show the advantages of our approach in terms of accuracy and computational time both with respect to user-independent approaches and to previous personalization techniques. IEEE Transactions on Multimedia (April 2016) Scalable Video Event Retrieval by Visual State Binary Embedding Abstract - With the exponential increase of media data on the web, fast media retrieval is becoming a significant research topic in multimedia content analysis. Among the variety of techniques, learning binary embedding (hashing) functions is one of the most popular approaches that can achieve scalable information retrieval in large databases, and it is mainly used in the near-duplicate multimedia search. However, till now most hashing methods are specifically designed for nearduplicate retrieval at the visual level rather than the semantic level. In this paper, we propose a Visual State Binary Embedding (VSBE) model to encode the video frames, which can preserve the essential semantic information in binary matrices, to facilitate fast video event retrieval in unconstrained cases. Compared with other video binary embedding models, one advantage of our proposed VSBE model is that it only needs a limited number of key frames from the training videos for hash function training, so the computational complexity is much lower in the training phase. At the same time, we apply the pair-wise constraints generated from the visual states to sketch the local properties of the events at the semantic level, so accuracy is also ensured. We conducted extensive experiments on the challenging TRECVID MED dataset, and have proved the superiority of our proposed VSBE model. IEEE Transactions on Multimedia (April 2016) Link Adaptation for High-Quality Uncompressed Video Streaming in 60-GHz Wireless Networks Abstract - The emerging 60-GHz multigigabits per second wireless technology enables the streaming of high-quality “uncompressed” video, which has been impossible with other existing wireless technologies. To support such a resource-hungry uncompressed video streaming service with limited wireless resources, it is necessary to design efficient link adaptation policies selecting suitable transmission rates for the 60-GHz wireless channel environment, thus optimizing video quality and resource management. For proper design of the link adaptation policies, we propose a new metric, called expected peak signal-to-noise ratio (ePSNR), to
  • 17. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. numerically estimate the video streaming quality. By using the ePSNR as a criterion, we propose two link adaptation policies with different objectives considering unequal error protection (UEP). The proposed link adaptation policies attempt to 1) maximize the video quality for given wireless resources, or 2) minimize the required wireless resources, while meeting the video quality. From the link adaptation policies, we provide a distributed resource management scheme for multiple users to maintain satisfactory video streaming quality. Our extensive simulation results demonstrate that the newly proposed variable, i.e., ePSNR, well represents the level of video quality. It is also shown that the proposed link adaptation policies can enhance the resource efficiency while achieving acceptable quality of the video streaming. IEEE Transactions on Multimedia (April 2016) Multiview and 3D Video Compression Using Neighboring Block Based Disparity Vectors Abstract - Compression of the statistical redundancy among different viewpoints, i.e., inter-view redundancy, is a fundamental and critical problem in multiview and three-dimensional (3D) video coding. To exploit the inter-view redundancy, disparity vectors are required to identify pixels of the same objects within two different views; in this way, the enhancement coding tools can be efficiently employed as new modes in block-based video codecs to achieve higher compression efficiency. Although disparity can be converted from depth, it is not possible in multiview video coding since depth information is not considered. Even when depth information is coded, it breaks the so-called multiview compatibility wherein texture views can be decoded without depth information. To resolve this problem, in this paper, a neighboring block-based disparity vector derivation (NBDV) method is proposed. The basic concept of NBDV is to derive a disparity vector (DV) of a current block by utilizing the motion information of spatially and temporally neighboring blocks predicted from another view. Through extensive experiments and analysis, it is shown that the proposed NBDV method achieves efficient DV derivation in the state-of-art video codecs, and it keeps the multiview compatibility with a relatively lower complexity. The proposed method has become an essential part of the 3D video standard extensions of H.264/AVC and HEVC. IEEE Transactions on Multimedia (April 2016) Predicting the Performance in Decision-Making Tasks: From Individual Cues to Group Interaction
  • 18. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Abstract - This paper addresses the problem of predicting the performance of decision-making groups. Towards this goal, we evaluate the predictive power of group attributes and discussion dynamics by using automatically extracted features, such as group members' aural and visual cues, interaction between team members, and influence of each team member, as well as self- reported features such as personality- and perception-related cues, hierarchical structure of the group, and individual- and group-level task performances. We tackle the inference problem from two angles depending on the way that features are extracted: 1) a holistic approach based on the entire meeting, and 2) a sequential approach based on the thin slices of the meeting. In the former, key factors affecting the group performance are identified and the prediction is achieved by support vector machines. As for the latter, we compare and contrast the classification performance of an influence model-based novel classifier with that of hidden Markov model (HMM). Experimental results indicate that the group looking cues and the influence cues are major predictors of group performance and the influence model outperforms the HMM in almost all experimental conditions. We also show that combining classifiers covering unique aspects of data results in improvement in the classification performance. IEEE Transactions on Multimedia (April 2016) Comparison and Evaluation of Sonification Strategies for Guidance Tasks Abstract - This paper aims to reveal the efficiency of sonification strategies in terms of rapidity, precision, and overshooting in the case of a one-dimensional guidance task. The sonification strategies are based on the four main perceptual attributes of a sound (pitch, loudness, duration/tempo, and timbre) and classified with respect to the presence or not of one or several auditory references. Perceptual evaluations are used to display the strategies in a precision/rapidity space and enable prediction of user behavior for a chosen sonification strategy. The evaluation of sonification strategies constitutes a first step toward general guidelines for sound design in interactive multimedia systems that involve guidance issues. IEEE Transactions on Multimedia (April 2016) 3D Ear Identification Using Block-wise Statistics based Features and LC-KSVD Abstract - Biometrics authentication has been corroborated to be an effective method for recognizing a person’s identity with high confidence. In this field, the use of 3D ear shape is a recent trend. As a biometric identifier, ear has several inherent merits. However, although a great
  • 19. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. deal of efforts have been devoted, there is still a large room for improvement for developing a highly effective and efficient 3D ear identification approach. In this paper, we attempt to fill this gap to some extent by proposing a novel 3D ear classification scheme that makes use of the label consistent K-SVD (LC-KSVD) framework. As an effective supervised dictionary learning algorithm, LC-KSVD learns a single compact discriminative dictionary for sparse coding and a multi-class linear classifier simultaneously. To use the LC-KSVD framework, one key issue is how to extract feature vectors from 3D ear scans. To this end, we propose a block-wise statistics based feature extraction scheme. Specifically, we divide a 3D ear ROI into uniform blocks and extract a histogram of surface types from each block; histograms from all blocks are then concatenated to form the desired feature vector. Feature vectors extracted in this way are highly discriminative and are robust to mere misalignment between samples. Experiments demonstrate that our approach can achieve better recognition accuracy than the other state-of-the-art methods. More importantly, its computational complexity is extremely low, making it quite suitable for the large-scale identification applications. Matlab source codes are publicly online available at http://sse.tongji.edu.cn/linzhang/LCKSVDEar/LCKSVDEar.htm. IEEE Transactions on Multimedia (May 2016) Sketch-based Image Retrieval by Salient Contour Reinforcement Abstract - The paper presents a sketch-based image retrieval algorithm. One of the main challenges in sketch-based image retrieval (SBIR) is to measure the similarity between a sketch and an image. To tackle this problem, we propose a SBIR based approach by salient contour reinforcement. In our approach, we divide the image contour into two types. The first is the global contour map. The second that is called the salient contour map is helpful to find out the object in images similar to the query. In addition, based on the two contour maps, we propose a new descriptor, namely angular radial orientation partitioning (AROP) feature. It fully utilizes the edge pixels’ orientation information in contour maps to identify the spatial relationships. Our AROP feature based on the two candidate contour maps is both efficient and effective to discover false matches of local features between sketch and images, and can greatly improve the retrieval performance. The application of retrieval system based on this algorithm is established. The experiments on the image dataset with 0.3 million images show the effectiveness of the
  • 20. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. proposed method and comparisons with other algorithms are also given. Compared to baseline performance, the proposed method achieves 10% higher precision in top 5. IEEE Transactions on Multimedia (May 2016) Democratic Diffusion Aggregation for Image Retrieval Abstract - Content-based image retrieval is an important research topic in the multimedia filed. In large-scale image search using local features, image features are encoded and aggregated into a compact vector to avoid indexing each feature individually. In aggregation step, sum- aggregation is wildly used in many existing work and demonstrates promising performance. However, it is based on a strong and implicit assumption that the local descriptors of an image are identically and independently distributed in descriptor space and image plane. To address this problem, we propose a new aggregation method named democratic diffusion aggregation with weak spatial context embedded. The main idea of our aggregation method is to re-weight the embedded vectors before sum-aggregation by considering the relevance among local descriptors. Different from previous work, by conducting a diffusion process on the improved kernel matrix, we calculate the weighting coefficients more efficiently without any iterative optimization. Besides, considering the relevance of local descriptors from different images, we also discuss an efficient query fusion strategy which uses the initial topranked image vectors to enhance the retrieval performance. Experimental results show that our aggregation method exhibits much higher efficiency (about ×14 faster) and better retrieval accuracy compared with previous methods, and the query fusion strategy consistently improves the retrieval quality. IEEE Transactions on Multimedia (May 2016) Democratic Diffusion Aggregation for Image Retrieval Abstract - Content-based image retrieval is an important research topic in the multimedia filed. In large-scale image search using local features, image features are encoded and aggregated into a compact vector to avoid indexing each feature individually. In aggregation step, sum- aggregation is wildly used in many existing work and demonstrates promising performance. However, it is based on a strong and implicit assumption that the local descriptors of an image are identically and independently distributed in descriptor space and image plane. To address this problem, we propose a new aggregation method named democratic diffusion aggregation with weak spatial context embedded. The main idea of our aggregation method is to re-weight the
  • 21. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. embedded vectors before sum-aggregation by considering the relevance among local descriptors. Different from previous work, by conducting a diffusion process on the improved kernel matrix, we calculate the weighting coefficients more efficiently without any iterative optimization. Besides, considering the relevance of local descriptors from different images, we also discuss an efficient query fusion strategy which uses the initial topranked image vectors to enhance the retrieval performance. Experimental results show that our aggregation method exhibits much higher efficiency (about ×14 faster) and better retrieval accuracy compared with previous methods, and the query fusion strategy consistently improves the retrieval quality. IEEE Transactions on Multimedia (May 2016) Tag based Image Search by Social Re-Ranking Abstract - Social media sharing websites like Flickr allow users to annotate images with free tags, which significantly contribute to the development of the web image retrieval and organization. Tag-based image search is an important method to find images contributed by social users in such social websites. However, how to make the top ranked result relevant and with diversity is challenging. In this paper, we propose a social re-ranking system for tag-based image retrieval with the consideration of image’s relevance and diversity. We aim at re-ranking images according to their visual information, semantic information and social clues. The initial results include images contributed by different social users. Usually each user contributes several images. First we sort these images by inter-user re-ranking. Users that have higher contribution to the given query rank higher. Then we sequentially implement intra-user re-ranking on the ranked user’s image set, and only the most relevant image from each user’s image set is selected. These selected images compose the final retrieved results. We build an inverted index structure for the social image dataset to accelerate the searching process. Experimental results on Flickr dataset show that our social re-ranking method is effective and efficient. IEEE Transactions on Multimedia (May 2016) Learning Geographical Hierarchy Features via a Compositional Model Abstract - Image location prediction is to estimate the geolocation where an image is taken, which is important for many image applications, such as image retrieval, image browsing and organization. Since social image contains heterogeneous contents, such as visual content and textual content, effectively incorporating these contents to predict location is nontrivial.
  • 22. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Moreover, it is observed that image content patterns and the locations where they may appear correlate hierarchically. Traditional image location prediction methods mainly adopt a single- level architecture and assume images are independently distributed in geographical space, which is not directly adaptable to the hierarchical correlation. In this paper, we propose a Geographically Hierarchical Bi-modal Deep Belief Network model (GHBDBN), which is a compositional learning architecture that integrates multi-modal deep learning model with non- parametric hierarchical prior model. GH-BDBN learns a joint representation capturing the correlations among different types of image content using a bi-modal DBN, with a geographically hierarchical prior over the joint representation to model the hierarchical correlation between image content and location. Then, an efficient inference algorithm is proposed to learn the parameters and the geographical hierarchical structure of geographical locations. Experimental results demonstrate the superiority of our model for image location prediction. IEEE Transactions on Multimedia (May 2016) Semantic Discriminative Metric Learning for Image Similarity Measurement Abstract - Along with the arrival of multimedia time, multimedia data has replaced textual data to transfer information in various fields. As an important form of multimedia data, images have been widely utilized by many applications, such as face recognition, image classification. Therefore, how to accurately annotate each image from a large set of images is of vital importance but challenging. To perform these tasks well, it’s crucial to extract suitable features to character the visual contents of images and learn an appropriate distance metric to measure similarities between all images. Unfortunately, existing feature operators, such as histogram of gradient, local binary pattern and color histogram, care more about the visual character of images and lack the ability to distinguish semantic information. Similarities between those features can’t reflect the real category correlations due to the well-known semantic gap. In order to solve this problem, this paper proposes a regularized distance metric framework called Semantic Discriminative Metric Learning (SDML). SDML combines geometric mean with normalized divergences and separates images from different classes simultaneously. The learned distance metric can treat all images from different classes equally. And distinctions between similar classes with entirely different semantic contents are emphasized by SDML. This procedure
  • 23. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. ensures the consistency between dissimilarities and semantic distinctions and avoids inaccuracy similarities incurred by unbalanced locations of samples. Various experiments on benchmark image datasets show the excellent performance of the novel method. IEEE Transactions on Multimedia (May 2016) 6-DOF Image Localization from Massive Geo-tagged Reference Images Abstract - The 6-DOF (Degrees Of Freedom) image localization, which aims to calculate the spatial position and rotation of a camera, is a challenging problem for most location-based services. In existing approaches, this problem is often tackled by finding the matches between 2D image points and 3D structure points so as to derive the location information via direct linear transformation algorithm. However, as these 2D-to-3D based approaches need to reconstruct the 3D structure points of the scene, they may be not flexible to employ massive and increasing geo- tagged data. To this end, this paper presents a novel approach for 6-DOF image localization by fusing candidate poses relative to reference images. In this approach, we propose to localize an input image according to the position and rotation information of multiple geo-tagged images retrieved from a reference dataset. From the reference images, an efficient relative pose estimation algorithm is proposed to derive a set of candidate poses for the input image. Each candidate pose encodes the relative rotation and direction of the input image with respect to a specific reference image. Finally, these candidate poses can be fused together by minimizing a welldefined geometry error so that the 6-DOF location of the input image is effectively derived. Experimental results show that our method can obtain satisfactory localization accuracy. In addition, the proposed relative pose estimation algorithm is much faster than existing work. IEEE Transactions on Multimedia (May 2016) Delay-Optimized Video Traffic Routing in Software-Defined Interdatacenter Networks Abstract - Many video streaming applications operate their geo-distributed services in the cloud, taking advantage of superior connectivities between datacenters to push content closer to users or to relay live video traffic between end users at a higher throughput. In the meantime, inter- datacenter networks also carry high volumes of other types of traffic, including service replication and data backups, e.g., for storage and email services. It is an important research topic to optimally engineer and schedule inter-datacenter traffic, taking into account the stringent latency requirements of video flows when transmitted along inter-datacenter links shared with
  • 24. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. other types of traffic. Since inter-datacenter networks are usually overprovisioned, unlike prior work that mainly aims to maximize link utilization, we propose a delay-optimized traffic routing scheme to explicitly differentiate path selection for different sessions according to their delay sensitivities, leading to a software-defined inter-datacenter networking overlay implemented at the application layer. We show that our solution can yield sparse path selection by only solving linear programs, and thus, in contrast to prior traffic engineering solutions, does not lead to overly fine-grained traffic splitting, further reducing packet resequencing overhead and the number of forwarding rules to be installed in each forwarding unit. Real-world experiments based on a deployment on six globally distributed Amazon EC2 datacenters have shown that our system can effectively prioritize and improve the delay performance of inter-datacenter video flows at a low cost. IEEE Transactions on Multimedia (May 2016) Multiple Human Identification and Cosegmentation: A Human-Oriented CRF Approach with Poselets Abstract - Localizing, identifying and extracting humans with consistent appearance jointly from a personal photo stream is an important problem and has wide applications. The strong variations in foreground and background and irregularly occurring foreground humans make this realistic problem challenging. Inspired by the advance in object detection, scene understanding and image cosegmentation, in this paper we explore explicit constraints to label and segment human objects rather than other non-human objects and “stuff”. We refer to such a problem as Multiple Human Identification and Cosegmentation (MHIC). To identify specific human subjects, we propose an efficient human instance detector by combining an extended color line model with a poselet-based human detector. Moreover, to capture high level human shape information, a novel soft shape cue is proposed. It is initialized by the human detector, then further enhanced through a generalized geodesic distance transform, and refined finally with a joint bilateral filter. We also propose to capture the rich feature context around each pixel by using an adaptive cross region data structure, which gives a higher discriminative power than a single pixel-based estimation. The high-level object cues from the detector and the shape are then integrated with the low-level pixel cues and mid-level contour cues into a principled conditional random field (CRF) framework, which can be efficiently solved by using fast graph
  • 25. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. cut algorithms. We evaluate our method over a newly created NTU-MHIC human dataset, which contains 351 images with manually annotated ground-truth segmentation. Both visual and quantitative results demonstrate that our method achieves state-of-the-art performance for the MHIC task. IEEE Transactions on Multimedia (May 2016) Game Theoretic Resource Allocation in Media Cloud with Mobile Social Users Abstract - Due to the rapid increases in both the population of mobile social users and the demand for quality of experience (QoE), providing mobile social users with satisfied multimedia services has become an important issue. Media cloud has been shown to be an efficient solution to resolve the above issue, by allowing mobile social users connecting it through a group of distributed brokers. However, as the resource in media cloud is limited, how to allocate resource among media cloud, brokers and mobile social users becomes a new challenge. Therefore, in this paper, we propose a game theoretic resource allocation scheme for media cloud to allocate resource to mobile social users though brokers. Firstly, a framework of resource allocation among media cloud, brokers and mobile social users is presented. Media cloud can dynamically determine the price of resource and allocate its resource to brokers. Mobile social user can select his broker to connect media cloud by adjusting the strategy to achieve the maximum revenue, based on the social features in the community. Next, we formulate the interactions among media cloud, brokers and mobile social users by a four-stage Stackelberg game. In addition, through the backward induction method, we propose an iterative algorithm to implement the proposed scheme and obtain the Stackelberg equilibrium. Finally, simulation results show that each player in the game can obtain the optimal strategy where Stackelberg equilibrium exists stably. IEEE Transactions on Multimedia (May 2016) DPcode: Privacy-Preserving Frequent Visual Patterns Publication on Cloud Abstract - Nowadays, cloud has become a promising multimedia data processing and sharing platform. Many institutes and companies plan to outsource and share their large-scale video and image datasets on cloud for scientific research and public interest. Among various video applications, the discovery of frequent visual patterns over graphical data is an exploratory and important technique. However, the privacy concerns over the leakage of sensitive information contained in the videos/images impedes the further implementation. Although the frequent visual
  • 26. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. patterns mining (FVPM) algorithm aggregates summary over individual frames and seems not to pose privacy threat, the private information contained in individual frames still may be leaked from the statistical result. In this paper, we study the problem of privacy-preserving publishing of graphical data FVPM on cloud. We propose the first differentially private frequent visual patterns mining algorithm for graphical data, named DPcode. We propose a novel mechanism that integrates the privacy-preserving visual word conversion with the differentially private mechanism under the noise allocation strategy of the sparse vector technique. The optimized algorithms properly allocate the privacy budgets among different phases in FPM algorithm over images and reduce the corresponding data distortion. Extensive experiments are conducted based on datasets commonly used in visual mining algorithms. The results show that our approach achieves high utility while satisfying a practical privacy requirement. IEEE Transactions on Multimedia (May 2016) Audio recapture detection with convolutional neural networks Abstract - In this work, we investigate how features can be effectively learned by deep neural networks for audio forensic problems. By providing a preliminary feature preprocessing based on Electric Network Frequency (ENF) analysis, we propose a convolutional neural network (CNN) for training and classification of genuine and recaptured audio recordings. Hierarchical representations which contain levels of details of the ENF components are learned from the deep neural networks and can be used for further classification. The proposed method works for small audio clips of 2 seconds’ duration, whereas the state of the art may fail with such small audio clips. Experimental results demonstrate that the proposed network yields high detection accuracy with each ENF harmonic component represented as a single-channel input. The performance can be further improved by a combined input representation which incorporates both the fundamental ENF and its harmonics. Convergence property of the network and the effect of using analysis window with various sizes are also studied. Performance comparison against the support tensor machine demonstrates the advantage of using CNN for the task of audio recapture detection. Moreover, visualization of the intermediate feature maps provides some insight into what the deep neural networks actually learn and how they make decisions. IEEE Transactions on Multimedia (May 2016) A Context-aware Framework for Reducing Bandwidth Usage of Mobile Video Chats
  • 27. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Abstract - Mobile video chat apps offer users an approachable way to communicate with others. As the highspeed 4G networks being deployed worldwide, the number of mobile video chat app users increases. However, video chatting on mobile devices brings users financial concerns, since streaming video demands high bandwidth and can use up a large amount of data in dozens of minutes. Lowering the bandwidth usage of mobile video chats is challenging since video quality may be compromised. In this paper, we attempt to tame this challenge. Technically, we propose a context-aware frame rate adaption framework, named LBVC (Low-bandwidth Video Chat). It follows a sender-receiver cooperative principle that smartly handles the trade-off between lowering bandwidth usage and maintaining video quality. We implement LBVC by modifying an open-source app - Linphone and evaluate it with both objective experiments and subjective studies. IEEE Transactions on Multimedia (May 2016) Resource Allocation With Video Traffic Prediction in Cloud-Based Space Systems Abstract - This paper considers the resource allocation problems for video transmission in space-based information networks. The queueing system analyzed in this study is constituted by multiple users and a single server. The server is operated as a cloud that can sense the traffic arrivals to each user's queue and then allocates the transmission resource and service rate for users. The objectives are to make configurations over time to minimize the time average cost of the system, and to minimize the waiting time of packets after they enter the queue. Meanwhile, the constraints on the queue stability of the system must be satisfied. In this paper, we introduce a predictive backpressure algorithm, which considers the future arrivals with a certain prediction window size into the consideration of resource allocation to make decisions on which packets to be served first. In addition, this paper designs a multiresolution wavelet decomposition-based backpropagation network for the prediction of video traffic, which exhibits the long-range dependence property. Simulation results indicate that the delay of the queueing system can be reduced through this prediction-based resource allocation, and the prediction accuracy for the video traffic is improved according to the proposed prediction system. IEEE Transactions on Multimedia (May 2016) SALIC: Social Active Learning for Image Classification
  • 28. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. Abstract - In this paper we present SALIC, an active learning method that is placed in the context of social networks and focuses on selecting the samples that are most appropriate to expand the training set of a binary classifier. The process of active learning can be fully automated in this social context by replacing the human oracle with the user tagged images obtained from social networks. However, the noisy nature of user-contributed tags adds further complexity to the problem of sample selection since, apart from their informativeness (i.e. how much they are expected to inform the classifier if we knew their label), our confidence about their actual content should also be maximized (i.e. how certain the oracle is on its decision about the contents of an image). The main contribution of this work is in proposing a probabilistic approach for jointly maximizing the two aforementioned quantities with a view to automate the process of active learning. Based on this approach the training set is expanded with samples that maximize the joint probability of selecting a sample given its informativeness and our confidence for its true content. In the examined noisy context, the oracle’s confidence is necessary to provide a contextual-based indication of the images’ true contents, while the samples’ informativeness is required to reduce the computational complexity and minimize the mistakes of the unreliable oracle. We prove the validity and superiority of SALIC over various baselines and state-of-theart methods experimentally. In addition, we show that SALIC allows us to select training data as effectively as typical active learning, without the cost of manual annotation. Finally, we argue that the speed-up achieved when learning actively in this social context (where labels can be obtained without the cost of human annotation) is necessary to cope with the continuously growing requirements of large scale applications. In this respect, we prove experimentally- that SALIC requires 10 times less training data in order to reach the exactly same performance as a straightforward informativeness-agnostic learning approach. IEEE Transactions on Multimedia (May 2016) Efficient Image Sharpness Assessment Based on Content Aware Total Variation Abstract - State-of-the-art sharpness assessment methods are mostly based on edge width, gradient, high-frequency energy or pixel intensity variation. Such methods consider very little the image content variation in conjunction with the sharpness assessment which causes the sharpness metric to be less effective for different content images. In this paper, we propose an
  • 29. For more Details, Feel free to contact us at any time. Ph: 9841103123, 044-42607879, Website: http://www.tsys.co.in/ Mail Id: tsysglobalsolutions2014@gmail.com. efficient no-reference image sharpness assessment called Content Aware Total Variation (CATV) by considering the importance of image content variation in sharpness measurement. By parameterizing the image TV statistics using Generalized Gaussian Distribution (GGD), the sharpness measure is identified by the standard deviation, and the image content variation evaluator is indicated by the shape-parameter. However, the standard deviation is content dependent which is different for the regions with strong edges, high frequency textures, low frequency textures, and blank areas. By incorporating the shape-parameter in moderating of the standard deviation, we propose a content aware sharpness metric. The experimental results show that the proposed method is highly correlated with the human vision system and has better sharpness assessment results than the state-of-the-art techniques on the blurred subset images of LIVE, TID2008, CSIQ and IVC databases. Also, our method has very low computational complexity which is suitable for online applications. The correlations with the subjective of the four databases and statistical significance analysis reveal that our method has superior results when compared with previous techniques. IEEE Transactions on Multimedia (May 2016) SUPPORT OFFERED TO REGISTERED STUDENTS: 1. IEEE Base paper. 2. Review material as per individuals’ university guidelines 3. Future Enhancement 4. assist in answering all critical questions 5. Training on programming language 6. Complete Source Code. 7. Final Report / Document 8. International Conference / International Journal Publication on your Project. FOLLOW US ON FACEBOOK @ TSYS Academic Projects