This document provides a literature survey on SIFT-based video watermarking. It discusses several interest point detectors such as Harris corner detector, scale invariant feature transform (SIFT) detector, Harris 3D detector, n-SIFT, and MoSIFT. These detectors aim to identify stable feature points in videos that are invariant to geometric transformations. The document also provides an overview of digital watermarking techniques and applications of SIFT for image watermarking. It discusses challenges in video watermarking such as resisting geometric attacks and collusion. The trends in video watermarking include extending techniques from still images and exploiting video compression formats. The document aims to explore applying SIFT to video watermarking.
Remotely Sensed Image (RSI) Analysis for feature extraction using Color map I...ijdmtaiir
Remote Sensing is the science and art of acquiring
information (spectral, spatial, and temporal) about material
objects, area, or phenomenon, without coming into physical
contact with them ad plays a significant role in feature
extraction. In the present paper, implementation of color
mapping index method is analyzed to extract features from RSI
in spectral domain. Color indexing is applied after fixing the
index value to the pixels of selected ROI (Region of Interest)
of RSI and there by clustering based on these index values.
Color mapping, which is also called tone mapping can be used
to apply color transformations on the final image colors of the
ROI. The process of color map indexing is a color map
approximation approach on RSI for feature extraction includes
designing appropriate algorithm, its implementation and
discussion on the results of such implementation on ROI.
Object tracking with SURF: ARM-Based platform ImplementationEditor IJCATR
Several algorithms for object tracking, are developed, but our method is slightly different, it’s about how to adapt and implement such algorithms on mobile platform.
We started our work by studying and analyzing feature matching algorithms, to highlight the most appropriate implementation technique for our case.
In this paper, we propose a technique of implementation of the algorithm SURF (Speeded Up Robust Features), for purposes of recognition and object tracking in real time. This is achieved by the realization of an application on a mobile platform such a Raspberry pi, when we can select an image containing the object to be tracked, in the scene captured by the live camera pi. Our algorithm calculates the SURF descriptor for the two images to detect the similarity therebetween, and then matching between similar objects. In the second level, we extend our algorithm to achieve a tracking in real time, all that must respect raspberry pi performances. So, the first thing is setting up all libraries that the raspberry pi need, then adapt the algorithm with card’s performances. This paper presents experimental results on a set of evaluation images as well as images obtained in real time.
Remotely Sensed Image (RSI) Analysis for feature extraction using Color map I...ijdmtaiir
Remote Sensing is the science and art of acquiring
information (spectral, spatial, and temporal) about material
objects, area, or phenomenon, without coming into physical
contact with them ad plays a significant role in feature
extraction. In the present paper, implementation of color
mapping index method is analyzed to extract features from RSI
in spectral domain. Color indexing is applied after fixing the
index value to the pixels of selected ROI (Region of Interest)
of RSI and there by clustering based on these index values.
Color mapping, which is also called tone mapping can be used
to apply color transformations on the final image colors of the
ROI. The process of color map indexing is a color map
approximation approach on RSI for feature extraction includes
designing appropriate algorithm, its implementation and
discussion on the results of such implementation on ROI.
Object tracking with SURF: ARM-Based platform ImplementationEditor IJCATR
Several algorithms for object tracking, are developed, but our method is slightly different, it’s about how to adapt and implement such algorithms on mobile platform.
We started our work by studying and analyzing feature matching algorithms, to highlight the most appropriate implementation technique for our case.
In this paper, we propose a technique of implementation of the algorithm SURF (Speeded Up Robust Features), for purposes of recognition and object tracking in real time. This is achieved by the realization of an application on a mobile platform such a Raspberry pi, when we can select an image containing the object to be tracked, in the scene captured by the live camera pi. Our algorithm calculates the SURF descriptor for the two images to detect the similarity therebetween, and then matching between similar objects. In the second level, we extend our algorithm to achieve a tracking in real time, all that must respect raspberry pi performances. So, the first thing is setting up all libraries that the raspberry pi need, then adapt the algorithm with card’s performances. This paper presents experimental results on a set of evaluation images as well as images obtained in real time.
Marker Controlled Segmentation Technique for Medical applicationRushin Shah
Medical image segmentation is a very important field for the medical science. In medical images, edge detection is an important work for object recognition of the human organs such as brain, heart or kidney etc. and it is an essential pre-processing step in medical image segmentation.
Medical images such as CT, MRI or X-Ray visualizes the various information’s of internal organs which is very important for doctors diagnoses as well as medical teaching, learning and research.
It is a tough job to locate the internal organs if images contains noise or rough structure of human body organs.
Image segmentation techniques
More information on this research can be found in:
Hussein, Rania, Frederic D. McKenzie. “Identifying Ambiguous Prostate Gland Contours from Histology Using Capsule Shape Information and Least Squares Curve Fitting.” The International Journal of Computer Assisted Radiology and Surgery ( IJCARS), Volume 2 Numbers 3-4, pp. 143-150, December 2007.
An evaluation of two popular segmentation algorithms, the mean shift-based segmentation algorithm and a graph-based segmentation scheme. We also consider a hybrid method which combines the other two methods.
3D Reconstruction from Multiple uncalibrated 2D Images of an ObjectAnkur Tyagi
3D reconstruction is the process of capturing the shape and appearance of real objects. In this project we are using passive methods which only use sensors to measure the radiance reflected or emitted by the objects surface to infer its 3D structure.
Improved Characters Feature Extraction and Matching Algorithm Based on SIFTNooria Sukmaningtyas
According to SIFT algorithm does not have the property of affine invariance, and the high
complexity of time and space, it is difficult to apply to real-time image processing for batch image
sequence, so an improved SIFT feature extraction algorithm was proposed in this paper. Firstly, the MSER
algorithm detected the maximally stable extremely regions instead of the DOG operator detected extreme
point, increasing the stability of the characteristics, and reducing the number of the feature descriptor;
Secondly, the circular feature region is divided into eight fan-shaped sub-region instead of 16 square subregion
of the traditional SIFT, and using Gaussian function weighted gradient information field to construct
the new SIFT features descriptor. Compared with traditional SIFT algorithm, The experimental results
showed that the algorithm not only has translational invariance, scale invariance and rotational invariance,
but also has affine invariance and faster speed that meet the requirements of real-time image processing
applications.
Marker Controlled Segmentation Technique for Medical applicationRushin Shah
Medical image segmentation is a very important field for the medical science. In medical images, edge detection is an important work for object recognition of the human organs such as brain, heart or kidney etc. and it is an essential pre-processing step in medical image segmentation.
Medical images such as CT, MRI or X-Ray visualizes the various information’s of internal organs which is very important for doctors diagnoses as well as medical teaching, learning and research.
It is a tough job to locate the internal organs if images contains noise or rough structure of human body organs.
Image segmentation techniques
More information on this research can be found in:
Hussein, Rania, Frederic D. McKenzie. “Identifying Ambiguous Prostate Gland Contours from Histology Using Capsule Shape Information and Least Squares Curve Fitting.” The International Journal of Computer Assisted Radiology and Surgery ( IJCARS), Volume 2 Numbers 3-4, pp. 143-150, December 2007.
An evaluation of two popular segmentation algorithms, the mean shift-based segmentation algorithm and a graph-based segmentation scheme. We also consider a hybrid method which combines the other two methods.
3D Reconstruction from Multiple uncalibrated 2D Images of an ObjectAnkur Tyagi
3D reconstruction is the process of capturing the shape and appearance of real objects. In this project we are using passive methods which only use sensors to measure the radiance reflected or emitted by the objects surface to infer its 3D structure.
Improved Characters Feature Extraction and Matching Algorithm Based on SIFTNooria Sukmaningtyas
According to SIFT algorithm does not have the property of affine invariance, and the high
complexity of time and space, it is difficult to apply to real-time image processing for batch image
sequence, so an improved SIFT feature extraction algorithm was proposed in this paper. Firstly, the MSER
algorithm detected the maximally stable extremely regions instead of the DOG operator detected extreme
point, increasing the stability of the characteristics, and reducing the number of the feature descriptor;
Secondly, the circular feature region is divided into eight fan-shaped sub-region instead of 16 square subregion
of the traditional SIFT, and using Gaussian function weighted gradient information field to construct
the new SIFT features descriptor. Compared with traditional SIFT algorithm, The experimental results
showed that the algorithm not only has translational invariance, scale invariance and rotational invariance,
but also has affine invariance and faster speed that meet the requirements of real-time image processing
applications.
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE sipij
One of the most important steps to describe local features is to estimate the interest region around the feature location to achieve the invariance against different image transformation. The pixels inside the interest region are used to build the descriptor, to represent a feature. Estimating the interest region
around a corner location is a fundamental step to describe the corner feature. But the process is challenging under different image conditions. Most of the corner detectors derive appropriate scales to estimate the region to build descriptors. In our approach, we have proposed a new local maxima-based
interest region detection method. This region estimation method can be used to build descriptors to represent corners. We have performed a comparative analysis to match the feature points using recent corner detectors and the result shows that our method achieves better precision and recall results than
existing methods.
EFFICIENT IMAGE RETRIEVAL USING REGION BASED IMAGE RETRIEVALsipij
Early image retrieval techniques were based on textual annotation of images. Manual annotation of images
is a burdensome and expensive work for a huge image database. It is often introspective, context-sensitive
and crude. Content based image retrieval, is implemented using the optical constituents of an image such
as shape, colour, spatial layout, and texture to exhibit and index the image. The Region Based Image
Retrieval (RBIR) system uses the Discrete Wavelet Transform (DWT) and a k-means clustering algorithm
to segment an image into regions. Each region of the image is represented by a set of optical
characteristics and the likeness between regions and is measured using a particular metric function on
such characteristics
Performance of Efficient Closed-Form Solution to Comprehensive Frontier Exposureiosrjce
IOSR Journal of Electronics and Communication Engineering(IOSR-JECE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of electronics and communication engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in electronics and communication engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
FINGERPRINT CLASSIFICATION BASED ON ORIENTATION FIELDijesajournal
ABSTRACT
This paper introduces an effective method of fingerprint classification based on discriminative feature gathering from orientation field. A nonlinear support vector machines (SVMs) is adopted for the classification. The orientation field is estimated through a pixel-Wise gradient descent method and the percentage of directional block classes is estimated. These percentages are classified into four-dimensional vector considered as a good feature that can be combined with an accurate singular point to classify the fingerprint into one of five classes. This method shows high classification accuracy relative to other spatial domain classifiers.
Robust content based watermarking algorithm using singular value decompositio...sipij
Nowadays, image content is frequently subject to different malicious manipulations. To protect images
from this illegal manipulations computer science community have recourse to watermarking techniques. To
protect digital multimedia content we need just to embed an invisible watermark into images which
facilitate the detection of different manipulations, duplication, illegitimate distributions of these images. In
this work a robust watermarking technique is presented that embedding invisible watermarks into colour
images the singular value decomposition bloc by bloc of a robust transform of images that is the Radial
symmetry transform. Each bit of the watermark is inserted in a bloc of eight pixels large of the blue
channel a high singular value of the corresponding bloc into the radial symmetry map. We justified the
insertion in the blue channel by our feeble sensibility to perturbations in this colour channel of images. We
present also results obtained with different tests. We had tested the imperceptibility of the mark using this
approach and also its robustness face to several attacks.
SHORT LISTING LIKELY IMAGES USING PROPOSED MODIFIED-SIFT TOGETHER WITH CONVEN...ijfcstjournal
The paper proposes the modified-SIFT algorithm which will be a modified form of the scale invariant feature transform. The modification consists of considering successive groups of 8 rows of pixel, along the height of the image. These are used to construct 8 bin histograms for magnitude as well as orientation individually. As a result the number of feature descriptors is significantly less (95%) than the standard SIFT approach. Fewer feature descriptor leads to reduced accuracy. This reduction in accuracy is quite drastic when searching for a single (RANK1) image match; however accuracy improves if a band of likely (say tolerance of 10%) images is to be returned. The paper therefore proposes a two-stage-approach where
First Modified-SIFT is used to obtain a shortlisted band of likely images subsequently SIFT is applied within this band to find a perfect match. It may appear that this process is tedious however it provides a significant reduction in search time as compared to applying SIFT on the entire database. The minor reduction in accuracy can be offset by the considerable time gained while searching a large database. The
modified-SIFT algorithm when used in conjunction with a face cropping algorithm can also be used to find a match against disguised images.
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...cscpconf
The Watermarking techniques represent actually a very important issue in digital multimedia
content distribution. To protect digital multimedia content we embed an invisible watermark
into images which facilitate the detection of different manipulations, duplication, illegitimate
distributions of these images. In this paper we present an approach to embedding invisible
watermarks into color images using a robust transform of images that is the Radial symmetry
transform. The watermark is inserted in blocs of eight pixels large of the blue channel using the
Singular Value Decomposition (SVD) of these blocs and those of the radial symmetry transform.
The insertion in the blue channel is justified when we know that many works states that the
human visual system is less sensible to perturbation in the blue channel of the image. Results
obtained after tests show that the imperceptibility of the watermark using this approach is good
and its robustness face to different attacks leads to think that the proposed approach is a very
promising one.
Similar to Literature Survey on Interest Points based Watermarking (20)
Designed and implemented three variants of evolutionary algorithms using pthreads for hyperparameter optimization of
Deep Neural Networks that give upto 9x speedups on 16 cores and scale very well with increasing number of threads,
hyperparameter space, search time and accuracy compared to standard baseline algorithms in OpenMP
An Auction Portal where people can buy (immediately or through auction), sell and get updates about their product status. Preventive measures for auction sniping and Real time synchronization during auction and notifications for users are provided. Web2py framework and mysql database is used.
This application takes care of IIT JEE admission process after the results of JEE is declared. This is done by a group of 20 people. My part is the backend coding of the admission process. The entire logic is given at the beginning of this report in a flow chart. This application was designed using java with mysql database using JDBC connector and Net Beans served as the primary IDE in this project.
A 4-bit CPU is implemented using TTL components and was based on micro-programmed control. The system implements 12 basic arithmetic, logic and control instructions with a 4 bit data bus and an 8 bit address bus. This project was done during 2nd year at IIT Guwahati
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Student information management system project report ii.pdf
Literature Survey on Interest Points based Watermarking
1. Multimedia System (CS 569)
Literature Survey
On
SIFT based Video Watermarking
Group Members
Priyatham Bollimapalli 10010148
Pydi Peddigari Venkat Sai 10010149
Pasumarthi Venkata Sai Dileep 10010180
2. Contents
1. Interest Point Detectors
1.1. Motivation
1.2. Harris Corner Detector
1.3. Scale Invariant Feature Point Detector
1.4. Harris 3D
1.5. n-SIFT
1.6. MoSIFT
1.7. Discussion
2. An Overview of Watermarking Techniques
2.1. Introduction
2.2. Digital Watermarking
2.3. Classifications of Watermarking
3. Application of SIFT in Image Watermarking
3.1. Introduction
3.2. Local Invariant Features
3.3. Watermarking Scheme
3.4. Other related works
4. Video Watermarking
4.1. Introduction
4.2. Applications of watermarking video content
4.3. Challenges in video watermarking
4.3.1. Various nonhostile video processings
4.3.2. Resilience against collusion
4.3.3. Real-time watermarking
4.4. The major trends in video watermarking
4.4.1. From still image to video watermarking
4.4.2. Integration of the temporal dimension
4.4.3. Exploiting the video compression formats
4.5. Discussion
5. Application of SIFT in Video Watermarking
6. Bibliography
3. 1. Interest Point Detectors
1.1. Motivation
With the widespread distribution of digital information over the World Wide Web
(WWW), the protection of intellectual property rights has become increasingly
important. The digital information which include still images, video, audio or text
can be easily copied without loss of quality and efficiently distributed. Because of
easy reproduction, retransmission and even manipulation, it allows a pirate (a
person or organization) to violate the copyright of real owner.
Digital watermarking is expected to be a perfect tool for protecting the intellectual
property rights. The ideal properties of a digital watermark include its
imperceptibility and robustness. The watermarked data should retain the quality of
the original one as closely as possible. Robustness refers to the ability to detect the
watermark after various types of intentional or unintentional alterations (so called
attacks).
The robustness of the watermark on geometrical attacks is a major problem in the
field of watermarking. Even minor geometrical manipulation to the watermarked
image dramatically reduces the ability of the watermark detector to detect the
watermark. Moreover, due to diversity in the applications and devices used by the
consumers today, video streams which adapts to the requirement of various
communication channels and user-end display devices are used. These suffer from
content adaptation attack where scaling of the resolution quality and frame rate of
the video corrupts the data and causes problem to watermarking.
Thus, there is a need for identification of stable interest/feature points in videos
which are invariant to rotation, scaling, translation, and partial illumination
changes. These points can be used as the reference locations for both the
watermark embedding and detection process.
Feature point detectors are used to extract the feature points. This section describes
two of the most popular feature point detectors today, namely Harris corner
detector and scale invariant feature point detector. Their extension to videos in the
form of 3D Harris, n-SIFT and MoSIFT is discussed. The drawbacks of each
technique and the scope for further improvements are also discussed.
4. 1.2. Harris Corner Detector
The ubiquitous Harris Corner detector starts with the assumption that the corners
are the interest points in any image. Corners are defined as the point at which the
direction of the boundary of object changes abruptly. So a corner can be
recognized in a window and shifting a window in any direction should give a large
change in intensity.
The salient features are
The variation of intensity of every pixel E with an analytic expansion about
the origin of the shifts using Taylor series expansion is considered.
A circular Gaussian window is considered to weigh the intensity variations
in the neighborhood. So the intensity variations closer to the center of the
window are assigned higher importance keeping a smooth weighting over
the entire window.
Instead of considering the minimum of Ex,y along the direction of shifts, the
variation of it with the directions of shifts of window is considered. The
intensity variation is expressed in matrix form and R-measure is calculated
which is a measure for finding change in intensity in both x and y directions
Corners are detected as the local maxima of R over 8-neighbourhood of a
pixel.
5.
6. 1.3. Scale Invariant Feature Point Detector
The SIFT detector developed by Lowe involves the following step-by-step process
Scale space peak selection:
The scale-space representation is a set of images represented at different levels of
resolution. Different levels of resolution are created by the convolution of the
Gaussian kernel G (σ) with the image I(x1, x2):
Is(x1, x2, σ) = G(σ) * I(x1, x2)
where ∗ is a the convolution operation in x1 and x2.
The variance σ of the Gaussian kernel is referred to as scale parameter.
The characteristic scale is a feature relatively independent of the image scale. The
characteristic scale can be defined as that at which the result of a differential
operator is maximized. Laplacian obtains the highest percentage of correct scale
detection.
The feature points are detected through a staged filtering approach that identifies
stable points in the scale-space. To detect the stable keypoint locations in scale
space efficiently, the scale-space extreme in the difference-of Gaussian function
(DDG(x1, x2, σ)) convolved with an image is used.. To detect the local maxima and
minima of DDG(x1, x2, σ) each point is compared with its 8 neighbors at the same
scale, and its 9 neighbors from the upper and lower scale. If this value is the
minimum or maximum of all these points then this point is an extreme.
7. Key point localization:
When the candidate points are found, the points with a low contrast or poorly
localized points are removed by measuring the stability of each feature point at its
location and scale. Similar to Harris Corner, this is done using Hessian Matrix and
Taylor series expansion.
Orientation Assignment:
Orientation of each feature point is assigned by considering the local image
properties. The keypoint descriptor can then be represented relative to this
orientation, achieving invariance to rotation.
An orientation histogram is formed from the gradient orientation of sample points
within a region (a circular window) around the keypoint. Each sample added to the
histogram is weighted by its gradient magnitude and by Gaussian-weighted
circular window. Peaks in the orientation histogram correspond to dominant
orientation of local gradients. Using this peak and any other local peak within 80%
of the height of this peak, a keypoint with that orientation is created. Some points
will be assigned with multiple orientations.
Key point descriptor:
To compute the descriptor, the local gradient data is used to create keypoint
descriptors. In order to achieve the rotation invariance the coordinates of the
descriptor and the gradient information is rotated to line up with the orientation of
the keypoint. The gradient magnitude is weighted by a Gaussian function with
variance, which is dependent on keypoint scale. This data is then used to create a
set of histograms over a window centered on the keypoint. SIFT uses a set of 16
histograms, aligned in a 4x4 grid, each with 8 orientation bins. This gives a feature
vector containing 4x4 x8=128 elements.
The figure shows 2x2 array of orientation histograms.
8. The numbers found are stored in a vector. Then the vector is normalized to unit
vector to account for contrast changes and to get illumination invariance. For non-
linear intensity transforms, each item in unit vector is bound to maximum 0.2 i.e.
larger gradients are removed. Now the unit vector is renormalized.
Key point matching: The descriptors for the two images which are to be
compared are calculated. The nearest neighbor i.e. a key points with minimum
Euclidean distance is found. For efficient nearest neighbor matching, SIFT
matches the points only if the ratio of distance between best and 2nd best match is
more than 0.8
1.4. Harris 3D
Harris 3-D space-time interest point detector was developed by Ivan Laptev using
3-D gradient structure along with scale-space representation to find interest points.
The video f(x,y,t) is convolved with 3D Gaussian g(
) to give,
L(x,y,t;
) = g(
) * f(x,y,t)
Now 3D Harris is computed similar to 2-D Harris matrix, to obtain
Spatio-temporal interest points are detected as the local maxima of 3-D Harris
corner measure defined by
H = det() - ktrace3
()
The spatial and temporal scales of each interest point are determined as the
maxima of scale-normalized Laplacian.
9. 1.5. n-SIFT
n-SIFT is a direct extension of SIFT from 2D images to arbitrary nD images. n-
SIFT uses nD Gaussian scale-space to find interest points and then describes them
using nD gradients in terms of orientation bins as SIFT descriptor does.
The method used to find the interest points is exactly similar to SIFT and can be
directly understood by looking at the following figures.
Scale-Space Pyramid
Local Maxima
10. Orientation Assignment
n-SIFT creates 25n-3
dimensional feature vector by closely following the descriptor
computation steps of SIFT
1. First the gradients in 16n
hypercube around the interest points are calculated
which are expressed in terms of magnitude and (n-1) orientation direction. The
gradient magnitudes are weighted by a Gaussian cantered at the interest point
location.
2. (n-1) orientation bin histogram with each voxel gradient magnitude is added to
the bin corresponding to its orientation and the bin with highest value will be
considered.
3. The hypercube is spitted into 4n
sub regions, each of which is described by 8
bins for each if the (n-1) directions. Thus each sub region is represented by 8n-1
bins and in total, 4n
8n-1
= 25n-3
dimensional feature vector is created.
4. Normalize the feature vector as in the case of SIFT.
11. 1.6. MoSIFT
The motion descriptor describes both spatial gradient structure as well as the local
motion structure. The algorithm is in the following figure.
The SIFT points which cross a minimum threshold of optical flow are chosen as
the spatio-temporal interest points
For spatial dimensions, the SIFT descriptor is computed as described before. For
describing local motion, descriptor is computed from the local optical flow, in the
same way SFIT descriptor is computed from the image gradients. The local 16 x 16
patch is divided into 16 4 x 4 patches and each of them are described using 8 bin
orientation histogram computed from the optical flow. The sixteen 8-bin histogram
are concatenated into a 128-bin optical flow histogram to describe the local
motion. The descriptor is obtain by concatenating SIFT and optical flow
descriptors to obtain 256 bit descriptor
12. 1.7. Discussion
Brief critique for all the descriptors is given below.
Harris-Corner: The corner points detected by Harris Corner detector are
invariant to rotation. But they were susceptible to scaling of images and were
dependent on the scale at which the derivatives and hence intensity variations were
computed.
SIFT: SIFT detector considers local image characteristic and retrieves feature
points that are invariant to image rotation, scaling, translation, partly illumination
changes and projective transform. The interest points are very robust and can
efficiently match the feature points across similar images.
Harris 3-D: Harris 3D also does not have a method for descriptor computation. As
such the concatenated histogram of gradient descriptor provided by Piotr Dollar is
used along with the detector. Out of the interest points detected by Harris 3-D,
some spatial points with no motion are also captured by the detector. These points
have high gradients in all three dimensions, even though there is no motion. This
shows the susceptibility of this technique to intensity variations.
This is a problem due to the spatiotemporal interest detectors since they do no
explicitly compute the motion between frames and rather go with the gradient
magnitude, which is susceptible to such intensity variations between the frames.
This problem can be solved to some extent by Gaussian smoothing along spatial
and temporal domains.
n-SIFT: The feature point detection detects a large number of spatial interest
points without motion. The techniques used to remove unstable edges and points
due to variations in contrast (like in SIFT) were tried here without any success.
Some of such techniques involve thresholding on ratio of all three eigen values of
Hessian matrix, or computing 2-D Hessian and using Lowe’s threshold criterion
etc. The problem can be solved to some extent by Harris 3-D using decoupled
Gaussian convolution which effectively weights the gradient computation and
hence handles inter-frame brightness variation.
Another problem with n-SIFT is its large memory usage. Since it treats video as
3D image and builds octaves from it, the memory requirement is very huge. When
handling 1000 frames of resolution 200 x 300, the algorithm is observed to take
8GB of memory
13. MoSIFT: The motion-SIFT captures multiple similar points with same motion,
which is redundant. The algorithm tries to detect points by frame by frame basis
and hence the redundancy though useful in getting sufficient number of points
might not be efficient in terms of repeatability. This is because optical flow is not
temporal invariant. So instead of mere thresholding in terms of optical flow, the
local characteristics of optical flow may be utilized to further prune the interest
points for stability in temporal domain.
Optical flow magnitude of the region around an interest point tends to be similar
and hence the descriptor using optical flow structure around the interest point need
not be unique. Instead, if the local motion trajectory is encoded in terms of optical
low values in time of the interest point in the nearby frames, then the descriptor
can be unique since natural motion tends to vary with time.
2. An Overview of Watermarking Techniques
2.1. Introduction
In the early days, encryption and control access techniques were used to protect the
ownership of media. Recently, the watermark techniques are utilized to keep the
copyright of media. Digital contents are spreading rapidly in the world via the
internet. It is possible to produce a number of the same one with the original data
without any limitation. The current rapid development of new IT technologies for
multimedia services has resulted in a strong demand for reliable and secure
copyright protection techniques for multimedia data.
Digital watermarking is a technique to embed invisible or inaudible data within
multimedia contents. Watermarked contents contain a particular data for
copyrights. A hidden data is called a watermark, and the format can be an image or
any type media. In case of ownership confliction in the process of distribution,
digital watermark technique makes it possible to search and extract the ground for
ownership.
14. 2.2. Digital Watermarking
The principles of watermark embedding & detection:
If an original image I and a watermark W are given, the watermarked image I’ is
represented as I’ = I + f(I,W) . An optional public or secret key k may be used for
this purpose.
Generic Watermark insertion
Watermark extraction and detection
The embedded watermark can be extracted later by many ways. There are some
ways which can evaluate the similarity between the original and extracted
watermarks. However, mostly used similarity measures are the correlation-based
methods. A widely used similarity measure is as follows:
To decide whether w and w* match, one may determine, sim(w, w*)>T, where T is
some threshold.
Main characteristics for a watermarking algorithm:
Invisibility: an embedded watermark is not visible.
Robustness: piracy attack or image processing should not affect the
embedded watermark.
Security: a particular watermark signal is related with a special number used
embedding and extracting.
15. 2.3. Classifications of Watermarking
On perceptivity:
o Visible watermarking
o Invisible watermarking
On robustness:
o Robust watermarking: the most important factor in dealing with
digital watermarking is the robustness. The robustness watermarking
is the most common case.
o Semi-fragile watermarking: semi-fragile watermark is capable of
tolerating some degree of the change to a watermarked image, such as
the addition of quantization noise from lossy compression.
o Fragile Watermarking: fragile watermark is designed to be easily
destroyed if a watermarked image is manipulated in the slightest
manner. This watermarking method can be used for the protection and
the verification of original contents.
3. Application of SIFT in Image Watermarking
3.1. Introduction
The following literature survey is based on robust image watermarking using local
invariant features by Lee, Kim, et al. Most previous watermarking algorithms are
unable to resist geometric distortions that desynchronize the location where
copyright information is inserted and here, a watermarking method that is robust to
geometric distortion is proposed.
Geometric distortion desynchronizes the location of watermark and hence causes
incorrect watermark detection. The use of media contents is a solution for
watermark synchronization and this method belongs to that approach. In this case,
the location of the watermark is not related to image coordinates, but to image
semantics.
In content based synchronization methods, the selection of features is a major
criterion. It is believed that local image characteristics are more useful than the
global ones. As discussed in previous sections, SIFT extracts features by
considering the local image properties and is invariant to rotation, scaling,
translation, and partial illumination changes.
16. Using SIFT, circular patches invariant to translation and scaling distortions are
generated. The watermark is inserted into the circular patches in an additive way in
the spatial domain, and the rotation invariance is achieved using the translation
property of the polar-mapped circular patches.
3.2. Local Invariant Features
As discussed in Section1, the SIFT descriptor extracts features and their properties,
such as the location (t1 ,t2), the scale s, and the orientation theta.
Modifications for Watermarking:
The local features from the SIFT descriptor are not directly applicable to
watermarking. Moreover, the SIFT descriptor was originally devised for image-
matching applications, so it extracts many features that have dense distribution
over the whole image. Hence, the number, distribution, and scale of the features
are adjusted and features that are susceptible to watermarks attacks are removed.
A circular patch is constructed using only the location (t1, t2) and scale s of
extracted SIFT features, as follows:
where k is a magnification factor to control the radius of the circular patches. These
patches are invariant to image scaling and translation as well as spatial
modifications.
The distance between adjacent features must also be taken into consideration. If the
distance is small, patches will overlap in large areas, and if the distance is large,
the number of patches will not be sufficient for the effective insertion of the
watermark. The distance D between adjacent features depends on the dimensions
of the image and is quantized by the r value as follows:
where the width and height of the image are denoted by w and h,
respectively. The r value is a constant to control the distance between adjacent
features and is set at 16 and 32 in the insertion and detection processes,
respectively.
3.3. Watermarking Scheme
Watermark Generation:
A 2-D rectangular watermark is generated, that follows a Gaussian distribution,
using a random number generator. Here, the rectangular watermark is considered
to be a polar-mapped watermark and inversely polar-map it to assign the insertion
17. location of the circular patches. Note that the size of circular patches differs, so we
should generate a separate circular watermark for each patch.
M and N are the dimensions of the rectangle and r is the radius of a circular patch.
The circular patch is divided into homocentric regions. To generate the circular
watermark, the x- and the y-axis of the rectangular watermark are inversely polar-
mapped into the radius and angle directions of the patch. The relation between the
coordinates of the rectangular watermark and the circular watermark is represented
as follows:
where x and y are the rectangular watermark coordinates, ri and theta are the
coordinates of the circular watermark, rM is equal to the radius of the patch, and r0
is a fixed fraction of rM.
To increase the robustness and invisibility of the inserted watermark, we transform
the rectangular watermark to be mapped to only the upper half of the patch, i.e., the
y-axis of the rectangular watermark is scaled by the angle of a half circle, not the
angle of a full circle. The lower half of the patch is set symmetrically with respect
to the upper half.
Watermark insertion:
This consists of the following steps:
Circular patches are extracted using SIFT descriptors. Watermark is inserted
into all the patches of the image to increase the robustness of the scheme
Circular watermark is generated, which is dependent on the radius of the
patch, as described above
18. This is inserted into the spatial domain additively. The insertion of the
watermark is represented as the spatial addition between the pixels of images
and the pixels of the circular watermark as follows:
, where vi and wci denote the pixels of images and of the
circular watermark, respectively, and denotes the perceptual mask that
controls the insertion strength of the watermark.
Watermark detection:
This consists of the steps as:
Extracting circular patches using SIFT descriptor. When there are several
patches in an image, watermark detection is applied on all the patches
The additive watermarking method in the spatial domain inserts the
watermark into the image contents as noise. Therefore, we first apply a
Wiener filter to extract this noise by calculating the difference between the
watermarked image and its Wiener-filtered image, and then regard that
difference as the retrieved watermark.
To measure the similarity between the reference watermark generated during
watermark insertion and the retrieved watermark, the retrieved circular
watermark should be converted into a rectangular watermark by applying the
polar-mapping technique. Considering the fact that the watermark is inserted
symmetrically, we take the mean value from the two semi-circular areas. By
this mapping, the rotation of circular patches is represented as a translation,
and hence we achieve rotation invariance for our watermarking scheme.
As there are several circular patches in an image, and hence, if the
watermark is detected from at least one patch, ownership is proved, and not
otherwise. As the watermark is inserted into several circular patches, rather
than just one, it is highly likely that the proposed scheme will detect the
watermark, even after image distortions.
This watermarking scheme is robust against geometric distortion attacks as
well as signal-processing attacks. Scaling and translation invariance is achieved by
extracting circular patches from the SIFT descriptor. Rotation invariance is
achieved by using the translation property of the polar mapped circular patches.
19. 3.4. Other related works
Hanling,Jie et al, proposed a novel robust image water marking scheme for digital
images using local invariant features and Independent Component Analysis(ICA).
This method belongs to the blind watermark category, since it uses ICA for
detection, which does not need the original image.
Framework for Watermark detection
It differs in the process that it uses Fast ICA for the watermark extraction.
where Iwe is the patch extracted in the detection procedure and
K is a random key. Then we obtain three signals and extract the watermark by
FastICA. This method is robust against the geometric distortion attacks as well as
the signal processing attacks.
Another method proposed by Pham, Miyaki, et al, deals with a robust object based
watermarking algorithm using the scale invariant features in conjunction with a
new data embedding method based on Discrete Cosine Transform
(DCT).Watermark is embedded by modifying the DCT coefficients. To detect the
hidden information in the object, first the detection of the object region is done by
using object matching.
20. Embedding Scheme Detecting Scheme
And by calculating the affine parameters, we can geometrically recover the object,
and can easily read the hidden message. The results have shown that this method
can resist to very strong attacks such as 0.4 scaling, all angle rotation, etc.
4. Video Watermarking
4.1. Introduction
There exists a complex trade-off between three parameters in digital watermarking:
data payload, fidelity and robustness. The data payload is the amount of
information, i.e. the number of bits, that is encoded by the watermark. The fidelity
is another property of the watermark: the distortion, that the watermarking process
is bound to introduce, should remain imperceptible to a human observer. Finally,
the robustness of a water- marking scheme can be seen as the ability of the
detector to extract the hidden watermark from some altered watermarked data. The
watermarking process is considered as the transmission of a signal through a noisy
channel.
4.2. Applications of watermarking video content
If the increasing interest concerning digital watermarking during the last decade is
most likely due to the increase in concern over copyright protection of digital
content.
Applications Purpose of the embedded watermark
Copy Control Prevent unauthorized copying
21. Broadcast Monitoring Identify the video item being broadcasted
Fingerprinting Trace back a malicious user
Video Authentication Ensure that original content hasn’t been tampered
Copyright Protection Prove ownership
Enhanced Video Coding Bringing addition information. eg. error correction
Table 1: Video watermarking: applications and associated purpose
4.3. Challenges in video watermarking
Watermarking in still images and video is a similar problem, it is not identical.
Three challenges for digital video watermarking are
Presence of Non Hostile video processing, which are likely to alter the
watermark signal
Resilience to collusion is much more critical in the context of video.
Real-Time is a requirement for video watermarking
4.3.1. Various nonhostile video processings
Robustness of digital watermarking has always been evaluated via the survival of
the embedded watermark after attacks. Nonhostile refers to the fact that even
content provider are likely to process a bit their digital data in order to manage
efficiently their resources.
Photometric Attacks:
This category gathers all the attacks which modify the pixel values in the frames.
Data transmission is likely to introduce some noise for example. Similarly, digital
to analog and analog to digital conversions introduce some distortions in the
video signal. Another common processing is to perform a gamma correction in
order to increase the contrast. Conversion of video from a format to another one
can also cause this attack.
Spatial Desynchronization:
22. Many watermarking algorithms rely on an implicit spatial synchronisation between
the embedder and the detector. A pixel at a given location in the frame is assumed
to be associated with a given bit of the watermark. Many nonhostile video
processings introduce spatial desynchronisation which may result in a drastic loss
of performance of a watermarking scheme.
The pixel position is susceptible to jitter. In particular, positional jitter occurs for
video over poor analog links e.g. broadcasting in a wireless environment.
Temporal desynchronisation:
Temporal desynchronisation may affect the watermark signal. For example, if the
secret key for embedding is different for each frame, simple frame rate
modification would make the detection algorithm fail. Since changing frame rate
is a quite common processing, watermarks should be designed so that they survive
such an operation.
Video editing:
Cut-and-splice and cut-insert-splice are two very common processings used during
video editing. Cut-insert-splice is basically what happens when a commercial is
inserted in the middle of a movie. Moreover, transition effects, like fade-and-
dissolve or wipe-and-matte, can be used in order to smooth the transition between
two scenes of the video.
4.3.2. Resilience against collusion
Collusion is a problem that has already been pointed out for still images some time
ago. It refers to a set of malicious users who merge their knowledge, i.e. different
watermarked data, in order to produce illegal content, i.e. unwater-marked data.
Such collusion is successful in two different distinct cases.
Collusion type I:
The same watermark is embedded into different copies of different data. The
collusion can estimate the watermark from each watermarked data and
23. obtain a refined estimate of the watermark by linear combination, e.g. the
average, of the individual estimations. Having a good estimate of the
watermark permits to obtain unwatermarked data with a simple subtraction
with the watermarked one.
Collusion type II:
Different watermarks are embedded into different copies of the same data.
The collusion only has to make a linear combination of the different
watermarked data, e.g. the average, to produce unwatermarked data. Indeed,
generally, averaging different watermarks converges toward zero.
Collusion is a very important issue in the context of digital video since there are
twice more opportunities to design collusion than with still images. When video is
considered, the origin of the collusion can be twofold.
1. Inter-videos collusion: This is the initial origin considered for still images.
A set of users have a watermarked version of a video which they gather in
order to produce unwatermarked video content. In the context of copyright
protection, the same watermark is embedded in different videos and
collusion type I is possible. Alternatively, in a fingerprinting application, the
watermark will be different for each user and collusion type II can be
considered. Inter- videos collusion requires several watermarked videos to
produce unwatermarked video content.
2. Intra-video collusion: This is a video-specific origin. Many water-marking
algorithms consider a video as a succession of still images. Watermarking
video comes then down to watermarking series of still images. Unfortunately
this opens new opportunities for collusion. If the same watermark is inserted
in each frame, collusion type I can be enforced since different images can be
obtained from moving scenes. On the other hand, if alternative water- marks
are embedded in each frame, collusion type II becomes a danger in static
scenes since they produce similar images. As a result, the water-marked
video alone permits to remove the water-mark from the video stream.
The main danger is intra-frame collusion i.e. when a watermarked video
alone is enough to remove the watermark from the video. It has been shown
that both strategies always insert the same watermark in each frame and
24. always insert a different watermark in each frame make collusion attacks
conceivable.
A basic rule has been denounced so that intra-video collusion is prevented.
The watermarks inserted into two different frames of a video should be as
similar, in terms of correlation, as the two frames are similar. In other
terms,” if two frames look like quite the same, the embedded watermarks
should be highly correlated. On the contrary, if two frames are really
different, the watermark inserted into those frames should be unalike”.
4.3.3. Real-time watermarking
Real-time can be an additional specification for video watermarking. It was not a
real concern with still images. In order to meet the real-time requirement, the
complexity of the watermarking algorithm should obviously be as low as possible.
Moreover, if the watermark can be inserted directly into the compressed stream,
this will prevent full decompression and recompression and consequently, it will
reduce computational needs.
Another way of achieving real-time is to split the computations. The basic idea
is to perform intensive computations once for all on the provider side and then
simple client-dependent processing on request. This can be seen as some sort of
preprocessing
4.4. The major trends in video watermarking
The most simple and straightforward algorithm is to consider a video as a
succession of still images and to reuse an existing watermarking scheme for still
images. Another point of view considers and exploits the additional temporal
dimension in order to design new robust video watermarking algorithms.
4.4.1. From still image to video watermarking
The first proposed algorithm for video coding was indeed Moving JPEG (M-
JPEG), which simply compresses each frame of the video with the image
25. compression standard JPEG. The simplest way of extending a watermarking
scheme for still images is to embed the same watermark in the frames of the video
at a regular rate. On the detector side, the presence of the watermark is checked in
every frame.
Differential Energy Watermarks (DEW) was initially designed for still images
and has been extended to video by water-marking the I-frames of an MPEG
stream. It is based on selectively discarding high frequency DCT coefficients in the
compressed data stream.
4.4.2. Integration of the temporal dimension
Many researchers have investigated how to reduce the visual impact of the
watermark for still image by considering the properties of the Human Visual
System (HVS) such as frequency masking, luminance masking and contrast
masking.
4.4.3. Exploiting the video compression formats
The last trend considers the video data as somedata compressed with a video
specific compression standard. Indeed, most of the time, a video is stored in a
compressed version in order to spare some storage space. As a result,
watermarking methods have been designed, which embed the watermark directly
into the compressed video stream. Algorithms are adapted so that the watermark
can be directly inserted in the nonzero DCT coefficients of an MPEG video stream.
4.5. Discussion
A watermark can be separated into two parts: one for copyright protection and the
other for customer fingerprinting. However many challenges have to be taken up.
Robustness has to be considered attentively. There are indeed many nonhostile
video processings which might alter the watermark signal. It might not even be
possible to be immune against all those attacks and detailed constraints has to be
defined according to the targeted application. Since collusion is far more critical in
the context of video, it must be seriously considered. Finally the real-time
constraint has to be met in many applications.
26. 5. Application of SIFT in Video Watermarking
This area of research is relatively new and is unexplored to a large extend due to
lack of robust and proper interest point detectors. The problems associated with
them are already discussed in section 1.7.
In the paper titled SIFT features in semi-fragile video watermarks by Stefan
Thiemert et. al. SIFT is used to detect manipulations in videos. With the detected
SIFT feature points, an authentication message is generated, which is embedded
with a robust video watermark. In the verification process a temporal filtering
approach is introduced to reduce the distortions caused by content-preserving
manipulations.
6. Bibliography
1. Distinctive image features from scale-invariant keypoints by D.G.Lowe
2. A combined corner and edge Detector by Harris, C. and Stephens
3. n-SIFT: n-Dimensional Scale Invariant Feature Transform by Warren
Cheung and Ghassan Hamarneh
4. On Space-Time Interest Points by Ivan Laptev
5. MoSIFT: Recognizing Human Actions in Surveillance Videos by Ming-Yu
Chen and Alexander Hauptmann
6. A Survey of Watermarking Techniques applied to Multimedia by Sin-Joo
Lee, Sung-Hwan Jung
7. Robust image watermarking using local invariant features by Hae-Yeoun
Lee and Hyungshin Kim
8. Robust Image Watermarking Using Local Invariant Features and
Independent Component Analysis by Zhang Hanling and Liu Jie
9. Gometrically Invariant Object-Based Watermarking using SIFT feature by
Viet Quoc PHAM, Takashi MIYAKI, Toshihiko YAMASAKI, Kiyoharu
AIZAWA
10.A guide tour of video watermarking by Gwena.el Do.err, Jean-Luc Dugelay
11.SIFT features in semi-fragile video watermarks by Stefan Thiemert, Martin
Steinebach