A novel approach for satellite imagery storage by classifying the non duplicate regions


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A novel approach for satellite imagery storage by classifying the non duplicate regions

  1. 1. International Journal of Computer Engineering (IJCET), ISSN 0976 – 6367(Print), International Journal of Computer Engineering and Technologyand Technology (IJCET), ISSN 0976 – 6367(Print) © IAEME ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IJCETISSN 0976 – 6375(Online) Volume 1Number 2, Sep - Oct (2010), pp. 147- 159 ©IAEME© IAEME, http://www.iaeme.com/ijcet.html A NOVEL APPROACH FOR SATELLITE IMAGERY STORAGE BY CLASSIFYING THE NON-DUPLICATE REGIONS Cyju Varghese Computer Science Department Karunya University, India E-Mail: jogycyju@gmail.com John Blesswin Computer Science Department Karunya University, India E-Mail: johnblesswin@gmail.com Navitha Varghese Computer Science Department Karunya University, India E-Mail: navithapullan@gmail.com Sonia Singha Computer Science Department Karunya University, India E-Mail: soniacs09@gmail.com ABSTRACT Everyday satellite is capturing thousands of images which needs to be classified in a proper way. In this paper, we address the problem of replacing the existing images with the captured one. We provide a new solution by storing only the non-existing part of the image. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. In order to overcome this difficulty, we propose an efficient approach, which consists of an algorithm that can adopt robust feature kernel principle component analysis (KPCA) to reduce dimensionality of image. Concerning image clustering, we utilize Fuzzy N-Means algorithm. Finally data is stored into 147
  2. 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEMEdatabase according to specific class by utilizing support vector machine classifier. Thusthe proposed scheme improve the efficient storage of satellite images in the database,save time consumption and make the correction of the satellite images more proficiently.Index Terms- Compression, Duplicate Detection, Feature Extraction, Image Clustering,Satellite Image1. INTRODUCTION Satellite images are playing an important role in many applications, especially tocapture earth images for environmental study and homeland security. ‘Geo’ is a generallyused satellite, to capture earth images. Thousands of thousands images are transmittedevery day to ‘digital globe’ database. Everyday the topography of the earth is changingand therefore updating the images in database frequently is very tedious. In currentapplications, the images are being totally updated in the database. Instead of updating thewhole image, this paper employs an approach to detect non-duplicates and duplicateblocks in the captured image and update the non-duplicate blocks only in thecorresponding image in the database. The approaches make use of a DuplicationDetection algorithm. To avoid the duplication of same image duplication detection approach need to beapplied. Traditional approaches in duplication detection of image objects normallypartition images into several blocks. These detection methods are designed specificallyfor the purpose of separating duplicate and non-duplicate image. It can detect duplicationwhen the locations of the extracted objects are invariant to scaling, translation, orrotation. The traditional techniques used in detecting duplication include discrete wavelettransform (DWT), principle component analysis (PCA), fourier mellin transform (FMT).These techniques are restricted with only linear features. Duplicate detection involvesdivision of the image into overlapping blocks, extract features from each block, detectsimilar feature. Depending on the type of duplication, various measures and mechanismscan be adopted and implemented to counter duplication. A discrete wavelet transform (DWT) [3][4]maps the time-domain signal of f(t)into a real-valued time frequency domain and the signals are described by the wavelet 148
  3. 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEMEcoefficients. Five-scale signal decomposition is performed to ensure that all disturbancefeatures in both high and low frequencies are extracted. Thus, the output of the wavelettransform consists of five decomposed scale signals, with different levels of resolutions.Principal component analysis (PCA) [5] in signal processing can be described as atransform of a given set of input vectors (variables) with the same length formed in the n-dimensional vector. FMT [6][7]is a global transform and applies on all pixels in the sameway. Fourier-Mellin Transform includes translation, scaling, and rotation invariance. Toachieve these properties, image is divided into overlapping blocks. Fourier transform isapplied into each of the block and obtains features. KPCA [1][2]is better over other techniques because it is used for non-linearfeature extraction. It can detect duplication if a particular portion of an image has beenrotated in any direction. Quantitative analyses indicate that the KPCA-based featureobtains excellent performance in the additive noise and lossy JPEG compressionenvironments. This method uses global geometric transformation and the labelingtechnique to indentify the mentioned duplication. Experiments with a good number ofnatural images show very promising results, when compared with the other conventionalapproach. Duplication detection involves division of the image into overlapping blocks,extract features from each block, detect similar feature. KPCA technique is mainly usedfor nonlinear feature extraction where other techniques are used for linear featureextraction. KPCA extracts more useful features than the linear PCA. Initial mapping tohigh-dimensional space provides smoother dimensionality reduction than the standardPCA. It does not require nonlinear optimization but just the solution of eigen valueproblem. Although signal reconstruction is unnecessary for the tampering detection,KPCA is computationally more expensive than the linear PCA.2. PROPOSED SCHEME Satellite image is used as input for this application. At the time of storing thisimage in database the image size will be reduced and then stored in the database. Itrequires less memory. Kernel principle is used to reduce the dimensionality of the image.Kernel PCA is a non-linear feature extractor which is used to detect duplicate and non-duplicate regions from satellite image. In Kernel PCA one important concern is selection 149
  4. 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEMEof kernel function and computation of gram matrix. They can extract data-nonlinearityand can simulate the behavior of other kernels. Gram matrix can be finding fromfollowing equation:K (xi, xj) = exp ( ) (1)Where Gaussian kernel denote important property, the value of kernel parameter is veryimportant. Figure 1 Flowchart of proposed scheme To compute principle component following step has to follow: 1. Construct one training and one testing matrix. 2. Compute gram matrix for training matrix. 3. Center the training gram matrix. 4. Diagonalizable the new matrix and compute Eigen value and eigenvector. 5. Construct the test gram matrix. 150
  5. 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME 6. Center the test gram matrix. 7. Compute projection of all vectors onto the eigenvectors. From the compressed images, extraction of image features is the most important stepthat has a great impact on the retrieval performance.A. Satellite Image Clustering The concept of points having significant membership to multiple classes isdeployed by Fuzzy algorithm. The points situated in the overlapped regions of differentclusters are first identified and excluded from consideration while clustering. Thereafter,these points are given class labels based on Support vector Machine classifier which istrained by the remaining points. The well known Fuzzy N-Means algorithm and somerecently proposed genetic clustering schemes are utilized in the process. Image is dividedinto number of blocks. Each block can have same features or different kind of features.Clustering is performed to group same kind of features. This step will give someadditional advantage for duplication detection from image.B. Satellite Image Segmentation Using KPCA image is divided into number of blocks. Each block can have samefeatures or different kind of features. Here image segmentation is performed to groupsame kind of features. This step will give some additional advantage for duplicationdetection. Image segmentation is the basis of image analysis & understanding. Imagesegmentation is exactly the problem of classifying pixel set of image. Clustering analysisis naturally applied into image segmentation. Here we are using Fuzzy N means algorithm for image segmentation. Fuzzy Nmeans is improved version of fuzzy C means. Here outlier test is also performed toimprove performance of segmentation. The internal level is used for calculating newcentroid and updating fuzzy subjection-level matrix, and the external level is for judgingif the algorithm has been converged to estimated threshold. After finishing the iterative,we can know generic subjection-level of certain pixel to certain clustering centreaccording to generated fuzzy subjection-level matrix, and determine generic category ofthe pixel by the size of the matrix[8]. Image segmentation means that image is indicatedas set of physically meaningful connected areas. 151
  6. 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEMEAlgorithm:Input: Test imageOutput: Segmented imageStep1: Initialize the parameter ⌡ and also perform normalizationStep 2: For k=1…….N Perform outlier test using equation Outlier test: ||xk-vI|| Where ui(k)= 2 Update centroid by: Vi(new)=vi(old)+ (x-vi(old))Step 3: Termination test || - ||>Є The problem of segmenting image into different clusters is iteratively [10]handled by means of single parameter .Outlier test is performed to improve the clustervalidity index. After finding the centers of clusters fuzzy membership value can bemeasured at any point. Thus groups of clusters with similar feature are obtained afterperforming this algorithm. Image segmentation means that image is indicated as set of physically [9]meaningful connected areas. Generally we achieve image segmentation purpose throughanalyzing such different image characteristics by using fuzzy N means clusteringalgorithm. Table 1 List of Symbols List of symbols ui Fuzzy Membership Value Vi Centroid Value Xk Pixel Values 152
  7. 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEMEC. Duplication Detection of Satellite Image Satellite Image database contain previously captured images of real world that areused as a training data. Everyday thousands of images is captured by satellite. In order toupdate the database, duplication detection has to be performed before storing the imageinto database. Each time satellite is storing the new image into database by replacing theprevious one which is captured by it. This process is time consuming and it requiresadditional memory space. This paper proposes a new approach which updates theexisting image with the identified non-duplicate block. To find the duplicate and non-duplicate blocks from the images duplicationdetection algorithm has proposed. Input of this algorithm is test image block.The duplication detection steps are as follows:Algorithm:Input: Test image of N pixels.Output: Non duplicate block of image.Step 1: Initialize block processing parameters: b: Number of pixels per block, Q: Number of quantization bins, Rth: Number of neighboring rows to search in the lexicographically sorted matrix, Dth: Minimum offset-magnitude threshold ∊: Fraction of the ignored variance along the principal axes or the fraction of the ignored local variance of the wavelet coefficients M: Number of training samples for the KPCA.Step 2: Apply KPCA on each block, b, of data, and compute a transform vector of length L, which is equal to (M, Nt2) for the KPCA-based features with dimension reduction.Step 3: Construct a data matrix, Mdata, of size Nb × L, where row-elements contain component-wise quantized features, i.e., bai/Qc.Step 4: Apply lexicographic sorting to the rows of the above matrix to obtain a new matrix S. Let si, be the i-th row of S, which represents the i-th block with its center coordinates (xi, yi). 153
  8. 8. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEMEStep 5: For every row si from S, select a number of adjacent rows, sj, such that |i − j| < Rth and place all pairs of coordinates (xi, yi) and (xj , yj) for j = 0, 1, ..., (Rth − 1) onto a list Pin.Step 6: Eliminate all pairs of points, whose offset-magnitude, Dof , is less than Dth. Construct a set, OF, of various offsets (m, n) and offset-frequencies (fm,n) for all elements in Pin.Step 7: Create a refined list of point-pairs, Pout, from Pin by the algorithm or by using manual threshold, fth. The proposed duplication detection algorithm has several parameters to beselected and justified before using them. These are block-size (b), number of quantizationbins (Q), block-similarity threshold (Rth), minimum offset-magnitude threshold (Dth),offset-frequency threshold (fth), and the fraction of ignored variance (∊). The selectionof Q depends on the feature variations. The selection of Rth depends on how welllexicographic sorting arranges similar vectors (blocks) in the sorted matrix,S. The parameter Dth isused to avoid false detection.D. Categorization of Satellite Image From the identified blocks to classify satellites image manually is a tediousprocess. To perform this, computer utilizes the numerical "signatures" for each trainingclass. Each pixel in the image is compared to these signatures and labeled as the class itmost closely resembles digitally. Hence, supervised classifiers require the user to decidewhich classes exist in the image, and then to define training areas of these classes. SVMallows not only the best classification performance (e.g., accuracy) on the training data,but also leaves much room for the correct classification of the future data. [11] After detecting a few duplicate pixels whose similarity scores are bigger than thethreshold using the KPCA algorithm, we have positive examples, the identified duplicateblocks in D, and negative blocks, namely, the remaining non duplicate blocks in N. 154
  9. 9. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME Table 2 Some Samples of the Test High-Resolution Satellite Image Database Results after Stored Image Existing image MS Captured Image applying PSNR in database E the Algorithm 0 33.56Algorithm:Input: Duplicate D and Non-Duplicate regions N Original ImageOutput: Updated ImageStep1: Train Classifier C1 using D and N.Step2: Classify the Non-Duplicate region N to the corresponding class label C in the database.Step 3: Perform Step 2 until all the Non-Duplicate blocks in N are inserted into the Original Image I. The duplicate blocks in D and Non-Duplicate N are used to train the classifier(SVM) inorder to identify where to categorize the non-duplicated block in the alreadystored image I thus updating the image. Thus the satellite images are stored in anefficient manner in the database. The proposed scheme works as follows: Image capturedis compressed using Kernel Principle Component Analysis (KPCA) and the featureextracted. The features extracted are clustered, employing the Fuzzy P –Means Algorithminorder to perform the duplicate detection algorithm efficiently. Duplication Detection is performed by comparing the captured image with thestored image. The duplicate and non-duplicate blocks are thus detected. Later on themissing part of the image stored in the database is updated by bringing in the non-duplicate block. 155
  10. 10. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME Table 2 Detection Accuracy for JPEG Dataset Intra-dataset average precision (P%) and recall (R%) Features JPEG P R KPCA 73.19 40.27 KPCA based feature obtains the best recall (40.27%) for JPEG and mediumprecision (73.19%) for JPEG performances in the compressed and noisy domain shownin Table 2. KPCA is performed on JPEG Satellite images in our experiment. It can be also beperformed BMP and SNR images. Recall varies roughly in sigmoid fashion withincreasing JPG.III. EXPERIMENTAL RESULTS Experimental results on satellite images demonstrate four objectives. Thus morethan 100 satellite jpeg images have been tested. Sample tested satellite images are givenin Table1. The first is the implementation of KPCA. The dimensionality of the originalimage is reduced. The image is resized to 256 x 256 before applying the proposedduplicate detection method. Moreover, the features are extracted using KPCA. Thesecond is, clustering the extracted features of the compressed image. Fuzzy N-meanscluster algorithm groups the similar features. This clustered information is used toidentify duplicate and non-duplicate block of the image. The third objective is theduplication detection. To show the non-duplicate block of the image a different color isused. First set of experiments use parameters which were empirical fixed to b=64, Q=256, Rth=50.Dth=16, =0, =1. The identified duplicate D and non-duplicate N blocksare used to train SVM classifier. This performs the task of blocks being inserted into thedatabase. In our scheme, peak signal-to-noise ratio (PSNR) is used to evaluate the quality ofthe updated image. Similarly, we use mean square error (MSE) to identify the differencebetween the updated image and the captured image. The quality of the updated image isconsidered by using two points of view. First, under the human resource system the 156
  11. 11. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEMEupdated image is almost indistinguishable from the original image. Secondly, the PSNRvalues of the updated images and the original images range from 32 to 34.5db. Moreover,all MSE’s are equal to zero when the image is exactly updated.IV. CONCLUSION This paper has presented one method for detecting duplicated regions in thesatellite image. An automatic duplication detection forgery has been proposed. Thistechnique reduces false detection as well as eliminates an important threshold parameter.Although time-cost is high, this method can have good performance. The next methodwhat we are applying is clustering method. Finally classification method is applied tostore the non-duplicate region of the image in the database.REFERENCES[1] M. K. Bashar, Member, IEEE, K. Noda, Non-member, N. Ohnishi, and K. Mori, Member, IEEE ” ,Exploring Duplicated Regions in Natural Images”. IEEE Transaction on Image Processing, Vol 1,pp. 1-40, March 2010.[2] M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of Cognitive Neuroscience, vol. 3, no. 1, 1991.[3] G. Li, Q. Wu, D. Tu, and S. Sun, “A Sorted Neighborhood Approach forDetecting Duplicated Regions in Image Forgeries based on DWT and SVD,” in Proceedings of IEEE International Conference on Multimedia and Expo, Beijing China, July 2-5, 2007, pp. 1750-1753.[4] W .Luo, J. Huang, and G. Qiu, “Robust Detection of Region Duplication Forgery in Digital Image,” in Proceedings of the 18th International Conference on Pattern Recognition, Vol. 4, 2006, pp. 746-749.[5] C. Popescu and H. Farid, “Exposing Digital Forgeries by Detecting Duplicated Image Regions”, Technical Report, TR2004-515, Dartmouth College, Computer Science, 2004.[6] Sevinc Bayram, Taha Sencar, and Nasir Memon, “An efficient and robust method for detecting copy-move forgery,” in Proceedings of ICASSP 2009. 157
  12. 12. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEME[7] H. Huang, W. Guo, and Y. Zhang, “Detection of Copy-Move Forgery in Digital Images Using SIFT Algorithm,” in Proceedings of IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application, Vol. 2, pp. 272-276, 2008.[8] Sang Wan Lee, Yong Soo Kim, and Zeungnam Bien, Fellow, IEEE, “A Nonsupervised Learning Framework of Human Behavior Patterns Based on Sequential Actions” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 4, April 2010.[9] Z. Bien and M.-G. Chun, “A Fuzzy Petri Net Model,” Handbook of Fuzzy Computation, C2.4, IOP Publishing Ltd., 1998.[10] T. Tajima et al., “Development of a Marketing System for Recognizing Customer Buying Behavior Sensor,” J. Japan Soc. for Fuzzy Theory and Intelligent Informatics, vol. 20, no. vol 5,pp 18-22,apr.2007[11] Weifeng Su, Jiying Wang, and Frederick H. Lochovsky, Member, IEEE Computer Society. “Record Matching over Query Results from Multiple Web Databases” IEEE Transactions On Knowledge And Data Engineering, VOL. 22, NO. 4, APRIL 2010[12] R. Baeza-Yates and B. Ribeiro-Neto, “Modern Information Retrieval.” ACM Press, 1999.Cyju Elizabeth Varghese received the B.E degree in Computer Science and Engineering from CSI Institute of Technology, Thovalai, India, in 2001 and been working since. Currently she is doing M. Tech in Computer Science and Engineering in Karunya University, Coimbatore. Her research interests include Web Mining and areas related to Database.John Blesswin received the B.Tech degree in Information Technology from Karunya University, Coimbatore, India, in 2009. He passed B.Tech examination with gold medal. He is doing M.Tech Computer Science and Engineering in Karunya University. His research interests include visual cryptography, visual secret sharing schemes, image hiding, and information retrieval. 158
  13. 13. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 2, Sep - Oct (2010), © IAEMENavitha Varghese received the B.Tech degree in Computer Science and Engineering from Model Engineering College, Ernakulam, India, in 2009.Currently she is doing M.Tech in Computer Science at Karunya University, Coimbatore. Her research interests include Web Mining, Web technologySonia Singha received the B.Tech degree in Computer Science and Engineering from Calcutta Institute of Technology, Kolkata, India, in 2009.Currently she is doing M. Tech in Computer Science at Karunya University, Coimbatore. Her research interests include Data Mining, Image Processing. 159