Ijciet 10 02_043

http://www.iaeme.com/IJCIET/index.asp 407 editor@iaeme.com
International Journal of Civil Engineering and Technology (IJCIET)
Volume 10, Issue 02, February 2019, pp. 407-423, Article ID: IJCIET_10_02_043
Available online at http://www.iaeme.com/ijciet/issues.asp?JType=IJCIET&VType=10&IType=02
ISSN Print: 0976-6308 and ISSN Online: 0976-6316
© IAEME Publication Scopus Indexed
GEO-LOCALIZATION OF VIDEO BASED ON
PROPOSED LBP-SVD METHOD.
Abdulkadhem Abdulkareem Abdulkadhem*
College of Information Technology, University of Babylon, Babil, 51002, Iraq
Tawfiq A. Al-Assadi
College of Information Technology, University of Babylon, Babil, 51002, Iraq
*corresponding author
ABSTRACT
In this work, we define a new method for indexing and retrieving non-geotagged
video sequences based on the visual content only by using the Local Binary Pattern
(LBP) and Singular Value Decomposition (SVD) techniques. The main question of our
system, Is it possible to determine the geographic location of a video film on the GIS-
map from just its pixels of frames?. The proposed system is introduced to answer the
questions like that. The GIS database was constructed by storing the reference images
on the intersection between segment roads in the map. The Local Binary Pattern
(LBP) is used to extract the features form images. The Singular Value Decomposition
(SVD) technique is used for compress the length of features and indexing the images
in the database. The input to the system is a video taken from the camera puts on a
vehicle as forward facing camera. The output of the proposed system is the geo-
location of keyframes of video which correspond the geo-tagged images retrieved
from the GIS database.
Key words: Geo-localization, Video, SVD, LBP, Keyframe, GIS-map.
Cite this Article: Abdulkadhem Abdulkareem Abdulkadhem and Tawfiq A. Al-
Assadi, Geo-Localization of Video Based on Proposed Lbp-Svd Method,
International Journal of Civil Engineering and Technology, 10(02), 2019, pp. 407–
423
http://www.iaeme.com/IJCIET/issues.asp?JType=IJCIET&VType=10&IType=02
1. INTRODUCTION
Videos can be geotagged at the acquisition time, which can offer the possible for effective
controlling of video files. Idyllically, each video frame can be labeled by the spatial extent of
its analysis area. The challenging of video management problem is successfully transformed
into a spatial database problem [1]. Today, the video footage devices such as smartphones,

Abdulkadhem Abdulkareem Abdulkadhem and Tawfiq A. Al-Assadi
cameras and car black boxes are well-found with GPS devices and able to capture videos
with spatiotemporal information such as time, location and camera orientation. We call such
videos geo-referenced videos [2,3]. The determining the location of the video is the basic and
important element behind numerous event analysis jobs such as cross camera person tracking
and scene reconstruction. On the other hand, unlike EXIF meta-data in images, most of the
videos shared online do not stand up with GPS information together [4,5]. Is it possible to
approximation the location of generic scenes?. Humans and computers can identify specific,
physical scenes that they’ve seen before, but what about more generic scenes that maybe
difficult to specifically localize? We distinguish that our world is self-similar not just locally
but across the globe [6,7].
Geo-localization is the problem of finding the location of an image or video using just the
visual information and its pixels without any meta-information associated with it. However,
images often contain suitable visual and contextual informative hints which allow us to
conclude the location of an image with variable confidence. The foremost of these hints are
architectural details, landmarks, building colors and textures. To handle this subject
successfully, the model is necessary to capture and maintain visual hints of the globe
carefully [8, 9]. The geographical location at which an image or video was taken is an
interesting problem in computer vision, and the recent decade has witnessed growing
research efforts to the solution of this problem. Such geographic information has developed a
crucial component of systems allowing personalized and context-aware multimedia services.
The objective of this paper is to propose a new methodology to visual content-based location
estimation system for video frames taken in urban environments by automatically derive
geographical information from multimedia content only. The system should be related in both
the geo-constrained scenario, in which the multimedia item is taken at one of a previously
defined set of locations the geo-unconstrained scenario, in which the multimedia item could
have been taken anywhere in the world [10,11].
In the organization of the paper, section 2, the background material survey the
methodology of the local binary pattern and the singular value decomposition. In section 3,
the proposed framework, which describes how the methodology presented. Section 4,
Experimental Results, gives an explanation of the output of the proposed system. In section 5,
Conclusions, where the authors reflect on the used method. In Appendix A, explain the final
output of the LBP-SVD and the similarity of each key frame with the GIS database.
2. BACKGROUND
2.1. Local Binary Pattern (LBP)
The size of image descriptors (the feature vectors) used in image retrieval applications is
relatively high. This high dimensionality of the feature vectors produces problems in creating
efficient data structures for search and retrieval. For this motivation, there is considerable
interest in decreasing the size of the descriptors while preserving the original topology of the
high dimensional space [12].
The local binary pattern (LBP) feature has developed as a silver lining in the field of
texture classification and retrieval [13]. Among the most popular dense local descriptors,
which extract local information from a considered image, pixel by pixel, is the Local Binary
Pattern (LBP). Initially introduced by Ojala et al [14, 15] for texture classification purposes.
The highest gains of this operator are confined to its Invariance to the rotation, robustness
against monotonic gray level transformation, and additionally its low computational
complexity, which is a significant advantage over other former approaches. The success of

Geo-Localization of Video Based on Proposed LBP-SVD Method
the LBP has inspired further studies in numerous applications, including face recognition,
facial expression recognition, and face authentication, face detection, smart gun, fingerprint
identification, automated cell phenotype image classification, and others [16]. A
comprehensive survey of different LBP variants can be found in [17].
2.2. Singular Value Decomposition (SVD)
Singular value decomposition (SVD), its statistical form of principal component analysis
(PCA) and Karhunen-Loeve Transform in signal processing, is one of the most widely used
mathematical formalism/decomposition in machine learning, data mining, pattern
recognition, artificial intelligence, computer vision, signal processing, etc.. Mathematically,
SVD can be seen as the best low-rank approximation to a rectangle matrix. The left and right
singular vectors are mutually orthogonal, and provide the orthogonal basis for row and
column subspaces. When the data matrix is centered as in most statistical analysis, the
singular vectors become eigenvectors of the covariance matrix and provide mutually
uncorrelated/de-correlated subspaces which are much easier to use for statistical analysis
[18].
The Singular value decomposition (SVD) is the best matrix decomposition in the least
square sense that it packages the maximum signal energy into as limited coefficients as
possible. SVD is a stable and effective technique to divided the system into a set of linearly
independent components, each of them manner specific energy impact. Singular value
decomposition is a smart algebraic transform for image processing, because of its endless
benefits, such as maximum energy storing which is usually used in compression, ability to
control the image in the base of two distinguishing subspaces data and noise subspaces,
which is typically used in noise filtering and also was utilized in watermarking applications.
Singular value decomposition is robust and reliable orthogonal matrix decomposition
methods, which is due to its theoretical and stability reasons becoming more and more
popular in the signal processing space [19].
3. THE PROPOSED SYSTEM
The proposed system consists of many steps. In Figure 1 the block diagram of proposed
system is illustrated.

Figure1 The block diagram of the proposed system
3.1. Input Video film
In this step, a user chose a video film taken from camera puts on a vehicle or car as a Forward
Facing Camera (FFC) which moving on the streets. The video film divided into a number of
frames. Each frame of video film is converted from color form RGB (Red, Green, Blue) to
gray-scale form.
3.2. Key frame Extraction
Before passing to the geo-location retrieval stage, it is important to extract the keyframes or
set of important frames of video that are represent the query images. Shots of video can be
clustered by similarity such that similar shots (e.g. similar camera angles or subjects) are
considered to be one shot or cluster [20]. Extracting keyframes greatly improves the
efficiency of the method as the amount of query images is reduced from the number of total
frames in the video to only a few. In this step, the system will be select some frames from
video film as important frames or keyframes that are used for matching operation with
database [21]. The keyframes determined when there is an abrupt change in the direction of
camera movement to the left or right direction. From our previous work[22], we use the
phase correlation method to determine the change in the direction of a camera depend on the
threshold. The frame of video that converted its direction from forwarding direction to the
left or the right direction considered as a keyframe of video. The input to this step is a

collection of gray- scale frames of video and the output is a number of selected frames as a
keyframe. In Figure 2, we illustrate the keyframe extraction from the trajectory of a video.
Figure 2 The key frames extraction.
3.3. Construction the GIS-map Database
Determining the location of a key frame is a problem solved by the geolocalization of an
image, which is a well-developed area. One of the most common practices consists on
building a reference database in which the image locations are known and searching for the
most similar reference images with respect to a query image. The image locations are usually
provided in geodetic coordinates: latitude, longitude, and altitude [21]. In the GIS
environment, there are three types of feature classes (points, lines, and polygon). From the
street view, each point can represents the intersection between more than two roads, while the
line feature represents the street or road between two points, the polygon class represent some
area that consists of a number of polyline witch the first line is connected with the last line. In
our system, we focus on the point and line feature class. In this step, the database is
constructed manually by taking the set of reference images which can be obtained from a
camera shot on the corner between two segment roads of the interested area and save it as
point feature-class (shape file) on the map. Each image represents the shot of the intersection
point between two connected road segments. The references images are taken depend on the
shape of roads on the map. We notice from the example of the interested area in Figure 3,
there is a number of roads and intersection points. Each point may have (2, 4 or 8) related
reference images. The point(1) has two reference images, the first on representing the
direction movement from the road(1) to the road(2), while the second one represents the
direction movement from the road(2) to the road(1). The shape of database for the example of
Figure 3 is illustrated in Table 1.
Figure3 The example of the interested area map

Table 1 The shape of our database construction.
Points Longitude(x) Latitude(y) Images Orientation
Point 1 X1 Y1
Image1
Image2
Road 1 Road 2
Road 2 Road 1
Point 2 X2 Y2
Image3
Image4
Road 1 Road 3
Road 3 Road 1
Point 3 X3 Y3
Image5
Image6
Image7
Image8
Image9
Image10
Image11
Image12
Road 2 Road 4
Road 4 Road 2
Road 2 Road 5
Road 5 Road 2
Road 4 Road 6
Road 6 Road 4
Road 5 Road 6
Road 6 Road 5
Point 4 X4 Y4
Image13
Image14
Image15
Image16
Road 3 Road 4
Road 4 Road 3
Road 4 Road 7
Road 7 Road 4
3.4. Applying the LBP for each Image in DB
In this step, will extract the Local Binary Pattern (LBP) for each reference image in the
database and every key frame of video extracted from the above steps. Each image has a huge
number of pixels, thus we need a method to extract a feature vector that represents the image
instead of it pixels. Suppose we have an image with size (300*300) pixel, that’s mean we
have 90,000 pixels represent the image. The goal of using the LBP is to decrease the features
pixel from 90,000 to (256, 1024 or 2304) feature vector. The LBP for an image extracted
according to the following steps:
1. Scan the image of size (m*n) by using a sliding window with size 3*3.
2. For each window represent the neighbor correlation of center pixel to binary form
by a threshold. The code is (0) if the neighbor pixel less than or equal to the center
pixel. While the code is (1) if the neighbor pixel largest than the center pixel.
3. Read the binary code of 8- neighbor of the window in a clockwise or
counterclockwise direction and collect 8-binary code
4. The 8- digit binary code is converted to a decimal number.
5. Put the decimal number on the center of a window.
Now, the final result of this step is an image with size (m*n) with different values
compared with the original image that represents the Local Binary Pattern of image. Figure 4
represents the example of Appling the Local Binary Pattern (LBP) on the image.
Figure 4 An example of applying Local Binary Pattern (LBP) on the image.

3.5. Dividing the LBP Image to Number of Regions
In this step, the proposed system can use the whole LBP image as one region, divided LBP
image into (2*2) regions, divided LBP image into (3*3), divided LBP image into (4*4),…or
divided LBP image into (n,n). Figure 5 illustrates the dividing the LBP image into (1,4,9,16)
regions.
Figure 5 Dividing the LBP image to regions.
3.6. Extracting the Histogram for each Region.
A histogram is a perfect representation of the spreading of pixel color values in the image
processing. It is an evaluation of the probability distribution of each gray-level value in
image. After determining the number of regions for LBP-image, now the system will
compute the histogram for each region which values of each histogram between (0-255) that
represent the range of gray-scale values. Each value represents a number of occurrences of it
in particular region in LBP-image.
3.7. Concatenating all Histograms as a Feature Vector
In this step, the system will fuse the histograms of the image as a single histogram. The first
histogram region takes values between 0 to 255. The second one takes values from 256 to
511. The third one takes values from 512 to 767 and so on. Figure 6 represents the
concatenated collection of histograms to one which considered a feature vector for a
particular image.
Figure 6 Concatenation histograms.
3.8. Constructing the Feature Matrix
In this step, we build the feature matrix of the database and one keyframe from the video
film. Each row of the feature matrix represents the feature vector of one image (the
concatenated histogram from the above step). The first row represents the feature vector of
one keyframe of video, while other rows represent the feature vector of the reference images
(geo-tagged) that stored in the GIS database. The length of each feature vector is (256, 1204,

2304, 4096, and so on) depend on the number of region grid of dividing LBP-images from
step (3.5). The 256 are used when considering the whole image as one region. 1024 are used
when considering the image is divided into four regions. 2304 are used when consider the
image is divided into nine regions. 4096 are used when considering the image is divided into
sixteen regions and so on. This step used for each keyframe extracted from video film. The
size of the feature matrix is (N+1 * length of the concatenated histogram). When N represents
the number of reference images stored in the GIS database. The shape of the final result for
this step explained in Table 2.
Table 2 The feature matrix representation of the GIS database and one keyframe of video.
No. Images Feature vector
0 Keyframe Keyframe feature vector
1 Reference Image 1_DB Reference Image (1) feature vector
. . .
. . .
N Reference Image N_DB Reference Image (N) feature vector
3.9. Applying the SVD Indexing method
In the linear algebra, the Singular value decomposition (SVD) is a factorization of a
rectangular real or complex matrix analogous to the diagonalization of symmetric or
Hermitian square matrices using a basis of eigenvectors. Singular value decomposition is a
stable and operative way to fragment the system into a set of linearly independent
components, each of them bearing private energy contribution. Given any (m * n) matrix A,
algorithm to find matrices U, V, and W such that
A = U W VT
(1)
Where U is a m*n and orthonormal
W is a n*n and diagonal
V is a n*n and orthonormal
In this step, the system applies the singular value decomposition (SVD) for the feature
matrix extracted from the above step. The goal of using SVD is to decrease (compress) the
length of the feature vector for each image in the feature matrix. The system uses on the left
matrix (U) of the SVD method to represent the new feature vector when the system focus on
the largest Eigenvectors (1-6) of the U matrix. The output of this step is a matrix with size
(N+1 * 6).
3.10. Image Retrieving for Keyframe
In this step, the system computes the similarity between the first row ( feature vector of the
keyframe) of feature matrix after decreasing from the above step with other rows (feature
vector of reference images in the GIS database) by using one of similarity measure such as

cosine similarity measures. The output of this step is the reference geo-tagged image that is
more similar to the keyframe of the video.
3.11. Geo-location of keyframe
In this step, the system retrieves the corresponding geo-location of the reference image
extracted from the above step (longitude x, latitude y) that represent the location of keyframe
of video on the map.
4. EXPERIMENTAL RESULTS
Experiments of the proposed system in this paper are performed on a small region of the
interested area in Karlsruhe city- Germany. The VB.net with Arc Objects SDK and ArcGIS
10.2.2 version are used to build the proposed system. The information about roadmap is
extracted from the OpenStreetMap platform [23] and then build our database by determining
manually the important points that represent the intersection between more than two road
segments. The geotagged images (reference images) are stored on the intersection points
when each reference image represent the change direction between two connected road
segments. In Figure 7 illustrated the shape of database construction such as Table 1.
Figure 7 The database construction of interested zone.
To test our system, we chose a video film without geo-tagged information that consists of
950 frames as a study case imagine from a camera puts on a car as farword facing camera and
extract the trajectory of the camera by using our previous method [19] depend on phase
correlation when the camera is moved to (forward, left or right) or using other visual
odometry methods. From the trajectory, the system will be able to determine the keyframes
(important frames) of video frames. Figure 8 explains the key frame selection from the
trajectory of the video.

Figure 8 The key frames selection of video film based on phase correlation
Now, for each key frame, we apply the LBP-SVD method proposed for finding the
feature matrix by dividing the images into four regions and from it, the proposed system
search for the most similar geotagged reference image in the database. We store (37)
geotagged reference images in the GIS database. The result of LBP-SVD is illustrated in
(Appendix A) which represent the max sixth Eigenvector of the left matrix U from the SVD.
The retrieved most similar geo-tagged image from the database for each key frame of video
film using one of the similarity measure on the feature vectors left matrix of SVD. The
similarity measure (cosine similarity) computed between each key frame and the other
images in the GIS database.
We notice from the above figure, there are three key frames (important frames) selection
from the trajectory of a video film. For each key frame, the proposed system searching for
more similar reference image from the database based on matching the feature vector that
depends only on visual content and returns the corresponding geo-tagged information
associated with it as geographic location information for the key frame of video. Figure 9
illustrates the final output of our proposed system
Figure 9 The final result of our proposed system.
5. CONCLUSIONS
Visual geo-localization, which is the problem of automatic estimation of the location where
an image or video was captured on the GIS-map, has concerned much interest through the
previous few years. An efficient geo-location video estimation method for video taken from a
camera placed on the vehicle as forward facing camera was effectively built and proved by
using a new LBP-SVD proposed. In this work, we convert a large number of features for
images into only (6) features for each image and using these features for matching and
retrieval process. One key component of the proposed system is the extraction of key frames
which then can be analyzed with known image processing algorithm (LBP and SVD). The
resulting features can be indexed and used for further retrieval.

APPENDIX A
The (Appendix A) represents the feature vector of the GIS database and one keyframe of the
video film. There are three key frames selection from Figure 8. The feature vectors (U
matrix) for the first keyframe of video and the similarity measure between the first keyframe
and all other reference images as illustrated in Table A1. We notice from the Table A1 the
most similar reference image to the first keyframe of video is the image (5).
Table A1The feature vectors and similarity measure for the 1st
keyframe of video.
Eigen
values
1st 2nd 3rd 4th 5th 6th
Similarity
measure
keyframe1 0.2425
-
0.4636
0.0194 0.0067 0.1823
-
0.1205
1.0000
Image 1 0.1428 0.1369
-
0.0490
-
0.1713
0.0126
-
0.2232
0.0459
Image 2 0.1553 0.0794
-
0.1832
0.0178 0.2915
-
0.1307
0.2940
Image 3 0.2014
-
0.1531
-
0.5280
-
0.4206
-
0.2940
-
0.0573
0.1601
Image 4 0.1916 0.0124
-
0.2453
-
0.0086
0.4281
-
0.3045
0.3537
Image 5 0.2461
-
0.4405
0.1376
-
0.0177
0.0446
-
0.0186
0.9241
Image 6 0.1668 0.1634 0.0380
-
0.1148
0.1459 0.0014 -0.0086
Image 7 0.2101 0.2421 0.4528
-
0.6138
0.0000 0.0986 -0.1562
Image 8 0.1455 0.1663 0.1874
-
0.1712
0.1298 0.0072 -0.0538
Image 9 0.2334
-
0.4048
0.1817
-
0.0347
0.0194
-
0.0047
0.8704
Image 10 0.2010
-
0.2252
0.2257
-
0.0624
0.0000
-
0.0910
0.6724
Image 11 0.1319 0.0467
-
0.3750
-
0.0896
0.2122
-
0.0065
0.0844
Image 12 0.1502 0.0813
-
0.2824
-
0.1608
0.1873 0.6454 -0.1317
Image 13 0.1483 0.0578
-
0.0156
0.0267
-
0.1055
-
0.0774
-0.0621
Image 14 0.1483 0.0509 0.0172 0.0321
-
0.1889
-
0.0294
-0.1531
Image 15 0.1497 0.0477 0.0316 0.0964
-
0.1663
-
0.0008
-0.1345
Image 16 0.1454 0.0921
-
0.0175
0.1058
-
0.0755
-
0.0190
-0.1708
Image 17 0.1416 0.0996
-
0.0521
0.1025
-
0.0390
-
0.0213
-0.1628
Image 18 0.1405 0.0972
-
0.0567
0.0913
-
0.0744
-
0.0215
-0.2013

Image 19 0.1387 0.1009
-
0.0470
0.0654
-
0.1106
-
0.0347
-0.2492
Image 20 0.1400 0.1039
-
0.0325
0.0589
-
0.1032
-
0.0390
-0.2377
Image 21 0.1420 0.0948
-
0.0230
0.0700
-
0.1015
-
0.0462
-0.1899
Image 22 0.1425 0.1049
-
0.0231
0.0314
-
0.1041
-
0.0876
-0.1926
Image 23 0.1417 0.1023
-
0.0251
0.0469
-
0.1216
-
0.0528
-0.2253
Image 24 0.1417 0.0944
-
0.0192
0.0542
-
0.1395
-
0.1009
-0.1709
Image 25 0.1418 0.0941
-
0.0057
0.0730
-
0.1213
-
0.1221
-0.1227
Image 26 0.1451 0.0538
-
0.0032
0.0086
-
0.2811
-
0.0924
-0.1531
Image 27 0.1470 0.0384 0.0162 0.0183
-
0.2689
-
0.0453
-0.1516
Image 28 0.1741
-
0.2050
-
0.0651
0.1167
-
0.2334
0.5223 0.0789
Image 29 0.1448 0.0045
-
0.0456
0.2772
-
0.1060
-
0.0054
0.0694
Image 30 0.1396 0.0628 0.0320 0.2370 0.1466 0.0751 0.1572
Image 31 0.1443 0.0819 0.0830 0.2541 0.1504 0.0921 0.1175
Image 32 0.1437 0.0887 0.0743 0.0495 0.0906 0.0439 0.0808
Image 33 0.1481 0.0998 0.0946 0.0826 0.1279 0.0795 0.0422
Image 34 0.1502 0.0788 0.0709 0.1059 0.1102 0.1376 0.0600
Image 35 0.1495 0.0880 0.0896 0.0966 0.0933 0.1195 0.0357
Image 36 0.1456 0.0805 0.0512 0.1317 0.0659
-
0.0005
0.1261
Image 37 0.1449 0.0739 0.0458 0.1044 0.0129
-
0.0008
0.0951
The feature vectors (U matrix) for the second keyframe of the video and the similarity
measure between the it and all other reference images in the GIS database are illustrated in
Table A2. We notice from the Table A2 the most similar reference image to the second
keyframe of video is the image (2).

Table A2 The feature vectors and similarity measure for the 2nd
keyframe of video.
Eigen
values
Similarity
measure
keyframe 2 0.1605
-
0.0390
0.3061
-
0.2271
0.1833
-
0.3034
1.0000
Image 1 0.1458
-
0.1355
0.0313 0.1429 0.1033
-
0.1732
0.5269
Image 2 0.1584
-
0.0733
0.2282
-
0.1785
0.1745
-
0.1243
0.9303
Image 3 0.2042 0.2078 0.4398 0.5736 0.0272
-
0.0842
0.2679
Image 4 0.1950 0.0057 0.2858
-
0.2179
0.2654
-
0.1716
0.5206
Image 5 0.2481 0.5344
-
0.1234
-
0.0893
0.1110
-
0.0497
0.0990
Image 6 0.1704
-
0.1641
-
0.0223
-
0.0088
0.1775
-
0.0076
0.4820
Image 7 0.2147
-
0.2433
-
0.4569
0.3758 0.4731 0.0538 -0.2633
Image 8 0.1487
-
0.1692
-
0.1723
0.0198 0.2223 0.0161 0.1179
Image 9 0.2353 0.4928
-
0.1680
-
0.0681
0.1103
-
0.0482
0.0383
Image 10 0.2033 0.2851
-
0.2136
-
0.0293
0.1073
-
0.1268
-0.0161
Image 11 0.1344
-
0.0361
0.3644 0.0168 0.1412 0.1147 0.1545
Image 12 0.1532
-
0.0709
0.2668 0.0700 0.1845 0.6658 -0.1977
Image 13 0.1510
-
0.0423
-
0.0090
0.0569
-
0.0985
-
0.0624
-0.1518
Image 14 0.1510
-
0.0328
-
0.0522
0.0989
-
0.1540
-
0.0343
-0.3593
Image 15 0.1525
-
0.0291
-
0.0620
0.0359
-
0.1810
-
0.0031
-0.3788
Image 16 0.1483
-
0.0821
-
0.0036
-
0.0130
-
0.1380
0.0026 -0.1897
Image 17 0.1445
-
0.0921
0.0360
-
0.0278
-
0.1153
0.0020 -0.0332
Image 18 0.1433
-
0.0889
0.0355 0.0026
-
0.1319
-
0.0075
-0.0090
Image 19 0.1415
-
0.0928
0.0213 0.0420
-
0.1372
-
0.0276
-0.0492
Image 20 0.1429
-
0.0960
0.0097 0.0376
-
0.1241
-
0.0389
0.0522
Image 21 0.1449
-
0.0857
0.0018 0.0258
-
0.1285
-
0.0475
0.0626
Image 22 0.1454 - - 0.0574 - - 0.0404

0.0971 0.0003 0.1056 0.0794
Image 23 0.1446
-
0.0940
0.0011 0.0554
-
0.1267
-
0.0599
0.0789
Image 24 0.1446
-
0.0847
-
0.0061
0.0590
-
0.1420
-
0.1056
0.0676
Image 25 0.1446
-
0.0846
-
0.0166
0.0321
-
0.1410
-
0.1201
0.1145
Image 26 0.1478
-
0.0353
-
0.0453
0.1771
-
0.2039
-
0.1148
-0.1469
Image 27 0.1497
-
0.0177
-
0.0634
0.1593
-
0.1995
-
0.0646
-0.2993
Image 28 0.1760 0.2650 0.0125 0.0652
-
0.2361
0.4626 -0.4721
Image 29 0.1473 0.0178 0.0286
-
0.1256
-
0.2688
-
0.0129
0.0475
Image 30 0.1423
-
0.0537
-
0.0067
-
0.2688
-
0.0586
0.0681 0.4255
Image 31 0.1471
-
0.0749
-
0.0549
-
0.2906
-
0.0628
0.0821 0.3561
Image 32 0.1465
-
0.0812
-
0.0650
-
0.0986
0.0320 0.0558 0.2364
Image 33 0.1510
-
0.0935
-
0.0812
-
0.1462
0.0346 0.1048 0.0692
Image 34 0.1531
-
0.0690
-
0.0607
-
0.1490
0.0048 0.1525 0.0778
Image 35 0.1524
-
0.0793
-
0.0826
-
0.1324
-
0.0002
0.1393 0.0464
Image 36 0.1484
-
0.0717
-
0.0465
-
0.1354
-
0.0473
0.0226 0.3697
Image 37 0.1477
-
0.0633
-
0.0496
-
0.0819
-
0.0650
0.0118 0.3284
The feature vectors (U matrix) for the third keyframe of video and the similarity measure
between it and all other reference images in the GIS database are illustrated in Table A3. We
notice from the Table A3 the most similar image to the third keyframe of video is the image
(8).

Table A3. The feature vectors and similarity measure for the 3rd
keyframe of video.
Eigen
values
Similarity
measure
keyframe
3
0.1498
-
0.1626
0.1901
-
0.2124
0.1014
-
0.0061
1.0000
Image 1 0.1462
-
0.1320
-
0.0415
-
0.1605
-
0.0565
-
0.2138
0.5669
Image 2 0.1585
-
0.0676
-
0.1748
-
0.0805
0.2854
-
0.1461
0.3302
Image 3 0.2043 0.2120
-
0.4870
-
0.3882
-
0.3664
-
0.0496
0.0003
Image 4 0.1952 0.0106
-
0.2234
-
0.1552
0.4194
-
0.3442
0.1105
Image 5 0.2483 0.5306 0.1635
-
0.0418
0.1081
-
0.0694
0.0178
Image 6 0.1708
-
0.1594
0.0375
-
0.1227
0.0826 0.0068 0.8982
Image 7 0.2155
-
0.2428
0.4717
-
0.4825
-
0.2354
0.1100 0.6823
Image 8 0.1491
-
0.1683
0.1973
-
0.1661
0.0532 0.0082 0.9858
Image 9 0.2355 0.4891 0.2037
-
0.0404
0.0713
-
0.0522
0.0367
Image 10 0.2035 0.2833 0.2350
-
0.0384
0.0134
-
0.1271
0.1246
Image 11 0.1345
-
0.0313
-
0.3513
-
0.1914
0.1937
-
0.0290
-0.0690
Image 12 0.1534
-
0.0658
-
0.2620
-
0.2309
0.1470 0.6427 0.0772
Image 13 0.1513
-
0.0381
-
0.0299
0.0612
-
0.0963
-
0.0784
-0.1825
Image 14 0.1513
-
0.0290
-
0.0008
0.0918
-
0.1729
-
0.0265
-0.2461
Image 15 0.1528
-
0.0252
0.0093 0.1466
-
0.1299
-
0.0001
-0.3010
Image 16 0.1485
-
0.0774
-
0.0381
0.1269
-
0.0462
-
0.0172
-0.2199
Image 17 0.1447
-
0.0868
-
0.0715
0.1111
-
0.0136
-
0.0188
-0.1891
Image 18 0.1436
-
0.0836
-
0.0763
0.1095
-
0.0506
-
0.0160
-0.1899
Image 19 0.1418
-
0.0877
-
0.0664
0.0967
-
0.0946
-
0.0268
-0.1639
Image 20 0.1432
-
0.0911
-
0.0505
0.0892
-
0.0892
-
0.0313
-0.0386
Image 21 0.1451
-
0.0807
-
0.0428
0.1018
-
0.0850
-
0.0385
-0.0730
Image 22 0.1457 - - 0.0688 - - -0.0345

0.0921 0.0413 0.1026 0.0796
Image 23 0.1448
-
0.0889
-
0.0443
0.0862
-
0.1128
-
0.0425
-0.0225
Image 24 0.1449
-
0.0798
-
0.0394
0.0981
-
0.1273
-
0.0923
-0.1034
Image 25 0.1449
-
0.0799
-
0.0265
0.1117
-
0.1040
-
0.1157
-0.0733
Image 26 0.1481
-
0.0312
-
0.0233
0.0934
-
0.2707
-
0.0802
-0.1652
Image 27 0.1500
-
0.0142
-
0.0027
0.0989
-
0.2527
-
0.0356
-0.2111
Image 28 0.1762 0.2659
-
0.0707
0.1461
-
0.1318
0.5096 -0.3282
Image 29 0.1475 0.0224
-
0.0742
0.2815
-
0.0056
-
0.0090
-0.4608
Image 30 0.1425
-
0.0496
0.0130 0.1882 0.2127 0.0731 0.2580
Image 31 0.1474
-
0.0707
0.0593 0.2127 0.2162 0.0932 0.2969
Image 32 0.1468
-
0.0781
0.0655 0.0378 0.0950 0.0424 0.6517
Image 33 0.1513
-
0.0903
0.0839 0.0611 0.1406 0.0748 0.5403
Image 34 0.1534
-
0.0657
0.0596 0.0822 0.1360 0.1347 0.4481
Image 35 0.1528
-
0.0762
0.0777 0.0808 0.1146 0.1185 0.5046
Image 36 0.1487
-
0.0683
0.0375 0.1139 0.0993 0.0011 0.4194
Image 37 0.1480
-
0.0600
0.0326 0.1019 0.0410 0.0040 0.3966
REFERENCES
[1] Lu Y; Shahabi C, Kim S. Efficient indexing and retrieval of large-scale geo-tagged video
databases. GeoInformatica,20, 2016, pp. 829-857
[2] Youngwoo Kim, Jinha Kim, Hwanjo Yu. GeoSearch: Georeferenced Video Retrieval
System. ACM, 2012, Beijing, China.
[3] Sanjay.S, Nalina M.E and H Ajith Hebbar, RS & GIS Based Site Suitability Analysis for
the Disposal of Solid Waste for Moodbidri and Karkala Region, International Journal of
Civil Engineering and Technology, 9(5), 2018, pp. 594–601.
[4] H. Shi, J. Chen, and A. G. Hauptmann, Joint Saliency Estimation and Matching using
Image Regions for Geo-Localization of Online Video, Proc. 2017 ACM Int. Conf.
Multimed. Retr 2017, pp. 383–391.
[5] SS. Asadi, M. Satish Kumar, B. Ramyaa Sree, M. Sujatha, Remote Sensing and GIS
Based Water Quality Estimation for Thimmapally Watershed. International Journal of
Civil Engineering and Technology, 8(8), 2017, pp. 1626– 1635.
[6] J. Choi and G. Friedland . Multimodal location estimation of videos and images.
Multimodal Locat. Estim. Videos Images 2015, pp. 1–191.

[7] Kallakunta Ravi Kumar, SS. Asadi and Venkata Ratnam Kolluru, Remote Sensing and
Gis Based Land Utilization Analysis: A Model Study from Vamsadhara River Basin,
International Journal of Mechanical Engineering and Technology 8(11), 2017, pp. 866–
873.
[8] E. Z. Mequanint, Y. T. Tesfaye, H. Idrees, A. Prati, M. Pelillo, and M. Shah, Large-scale
Image Geo-Localization Using Dominant Sets. IEEE Trans. Pattern Anal. Mach 2017,
pp. 1–15.
[9] P. H. Seo, T. Weyand, J. Sim, and B. Han. CPlaNet : Enhancing Image Geolocalization
by Combinatorial Partitioning of Maps. arXiv preprint arXiv:1808.08779 2018.
[10] Jie Huang and Sio-Long Lo. Location Estimation of Urban Images Based on
Geographical. Journal of Physics: Conference Series 2018, 1004, conference 1.
[11] Li, X. Large Scale Image Retrieval for Location Estimation. Master of Science in
Information Science and Engineering, Shandong University geboren te Jinan, China
2016.
[12] P. Wu, B. S. Manjunath, H. D. Shin, and S. Barbara, Dimensionality Reduction for Image
Retrieval. IEEE Proceedings 2000 International Conference on Image Processing 2012,
Canada, and DOI: 10.1109/ICIP.2000.899557.
[13] P. Pawar and P. P. Belagali, Image Retrieval Technique Using Local Binary Pattern (
LBP ), International Journal of Science and Research (IJSR), 4, 7, 2015, pp. 2013–2016.
[14] T. Ojala, M. Pietikainen, D. Harwood. A comparative study of texture measures with
classification based on featured distributions. Pattern Recogn, 29(1), 1996, pp. 51–59.
[15] T. Ojala, M. Pietikainen, and T. Mäenpää . Multiresolution gray-scale and rotation
invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal.
Mach. Intell 2002, 24, 7, pp. 971–987.
[16] M. Lamine Mekhalfi and M. Guermoui . A Sparse Representation of Complete Local
Binary Pattern Histogram for Human Face Recognition. arXiv preprint 2016.
[17] M. Pietikainen, A. Hadid, G. Zhao, T. Ahonen. Computer Vision Using Local Binary
Patterns. Book, “Springer-Verlag London Li-mited 2011, DOI: 10.1007/978-0-85729-
748-8.
[18] S. Zheng, C. Ding, and F. Nie . Regularized Singular Value Decomposition and
Application to Recommender System. arXiv Prepr 2018. DOI: arXiv: 1804.05090v1.
[19] Rowayda A. Sadek. SVD Based Image Processing Applications: State of The Art,
Contributions and Research Challenges, International Journal of Advanced Computer
Science and Applications, 3, 7, 2012.
[20] Shingo Uchihashi and Jonathan Foote. Summarizing Video Using a Shot Importance
Measure and a Frame-Packing Algorithm. IEEE International Conference on Acoustics,
Speech, and Signal Processing. Proceedings 2002, DOI: 10.1109/ICASSP.1999.757482.
[21] S. Medina, Z. Dai, and Y. Gao . Where Is This? Video Geolocation Based On Neural
Network Features. arXiv Prepr 2018.
[22] Abdulkadhem Abdulkadhem, A., and A. Al-Assadi, T. . Camera Motion Estimation
based on Phase Correlation. International Journal of Engineering & Technology, 8(1.5),
2018, pp. 257-265
[23] Website https://www.openstreetmap.org/ , accessed October 2,2018 .

Ijciet 10 02_043

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Ijciet 10 02_043

Similar to Ijciet 10 02_043 (20)

More from IAEME Publication

More from IAEME Publication (20)

Recently uploaded

Recently uploaded (20)

Ijciet 10 02_043