An Algorithm for In-Place Vision-Based Skewed 1D Barcode Scanning in the Cloud
An Algorithm for In-Place Vision-Based Skewed 1D
Barcode Scanning in the Cloud
Department of Computer Science
Utah State University
Logan, UT, USA
Department of Computer Science
Utah State University
Logan, UT, USA
Abstract—An algorithm is presented for in-place vision-based
skewed 1D barcode scanning that requires no smartphone
camera alignment. The algorithm is in-place in that it performs
no rotation of input images to align localized barcodes for
scanning. The algorithm is cloud-based, because image
processing is done in the cloud. The algorithm is implemented in
a distributed, cloud-based system. The system’s front end is a
smartphone application that runs on Android 4.3 or higher. The
system’s back end is currently deployed on a four node Linux
cluster used for image recognition and data storage. The
algorithm was evaluated on a set of 506 video recordings of
common grocery products. The videos had a 1280 x 720
resolution, an average duration of 15 seconds, and were recorded
on an Android Galaxy Nexus smartphone in a local supermarket.
The results of the experiments are presented and discussed.
Keywords—computer vision; barcode detection; barcode
scanning; mobile computing; skewed barcodes
According to the World Health Organization (www.who.int),
obesity causes such diseases as diabetes, kidney failures, and
strokes and predicts that these diseases will be a major cause
of death worldwide. Berreby  points out that, for the first
time in human history, obese people outnumber underfed
ones. Such chronic illnesses as diabetes threaten many
individuals with numerous complications that include but are
not limited to blindness and amputations [2, 3]. The U.S.
Academy of Nutrition and Dietetics (www.eatright.org)
estimates that approximately twenty six million Americans
have diabetes and seven million people in the U.S. are
estimated to be aware of their condition. It is estimated that by
2030 the prevalence of diabetes in the world will reach 4.4%,
which will equal to approximately 366 million people .
While there is no cure for diabetes 1 or 2 as of now, many
experts agree that it can be successfully managed. Successful
diabetes management has three integral components: healthy
diet, blood glucose management, and physical exercise . In
this paper, we focus on healthy diet. An important component
of a healthy diet is the patients’ comprehension and retention
of nutritional information and understanding of how different
foods and nutritional components affect their bodies. In the
U.S. and many other countries, nutritional information is
primarily conveyed to consumers through nutritional labels
(NLs). Unfortunately, even highly motivated consumers, who
deliberately look for NLs to make healthy food choices, find it
difficult to locate them on many products .
One way to improve the comprehension and retention of
nutritional information by consumers is to use computer vision
to scan barcodes in order to retrieve NLs from databases.
Unfortunately, a common weakness of many barcode
scanners, both open source and commercial, is the camera
alignment requirement: the smartphone camera must be
aligned with a target barcode to obtain at least one complete
scanline for successful barcode recognition . This
requirement is acceptable for sighted users but presents a
serious accessibility barrier to visually impaired and blind
users or to users who may not have good physical command of
their hands. Skewed barcode scanning is also beneficial for
sighted smartphone users, because it makes barcode scanning
faster due to the absence of the camera alignment requirement.
Another weakness of the current mobile smartphone scanners
is lack of coupling of barcode scanning to comprehensive NL
databases from which nutritional information can be retrieved
To address the problem of skewed barcode scanning, we
developed an algorithm for skewed barcode localization on
mobile smartphones . In this paper, an algorithm is
presented for in-place vision-based skewed barcode scanning
that no longer requires the smartphone camera alignment. The
algorithm is in-place in that it performs no rotation of input
images to align localized barcodes for scanning. The algorithm
is cloud-based, because image processing is done in the cloud.
The algorithm is implemented in a distributed, cloud-based
system. The system’s front end is a smartphone application
that runs on Android 4.3 or higher. The system’s back end is
currently deployed on a four node Linux cluster used for
image recognition and nutritional data storage.
The front end smartphone sends captured frames to the
back end cluster across a wireless data channel (e.g.,
3G/4G/Wi-Fi) where barcodes, both skewed and aligned, are
recognized. Corresponding NLs are retrieved from a cloud
database, where they are stored as HTML documents, and sent
across the data channel back to the smartphone where the
HTML documents are displayed on the touchscreen.
Wikipedia links to important nutrition terms are embedded for
better comprehension. Consumers can use standard touch
gestures (e.g., zoom in/out, swipe) available on mainstream
smartphone platforms to manipulate the NL’s surface size.
The NL database currently includes approximately 230,000
products compiled from public web sites by a custom crawler.
The remainder of this paper is organized as follows. In
Section II, some background information is given on the
related work as well as on the research of our laboratory on
proactive nutrition management and mobile vision-based
nutritional information extraction from product packages. In
Section III, we outline the details of our algorithm for in-place
skewed barcode scanning in the cloud. In Section IV, we
describe our four node Linux cluster for image processing and
data storage. Section V presents several experiments with the
system and discusses the results. Section VI summarizes our
findings, presents our conclusions, and outlines several
research venues for the future.
II. Related Work
A. Barcode Localization and Scanning
Much research has been done to on mobile barcode scanning.
Tekin and Coughlan  describe a vision-based algorithm to
guide visually impaired smartphone users to center target
barcodes in the camera frame via audio instructions.
Wachenfeld et al.  present another vision-based algorithm
that detects barcodes on a mobile phone via image analysis
and pattern recognition methods. A barcode is assumed to be
present in the image. The algorithm overcomes typical
distortions, such as inhomogeneous illumination, reflections,
or blurriness due to camera movement. Unfortunately, the
algorithm does not appear to address the localization and
scanning of skewed barcodes. Adelmann et al.  have
developed a vision-based algorithm for scanning barcodes on
mobile phones. The algorithm relies on the fact that, if
multiple scanlines are drawn across the barcode in various
arbitrary orientations, one of them might cover the whole
length of the barcode and result in a successful barcode scan.
This recognition scheme does not appear to handle distorted
images, because it is not always possible to obtain the
scanlines that cover the entire barcode. Galo and Manduchi
 present an algorithm for 1D barcode reading in blurred,
noisy, and low resolution images. However, the algorithm
detects barcodes only if they are slanted by less than 45
degrees. Lin et al.  have developed an automatic barcode
detection and recognition algorithm for multiple and rotation
invariant barcode decoding. The proposed system is
implemented and optimized on a DM6437 DSP EVM board, a
custom embedded system built specifically for barcode
A common weakness of many barcode scanners, both open
source and commercial (e.g., ) is the camera alignment
requirement: the smartphone camera must be aligned with a
target barcode to obtain at least one complete scanline for
successful barcode recognition. This requirement is acceptable
for sighted users but presents a serious accessibility barrier to
visually impaired shoppers or to shoppers who may not have
good dexterity. Skewed barcode scanning is also beneficial for
sighted smartphone users, because it may make barcode
scanning faster because the camera alignment requirement no
longer needs to be satisfied.
B. Proactive Nutrition Management
Many nutritionists and dieticians consider proactive nutrition
management to be a key factor in managing diabetes. As more
and more individuals start managing their daily activities with
smartphones and other mobile devices, such devices hold great
potential to become self-management tools for diabetes and
other chronic ailments. Unfortunately, modern nutrition
management systems assume that users understand how to
collect nutritional data and can be persuaded to perform
necessary data collection with emails, SMS’s or other digital
prompts. Such systems often underperform, because many
users find it difficult to integrate nutritional data collection
into their daily activities due to lack of time, motivation, or
training. Eventually they turn off or ignore digital stimuli .
To overcome these challenges, we have begun to develop a
Persuasive NUTrition Management System (PNUTS).
PNUTS seeks to shift current research and clinical practices in
nutrition management toward persuasion, automated
nutritional information extraction and processing, and context-
sensitive nutrition decision support. The system is based on a
nutrition management approach inspired by the Fogg Behavior
Model (FBM) , which states that motivation alone is
insufficient to stimulate target behaviors. Even a motivated
user must have both the ability to execute a behavior and a
trigger to engage in that behavior at an appropriate place and
time. The algorithm presented in this paper is one of the
algorithms used by PNUTS for vision-based nutritional
information extraction from product packages.
III. Skewed Barcode Scanning
The algorithm for skewed 1D barcode scanning uses our
algorithm for skewed barcode localization . The algorithm
for skewed barcode localization localizes skewed barcodes in
captured frames by computing dominant orientations of
gradients (DOGs) of image segments and collecting smaller
segments with similar dominant gradient orientations into
larger connected components. In Fig. 1, the output of the DOG
localization algorithm is shown as a white rectangle around
the skewed barcode.
Fig. 2 shows the control flow of our algorithm for skewed
barcode scanning. The algorithm takes as input an image
captured from the smartphone camera’s video stream. This
image is given to the DOG algorithm. If the barcode is not
localized, another frame is grabbed from the video stream. If
the DOG algorithm localizes a barcode, as shown in Fig. 1, the
coordinates of the detected region is passed to the line grower
component. The line grower component selects the center of
the localized region, which is always a rectangle, and starts
growing scanlines. For an example of how the line growing
component works, consider Fig. 3. In Fig. 3, the horizontal
and vertical white lines intersect in the center of the localized
region. The skew angle of the localized barcode computed by
the DOG algorithm is 120 degrees.
The line that passes the localized region’s center at the
skew angle detected by the DOG algorithm is referred to as
the skew line. After the center of the region and the skew angle
are determined, the line growing component begins to grow
scanlines that are orthogonal to the skew line. In Fig. 3, the
skew line is denoted as the black line that passes the region’s
center at 120 degrees. A scanline is grown on both sides of the
skew line. In Fig. 3, the upper half of the scanline is shown as
a red arrow and the lower half of the scanline is shown as a
blue arrow. Each half is extended until it reaches the portion
of the image where the barcode lines are no longer detectable.
A five pixel buffer region is subsequently added after the
scanline’s end to improve subsequent scanning.
Figure 1. Localization of a skewed barcode
The number of scanlines grown on both sides of the skew
line is an adjustable parameter. In the current implementation
of the algorithm, the value of this parameter is set to 10. The
scanlines are arrays of luminosity values for each pixel in their
growth path. For each grown scanline, the Line Widths (LW)
for the barcode are then computed by finding two points that
are on the intensity curve but lie on the opposite sides of the
mean intensity. By modelling the curve between these points
as a straight line we obtain the intersection point between the
intensity curve and the mean intensity.
Figure 2. Skewed barcode scanning algorithm
Line Colors (LC) are classified to be either black or white
based on whether the pixel intensity is less than or greater than
the mean intensity of the scanline. Once the LW and LC are
known, each scanline is decoded using the standard EAN
decoding scheme. Since UPC is a subset of EAN, this
scanning algorithm can decode both EAN and UPC
barcodes.The number of scanlines is currently not dynamically
adjusted. In other words, ten scanlines are grown and then
each of them is passed to the barcode scanner. As soon as a
successful scan is obtained, the remaining scanlines, if there
are any left, are not scanned. If no scanlines result in a
successful scan, the control returns back to caputring frames
from the video stream.
Figure 3. Growth of a scanline orthogonal to the skewed
angle of a barcode
Figure 4. Sequence of images that demonstrates how the
Fig. 4 shows a sequence of images that gives a visual
demonstration of how the algorithm processes a captured
frame. The top image in Fig. 4 is a frame captured from the
smartphone camera’s video stream. The second image in Fig.
4 shows the result of the clustering stage of the DOG
algorithm that clusters small subimages with similar dominant
gradient orientations. The third image in Fig. 4 shows a white
rectangle that denotes the localized barcode. The fourth image
in Fig. 4 shows the ten scanlines, one of which results in a
successful skewed barcode scan.
IV. Linux Cluster for Image Processing and
We built a Linux cluster out of four Dell computers for cloud-
based computer vision and data storage. Each computer has an
Intel Core i5-650 3.2 GHz dual-core processor that supports
64-bit computing. The processors have 3MB of cache
memory. The machines are equipped with 6GB DDR3
SDRAM and have Intel integrated GMA 4500 Dynamic Video
Memory Technology 5.0. All machines have 320 GB of hard
disk space. Ubuntu 12.04 LTS was installed on each machine.
We used JBoss (http://www.jboss.org) to build and
configure the cluster and the Apache mod_cluster module
(http://www.jboss.org/mod_cluster) to configure the cluster
for load balancing. Our cluster has one master node and three
slaves. The master node is the domain controller. The master
node also runs mod_cluster and httpd. All four machines are
part of a local area network and have hi-speed Internet
connectivity. We have installed JDK 7 in each node.
The JBoss Application Server (JBoss AS) is a free open-
source Java EE-based application server. In addition to
providing a full implementation of a Java application server, it
also implements the Java EE part of Java. The JBoss AS is
maintained by jboss.org, a community that provides free
support for the server. JBoss is licensed under the GNU
Lesser General Public License (LGPL).
The Apache mod_cluster module is an httpd-based load
balancer. The module is implemented with httpd as a set of
modules for httpd with mod_proxy enabled. This module uses
a communication channel to send requests from httpd to a set
of designated application server nodes. An additional
communication channel is established between the server
nodes and httpd. The nodes use the additional channel to
transmit server-side load balance factors and lifecycle events
back to httpd via a custom set of HTTP methods collectively
referred to as the Mod-Cluster Management Protocol
The mod_cluster module provides dynamic configuration
of httpd workers. The proxy's configuration is on the
application servers. The application server sends lifecycle
events to the proxies, which enables the proxies to auto-
configure themselves. The mod_cluster module provides
accurate load metrics, because the load balance factors are
calculated by the application servers, not the proxies.
All nodes in our cluster run JBoss AS 7. Jboss AS 7.1.1 is
the version of the application server installed on the cluster.
Apache httpd runs on the master with the mod_cluster-1.2.0
module enabled. The Jboss AS 7.1.1 on the master and the
slaves are discovered by httpd.
A Java servlet for image recognition is deployed on the
master node as a web archive file. The servlet’s URL is
hardcoded in every front end smartphone. The servlet receives
images uploaded with http post requests, recognizes barcodes,
and sends an HTML response back to front end smartphones.
No data caching is currently done on the servlet or the front
The skewed barcode scanning experiments were conducted on
a set of 506 video recordings of common grocery products that
we have made publicly available . The videos have a
1280x720 resolution, an average duration of 15 seconds and
were recorded on an Android 4.2.2 Galaxy Nexus smartphone
in a supermarket in Logan, UT. All videos were taken by an
operator who held a grocery product in one hand and a
smartphone in the other. The videos covered four different
categories of products: bags, boxes, bottles, and cans.
Colored RGB frames were extracted from each video at
the rate of one frame per second and grouped together into
different categories of products. Each frame was automatically
classified as blurred or sharp by the blur detection scheme
using Haar wavelet transforms [16, 17] that we implemented
in Python. Each frame was manually classified as having a
barcode or not.
Figure 5. Average request-response times in milliseconds
After all classifications were completed (blurred vs. sharp,
barcode vs. no barcode), the classified frames were stored in
the smartphone's sdcard. The evaluation procedure was
implemented as follows. A started Android service would take
one frame at a time and uploaded it to the Linux cluster via an
http POST request over the USU Wi-Fi network. The network
has a download speed of 72.31 Mbps and an upload speed of
Each captured frame was processed on the cluster as
follows. The DOG localization algorithm  was executed
and, if a barcode was successfully localized, the barcode was
scanned within the localized region in place with ten
scanlines. The detection result was sent back to the
smartphone and recorded on the smartphone’s sdcard. The
average request-response time for each session was calculated.
Fig. 5 gives the graph of the node cluster’s request-response
times. The lowest average was 712 milliseconds; the highest
average was 1813 milliseconds.
Figure 6. Skewed barcode scanning in blurred and sharp
Fig. 6 shows the performance of the skewed barcode
scanning algorithm on blurred and sharp images for each of
the four product categories. As mentioned above, the
blurriness coefficient was computed automatically through the
Haar transform .
Figure 7. Average barcode scan times for each product
category in a local supermarket in seconds
We also conducted skewed barcode scanning experiments
in a local supermarket. A sighted user was given a Galaxy
Nexus 4 smartphone with an AT&T 4G connection. Our front
end application was installed on the smartphone. The user was
asked to scan ten products of his choice in each of the four
categories: box, can, bottle, and bag. The user was told that he
can choose which products to scan. A research assistant
accompanied the user and recorded the scan times for each
product. Each scan time started from the moment the user
began scanning and ended when the response was received
from the server. Fig. 7 denotes the average times in seconds
for each category.
Images that contained barcodes for all four product categories
had no false positives. As shown in Fig. 6, for each product
category, the sharp images had a significantly better true
positive percentage than the blurred images. A comparison of
the bar charts in Fig. 6 reveals that the true positive percentage
of the sharp images is more than double that of the blurry
ones. Images without any barcode for all categories produced
100% accurate results with all true negatives, irrespective of
the blurriness. In other words, the algorithm is highly specific
in that it does not detect barcodes in images that do not
Another observation on Fig. 6 is that the algorithm showed
its best performance on boxes. The algorithm’s performance
on bags, bottles, and cans was worse because of crumpled,
curved, or shiny surfaces. These surfaces caused many light
reflections and hindered performance of the skewed barcode
localization and scanning. The percentages of the skewed
barcode localization and scanning were better on boxes due to
smoother surfaces. Quite expectedly, the sharpness of a frame
also makes a positive difference in that the algorithm
performed much better on sharp images in each product
category. Specifically, on sharp images, the algorithm
performed best on boxes with a true positive percentage of
54%, followed by bags at 44%, bottles at 32%, and cans at
As Fig. 7 shows, that the average scan time was lowest for
boxes and largest for cans. This finding is in line with the
results in Fig. 6. The average scan times for cans and bags
were significantly longer than for boxes. The average scan
time for bottles was longer than for boxes but shorter than for
cans and bags. However, the explanation is not limited to
crumpled, curved, and shiny surfaces. Another reason for the
slower scan times on individual products in each product
category is the availability of Internet connectivity at various
locations in the supermarket. During our scan experiments in
the supermarket, we noticed that at some areas of the
supermarket the Internet connection did not exist, which
caused delays in barcode scanning. For several products, a 10-
or 15-step change in location within a supermarket resulted in
a successful barcode scan.
The skewed barcode scanning algorithm presented in this
paper targets medium- to high-end mobile devices with single
or quad-core ARM systems. Since cameras on these devices
capture several frames per second, the algorithm is designed to
minimize false positives rather than maximize true ones,
because, at such frequent frame capture rates, it is far more
important to minimize the processing time per frame. In other
words, the algorithm is highly specific, where specificity is the
percentage of true negative matches out of all possible
Figures 8, 9, 10, and 11 demonstrate the specificity of the
algorithm on all four categories of products. In all product
categories, the true negative and false positive percentages are
0, which means that the algorithm accurately does not
recognize barcodes in images that do not contain them. In all
categories, the false negative percentages are relatively high.
This is done by design. The algorithm is designed to be
conservative in that it rejects the frames on the slightest
chance that it does not contain any barcode. While this
increases false negatives, it keeps both true negatives and false
positives close to zero.
Figure 8. Performance on sharp box images
Figure 9. Performance on sharp can images
Figure 10. Performance on sharp bag images
A limitation of the current implementation of this
algorithm which was discovered during our field experiments
in a supermarket is that there is no run-time checking for the
availability of Internet connectivity. If such checking is
implemented, the user can be quickly notified that there is no
Internet connectivity and the frame grabbing from the video
stream can be temporarily halted.
Figure 11. Performance on sharp bottle images
Another limitation of the current implementation is that it
does not compute the blurriness of the captured frame before
sending it to the cluster for barcode localization and detection.
As shown in Fig. 6, the scanning results are substantially
higher on sharp images than on blurred images. This limitation
points to a potential improvement that we plan to implement
in the future. When a frame is captured, its blurriness
coefficient can be computed on the smartphone and, if it is
high, the frame should not even be sent to the cluster. This
improvement will also reduce the load on the cluster and may
increase response times.
Another approach to handling blurred inputs is to improve
camera focus and stability, both of which are outside the scope
of this algorithm, because it is, technically speaking, a
hardware problem. It is likely to work better in later models of
smartphones. The current implementation on the Android
platform attempts to force focus at the image center but this
ability to request camera focus is not present in older Android
versions. Over time, as device cameras improve and more
devices run newer versions of Android, this limitation will
have less impact on recall but it will never be fixed entirely.
Finally, we would like to implement a tighter integration
of the current system with proactive nutrition management.
Specifically, we plan to integrate skewed barcode scanning
with a wireless glucometer so that nutrition intake recording
can be coupled with glucometer readings. This will allow
users to actively monitor the impact that various foods have on
the level of glucose in their blood.
 Anding, R. Nutrition Made Clear. The Great Courses,
Chantilly, VA, 2009.
 D. Berreby. “The obesity era.” The Aeon Magazine. June
 Rubin, A. L. Diabetes for Dummies. 3rd
Publishing, Inc. Hoboken, New Jersey, 2008.
 Eirik Årsand, E., Tatara, N., Østengen, G. and Gunnar
Hartvigsen, G. “Mobile phone-based self-management
tools for type 2 diabetes: the few touch application.”
Journal of Diabetes Science and Technology, Vol. 4,
Issue 2, March 2010.
 Graham, D. J., Orquin, J. L., and Visshers, V. H. M. “Eye
tracking and nutritional label use: a review of the
literature and recommendations for label enhancement.”
Food Policy, vol. 32, pp. 378-382, 2012.
 Kulyukin, V., Kutiyanawala, A., and Zaman, T. "Eyes-
Free Barcode Detection on Smartphones with Niblack's
Binarization and Support Vector Machines." In
Proceedings of the 16-th International Conference on
Image Processing, Computer Vision, and Pattern
Recognition ( IPCV 2012), Vol. I, pp. 284-290, CSREA
Press, July 16-19, 2012, Las Vegas, Nevada, USA. ISBN:
 Kulyukin, V. and Zaman T. "Vision-Based Localization
of Skewed UPC Barcodes on Smartphones." In
Proceedings of the International Conference on Image
Processing, Computer Vision, & Pattern Recognition
(IPCV 2013), pp. 344-350, pp. 314-320, ISBN 1-60132-
252-6, CSREA Press, Las Vegas, NV, USA.
 Tekin, E. and Coughlan, J. “A mobile phone application
enabling visually impaired users to find and read product
barcodes.” In Proceedings of the 12th International
Conference on Computers Helping People with Special
needs (ICCHP'10), Klaus Miesenberger, Joachim Klaus,
Wolfgang Zagler, and Arthur Karshmer (Eds.). Springer-
Verlag, Berlin, Heidelberg, pp. 290-295, 2010.
 Wachenfeld, S., Terlunen, S., and Jiang, X. "Robust
recognition of 1-D barcodes using camera phones." In
Proceedings of the 19th
International Conference on
Pattern Recognition (ICPR 2008), pp.1-4, Dec. 8-11,
2008. ISSN: 1051-4651, IEEE.
 Adelmann R., Langheinrich M., Floerkemeier, C. A
Toolkit for BarCode Recognition and Resolving on
Camera Phones - Jump Starting the Internet of Things.
Workshop on Mobile and Embedded Information Systems
(MEIS’06) at Informatik 2006, Dresden, Germany, Oct
 Gallo, O.; Manduchi, R., "Reading 1D barcodes with
mobile phones using deformable templates." IEEE
Transactions on Pattern Analysis and Machine
Intelligence, Vol.33, no.9 ,pp.1834-1843,Sept.2011.
 Lin, D. T., Lin, M.C., and Huang, K.Y. 2011. “Real-time
automatic recognition of omnidirectional multiple
barcodes and DSP implementation.” Mach. Vision Appl.
Vol 22, num. 2, pp. 409-419, March 2011.
 Zxing, http://code.google.com/p/zxing/.
 Fog, B.J. "A behavior model for persuasive design." In
Proc. 4th International Conference on Persuasive
Technology, Article 40, ACM, New York, USA, 2009.
 Mobile Supermarket Barcode Videos.
 Tong, H., Li, M., Zhang, H., and Zhang, C. "Blur
detection for digital images using wavelet transform," In
Proceedings of the IEEE International Conference on
Multimedia and Expo, Vol.1, pp. 27-30, June 2004.
 Nievergelt, Y. Wavelets Made Easy. Birkhäuser, Boston,