SlideShare a Scribd company logo
Identifying shapes with the Digital Foam
surface
by
Barry James Smalldon
A thesis submitted for the degree of
Bachelor of Information Technology (Honours)
Supervisor Dr Ross T. Smith
Wearable Computer Lab
School of Computer and Information Science
May 2014
2
Abstract
Digital Foam is a new computer input device that provides developers of prototyping, wearable or
innovative computer systems the opportunity to interface with a computer in a novel and unique
fashion. The input device can be deformed, moulded or pressed, with the resultant physical shape
captured by a computer.
This dissertation presents a shape recognition process for objects pushed in the Foam sensor. A
sample set of plastic shapes was created and used with a small, square array prototype of the Digital
Foam for testing of this work. The push, or press, of an object into the foam sensor was converted to
an image for the purposes of shape recognition. Using these images to create a library of shapes,
and then utilizing this library to identify other shapes is the main body of my study. Initially, this
was realised with simple image comparison processes, but for a more enhanced and flexible
application, Machine Learning techniques are also discussed. Suitable, off the shelf type Machine
Learning methods were analysed for this application; with the adaptation of one type of machine
learning process applied in software. The realisation of the thesis was the design and
implementation of image comparison software in C#.
The first outcome of the thesis was the application of a shape identification algorithm, by way of
image comparisons, to capture a sample shape, convert it to a grey-scale image and recognize it
from a library.
The second outcome was the implementation of Machine Learning. This involved not only
recognising known shapes successfully; but also the ability to recognise an unknown shape, add this
shape to the library (expanding shape knowledge) and then successfully identify the shape
henceforth. The OpenCV machine learning library was found to be a suitable and easily adaptable
machine learning system; particularly as it already includes many algorithms for image recognition
and there were many examples of OpenCV for object detection.
The two variants of learning and expanding shape knowledge were measured. The results of the
image template comparator showed correct recognition when two shapes are similar, as expected,
but if shape orientation or placement on the Foam Sensor differed somewhat; it failed to identify the
shape correctly.
3
The results for the Machine Learning algorithm showed mixed results. Evaluations showed average
image comparison matches when orientation and alignment to the Foam Sensor was identical to the
library image, but interestingly some image recognition did occur when orientation or alignment
differed between sample and library image. These results suggest that both simple comparisons and
Machine Learning processes have strengths when performing image comparisons. More
importantly, my dissertation shows that shape identification with Digital Foam by image
comparison is entirely possible.
4
Contents
1 Introduction.................................................................................................................................10
2 Research Question ......................................................................................................................16
3 Literature Review .......................................................................................................................17
3.1 Tangible User Interfaces .....................................................................................................17
3.2 Organic User Interfaces.......................................................................................................19
3.3 Deformable User Interfaces.................................................................................................21
3.4 Digital Foam........................................................................................................................22
3.5 Machine Learning................................................................................................................23
3.6 Machine Learning with OpenCV ........................................................................................27
4 Research Design .........................................................................................................................28
4.1 Digital Foam Hardware Overview and Apparatus..............................................................28
4.2 Shape Selection ...................................................................................................................30
4.3 Design of Simple Image Comparator..................................................................................32
4.4 Design and Theory of Machine Learning Algorithm..........................................................34
4.4.1 Initial Process - the Creation of a Library of Knowledge............................................35
4.4.2 Application of Suitable Machine Learning Algorithm ................................................36
4.4.3 Application of K-Nearest Neighbours to Image Recognition......................................40
4.5 Selection of Software and Framework................................................................................41
5 Implementation...........................................................................................................................42
5.1 Simple Image Comparator...................................................................................................46
5.1.1 Evaluation ....................................................................................................................46
5.1.2 Results..........................................................................................................................47
5.1.3 Discussion....................................................................................................................48
5.2 Machine Learning Image Comparator ................................................................................50
5.2.1 Evaluation ....................................................................................................................51
5.2.2 Results..........................................................................................................................51
5
5.2.3 Discussion....................................................................................................................52
6 Conclusion..................................................................................................................................53
7 References...................................................................................................................................54
8 Appendix.....................................................................................................................................60
8.1 Software Requirements .......................................................................................................60
8.2 Software Installation............................................................................................................61
8.2.1 .NET Framework on a Windows 7 Computer .............................................................61
8.2.2 OpenCV 2.4.6 or Later.................................................................................................62
8.2.3 Arduino UNO USB Driver for Windows ....................................................................62
8.2.4 The Digital Foam Reader Application.........................................................................63
8.3 Method of Operation ...........................................................................................................64
8.3.1 Run SerialFoam.exe....................................................................................................64
8.3.2 Connect to the Digital Foam Sensor via USB port......................................................64
8.3.3 Load or Create Image Library......................................................................................65
8.3.4 Sampling a New Shape for Recognition Purposes.......................................................68
8.3.5 Compare the Push Sample to the Image Library .........................................................68
8.3.6 The Image Comparison Results...................................................................................70
8.4 DVD Contents.....................................................................................................................71
6
List of Figures
Figure 1: (a) A tablet computer touchscreen and (b) an Android Phone touchscreen in use.............10
Figure 2: Lumen - an interactive display (permission requested)......................................................11
Figure 3: Direct manipulation and Gestural interaction with Recompose (Image courtesy of M.
Blackshaw).........................................................................................................................................12
Figure 4: An example of Digital Foam..............................................................................................12
Figure 5: (a) Planar and (b) Hemispherical Digital Foam. ................................................................13
Figure 6: (a) and (b) Spherical Digital Foam.....................................................................................13
Figure 7: Digital Foam being used to sculpt a 3D model like clay....................................................14
Figure 8: The FANUC Robot M-410iB Palletizing Industrial Robot.1
.............................................15
Figure 9: A user interacting with the Sandscape TUI (Image courtesy of H. Ishii). .........................18
Figure 10: The Illuminating Clay TUI – The user is directly forming/sculpting the geographic
landscape (Image courtesy of H. Ishii). .............................................................................................18
Figure 11: The ReacTable TUI – Creates sounds by the location and proximity of phicons on the
surface (permission requested)...........................................................................................................19
Figure 12: The Nokia Kinetic Prototype Mobile Phone (Image courtesy of J. Kildal). ....................20
Figure 13: The Gummi bendable handheld computer (permission requested)..................................20
Figure 14: Murakami and Nakajima’s concept of a 3D Shape Deformation device (Image courtesy
of T. Murakami).................................................................................................................................21
Figure 15: The conductive sensors of Digital Foam..........................................................................22
Figure 16: Demonstration of Digital Foam........................................................................................23
Figure 17: The basic Machine Learning process.3
.............................................................................24
Figure 18: Machine Learning Algorithm classes (image adapted from
http://bitsearch.blogspot.com.au/2011/02/supervised-unsupervised-and-semi.html)........................25
Figure 19: Pushing a shape into Digital Foam and recognising the shape by Digital Foam Reader.28
Figure 20: (a) Digital Foam Prototype - constructed as an Arduino Shield, and (b) the 4 x 4 Digital
Foam apparatus used in this application. ...........................................................................................28
Figure 21: Each Conductive Foam Tube produces a variable output according to depth of
deformation........................................................................................................................................29
Figure 22: The Output from each Conductive Foam Tube is converted to a Grey Scale value by
Digital Foam Reader, according to depth of push, (a) is an ideal scale, (b) is the actual scale used. 29
Figure 23: (a) 'L' Shape, (b) 'T' Shape, (c) ‘Sphere’ and (d) ‘Flat Cylinder’ Test Shape designs. ....30
Figure 24: (a) - ‘L’ Shape, (b) - ‘T’ Shape, (c) - ‘Sphere’ and (d) - ‘Flat Cylinder’ plastic samples.31
Figure 25: The Image Library of the four sample shapes from Figure 24.........................................32
7
Figure 26: The Push Sample Image that will be compared to the Image Library. ............................32
Figure 27: The Image Comparison process .......................................................................................33
Figure 28: The Library Sample chosen to be the correct image. .......................................................33
Figure 29: Flowchart of the Shape Recognition Software.................................................................34
Figure 30: (c) The resultant image created by Digital Foam Reader.................................................35
Figure 31: (a) 8 Images in the image library and (b) a 15 image library stored by Digital Foam
Reader. ...............................................................................................................................................36
Figure 32: Diagram of an Artificial Neural Network (ANN).5
..........................................................39
Figure 33: Microsoft Visual Studio early Development of Digital Foam Reader.............................42
Figure 34: Digital Foam Reader application showing a sample push image.....................................43
Figure 35: (a) - the ‘L’ Shape test object, (b) - the sample being pushed into foam and, (c) - the
output grey-scale image capture of this push sample. .......................................................................44
Figure 36: (a) - the ‘T’ Shape test object, (b) - the sample being pushed into foam and, (c) - the
output grey-scale image capture of this push sample. .......................................................................44
Figure 37: (a) - the ‘Sphere’ Shape test object, (b) - the sample being pushed into foam and, (c) - the
output grey-scale image capture of this push sample. .......................................................................44
Figure 38: (a) - the ‘Flat Cylinder’ Shape test object, (b) - the sample being pushed into foam and,
(c) - the output grey-scale image capture of this push sample...........................................................45
Figure 39: Steps to Train a learning machine. ...................................................................................45
Figure 40: (a) The Push Sample Image, (b) Closest matching Library Image and (c) Result of the
Comparator process............................................................................................................................46
Figure 41: (a) The four Test Shapes and (b) The four Unknown Shapes to be compared in this
evaluation...........................................................................................................................................47
Figure 42: (a) a soft Push Sample, (b) a suitable Push Sample and (c) a Push Sample with excessive
depth...................................................................................................................................................49
Figure 43 Poor Shape orientation to Foam Sensor ............................................................................49
Figure 44: Image match accuracy from (a) poor, to (b) average, to (c) good and (d) excellent........50
Figure 45: The 4x4 Digital Foam Sensor used in this Dissertation. ..................................................60
Figure 46: Microsoft .NET Framework Version 4.5.1 is installed on this computer. .......................61
Figure 47: The Arduino Uno is connected to Serial Port COM14.....................................................62
Figure 48: A typical Digital Foam Reader PC environment.............................................................63
Figure 49: Digital Foam Reader application in use...........................................................................64
Figure 50: Serial Port textbox and Connect to Digital Foam Button (com Port may vary)..............64
Figure 51: A successful connection to the device will display in the received data box..................65
Figure 52: Load a previously saved image library with this Button.................................................65
8
Figure 53: Loading a saved image library ........................................................................................65
Figure 54: (a) a four image library, (b) an eight image library and (c) a fifteen image library. ........66
Figure 55: A list of all commands Digital Foam accepts..................................................................66
Figure 56: (a) The command that is sent to Digital Foam with Send Command Button (b).............66
Figure 57: Numerical Data received from Digital Foam; 8 push samples are displayed. ................67
Figure 58: Grey Count. The sum of grey scale data in the image......................................................67
Figure 59: A push sample image........................................................................................................67
Figure 60: Save Sampled Shape button .............................................................................................68
Figure 61: The two types of comparison processes ...........................................................................68
Figure 62: Results of the Comparator processes................................................................................69
Figure 63: Machine Learning comparison considers this an excellent image match. .......................69
Figure 64: Machine Learning comparison showing a poor image match..........................................69
Figure 65: Compare a single image from any library of shapes to the current Sampled Shape........70
Figure 66: Results from both image comparison methods as displayed on Digital Foam Reader....70
Figure 67: Contents of DVD..............................................................................................................71
Figure 68: The SerialFoam Application.............................................................................................71
9
List of Tables
Table 5-1 Test results for the Comparator Evaluation.......................................................................48
Table 5-2: Test results for the Machine Learning Comparator Evaluation. ......................................51
10
1 Introduction
Computer input devices support the interface between human and computer. This field of study has
evolved (along with the evolution in computing power, design technology and materials) into what
is known today as Human Computer Interaction (HCI). The term HCI has been in widespread use
since the early 1980's (Dix et al. 2004). An exciting and ongoing area of research in the HCI field is
exploring new devices that allow humans to interact with computer systems, beyond the ubiquitous
mouse (Engelbart & English 1968), the keyboard and the Graphic User Interface (GUI) (Myers
1998).
New improvements in technology and materials (e.g. touch sensitive surfaces) have given rise to
new ways of interacting with a computer or electronic device. Recently, developments in touch
screen technology have made tablet PC’s (Figure 1(a)) and Smartphones (Figure 1(b)) immensely
popular. New terms have been introduced - such as Tangible Interfaces (Ishii & Ullmer 1997),
Organic Interfaces (Vertegaal & Poupyrev 2008) and even more recently Deformable User
Interfaces (Kildal 2012).
(a) (b)
Figure 1: (a) A tablet computer touchscreen and (b) an Android Phone touchscreen in use.
These new approaches to computer interaction represent a paradigm shift in how the sense of touch
and feel are incorporated into the GUI. Expanding from solid touch sensitive surfaces, new
innovations are supporting interactions in three dimensions; now it is possible to interact with
computers using non-rigid, malleable, pliable or deformable surfaces. We push, pull, twist, turn,
shape things and mould surfaces naturally; and these actions can be directly integrated into a
computer system.
11
One area of interest to this dissertation is haptic interfaces. A haptic interface is a feedback device
that generates sensation to the skin and muscles, including a sense of touch, weight and rigidity
(Iwata et al. 2001). There are several examples of research in this field. For instance, Poupyrev et al.
(2004) developed Lumen (Figure 2). Lumen is an interactive display that presents visual images and
physical, moving shapes, both controlled independently. The smooth, organic physical motions
provide aesthetically pleasing, calm displays for ambient computing environments. Users interact
with Lumen directly, forming shapes and images with their hands.
Figure 2: Lumen - an interactive display (permission requested).
Another related area is shape displays. For example, Blackshaw et al. (2011) developed Recompose
(Figure 3), a new system for manipulation of an actuated surface:
Recompose is a framework allowing direct and gestural manipulation of the physical
environment. Recompose complements the highly precise, yet concentrated affordance of direct
manipulation with a set of gestures allowing functional manipulation of an actuated surface
(Blackshaw et al. 2011).
An important note mentioned by this paper is that direct manipulation of an actuated surface allows
us to precisely affect the material world, where the user is guided throughout the interaction by
natural haptic feedback (Blackshaw et al. 2011). This is a relevant fact to my research; our ability to
express ourselves with our hands and utilizing direct manipulation in more than 2 dimensions
provides excellent feedback to the user and a pleasing 'hands-on' feeling of interaction.
12
Figure 3: Direct manipulation and Gestural interaction with Recompose (Image courtesy of M. Blackshaw).
Ishii (2008) describes how, when interacting with the (2D) GUI world, the user does not utilize their
skill in evolved dexterity or direct manipulating of physical objects with our hands (such as building
blocks or clay models). Ishii reasons that this is where the fields of Tangible, Organic and
Deformable User Interfaces could be able to provide a user with a 'seamless coupling' between the
physical environment and the computer generated environment.
Digital Foam (Figure 4) is a device which fits into the category of a Deformable User Interface
(DUI). This Foam interface is active and the interactions with it are recordable (Smith, Thomas &
Piekarski 2008a). It provides developers of prototyping, wearable or innovative computer systems
the opportunity to interface with a computer in a new and unique fashion. This is an example of a
computer interface where the user has a more natural, or non-rigid interaction with it; like
interacting with clay or sponge material.
Figure 4: An example of Digital Foam.
13
Digital Foam can be pushed, shaped and moulded into shapes that are recorded in 3D space with a
computer. Smith, Thomas & Piekarski (2008b) developed six methods of HIC with Digital Foam;
all of these methods involved the use of Digital Foam as a standalone input device without the need
for a keyboard or mouse. Some examples of Digital Foam's versatility uses include free-form digital
sculpting, controlling a video camera's aspect (zoom, pan & tilt) and driving a custom made on-
screen menu (Smith, Thomas & Piekarski 2008b). Digital Foam can be a very versatile input device
as its construction can be altered to fit different form factors such as planar (Figure 5(a)),
hemispherical (Figure 5(b)) and spherical (Figure 6), or any shape that polyurethane foam can be
moulded into. In Figure 7 the Digital Foam input device is manufactured in a spherical shape – the
user holds the device with both hands like a ball of clay and pushes and shapes with it.
(a) (b)
Figure 5: (a) Planar and (b) Hemispherical Digital Foam.
(a) (b)
Figure 6: (a) and (b) Spherical Digital Foam.
14
As Digital Foam is a relatively new invention, interaction techniques and devices that incorporate
the Foam sensor in their design are being actively developed. Figure 7 displays an example of a
computer modelling interface, showing Digital Foam's versatility in interaction with a dynamic
shaping environment, where the user is sculpting a digital model using the spherical input device.
Figure 7: Digital Foam being used to sculpt a 3D model like clay.
This dissertation is interested in exploring a new use of the Digital Foam Sensor to identify
characteristics of objects that touch its surface.
Consider the scenario of covering the entire surface of a robotic arm with flexible Digital Foam,
producing a sensor that is similar to human skin, enabling a robot to inherit a sense of touch. This
will enable a robotic arm to detect physical objects or collisions (e.g. Figure 8); creating habitat
awareness without the use of computer vision techniques - such as environments with low or poor
vision quality like factory production lines or mining spaces. If particular shapes and characteristics
can be recognised; the robot could respond accordingly. That is, if a robot could tell the difference
between a human body and the chassis of a new car, the robot could change its behaviour and cease
all actions, potentially stopping a workplace accident or injury from occurring. My research into
shape identification with the Foam surface is motivated by this example.
15
Figure 8: The FANUC Robot M-410iB Palletizing Industrial Robot.1
Previous interaction techniques with Digital Foam did not involve any shape recognition or learning
methods. The aim of this study is to create a form of shape learning application for Digital Foam.
Included in this is some machine learning intelligence to recognize shape pushes into the foam
Sensor, giving it the ability to learn about its environment and expand the knowledge of its
surroundings over time.
1
http://www.abelwomack.com/warehouse-products/industrial-robots/fanuc-robots/m-410ib/
E.g. If the metal surrounds of this
Robot Arm were covered in Digital
Foam; any unexpected physical
contact could be considered an error;
causing the Robot to react
accordingly (without the need for
vision sensors that could be hindered
by environmental conditions).
16
2 Research Question
Can different shaped objects be detected when pushed into the Digital Foam Sensor?
The aim of this dissertation is to develop a shape recognition system for Digital Foam. As the
Foam sensor consists of a deformable surface which, when an area of the surface is pushed, all
surrounding sensors send a depth reading with respect to force applied, the aim is to identify
rounded or smooth edges of geometric shapes. Two promising methods of identification are image
comparison and Machine Learning.
Image comparison template matching algorithms are a potential solution that can be adapted for
recognition of shapes. Another possible approach is to adapt existing Machine Learning methods to
identify shapes in the Foam Sensor. The use of the supervised learning process with training data is
a means by which I will explore the recognition of shapes pushed into Digital Foam.
17
3 Literature Review
A review of relevant literature for this dissertation is provided in this section.
Given the research question; the literature review is broken into six sections:
1. Tangible and User Interfaces
2. Organic User Interfaces
3. Deformable User Interfaces
4. Digital Foam
5. Machine Learning
6. Machine Learning with OpenCV
3.1 Tangible User Interfaces
Conventionally, the computer mouse and keyboard have clear boundaries between user input and
the resultant output on a computer screen or Graphical Environment: the user moves a mouse which
in turn moves a cursor on a screen; similarly the keyboard is pressed which in turn inputs text onto a
page. Tangible User Interfaces begin to bridge the gap between virtual systems and the physical
environment by employing props to enhance user interactions; becoming closer than a mere
dictation or pointer based interface. This is the goal of the Tangible Media Lab of MIT, run by
Professor Hiroshi Ishii.
Ishii and Ullmer (1997) introduced the term 'Tangible User Interfaces' (TUIs). Their view is of a
computer interface or environment that is immersive with humans, because we interact with our
natural world in more ways than just pointing and clicking on a 2D surface. TUIs are user interfaces
employing real world objects as physical interfaces to digital information (Ullmer & Ishii 1997).
This is a description of a physical tool, or prop, which is the interface. For example, a real eraser
could be used to delete virtual objects or a real pencil could be used to draw in a virtual
environment. These physical props became to be known by the term ‘phicon’ – a physical icon.
A major point of note about TUIs is that they are relatively specific interfaces tailored to certain
types of applications in order to increase the directness and intuitiveness of interactions (Ishii
2008a).The following are three such examples of TUI's.
18
The Sandscape User Interface was created by Ishii's Tangible Media Group at MIT (Ishii 2008b).
Users alter the form of the landscape model by manipulating sand while seeing the resultant effects
of computational analysis generated and projected onto the surface of sand in real time, (as shown
in Figure 9) the user is directly manipulating the sand, his movements are mapped and projected
onto the screen.
Figure 9: A user interacting with the Sandscape TUI (Image courtesy of H. Ishii).
Another example of a TUI is the Illuminating Clay User Interface. The topography of a clay
landscape model can be sculpted, shaped and expanded, while the changing geometry is captured in
real-time by a ceiling-mounted laser scanner (Piper, Ratti, & Ishii 2002b).Users of this GUI would
be in the field of Geographic Information Systems, environmental engineering and landscape design
(Ratti et al. 2004; Ishii et al. 2004; Piper, Ratti, & Ishii 2002a). Figure 10 displays a user
interacting with Illuminating Clay.
Figure 10: The Illuminating Clay TUI – The user is directly forming/sculpting the geographic landscape (Image
courtesy of H. Ishii).
The ReacTable User Interface is a novel, multi-user electro-acoustic musical instrument with a
tabletop Tangible User Interface. Several simultaneous performers share complete control over the
instrument by moving physical artefacts on the table surface while constructing different audio
topologies in a kind of tangible modular synthesizer (Kaltenbranner et al. 2006; Jorda et al. 2007).
As shown in (Figure 11), the ReacTable has become a commercial product used by musicians in
studios and live performances.
19
Figure 11: The ReacTable TUI – Creates sounds by the location and proximity of phicons on the surface
(permission requested).
3.2 Organic User Interfaces
Holman et al. (2013) states that the art of user interface design is on the cusp of a revolutionary
change; one that will require designers to think about the effect of a material and a form on a
design. This change also includes the parts of our body we use for interaction. That is, not just the
fingers but the palms, hand, arm or even the entire body are potentially usable (Rekimoto 2008).
Additionally, there has been a recent increase in interest toward using physical motion of real
objects as a communication medium (Parkes, Poupyrev & Ishii 2008).
The above factors of technology: using more body parts for interaction and physical motion, have
combined into a term known as ‘Organic User Interfaces’.
Organic User Interface (OUI) is a more recent phrase in the evolution of user interfaces, it was first
suggested by (Vertegaal and Poupyrev 2008).The authors chose the term Organic because of the
technologies that underpin some of the most important developments in this area, that is, organic
electronics, and also because of the inspiration provided by millions of organic shapes that can be
observed in nature; forms of amazing variety, forms that are often transformable and flexible.
Another explanation of the term ‘organic’ is suggested by Schwesig (2008); he simply puts it as an
interface that feels ‘natural’. Vertegaal and Poupyrev also introduced 3 main design themes that
Organic Interfaces should follow:
1. Input equals output - input actions from the user can be output onto the same object
2. Function equals form - The form of an object clearly determines its ability to be used as an
input
3. Form follows flow – the context in which the interaction takes place defines the action.
20
An excellent example of an Organic user Interface is the Kinetic Prototype Mobile phone (Figure
12), produced by Nokia researchers and demonstrated for the first time at Nokia World 2011 at
London’s Excel Centre.2
Kildal, Paasovaara & Aaltonen (2012) created the device as a proof of
concept, they state: ‘This prototype is functional as a deformable user interface, meaning that it
detects the deformation input from the user, which can be used to design and test interactions’.
Figure 12: The Nokia Kinetic Prototype Mobile Phone (Image courtesy of J. Kildal).
Another OUI example is the Gummi credit card sized computing device (Schwesig, Poupyrev &
Mori 2003) (Figure 13). The creators of Gummi describe it as:
An interaction technique and device concept based on physical deformation of a handheld
device. The device consists of several layers of flexible electronic components, including
sensors measuring deformation of the device. Users interact with this device by a combination
of bending and 2D position control (Schwesig, Poupyrev & Mori 2003).
The authors indicate how the conventional Windows, Icons, Mouse, Pointer (WIMP) interface is
not practical on smaller, mobile devices. As devices and screens become smaller, pointing and
clicking on small interface elements becomes increasingly difficult. This paper indicates researchers
were considering the limitations of the window, icon and mouse interface and the possibilities of
hands on manipulation of a device would allow types of functionality that would not be possible in
a WIMP interface.
Figure 13: The Gummi bendable handheld computer (permission requested).
2
http://conversations.nokia.com/2011/10/28/nokia-kinetic-bendy-phone-is-the-next-big-thing-video/
21
3.3 Deformable User Interfaces
The term Deformable User Interface (DUI) was first suggested by Kildal (2012); he describes the
DUI as being placed in the intersection between Organic and Tangible User Interfaces. They consist
of physical objects that are intended to be grasped and manipulated with the hands in order to
interact with a system. The manipulation of a DUI results in the physical deformation of the
material the object is made of. Thus, deforming the interface elastically or plastically is the
distinctive form of input to the system when using a DUI. Such deformations are designed to give
physical form to the interaction with information. The DUI could also be considered a subset of the
Organic User Interface, as both terms involve the manipulation of raw material to convey
computational information.
The term free-form deformation (or FFD), can be thought of as a method for sculpturing solid
models, first suggested by Sederberg & Parry (1986). Due to the limitations in materials and
manufacturing at the time, their study was an entirely software based application with no external
input devices for shaping and moulding. This study indicates that a sculpturing metaphor for
geometric modelling has been a topic of interest for some time.
Further developments in transducer technology enabled Murakami & Nakajima (1994) to create an
elastic object as an input device. The interface system consists of a real elastic object as a shape
deformation input device and realtime computer graphics. By deforming the object with bare hands
with a tactile feedback, users can manipulate a 3-D shape modelled and displayed on a computer
screen directly and intuitively (Murakami & Nakajima 1994) (Figure 14). This is an early example
that the idea of manipulating a soft, compressible foam object could be useful as an interface
device.
Figure 14: Murakami and Nakajima’s concept of a 3D Shape Deformation device (Image courtesy of T.
Murakami).
22
Murakami's 12 cm cube was made from both electrically nonconductive and conductive
polyurethane foam. To measure device shape deformation, complex calculations were implemented,
the authors discovered that due to the inaccuracy of the conductive foam as a sensor, measured
lengths can be geometrically impossible (Murakami & Nakajima 1994). It was later found that it is
possible to capture geometric shapes, using a variant of the conductive foam materials that was
demonstrated by the Digital Foam Sensor (Smith, Thomas & Piekarski 2008b).
3.4 Digital Foam
An input device that suits the requirements as a Deformable User Interface is Digital Foam. The
development of this DUI is described as a new type of input device that can be used to support
natural sculpting operations similar to those used when sculpting clay (Smith, Thomas, & Piekarski
2008a). Digital Foam consists of a mixture of conducting and insulated polyurethane foam,
arranged in an evenly distributed matrix structure as shown in (Figure 15). Each conductive sensor
produces a digital output when pressed that increases in value depending upon the depth of the
press. This output is repeatable, making Digital Foam suitable for continual use.
Figure 15: The conductive sensors of Digital Foam.
Further explanation and utilisation of Digital Foam, in the domain of user interfaces, is described by
(Smith, Thomas & Piekarski 2008b). Specifically, the authors detail activities that digital foam
could be used for, such as free-form digital sculpting, controlling a video camera's aspect (zoom,
pan & tilt) and driving a custom made on-screen menu. This paper also describes the advances
made by the original inventors with higher resolution Digital Foam (adding more sensors on a ball
shape foam input), additional shapes as inputs such as a half round graspable sensor for
demonstrations (Figure 16), and the results of a user evaluation with this newer high resolution
version.
23
Figure 16: Demonstration of Digital Foam.
3.5 Machine Learning
To define 'Machine Learning', one must define learning with respect to an organic being. Michalski,
Carbonell, & Mitchell describe the learning process as:
the acquisition of new declarative knowledge, the development of motor and cognitive skills
through instruction or practice, the organization of new knowledge into general, effective
representations, and the discovery of new facts and theories through observation and
experimentation (Michalski, Carbonell, & Mitchell 1984, p.12).
There are two basic forms of learning: knowledge acquisition and skill refinement. Knowledge
acquisition describes the process by which an entity obtains new symbolic information and then
applies that information in an effective manner. Skill refinement can be considered as the gradual
improvement of motor and cognitive skills through practice, such as learning to ride a bicycle, play
the piano or a newborn horse taking its first shaky steps shortly after being born.
Therefore, the knowledge acquisition type of Learning is the one that belongs in the field of
artificial intelligence, as this form of learning can be considered an intellectual endeavour, whereas
skill refinement can be seen as a motor coordination task performed by living creatures trying to
move or perform some action. The authors consider skill refinement to be a more non-symbolic
process, such as those studied in adaptive control systems. Samuel (1959) in his paper described
machine learning as a 'Field of study that gives computers the ability to learn without being
explicitly programmed'. Samuel showed how, even in the early computer era of the late 1950's, he
24
used a form of decision tree in his software program to 'learn' how to play the Checkers boardgame
and compete well against a human opponent (and often win).
Hence, knowledge acquisition using machine learning is the field of study of this thesis. The goal of
machine learning is to design and develop algorithms that allow systems to use empirical data,
experience, and training to evolve and adapt to changes that occur in their environment (Kapitanova
& Son 2012). This is ideally what we will be hoping to achieve – a learning application for the
user’s benefit.
Another explanation of Machine Learning is: Given a training set, we feed it into a learning
algorithm (like SVM, Artificial Neural Nets, Logistic Regression, Linear Regression etc). The
learning algorithm then outputs a function, which for historical reasons is called the hypothesis and
denoted by h.3
As shown in (Figure 17), the hypothesis’ job is to take a new input and give out an
estimated output or class. Or simply - The hypothesis can be thought of as a machine that gives a
prediction y on some unseen input x.
Figure 17: The basic Machine Learning process.3
3
http://onionesquereality.wordpress.com/2009/03/22/why-are-support-vectors-machines-called-so/
25
The parameters that define the hypothesis are what are 'learned' by using a training set of either
labeled, unlabeled or partially labelled data. Machine learning falls into three main categories:
Supervised, Unsupervised and Semi-Supervised learning data (Figure 18).
Figure 18: Machine Learning Algorithm classes (image adapted from
http://bitsearch.blogspot.com.au/2011/02/supervised-unsupervised-and-semi.html).
Unsupervised Learning
Unsupervised learning is when there is no labelled data available for training. This method is used
when a general 'structure' to the data is sought; rather than an exact name for each item. Examples
of this are often clustering methods.
Supervised Learning
In this case the training data exists out of labelled data: All samples in the training set have a label,
or name. The problem you solve here is often predicting the labels for data points without a label.
Semi-Supervised Learning
In this case both labeled data and unlabeled data are used. Generally a small amount of labelled data
is associated with a large amount of unlabelled data. This method is used for example, when the
training set is large but has many examples of a similar type; a time/cost saving benefit can be
achieved by not labelling every single item in the dataset.
Supervised Machine Learning is the process by which an algorithm uses manually labelled data to
train a training set; which is then used in the validation (or hypothesis) process to describe, or
classify, the input variable.
26
For this thesis' application; Identifying Shapes with the Digital Foam surface; I have selected
Supervised Learning as the appropriate method of Machine Learning. This is due to the fact that the
training set is considered relatively small in size and the idea that we want a definite result or
description of the new shape sample; I don't want to cluster the data, I want to identify it. Some
examples of existing frameworks, all of which contain versions of Supervised Learning Algorithms
that could be suitable for this thesis are:
 Torch7: A Matlab-like Environment for Machine Learning - a versatile numeric computing
framework and machine learning library that extends Lua (Collobert, Kavukcuoglu &
Farabet 2011).
 Bob: A free signal processing and machine learning toolbox for researchers – a researcher-
friendly Python environment for rapid development, yet remains efficient for processing
large amounts of multimedia data through the use of fast C++ implementations (Anjos et al.
2012).
 OpenCV: An open source computer vision library created by INTEL Research in 1999
(Bradski & Kaehler 2008); contains implementations more than 500 optimized algorithms.
New functionality and algorithms for object detection and general machine learning have
been added by (Druzhkov et al. 2011).
 Emgu Open CV: Emgu CV is a cross platform .Net wrapper to the OpenCV image
processing library. Allowing OpenCV functions to be called from .NET compatible
languages such as C#, VB, VC++, Python etc. The wrapper can be compiled in Mono and
run on Windows, Linux, Mac OS X, iPhone, iPad and Android devices.4
4
http://www.Emgu.com/wiki/index.php/Main_Page
27
3.6 Machine Learning with OpenCV
There are a number of 'off the shelf' Machine Learning environments available, that could be
suitably applied for this dissertation. My studies have found several instances where OpenCV has
been adapted for intelligent system learning scenarios such as image recognition, human feature
detection and environment awareness for robots or automated systems. Flynn, De Hoog & Cameron
(2009) presented a paper on Machine Learning Applied to Object Recognition in Robot Search and
Rescue Systems. This paper details how they were successful in adapting an OpenCV machine
learning algorithm, know as a Decision Tree, for the detection of human faces and common
obstacles in disaster zones, enabling a robot to traverse a disaster zone and provide as much
information as possible on the location and status of survivors. Pahalawatta & Green (2013) detail
the use of OpenCV and Machine Learning to detect the diffusion of a photoelectric beam caused by
small airborne particles. This application was suggested as a possible improvement in household
smoke detectors. As a silicon photodiode receives the infrared beam of light scattered by smoke and
dust particles, this received infrared light ray is then converted to an RBG image, and a histogram
sequence from the sampled images is created. This histogram sequence is then compared to seven
known smoke particle type histogram sequences to determine the presence of smoke. The authors
applied the results to two widely used supervised classification algorithms (Multiple Discriminant
Analysis and K-nearest neighbours), and concluded that their process was an improvement over
current photoelectric smoke detection methods.
28
4 Research Design
This section details the research into shape recognition with Digital Foam. Figure 19 describes the
process of recognising a shape pushed into Digital Foam. To accomplish the task of shape
recognition, I explain each of the aspects described in Figure 19:
1. The Digital Foam apparatus and its operation
2. The selection of Test Shapes
3. Machine Learning theory - specifically Image Recognition by Machine Learning
4. The selection of a Software language and Framework.
Figure 19: Pushing a shape into Digital Foam and recognising the shape by Digital Foam Reader
4.1 Digital Foam Hardware Overview and Apparatus
Other samples of Digital Foam are constructed with a larger or non-square array of sensors, the 4x4
example is used in this Thesis as a prototype version; it is envisaged that the software code and
application can be reconfigured for these larger Digital Foam examples. I will use the 4x4 Digital
Foam Prototype, (as shown in Figure 20(a) as an Arduino Shield and Figure 20(b) the device
housed in a plastic case) to prove my software application is capable of the required results.
(a) (b)
Figure 20: (a) Digital Foam Prototype - constructed as an Arduino Shield, and (b) the 4 x 4 Digital Foam
apparatus used in this application.
29
When the Digital Foam sensor is physically pressed (or deformed), an output of numerical data for
each conductive tube in the foam construction is produced. As shown in (Figure 21) there is a 4 x 4
array of conductive foam sensors(dark grey tubes of foam), housed within a 50mm square block of
non-conductive foam (light grey foam in Figure 21). 16 digitised outputs are produced. The depth
of the foam in (Figure 21) is 25mm. Ideally, a depth reading of 0mm would return 1024, and a full
depth press reading of 25mm would return 0 (Figure 22(a)). Due to physical limitations when
compressing Digital Foam, a full depth press produced a numerical value anywhere from 0 to 512,
hence 512 was considered the cut off point for a full press , with lighter presses scaled linearly back
to 1024 for no press reading (Figure 22(b)).
Figure 21: Each Conductive Foam Tube produces a variable output according to depth of deformation.
(a) Pushing a shape into Foam with an Ideal depth reading (b) Scaled reading
Figure 22: The Output from each Conductive Foam Tube is converted to a Grey Scale value by Digital Foam
Reader, according to depth of push, (a) is an ideal scale, (b) is the actual scale used.
25 mm thick Digital Foam
ensor
30
4.2 Shape Selection
The aim is to be able to discern between some simple example shapes. For testing purposes, I have
chosen four different shaped objects (Figure 23). These are the 'Test Shapes' used in this
dissertation. The shapes have been designed with Autodesk Inventor (Shih, R 2012; Waguespack, C
2012) and produced physically with a 3D Printer, each shape is approximately 40mm x 40mm in
size, similar in size to the Digital Foam array in Figure 21. Each shape has a handle to facilitate
pressing into the Foam Sensor for sampling purposes. These shapes provide an excellent test bed to
identify shapes, with a mixture of sharp edges, rounded edges and a spherical specimen.
The examples for the purposes of object recognition I chose are:
 L - Shape
 T - Shape
 Sphere
 Flat Cylinder
(a) (b)
(c) (d)
Figure 23: (a) 'L' Shape, (b) 'T' Shape, (c) ‘Sphere’ and (d) ‘Flat Cylinder’ Test Shape designs.
31
The test shapes were produced in a 3D printer from the Autodesk designs, specifically for this
dissertation:
(a) (b)
(c) (d)
Figure 24: (a) - ‘L’ Shape, (b) - ‘T’ Shape, (c) - ‘Sphere’ and (d) - ‘Flat Cylinder’ plastic samples.
These four shapes provide a mixture of right angle, an intersection, straight and spherical samples.
It is hoped that with these shapes, and multiple push samples of each shape, an excellent test library
will be created.
The orientation of the four test objects will remain constant; no image comparison of a rotation of
the sample objects is initially required. The possibility of orientation matching will be examined
with the Machine Learning process; as some unidentified shapes will be angular rotations of the
sample shapes; to confirm whether orientations of a shape can be recognised by software.
Flat
Cylinder
HandleHandle
Sphere
Handle
L - Shape
Handle
T - Shape
32
4.3 Design of Simple Image Comparator
Initially, a simple comparison algorithm is intended to be used by the application. Utilising some
simple methods in OpenCV we can subtract one image from another, hopefully finding a 'best fit'
image to identify a new shape push by. This process is shown below:
1. The image library is loaded; in this instance the library contained only one sample of each of
the four test shapes (as displayed in Figure 24), namely:
Figure 25: The Image Library of the four sample shapes from Figure 24.
2. A Push Sample is taken:
A shape is pushed into Digital Foam. The deformation of the Foam Sensor by the shape is
captured in the form of a grey scale image.
Figure 26: The Push Sample Image that will be compared to the Image Library.
3. The Sample Image is compared with all images in the Library:
The application cycles through all Library Images, subtracting the Push Sample Image from
each Library Image, using the OpenCV function:
Image<Gray, Byte> resultComparison;
resultComparison = Image1.AbsDiff(Image2[arrayItemCount]);
33
For each resultComparison Grey Scale Image, the quantity of grey scale data is summed
numerically from the scale in Figure 22(b). The lowest amount of data is the closest image
comparison. For the Push Sample Image in Figure 26 Error! Reference source not found.
compared against the four image library in Figure 25, the calculations were:
Library Image 1 Compared to result = grey count = 806
Library image 2 Compared to result = grey count = 966
Library image 3 Compared to result = grey count = 1015
Library image 4 Compared to result = grey count = 317
Figure 27: The Image Comparison process
The lowest number being closest to the Sample Push Image, in this case 317 was the best result.
The result is displayed in Figure 28.
Figure 28: The Library Sample chosen to be the correct image.
Furthermore, if the application did not return a suitable image comparison, it provides the user with
the option to save this image to the image library. This is the process by which a user builds the
library of shapes they believe are useful and relevant for their particular application.
34
4.4 Design and Theory of Machine Learning Algorithm
To describe Machine Learning by a computer; I explain it as 'a method of teaching a computer to
make a prediction or to identify something'. In order for a computer to predict or identify, there are
two main steps to machine learning:
1. Create a library of knowledge: A certain level of knowledge about a particular topic must be
created. The way a computer builds knowledge on something is by the building of a library (or
database); this library is the equivalent of a person's memory - this is what they know about a
certain subject or situation.
2. Apply a suitable algorithm to obtain an outcome: The computer then uses this library
(knowledge) in a logical manner (algorithm) to either classify a situation or to predict an
outcome.
In the context of push recognition for Digital Foam, this process is displayed in the following
Flowchart:
Figure 29: Flowchart of the Shape Recognition Software.
YesNo
Connect to Digital
Foam Device
Load Labelled
Dataset
Add push image to
Library of shapes
Describe image
Get a push image from
Digital Foam, compare to
library of shapes
Match
found
35
This flowchart is a common approach to Machine Learning techniques. This enables me to utilize
existing methods and algorithms for this dissertation, without the need to invent new algorithms,
only the requirement to adapt a suitable existing algorithm for the above process.
4.4.1 Initial Process - the Creation of a Library of Knowledge
If the user pushes a spherical object into the Foam (Figure 30(a)), Digital Foam produces numerical
data corresponding to push depth as shown in (Figure 30(b)), an output grey scale image results
from Digital Foam Reader as per (Figure 30(c)). The differing grey scale squares equate to the
depth of push received from each Conductive Tube on the Foam device.
(a) Pushing a sphere test sample into Digital Foam.
1022 1022 1022 1022
1022 930 905 1022
1022 724 843 1022
1022 1022 1022 1022
(b) The numerical values output from Digital Foam equivalent to depth of push.
Figure 30: (c) The resultant image created by Digital Foam Reader.
36
In the early stages of the application, it can be assumed that the image library is empty - to compare
an image at this stage would be worthless, as the computer has no 'memory' of shapes to select an
appropriate match from. A library of images can be created by the user at this time, for as many
different types of push images they believe are suitable for their situation. This image library is the
training set for the Machine Learning process. Some examples of image libraries are displayed
below, with a 8 image library (Figure 31(a)) and a 15 image library (Figure 31(b)).
Figure 31: (a) 8 Images in the image library and (b) a 15 image library stored by Digital Foam Reader.
As this library grows, the more useful Machine Learning becomes in identifying further push
shapes into the Foam sensor, because the Machine Learning Algorithm has a greater working
knowledge of the situation.
4.4.2 Application of Suitable Machine Learning Algorithm
As mentioned by 5
, OpenCV has a number of Machine Learning Algorithms already implemented
as C++ classes. The Machine Learning Library is a set of classes and functions for statistical
classification, regression, and clustering of data.
The Machine Learning Algorithms implemented by OpenCV in C++ are:
 Normal (Naive) Bayes Classifier
 K-Nearest Neighbours
 Support Vector Machines (SVM)
 Decision Trees
 Boosting
 Gradient Boosted Trees
 Random Forest Trees
37
 Extremely randomized trees
 Expectation Maximization
 Artificial Neural Networks
As Emgu CV is a wrapper library (a layer of code that enables cross language interoperability), not
all the C++ Machine Learning Algorithms have been adapted to C#. The list of Emgu CV C#
Machine Learning Algorithms is smaller 4
, and includes:
 Normal Bayes Classifier
 K Nearest Neighbours
 Support Vector Machine (SVM)
 Expectation-Maximization (EM)
 Neural Network (ANN MLP)
 Mushroom Poisonous Prediction (Decision Tree)
A brief explanation of each of these Algorithms, and the decision whether to apply each one in this
dissertation, follows:
1. Normal (or Naive) Bayes Classifier: The naive Bayes classifier is a term in Bayesian
statistics dealing with a simple probabilistic classifier based on applying Bayes' theorem
with strong (naive) independence assumptions.5
A more descriptive term for the underlying
probability model would be 'independent feature model'. In image recognition, a Bayes
classifier usually works on separate, unconnected features to determine the identity of an
object (such as a face). A widely used classifier in the field of image recognition, and
possibly suitable to my application. As the features are not pronounced enough in my shape
detection application (There are actually not enough features when using the 4x4 digital
foam array prototype), the ability to adapt this classifier to my application was limited, but
with larger Digital Foam arrays a Naive Bayes Classifier could be considered a solution to
shape push identification.
2. K-Nearest Neighbours (K-NN): Useful algorithm for classification. A method for
classifying objects based on closest training examples in the feature space. K-NN is a type
of instance-based learning, or lazy learning where the function is only approximated locally
and all computation is deferred until classification. It can also be used for regression
(Bradski & Kaehler 2008, Page 460). The K-NN algorithm is considered one of the simplest
machine learning algorithms; a test data point is classified according to the majority vote of
its K nearest other data points, in a Euclidian sense of nearness (Bradski & Kaehler 2008,
Page 463). This algorithm could be very useful for Shape Recognition with Digital Foam.
K-NN also performs all computation at the classification stage; this could slow down image
38
processing, but should be sufficient for my application as the images are considered small in
terms of number of pixel comparisons to be made.
3. Support Vector Machine (SVM): A Support Vector Machine is used mostly for
classification between two data sets.5
The data library can be considered to be two sets of
vectors in an n-dimensional space. A SVM will construct a separating hyperplane in that
space, one which maximizes the margin between the two data sets. To calculate the margin,
two parallel hyperplanes are constructed; one on each side of the separating hyperplane,
which are 'pushed up against' the two data sets. As there was only a single data value to
classify images with (in my case, grey scale pixel values), I found I was unable to adapt an
SVM to my particular application.
4. Expectation Maximization (EM): The expectation maximization algorithm enables
parameter estimation in probabilistic models with incomplete data.6
EM is used in statistical
calculations where the equation cannot be solved directly. Latent variables, unknown
parameters or missing data values can be formulated with the assistance of additional
theoretical data points. As I will be working with finite data, this algorithm would not be
entirely suitable for my application.
5. Artificial Neural Network (ANN): An ANN is an algorithm that is loosely modelled after
the neuronal structure of the mammalian cerebral cortex but on much smaller scales.7
A
Neural network typically consists of a number of interconnected nodes which contain an
'activation function'. Patterns are presented to the network via the input layer, which
communicates to one or more hidden layers where the actual processing is done via a system
of weighted connections. The hidden layers then link to an 'output layer' where the
classification or prediction is output (see Figure 32). As I was able to find many examples of
image recognition using ANNs, one must consider adapting and ANN algorithm for this
dissertation. Therefore, I will not entirely rule out it the use of an ANN algorithm for Foam
sensor shape recognition.
5
http://docs.opencv.org/modules/refman.html
6
http://ai.stanford.edu/~chuongdo/papers/em_tutorial.pdf
7
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html
39
Figure 32: Diagram of an Artificial Neural Network (ANN).5
6. Decision Trees: A decision tree is a binary tree (tree where each non-leaf node has two child
nodes). It can be used either for classification or for regression. For classification, each tree
leaf is marked with a class label; multiple leaves may have the same label. For regression, a
constant is also assigned to each tree leaf, so the approximation function is piecewise
constant.5
Decision Trees are mostly used in the field of decision analysis, where the end
result of the problem is the solution of a puzzle or the reaching of a goal. Decision tree
learning is a very successful technique for supervised classification learning. A Decision
Tree is a flowchart like structure, with multiple internal binary nodes where a test is made.
As Decision Trees work in a top-down structure starting from a single point and splitting
multiple times on each node's decision, I was unable to adapt a Decision Tree for my image
recognition process.
From the above analysis of Machine Learning Algorithms available in the Emgu CV Library, I
concluded that the most suitable Algorithms to this dissertation, in order, are:
1. K-Nearest Neighbours (KNN)
2. Artificial Neural Network (ANN)
3. Normal (or Naive) Bayes Classifier.
Initially I will attempt to adapt a KNN Machine Learning method of image recognition to my
dissertation.
40
4.4.3 Application of K-Nearest Neighbours to Image Recognition
To identify shape in an image, I chose keypoint matching, or feature detection, as the method of
comparing two images. This approach was taken as I found sufficient theory on feature detection
and description within the OpenCV documentation to validate my choice.5
Some of the common
and widely implemented feature (or object) detection algorithms available in the Emgu CV library
are (in order of date of implementation):
 SIFT Detector - Scale Invariant Feature Transform (Lowe 1999)
 HOG Descriptor - Histogram of Oriented Gradients (Dalal & Triggs, 2005)
 SURF Detector - Speeded-up Robust Features (Bay, Tuytelaars & Van Gool, 2006)
 FAST Detector - Features from Accelerated Segment Test (Rosten, Porter & Drummond, 2010)
 ORB Detector - Oriented BRIEF keypoint detector and descriptor extractor (Rublee, Rabaud,
Konolige & Bradski, 2011).
The exact manner in which each feature detection method operates will not be discussed in this
Dissertation. Each feature detection algorithm has its benefits and costs; e.g. one algorithm could be
faster at feature extraction and comparison, but another algorithm could be more accurate but time
consuming. As the speed of comparison was not an issue (the grey scale image has very little detail
to be examined), the only deciding factor was the accuracy and repeatability of the algorithm to find
good feature matches between sample and library image. Discovering the best feature detection
method for my dissertation was achieved through applying each algorithm in software, observing
the accuracy of the output and making comparisons with all other methods. The SURF Algorithm
was found to produce the most promising results for the grey-scale images of push sample
comparison, when compared to the other detection methods listed.
The K-Nearest Neighbours algorithm works as follows:
Using n features extracted from the unknown image by the feature detection algorithm (chosen
previously), the feature points are placed on a graph, making them feature vectors. The same
process is performed on a known (labellel and Classified) image, producing a graph also with n
feature vectors. Each n feature vector on the unknown image is then measured to find the k-Nearest
Neighbours in the labelled image. This measurement is performed by way of inverse Euclidean
distance (d) between the sampled image's feature vector and the labelled image's k-Nearest feature
vectors, producing (k x d) for all n features (The value of k is an integer manually selected by the
user, typically a low odd number such as 3 or 5; an odd number ensures a majority decision is
possible when classifying each feature vector).
41
This calculation is repeated for all images in the labelled set (or library).The sum of n x (k x d)
produces a class of images that has the highest majority vote; the unknown image is considered to
be a member of this class, and is labelled accordingly (Cover & Hart, 1967).
4.5 Selection of Software and Framework
I found sufficient articles and applications of the OpenCV software library for machine learning to
convince me to apply this library to my research (refer section 3.6). Many articles can be found on
OpenCV face recognition which, although the writers often employ machine learning algorithms
from the OpenCV library, is not as relevant to this study, as face recognition is not a requirement
for this dissertation. More relevant were articles on robotic machine vision, such as Kao & Huy
(2013) and environment learning and object recognition such as Flynn (2009), and German et al.
(2013). As these papers discussed the use of OpenCV for Robotic image processing in poor
visibility and dynamic, hostile environments (e.g. earthquake sites, underground mines or
battlefields), I found these papers particularly informative and inspiring to reaffirm my acceptance
of OpenCV for this study.
OpenCV was written originally in C++, one library called Emgu CV is available for applications in
C# .As stated in 4
: Emgu CV is a cross platform .NET wrapper to the OpenCV image processing
library, allowing OpenCV functions to be called from .NET compatible languages such as C#, VB,
VC++. Emgu CV is an additional, non standard C# library, and the installation of the Emgu CV
libraries are required for image processing with this application. Shi (2013) performed a comparison
of three Computer Vision processing libraries: OpenCV, EmguCV and AForge.NET. His findings
were that although Emgu CV was not the fastest application in terms of software performance (C++
OpenCV was faster); Emgu CV had better overall results in documentation and ease of use that
compliments any performance issue and sets Emgu CV ahead of its two comparisons. Furthermore,
my choice of applying the Emgu OpenCV Library is simply to take advantage of the Microsoft
.NET Framework in the simplification of GUI development using the C# programming Language;
OpenCV itself does not have the ability to create GUI applications on its own. Finally, I found the
quality and quantity of online documentation and support available for C# Emgu CV to be better
than Torch7 and Bob (refer section 3.5).
42
5 Implementation
The software application, 'Digital Foam Reader', was designed and coded in Microsoft Visual
Studio 2012, using the C#.NET language and utilising a version of OpenCV written for C# called
EMGU-OpenCV (Emgu CV).5
The software application’s tasks can be broken down as follows:
1. Connect to the Digital Foam device via any Serial Port on a standard PC running Windows
software.
2. Read ‘Push Data’ from Digital Foam and convert this data into a grey scale image.
3. Compare this grey scale image with other saved images from a library of shapes, find the
closest matching image by two processes:– A. use a simple image ‘comparison’ method, and
B. if A is not successful, use a more complex Machine Learning algorithm to choose the
closest matching shape.
4. If there is no closest comparison to the pushed action, give the application the ability to
realise this and offer the user to save this ‘Push Data’ as an image, with a label name. This
image is then added to the array of images - hence increasing the application's knowledge of
shapes by supervised learning.
5. Have the ability to save ‘Push Data’ as an image, then either load these saved images or
possibly store these images in a library that can be used for further shape recognition.
Figure 33: Microsoft Visual Studio early Development of Digital Foam Reader.
43
The software application has the ability to read from Digital Foam, capture the numerical data and
convert this data into a 4x4 grey scale image.
Figure 34: Digital Foam Reader application showing a sample push image.
This is the start of the Library building phase; as the local folder becomes the repository of the push
image library. It is important to note here that each time ‘Push Data’ is received, it will differ
slightly even with the same shape pushed into Digital Foam by the same user; due to the amount of
downward force on the object and any slight variation in vertical alignment between the object and
the Foam Sensor. This complicates the process of shape recognition greatly. If the depth and angle
that the shape is pushed into the foam is constant, or even if the output from Digital Foam was
constant with the same push (which it isn’t – the digital values can differ slightly each time), shape
recognition would be much simpler. But due to these factors which were noted whilst testing the
software, part of the complexity of this study is due to the inability to repeat the exact same push
sample into Digital Foam. This result is a statistical situation known as standard deviation, or bell
curve, where most push observations are in the middle of the depth scale, but some samples will
occur from either a light push or a heavy push into the Foam. From all the above variables, it would
be rare for any two push samples to produce the exact same grey-scale image.
Eventually the user builds a library of images that will be used for comparison techniques, and
further into the application this same library of push images will be used as the training set for
44
Machine Learning techniques - for when a simple image comparison algorithm does not produce a
sufficient image match. Figure 35 - Figure 38 displays the shape push and image creation trials for
the four test objects constructed for this dissertation:
(a) (b) (c)
Figure 35: (a) - the ‘L’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale
image capture of this push sample.
(a) (b) (c)
Figure 36: (a) - the ‘T’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale
image capture of this push sample.
(a) (b) (c)
Figure 37: (a) - the ‘Sphere’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-
scale image capture of this push sample.
45
(a) (b) (c)
Figure 38: (a) - the ‘Flat Cylinder’ Shape test object, (b) - the sample being pushed into foam and, (c) - the
output grey-scale image capture of this push sample.
Once a deformation is successfully and satisfactorily captured, the user can save the grey-scale
image as a jpeg file to the selected local folder; the filename of which will become the ‘label’ for
the shape. This is the ‘Train a machine’ step in (Figure 39). For example, with the shape in (Figure
37), the user could enter ‘small-sphere’ as the shape. If this shape was push-sampled again; the
Machine Learning application should be able to recognize this shape as ‘small-sphere’ from its
learned library.
Figure 39: Steps to Train a learning machine.
46
5.1 Simple Image Comparator
The Simple Image Comparator implements a Grey-Scale image comparison between the sample
image and a library image. The comparison is performed on each of the sixteen shaded squares
which are depth conversions of a sample shape pushed into the Foam Sensor (see Figure 22). A
library image is a product of the same process, saved previously. An interesting point to note here is
that virtually every push sample of the same shape produces a slightly different Grey-Scale image,
due to slight variations in axis and depth of push when the shape is held down by the user and
sampled. So, the unrepeatability of two push samples of the same shape will affect image
recognition greatly.
(a) (b) (c)
Figure 40: (a) The Push Sample Image, (b) Closest matching Library Image and (c) Result of the Comparator
process
5.1.1 Evaluation
To determine the accuracy and repeatability of the Simple Image Comparator, a test recording of
shapes was undertaken (Figure 40). This test measurement involved 25 push samples of the four test
shapes, as well as 25 push samples of four unknown shapes; 200 push samples in total. The
unknown shapes consisted of a rotation of three of the test shapes, rotated either 45 degrees or 180
degrees, and the Sphere sampled in the bottom right corner of the Foam Sensor (see Figure 41(b)).
The test was conducted with:
 25 push samples of each of the four test shapes, 100 total samples.
 25 push samples of four unknown shapes, 100 total samples.
The unknown shapes were chosen as they were as different as possible to the original test shapes,
within the constraint of a 4x4 sample foam array.
47
Known Shapes Unknown Shapes
(a) (b)
Figure 41: (a) The four Test Shapes and (b) The four Unknown Shapes to be compared in this evaluation.
5.1.2 Results
The results of the test are displayed as a Confusion Matrix, as shown in Table 5-1. The Image
Comparator results are displayed in the columns from left to right, showing the percentage of
correct identification that was made for each shape in the final column. The actual physical shapes
are displayed in the rows from top to bottom, separated into known and unknown shapes. As stated
in the Evaluation, the unknown shapes were rotations (except the Sphere sample which was
repositioned) of the Test Shapes. This proves that the Simple Image Comparator will only identify a
match if the position and orientation of the push sample is exactly the same as the library image
sample. Additionally, the depth of push and angle of incidence to the Foam Sensor also affect the
accuracy of the Simple Comparator's results.
48
Table 5-1 Test results for the Comparator Evaluation.
Predicted from Comparator Software
L-shape Cylinder Sphere T-shape Unknown
Percent
Correct
Actual
Physical
Shape
Known
Shapes
L-Shape 25 0 0 0 0 100
Cylinder 0 24 0 0 1 96
Sphere 0 0 25 0 0 100
T-Shape 0 0 0 25 0 100
Total 25 24 25 25 1 99%
Unknown
Shapes
L-Shape 1800
0 0 0 6 18 72
Cylinder -450
0 0 2 0 23 96
Sphere Moved 0 0 0 0 25 72
T-Shape 1800
1 0 0 0 24 96
Total 1 0 2 6 90 90%
Total accuracy for Simple Image Comparator: 94.5%
5.1.3 Discussion
The results obtained from the Comparator Experiment demonstrate that, with some algorithm
refinement, a simple image comparator can produce excellent results, with 99% accuracy when
identifying a known shape from its library, 90% accuracy when acknowledging an unidentified
image, and an overall success rate of 94.5%. Indeed, the objects must in the exact position and
orientation on the Foam Sensor between sample and library images. Of particular note when I
performed the test was the indication that depth of push was relevant to the comparator's accuracy;
with all the incorrectly identified shape comparisons taking place when a push was either light in
pressure or very heavy in pressure. This indicated that a working range of push depth is very
important to shape recognition with image comparisons. From the test results, I consider a working
Grey Count range to be from 300 - 900 (see Figure 58), anything beyond 900 and generally the
Digital Foam Sensor is deforming heavily, thus affecting the Foam Sensors along side the sensors
that are deformed due to the shape being forced further into the Foam, pushing the surface fabric
down with it (see Figure 42). Finally, the angle of incidence when pushing a Shape into the Foam
Sensor affects the resulting Grey Scale image, lessening the chances of identification (Figure 43).
49
(a) Too little deformation (b) suitable deformation (c) Excessive deformation
Figure 42: (a) a soft Push Sample, (b) a suitable Push Sample and (c) a Push Sample with excessive depth.
Figure 43 Poor Shape orientation to Foam Sensor
Surrounding fabric of Foam
Sensor not touched by the
shape is deformed
Foam Sensors not deforming
enough, producing low
intensity Grey Scale image
50
5.2 Machine Learning Image Comparator
The realisation of the Machine Learning Image Comparator was the implementation of the SURF
Feature Detection algorithm used to detect features in the image, then the execution of a K-Nearest
Neighbours algorithm to find the closest image from the library.
This Comparator returns an image, similar to Figure 44, displaying the features found in each image
(small yellow circles), the matches between features (blue line joining two yellow circles) and the
size of the Region of Interest (if any) in the form of an orange quadrilateral shape. The closer the
orange quadrilateral is to the full shape of the sampled image, the better the match (See Figure 44).
In Figure 44 the sampled shape from the Digital Foam Sensor is on the left, with the library sample
that is deemed to be the best match from the result of the k-Nearest Neighbours algorithm is
displayed on the right of the image.
(a) A poor image match (b) An average image match
(c) A good image match (d) An excellent image match
Figure 44: Image match accuracy from (a) poor, to (b) average, to (c) good and (d) excellent.
It is important to note here that with the execution of this algorithm, an image match is always
found; meaning the Machine Learning Comparator will never return with an unknown image. The
calculation of a cut-off, or threshold value, where an image comparison is unidentified is still
51
required. But, as tests show (see Table 5-2), the Machine Learning Image Comparator's accuracy is
limited, and the addition of returning an unknown shape produced a larger error response during
tests, and was omitted to ease algorithm improvement.
5.2.1 Evaluation
To determine the accuracy of the Machine Learning Image Comparator, a test recording of shapes
was undertaken. This test measurement involved 25 push samples of the four test shapes only (as
displayed in Figure 41(a)), 100 shape samples in total. The evaluation was conducted with push
samples of the four Test Shapes compared to the library of shapes as displayed in Figure 25.
5.2.2 Results
The results of the test are displayed in a Confusion Matrix, as shown in Table 5-2. The Machine
Learning Image Comparator results are displayed in the columns from left to right, showing the
percentage of a correct identification was made for each shape in the final column. The actual
physical shapes are displayed in the rows from top to bottom. All four Test Shapes should be found
in the library of shapes.
Table 5-2: Test results for the Machine Learning Comparator Evaluation.
Predicted Class from Machine Learning Algorithm
L-shape Cylinder Sphere T-shape
Number
Correct
Percent
Correct
Known
Shapes
L-Shape 9 0 0 16 9/25 36
Cylinder 1 24 0 0 24/25 96
Sphere 8 4 4 9 4/25 16
T-Shape 3 6 0 16 16/25 64
Total 21 34 4 41 53/100 53%
Total accuracy for Machine Learning Comparator: 53%
The Machine Learning Image Comparator had great difficulty in identifying the Sphere, with only
16% accuracy, but excellent results for the Cylinder object, with 96% accuracy. The L-Shape and
T-Shape objects produced mixed results with 36% and 64% accuracy respectively. Overall, the
Machine Learning Comparator produced an image identification accuracy of 53%.
52
5.2.3 Discussion
Considering the results obtained from the evaluation, it would appear that the Machine Learning
Image Comparator's ability to recognise a shape is not much greater than that achieved from a 50/50
guess. This would indicate that either the Machine Learning algorithm is not performing correctly,
or my interpretation of the algorithm's output is imprecise.
To improve accuracy of the Machine Learning Image recognition process, there are some variables
in the algorithm that require adjustment. Particularly the values of:
 k (the nearest number of neighbours), and integer from 1 upward
 SURF feature detection points, an integer from 1 upward
 The uniqueness threshold; the point where two features are considered a match (a value
between 0 and 1).
To implement the Machine Learning Image Comparator proved to be a challenging task. Many
algorithm variables and the analysing of information proved to be the main reason for error.
Specifically, it is the evaluation and assessment of the results returned by the Machine Learning
algorithm that determines its accuracy. One example is the determination of a cut-off for uniqueness
threshold; this value determines if uniqueness between two points is true. Another important value
is the value of k. This value is like an expanding circle; the higher the value of k, the more the circle
expands outwards around a sample point, until k number of library points are consumed by the
circle. So, adding the number of SURF feature points, the uniqueness threshold between two points
and the value of k - nearest neighbours all affect the accuracy of the algorithm. The ideal value for
each variable is yet to be determined, and has proven to be very difficult in defining. Therefore, I
did not set a cut-off value below which was deemed an unidentified shape. With further analysis
and refining, I believe the accuracy of the Machine Learning Image Comparator process can be
improved.
53
6 Conclusion
In this thesis I have introduced techniques to identify shapes pushed into the Digital Foam
apparatus. Four unique shapes were manufactured to suit the prototype 4x4 Digital Foam Sensor
that was used throughout this study. The size of the test shapes was equivalent to the surface area of
the Foam Sensor to facilitate ease of testing and confirmation of results. The test shapes were
pushed into the Foam Sensor and a resulting deformation sample was converted to a grey scale
image. These images were compared to a previously saved image library of shapes, to confirm
whether it was possible to identify any shapes and label them accordingly. Two methods of image
identification were studied; a grey scale image comparator and a more elaborate Machine Learning
algorithm. Both methods displayed strengths and weaknesses when comparing images. The grey
scale image comparator was easily implemented in software, but was limited in the variation of
image it could identify; the sampled image had to be in the same position and orientation as the
library image to be considered a match. The Machine Learning image comparison algorithm
produced mixed results, and was indeed outperformed by the simple comparator in my study. It was
noted that the Machine Learning image comparator could withstand slight variances in shape
position and orientation. This observation was unexpected but interesting, indicating this is where
the strength of Machine Learning lies in the field of image recognition.
There was a limitation caused by the size of the Foam Sensor producing a low resolution image
with little detail. A solution to this problem would be to use a larger Digital Foam array to produce
images with more interesting features to compare. The K-Nearest Neighbour Machine Learning
algorithm may not be ideal for image matching with Digital Foam; and consideration should be
given to other common Machine Learning algorithms that have been adapted for image matching
such as Artificial Neural Networks, Naive Bayes Classifiers and Random Forests. Furthermore, the
K-Nearest Neighbours algorithm was run on the SURF Detector feature set; this resulted in only
average image comparison results. Future directions could include the implementation of K-nearest
neighbours on other image attributes such as the histogram attribute, SIFT Features, or a colour
attribute if we were to create colour sample images instead of grey-scale images.
In summary, I found the capacity for Digital Foam to recognise shapes does exist, and the
application of Machine Learning could be refined to improve the quality of shape recognition with
the Digital Foam Sensor.
54
7 References
1. Anjos, A, El Shafey, L, Wallace, R, Gunther, M, McCool, C, Marcel, S 2012, ‘Bob: a free signal
processing and machine learning toolbox for researchers’, in Proceedings of the 20th ACM
international conference on Multimedia, Nara, Japan, pp. 1449–1452.
2. Bay, H, Tuytelaars, T & Van Gool, L 2006, ‘SURF: Speeded Up Robust Features’, Lecture Notes in
Computer Science, vol. 3951, pp. 404–417.
3. Blackshaw, M, Devincenzi, A, Lakatos, D, Leithinger, D & Ishii, H 2011, ‘Recompose: direct and
gestural interaction with an actuated surface’, in CHI ’11 Extended Abstracts on Human Factors in
Computing Systems, Vancouver, BC, Canada, pp. 1237–1242.
4. Bradski, G & Kaehler, A 2008, Learning OpenCV: Computer Vision with the OpenCV Library, 1st Edn,
O’Reilly Media, Inc., Canada.
5. Collobert, R, Kavukcuoglu, K and Farabet, C 2011, ‘Torch7: A Matlab-like Environment for Machine
Learning’, In Big Learning 2011: NIPS 2011 Workshop on Algorithms, Systems, and Tools for
Learning at Scale.
6. Cover, T & Hart, P 1967, ‘Nearest neighbor pattern classification’, IEEE Transactions on Information
Theory, vol. 13, no. 1, pp. 21 – 27.
7. Dalal, N & Triggs, B 2005, ‘Histograms of oriented gradients for human detection’, in Proceedings of
the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, San
Diego, CA, USA, pp. 886 - 893.
8. Dix, A, Finlay, J, Abowd, G & Beale, R 2004, Human-Computer Interaction, 3rd Edn, Prentice Hall,
London.
9. Druzhkov, V, Erukhimov, V, Zolotykh, N, Kozinov, E, Kustikova, V, Meerov, I and Polovinkin, A 2011,
‘New object detection features in the OpenCV library’, Pattern Recognition and Image Analysis, vol.
21, no. 3, pp. 384–386.
55
10. Engelbart, D, English W K 1968, 'A research center for augmenting human intellect', in: AFIPS ’68
Proceedings of the December 9-11, 1968, Fall Joint Computer Conference, Part I, pp. 395–410.
11. Flynn, H, De Hoog, J & Cameron, S 2009, ‘Integrating Automated Object Detection into Mapping in
USARSim’, in EEE/RSJ International Conference on Intelligent Systems and Robots (IROS 2009), St.
Louis‚ USA.
12. Hibernik, K, Ghrairi, Z, Hans, C & Thoben, K-D 2011, ‘Co-creating the Internet of Things First
experiences in the participatory design of Intelligent Products with Arduino’, Proceedings of the
17th International Conference on Concurrent Enterprising (ICE 2011), Aachen, Germany, pp. 1 – 9.
13. Holman, D & Vertegaal, R 2008, ‘Organic user interfaces: designing computers in any way, shape, or
form’, Communications of the ACM - Organic user interfaces, vol. 51, no. 6, pp. 48–55.
14. Holman, D, Girouard, A, Benko, H, & Vertegall, R 2013, ‘The Design of Organic User Interfaces:
Shape, Sketching and Hyper context’, Interacting with Computers, vol. 25, no. 2, pp. 133–142.
15. Ishii, H & Ullmer, B 1997a, 'Tangible bits: towards seamless interfaces between people, bits and
atoms', in Proceedings of the ACM SIGCHI Conference on Human factors in computing systems,
Atlanta, Georgia, USA, pp. 234–241.
16. Ishii, H 2008a, ‘Tangible bits: beyond pixels’, in 2nd international conference on Tangible and
embedded interaction, Bonn, Germany, pp. 15–25.
17. Ishii, H 2008b, ‘The tangible user interface and its evolution’, Communications of the ACM - Organic
user interfaces, vol. 51, no. 6, pp. 32–36.
18. Ishii, H, Ratti, C, Piper, B, Wang, Y, Biderman, A, Ben-Joseph E 2004, ‘Bringing Clay and Sand into
Digital Design — Continuous Tangible user Interfaces’, BT Technology Journal, vol. 22, no. 4, pp.
287–299.
19. Iwata, H, Yano, H, Nakaizumi, F & Kawamura, R 2001, ‘Project FEELEX: adding haptic surface to
graphics’, in Proceedings of the 28th annual conference on Computer graphics and interactive
techniques, Los Angeles, CA, USA, pp. 469–476.
56
20. Jorda, S, Geiger, G, Alonso, M & Kaltenbrunner, M 2007, ‘The ReacTable: exploring the synergy
between live music performance and tabletop tangible interfaces’, in Proceedings of the 1st
international conference on Tangible and embedded interaction, Baton Rouge, LA, USA, pp. 139–
146.
21. Kaltenbrunner, M, Jorda, S, Geiger, G, & Alonso, M 2006, ‘The ReacTable: A Collaborative Musical
Instrument’, in Proceedings of the 15th IEEE International Workshops on Enabling Technologies:
Infrastructure for Collaborative Enterprises, Manchester, UK, pp. 406 - 411.
22. Kapitanova, K & Son, S 2012 Book chapter ‘Machine Learning Basics, in Intelligent Sensor Networks:
Across Sensing, Signal Processing, and Machine Learning’, Taylor & Francis LLC, CRC Press ISBN
9781439892817.
23. Kildal, J 2012, ‘Interacting with Deformable User Interfaces: Effect of Material Stiffness and Type of
Deformation Gesture’, in Lecture Notes in Computer Science, vol. 7468, Lund, Sweden, pp. 71–80.
24. Kildal, J, Paasovaara, S & Aaltonen, V 2012, ‘Kinetic device: designing interactions with a
deformable mobile interface’, in CHI ’12 Extended Abstracts on Human Factors in Computing
Systems, Austin, Texas, USA, pp. 1871–1876.
25. Lowe, DG 1999, ‘Object recognition from local scale-invariant features’, in The Proceedings of the
Seventh IEEE International Conference on Computer Vision, vol. 2, Kerkyra, Greece, pp. 1150 - 1157.
26. Mao, Z, Zeng, C, Gong, H & Li, S 2010, ‘A new method of virtual reality based on Unity3D’, paper
presented at Geoinformatics, 2010 18th International Conference on, Beijing, China, pp. 1–5.
27. Michalski, R, Carbonell, J & Mitchell, T 1984, ‘Machine Learning: An Artificial Intelligence Approach -
Volume 1’, Springer-Verlag, Berlin.
28. Murakami, T & Nakajima, N 1994, ‘Direct and intuitive input device for 3-D shape deformation’, in
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Boston,
Massachusetts USA, pp. 465–470.
57
29. Murakami, T, Hayashi, K, Oikawa, K & Nakajima, N 1995, ‘DO-IT: deformable objects as input tools’,
in CHI ’95 Conference Companion on Human Factors in Computing Systems, Denver, USA, pp. 87–
88.
30. Myers, B 1998, ‘A brief history of human-computer interaction technology’, Interactions Magazine,
vol. 5, no. 2, pp. 44–54.
31. Pahalawatta, K. & Green, R 2013, ‘Particle Detection and Classification in Photoelectric Smoke
Detectors Using Image Histogram Features’, in International Conference on Digital Image
Computing: Techniques and Applications (DICTA 2013), Hobart, Tasmania, pp. 1 – 8.
32. Parkes, A, Poupyrev, I & Ishii, H 2008, ‘Designing kinetic interactions for organic user interfaces’,
Communications of the ACM - Organic user interfaces, vol. 51, no. 6, pp. 58–65.
33. Piper, B, Ratti, C & Ishii, H 2002a, ‘Illuminating Clay: A Tangible Interface with potential GRASS
applications’, Proceedings of the Open source GIS - GRASS users conference 2002, Trento, Italy.
34. Piper, B, Ratti, C & Ishii, H 2002b, ‘Illuminating Clay: A 3-D Tangible Interface for Landscape
Analysis’, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems,
Minneapolis, Minnesota, USA, pp. 355–362.
35. Poupyrev, I, Nashida, T, Maruyama, S, Rekimoto, J & Yamaji, Y 2004, ‘Lumen: interactive visual and
shape display for calm computing’, in ACM SIGGRAPH 2004 Emerging technologies, Los Angeles, CA,
USA, Page 17.
36. Ratti, C, Wang, Y, Ishii, H, Piper, B and Frenchman, D 2004, 'Tangible User Interfaces (TUIs): A Novel
Paradigm for GIS'. Transactions in GIS, vol. 8, no. 4, pp. 407–421.
37. Rekimoto, J 2008, ‘Organic interaction technologies: from stone to skin’, Communications of the
ACM - Organic user interfaces, vol. 51, no. 6, pp. 38–44.
38. Rosten, E, Porter, R & Drummond, T 2010, ‘Faster and Better: A Machine Learning Approach to
Corner Detection’, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 1,
pp. 105–119.
58
39. Rublee, E, Rabaud, V, Konolige, K & Bradski,, G 2011, ‘ORB: An efficient alternative to SIFT or SURF’,
in 2011 IEEE International Conference on Computer Vision, Barcelona, Spain, pp. 2564 – 2571.
40. Samuel, A L 2000, ‘Some studies in machine learning using the game of checkers’, IBM Journal of
Research and Development, vol. 44, no. 1/2.
41. Schwesig, C 2008, ‘What makes an interface feel organic?’, Communications of the ACM - Organic
user interfaces, vol. 51, no. 6, pp. 67–69.
42. Schwesig, C, Poupyrev, I & Mori, E 2003, ‘Gummi: user interface for deformable computers’, in CHI
’03 Extended Abstracts on Human Factors in Computing Systems, Ft Lauderdale, USA, pp. 954–955.
43. Schwesig, C, Poupyrev, I & Mori, E 2004, ‘Gummi: a bendable computer’, in Proceedings of the
SIGCHI Conference on Human Factors in Computing Systems, Vienna, Austria, pp. 263–270.
44. Sederberg, T & Parry, S 1986, ‘Free-form deformation of solid geometric models’, in Proceedings of
the 13th annual conference on Computer graphics and interactive techniques, Dallas, USA, pp.
151–160.
45. Sellen, A, Rogers, Y, Harper, R, & Rodden, T 2009. 'Reflecting human values in the digital age'.
Communications of the ACM - Being Human in the Digital Age, Volume 52, Issue 3, pp. 58–66.
46. Shi, S 2013, Emgu CV Essentials, Packt Publishing, http://www.packtpub.com/.
47. Shih, R 2012, Learning Autodesk Inventor 2012, SDC Publications, Kansas.
48. Smith, R T, Thomas, B H & Piekarski, W 2008b, 'Digital Foam Interaction Techniques for 3D
Modelling', ACM symposium on Virtual Reality Software Technology (VRST), Bordeaux, France, 27-
29 October 2008.
49. Smith, R T, Thomas, B H, Piekarski, W 2008a, 'Tech Note: Digital Foam', IEEE Symposium on 3D User
Interfaces (3DUI), Reno, Nevada, USA, 8-9 Mar 2008.
59
50. Ullmer, B & Ishii, H 1997b, ‘The metaDESK: models and prototypes for tangible user interfaces’, in
Proceedings of the 10th annual ACM symposium on User interface software and technology, Banff,
Alberta, Canada, pp. 223 – 232.
51. Vertegaal, R & Poupyrev, I 2008, ‘Introduction : Organic User Interfaces’, Communications of the
ACM - Organic user interfaces, vol. 51, no. 6, pp. 26–30.
52. Waguespack, C 2012, Mastering Autodesk Inventor 2013 and Autodesk Inventor LT 2013, Sybex; 1st
edition.
60
8 Appendix
8.1 Software Requirements
To load and run the application ‘Digital Foam Reader’ (filename SerialFoam.exe), the user
requires a PC with the following software and hardware:
Software Requirements:
1. The Microsoft Windows 7 SP1 64-bit, Windows 8 64-bit or Windows 8.1 64-bit Operating
Systems. Note: it is not guaranteed that this software will execute correctly on a 32-bit
version of the above Operating Systems.
2. The Microsoft .NET Framework (version 4.0 or later) – this is a stand alone Framework ,
Version 3.5 is included with a standard Windows 7 installation, so an upgrade to .NET 4.0
or later is required. The .NET Framework 4.5 is included with the Windows 8 Operating
System. Similarly, the .NET Framework 4.5.1 is included with Windows 8.1; no upgrade is
required on a computer running Windows 8 or 8.1.
3. The OpenCV computer vision and machine learning software library – the latest version of
this library can be downloaded from www.opencv.org the latest version (as of 25-04-2014)
available for free download from this site is OpenCV 2.4.9. The minimum version required
to run this software is OpenCV 2.4.6.0; this version is included in the Software package –
this is a self extracting installer; see below for installation instructions if not already
installed on your PC.
4. The Arduino UNO USB Driver for Windows 7.
5. A copy of the Digital Foam Reader software executable – included in the Software package.
Hardware Requirements:
1. The minimum hardware and Operating System requirements for the .NET Framework 4.5
or later can be found at:
http://msdn.microsoft.com/library -> .NET Development -> .NET Framework 4.5 -> .NET
Framework System Requirements
2. A PC with at least one free USB 2.0 port available.
3. The 4x4 Prototype Digital Foam Sensor (see Figure 45), connected to the computer via a
USB cable.
Figure 45: The 4x4 Digital Foam Sensor used in this Dissertation.
61
8.2 Software Installation
8.2.1 .NET Framework on a Windows 7 Computer
It is a requirement to check the Microsoft .NET Framework installed version on Windows 7.
For Windows 8 or later, the correct version of The Microsoft .NET Framework is installed.
On a Windows 7 computer, check the current .NET version installed by selecting:
Start Menu -> Control Panel -> Programs -> Programs and Features
In the list of installed programs, navigate through the list to find the Name:
Microsoft .NET Framework x.x.x
If this version is 4.0.0 or later, you do not need to upgrade the .NET Framework (see Figure 46)
Figure 46: Microsoft .NET Framework Version 4.5.1 is installed on this computer.
Alternatively, if you have administration rights to your computer and are familiar with running
regedit, you can investigate further with reference to web site:
http://support.microsoft.com/kb/318785
(How to determine which versions and service pack levels of the Microsoft .NET Framework are
installed).
If this version is 3.5.1 or earlier (the default installation on Windows 7 SP1), you MUST upgrade to
the latest version of the .NET Framework (at least version 4.0.0). Microsoft .NET Framework 4.0
installation for Windows 7 can be found at:
http://www.microsoft.com/en-au/download/details.aspx?id=17851
This is a self-installing package for Windows 7. Download and install this update to the .NET
Framework. A restart may be necessary upon completion.
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon
Honours Research Thesis B Smalldon

More Related Content

Viewers also liked

Aes
AesAes
Jay.Conin_e's.Resume
Jay.Conin_e's.ResumeJay.Conin_e's.Resume
Jay.Conin_e's.Resume
Jay Conin-e
 
Koc_Arkut
Koc_ArkutKoc_Arkut
Chloe Lee Vassallo's Marketing Portfolio
Chloe Lee Vassallo's Marketing PortfolioChloe Lee Vassallo's Marketing Portfolio
Chloe Lee Vassallo's Marketing Portfolio
Chloe Lee Vassallo
 
Aja-Feingold-Portfolio
Aja-Feingold-PortfolioAja-Feingold-Portfolio
Aja-Feingold-Portfolio
Aja Feingold
 
Learning disabilities power point ma
Learning disabilities power point maLearning disabilities power point ma
Learning disabilities power point ma
Lucky Rana
 
SFC
SFCSFC
HTL Group presentation
HTL Group presentationHTL Group presentation
HTL Group presentation
Ian Lamb
 
Information
InformationInformation
Information
Ian Lamb
 
Kεφ 11. Μαθηματικά
Kεφ 11. ΜαθηματικάKεφ 11. Μαθηματικά
Kεφ 11. Μαθηματικά
Γενοβέφα Μαυροπούλου
 
FRANTZIA
FRANTZIAFRANTZIA
FRANTZIA
zz
 
Buen trato
Buen tratoBuen trato

Viewers also liked (12)

Aes
AesAes
Aes
 
Jay.Conin_e's.Resume
Jay.Conin_e's.ResumeJay.Conin_e's.Resume
Jay.Conin_e's.Resume
 
Koc_Arkut
Koc_ArkutKoc_Arkut
Koc_Arkut
 
Chloe Lee Vassallo's Marketing Portfolio
Chloe Lee Vassallo's Marketing PortfolioChloe Lee Vassallo's Marketing Portfolio
Chloe Lee Vassallo's Marketing Portfolio
 
Aja-Feingold-Portfolio
Aja-Feingold-PortfolioAja-Feingold-Portfolio
Aja-Feingold-Portfolio
 
Learning disabilities power point ma
Learning disabilities power point maLearning disabilities power point ma
Learning disabilities power point ma
 
SFC
SFCSFC
SFC
 
HTL Group presentation
HTL Group presentationHTL Group presentation
HTL Group presentation
 
Information
InformationInformation
Information
 
Kεφ 11. Μαθηματικά
Kεφ 11. ΜαθηματικάKεφ 11. Μαθηματικά
Kεφ 11. Μαθηματικά
 
FRANTZIA
FRANTZIAFRANTZIA
FRANTZIA
 
Buen trato
Buen tratoBuen trato
Buen trato
 

Similar to Honours Research Thesis B Smalldon

Report
ReportReport
Report
Chris Watts
 
Research: Developing an Interactive Web Information Retrieval and Visualizati...
Research: Developing an Interactive Web Information Retrieval and Visualizati...Research: Developing an Interactive Web Information Retrieval and Visualizati...
Research: Developing an Interactive Web Information Retrieval and Visualizati...
Roman Atachiants
 
BachelorThesis 5.3
BachelorThesis 5.3BachelorThesis 5.3
BachelorThesis 5.3
Nguyen Huy
 
thesis_submitted
thesis_submittedthesis_submitted
thesis_submitted
Alex Streit
 
Report
ReportReport
IRJET- Automatic Suggestion of Outfits using Image Processing
IRJET- Automatic Suggestion of Outfits using Image ProcessingIRJET- Automatic Suggestion of Outfits using Image Processing
IRJET- Automatic Suggestion of Outfits using Image Processing
IRJET Journal
 
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
Nischal Lal Shrestha
 
Z suzanne van_den_bosch
Z suzanne van_den_boschZ suzanne van_den_bosch
Z suzanne van_den_bosch
Hoopeer Hoopeer
 
FinalReport
FinalReportFinalReport
FinalReport
Jiawen Zhou
 
2013 Lecture 5: AR Tools and Interaction
2013 Lecture 5: AR Tools and Interaction 2013 Lecture 5: AR Tools and Interaction
2013 Lecture 5: AR Tools and Interaction
Mark Billinghurst
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
[Evaldas Taroza - Master thesis] Schema Matching and Automatic Web Data Extra...
[Evaldas Taroza - Master thesis] Schema Matching and Automatic Web Data Extra...[Evaldas Taroza - Master thesis] Schema Matching and Automatic Web Data Extra...
[Evaldas Taroza - Master thesis] Schema Matching and Automatic Web Data Extra...
Evaldas Taroza
 
DILE CSE SEO DIGITAL GGGTECHNICAL INTERm.pdf
DILE CSE SEO DIGITAL GGGTECHNICAL INTERm.pdfDILE CSE SEO DIGITAL GGGTECHNICAL INTERm.pdf
DILE CSE SEO DIGITAL GGGTECHNICAL INTERm.pdf
DiamondZ3
 
Face Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image ProcessingFace Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image Processing
paperpublications3
 
Sona project
Sona projectSona project
Sona project
Jagannath Swain
 
Computer vision
Computer visionComputer vision
Computer vision
Sheikh Hussnain
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image Retrieval
Léo Vetter
 
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Nóra Szepes
 
Final Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image ProcessingFinal Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image Processing
Sabnam Pandey, MBA
 
A Wireless Network Infrastructure Architecture for Rural Communities
A Wireless Network Infrastructure Architecture for Rural CommunitiesA Wireless Network Infrastructure Architecture for Rural Communities
A Wireless Network Infrastructure Architecture for Rural Communities
AIRCC Publishing Corporation
 

Similar to Honours Research Thesis B Smalldon (20)

Report
ReportReport
Report
 
Research: Developing an Interactive Web Information Retrieval and Visualizati...
Research: Developing an Interactive Web Information Retrieval and Visualizati...Research: Developing an Interactive Web Information Retrieval and Visualizati...
Research: Developing an Interactive Web Information Retrieval and Visualizati...
 
BachelorThesis 5.3
BachelorThesis 5.3BachelorThesis 5.3
BachelorThesis 5.3
 
thesis_submitted
thesis_submittedthesis_submitted
thesis_submitted
 
Report
ReportReport
Report
 
IRJET- Automatic Suggestion of Outfits using Image Processing
IRJET- Automatic Suggestion of Outfits using Image ProcessingIRJET- Automatic Suggestion of Outfits using Image Processing
IRJET- Automatic Suggestion of Outfits using Image Processing
 
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
A Real-time Classroom Attendance System Utilizing Viola–Jones for Face Detect...
 
Z suzanne van_den_bosch
Z suzanne van_den_boschZ suzanne van_den_bosch
Z suzanne van_den_bosch
 
FinalReport
FinalReportFinalReport
FinalReport
 
2013 Lecture 5: AR Tools and Interaction
2013 Lecture 5: AR Tools and Interaction 2013 Lecture 5: AR Tools and Interaction
2013 Lecture 5: AR Tools and Interaction
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
[Evaldas Taroza - Master thesis] Schema Matching and Automatic Web Data Extra...
[Evaldas Taroza - Master thesis] Schema Matching and Automatic Web Data Extra...[Evaldas Taroza - Master thesis] Schema Matching and Automatic Web Data Extra...
[Evaldas Taroza - Master thesis] Schema Matching and Automatic Web Data Extra...
 
DILE CSE SEO DIGITAL GGGTECHNICAL INTERm.pdf
DILE CSE SEO DIGITAL GGGTECHNICAL INTERm.pdfDILE CSE SEO DIGITAL GGGTECHNICAL INTERm.pdf
DILE CSE SEO DIGITAL GGGTECHNICAL INTERm.pdf
 
Face Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image ProcessingFace Recognition & Detection Using Image Processing
Face Recognition & Detection Using Image Processing
 
Sona project
Sona projectSona project
Sona project
 
Computer vision
Computer visionComputer vision
Computer vision
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image Retrieval
 
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
 
Final Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image ProcessingFinal Year Project-Gesture Based Interaction and Image Processing
Final Year Project-Gesture Based Interaction and Image Processing
 
A Wireless Network Infrastructure Architecture for Rural Communities
A Wireless Network Infrastructure Architecture for Rural CommunitiesA Wireless Network Infrastructure Architecture for Rural Communities
A Wireless Network Infrastructure Architecture for Rural Communities
 

Honours Research Thesis B Smalldon

  • 1. Identifying shapes with the Digital Foam surface by Barry James Smalldon A thesis submitted for the degree of Bachelor of Information Technology (Honours) Supervisor Dr Ross T. Smith Wearable Computer Lab School of Computer and Information Science May 2014
  • 2. 2 Abstract Digital Foam is a new computer input device that provides developers of prototyping, wearable or innovative computer systems the opportunity to interface with a computer in a novel and unique fashion. The input device can be deformed, moulded or pressed, with the resultant physical shape captured by a computer. This dissertation presents a shape recognition process for objects pushed in the Foam sensor. A sample set of plastic shapes was created and used with a small, square array prototype of the Digital Foam for testing of this work. The push, or press, of an object into the foam sensor was converted to an image for the purposes of shape recognition. Using these images to create a library of shapes, and then utilizing this library to identify other shapes is the main body of my study. Initially, this was realised with simple image comparison processes, but for a more enhanced and flexible application, Machine Learning techniques are also discussed. Suitable, off the shelf type Machine Learning methods were analysed for this application; with the adaptation of one type of machine learning process applied in software. The realisation of the thesis was the design and implementation of image comparison software in C#. The first outcome of the thesis was the application of a shape identification algorithm, by way of image comparisons, to capture a sample shape, convert it to a grey-scale image and recognize it from a library. The second outcome was the implementation of Machine Learning. This involved not only recognising known shapes successfully; but also the ability to recognise an unknown shape, add this shape to the library (expanding shape knowledge) and then successfully identify the shape henceforth. The OpenCV machine learning library was found to be a suitable and easily adaptable machine learning system; particularly as it already includes many algorithms for image recognition and there were many examples of OpenCV for object detection. The two variants of learning and expanding shape knowledge were measured. The results of the image template comparator showed correct recognition when two shapes are similar, as expected, but if shape orientation or placement on the Foam Sensor differed somewhat; it failed to identify the shape correctly.
  • 3. 3 The results for the Machine Learning algorithm showed mixed results. Evaluations showed average image comparison matches when orientation and alignment to the Foam Sensor was identical to the library image, but interestingly some image recognition did occur when orientation or alignment differed between sample and library image. These results suggest that both simple comparisons and Machine Learning processes have strengths when performing image comparisons. More importantly, my dissertation shows that shape identification with Digital Foam by image comparison is entirely possible.
  • 4. 4 Contents 1 Introduction.................................................................................................................................10 2 Research Question ......................................................................................................................16 3 Literature Review .......................................................................................................................17 3.1 Tangible User Interfaces .....................................................................................................17 3.2 Organic User Interfaces.......................................................................................................19 3.3 Deformable User Interfaces.................................................................................................21 3.4 Digital Foam........................................................................................................................22 3.5 Machine Learning................................................................................................................23 3.6 Machine Learning with OpenCV ........................................................................................27 4 Research Design .........................................................................................................................28 4.1 Digital Foam Hardware Overview and Apparatus..............................................................28 4.2 Shape Selection ...................................................................................................................30 4.3 Design of Simple Image Comparator..................................................................................32 4.4 Design and Theory of Machine Learning Algorithm..........................................................34 4.4.1 Initial Process - the Creation of a Library of Knowledge............................................35 4.4.2 Application of Suitable Machine Learning Algorithm ................................................36 4.4.3 Application of K-Nearest Neighbours to Image Recognition......................................40 4.5 Selection of Software and Framework................................................................................41 5 Implementation...........................................................................................................................42 5.1 Simple Image Comparator...................................................................................................46 5.1.1 Evaluation ....................................................................................................................46 5.1.2 Results..........................................................................................................................47 5.1.3 Discussion....................................................................................................................48 5.2 Machine Learning Image Comparator ................................................................................50 5.2.1 Evaluation ....................................................................................................................51 5.2.2 Results..........................................................................................................................51
  • 5. 5 5.2.3 Discussion....................................................................................................................52 6 Conclusion..................................................................................................................................53 7 References...................................................................................................................................54 8 Appendix.....................................................................................................................................60 8.1 Software Requirements .......................................................................................................60 8.2 Software Installation............................................................................................................61 8.2.1 .NET Framework on a Windows 7 Computer .............................................................61 8.2.2 OpenCV 2.4.6 or Later.................................................................................................62 8.2.3 Arduino UNO USB Driver for Windows ....................................................................62 8.2.4 The Digital Foam Reader Application.........................................................................63 8.3 Method of Operation ...........................................................................................................64 8.3.1 Run SerialFoam.exe....................................................................................................64 8.3.2 Connect to the Digital Foam Sensor via USB port......................................................64 8.3.3 Load or Create Image Library......................................................................................65 8.3.4 Sampling a New Shape for Recognition Purposes.......................................................68 8.3.5 Compare the Push Sample to the Image Library .........................................................68 8.3.6 The Image Comparison Results...................................................................................70 8.4 DVD Contents.....................................................................................................................71
  • 6. 6 List of Figures Figure 1: (a) A tablet computer touchscreen and (b) an Android Phone touchscreen in use.............10 Figure 2: Lumen - an interactive display (permission requested)......................................................11 Figure 3: Direct manipulation and Gestural interaction with Recompose (Image courtesy of M. Blackshaw).........................................................................................................................................12 Figure 4: An example of Digital Foam..............................................................................................12 Figure 5: (a) Planar and (b) Hemispherical Digital Foam. ................................................................13 Figure 6: (a) and (b) Spherical Digital Foam.....................................................................................13 Figure 7: Digital Foam being used to sculpt a 3D model like clay....................................................14 Figure 8: The FANUC Robot M-410iB Palletizing Industrial Robot.1 .............................................15 Figure 9: A user interacting with the Sandscape TUI (Image courtesy of H. Ishii). .........................18 Figure 10: The Illuminating Clay TUI – The user is directly forming/sculpting the geographic landscape (Image courtesy of H. Ishii). .............................................................................................18 Figure 11: The ReacTable TUI – Creates sounds by the location and proximity of phicons on the surface (permission requested)...........................................................................................................19 Figure 12: The Nokia Kinetic Prototype Mobile Phone (Image courtesy of J. Kildal). ....................20 Figure 13: The Gummi bendable handheld computer (permission requested)..................................20 Figure 14: Murakami and Nakajima’s concept of a 3D Shape Deformation device (Image courtesy of T. Murakami).................................................................................................................................21 Figure 15: The conductive sensors of Digital Foam..........................................................................22 Figure 16: Demonstration of Digital Foam........................................................................................23 Figure 17: The basic Machine Learning process.3 .............................................................................24 Figure 18: Machine Learning Algorithm classes (image adapted from http://bitsearch.blogspot.com.au/2011/02/supervised-unsupervised-and-semi.html)........................25 Figure 19: Pushing a shape into Digital Foam and recognising the shape by Digital Foam Reader.28 Figure 20: (a) Digital Foam Prototype - constructed as an Arduino Shield, and (b) the 4 x 4 Digital Foam apparatus used in this application. ...........................................................................................28 Figure 21: Each Conductive Foam Tube produces a variable output according to depth of deformation........................................................................................................................................29 Figure 22: The Output from each Conductive Foam Tube is converted to a Grey Scale value by Digital Foam Reader, according to depth of push, (a) is an ideal scale, (b) is the actual scale used. 29 Figure 23: (a) 'L' Shape, (b) 'T' Shape, (c) ‘Sphere’ and (d) ‘Flat Cylinder’ Test Shape designs. ....30 Figure 24: (a) - ‘L’ Shape, (b) - ‘T’ Shape, (c) - ‘Sphere’ and (d) - ‘Flat Cylinder’ plastic samples.31 Figure 25: The Image Library of the four sample shapes from Figure 24.........................................32
  • 7. 7 Figure 26: The Push Sample Image that will be compared to the Image Library. ............................32 Figure 27: The Image Comparison process .......................................................................................33 Figure 28: The Library Sample chosen to be the correct image. .......................................................33 Figure 29: Flowchart of the Shape Recognition Software.................................................................34 Figure 30: (c) The resultant image created by Digital Foam Reader.................................................35 Figure 31: (a) 8 Images in the image library and (b) a 15 image library stored by Digital Foam Reader. ...............................................................................................................................................36 Figure 32: Diagram of an Artificial Neural Network (ANN).5 ..........................................................39 Figure 33: Microsoft Visual Studio early Development of Digital Foam Reader.............................42 Figure 34: Digital Foam Reader application showing a sample push image.....................................43 Figure 35: (a) - the ‘L’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale image capture of this push sample. .......................................................................44 Figure 36: (a) - the ‘T’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale image capture of this push sample. .......................................................................44 Figure 37: (a) - the ‘Sphere’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale image capture of this push sample. .......................................................................44 Figure 38: (a) - the ‘Flat Cylinder’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale image capture of this push sample...........................................................45 Figure 39: Steps to Train a learning machine. ...................................................................................45 Figure 40: (a) The Push Sample Image, (b) Closest matching Library Image and (c) Result of the Comparator process............................................................................................................................46 Figure 41: (a) The four Test Shapes and (b) The four Unknown Shapes to be compared in this evaluation...........................................................................................................................................47 Figure 42: (a) a soft Push Sample, (b) a suitable Push Sample and (c) a Push Sample with excessive depth...................................................................................................................................................49 Figure 43 Poor Shape orientation to Foam Sensor ............................................................................49 Figure 44: Image match accuracy from (a) poor, to (b) average, to (c) good and (d) excellent........50 Figure 45: The 4x4 Digital Foam Sensor used in this Dissertation. ..................................................60 Figure 46: Microsoft .NET Framework Version 4.5.1 is installed on this computer. .......................61 Figure 47: The Arduino Uno is connected to Serial Port COM14.....................................................62 Figure 48: A typical Digital Foam Reader PC environment.............................................................63 Figure 49: Digital Foam Reader application in use...........................................................................64 Figure 50: Serial Port textbox and Connect to Digital Foam Button (com Port may vary)..............64 Figure 51: A successful connection to the device will display in the received data box..................65 Figure 52: Load a previously saved image library with this Button.................................................65
  • 8. 8 Figure 53: Loading a saved image library ........................................................................................65 Figure 54: (a) a four image library, (b) an eight image library and (c) a fifteen image library. ........66 Figure 55: A list of all commands Digital Foam accepts..................................................................66 Figure 56: (a) The command that is sent to Digital Foam with Send Command Button (b).............66 Figure 57: Numerical Data received from Digital Foam; 8 push samples are displayed. ................67 Figure 58: Grey Count. The sum of grey scale data in the image......................................................67 Figure 59: A push sample image........................................................................................................67 Figure 60: Save Sampled Shape button .............................................................................................68 Figure 61: The two types of comparison processes ...........................................................................68 Figure 62: Results of the Comparator processes................................................................................69 Figure 63: Machine Learning comparison considers this an excellent image match. .......................69 Figure 64: Machine Learning comparison showing a poor image match..........................................69 Figure 65: Compare a single image from any library of shapes to the current Sampled Shape........70 Figure 66: Results from both image comparison methods as displayed on Digital Foam Reader....70 Figure 67: Contents of DVD..............................................................................................................71 Figure 68: The SerialFoam Application.............................................................................................71
  • 9. 9 List of Tables Table 5-1 Test results for the Comparator Evaluation.......................................................................48 Table 5-2: Test results for the Machine Learning Comparator Evaluation. ......................................51
  • 10. 10 1 Introduction Computer input devices support the interface between human and computer. This field of study has evolved (along with the evolution in computing power, design technology and materials) into what is known today as Human Computer Interaction (HCI). The term HCI has been in widespread use since the early 1980's (Dix et al. 2004). An exciting and ongoing area of research in the HCI field is exploring new devices that allow humans to interact with computer systems, beyond the ubiquitous mouse (Engelbart & English 1968), the keyboard and the Graphic User Interface (GUI) (Myers 1998). New improvements in technology and materials (e.g. touch sensitive surfaces) have given rise to new ways of interacting with a computer or electronic device. Recently, developments in touch screen technology have made tablet PC’s (Figure 1(a)) and Smartphones (Figure 1(b)) immensely popular. New terms have been introduced - such as Tangible Interfaces (Ishii & Ullmer 1997), Organic Interfaces (Vertegaal & Poupyrev 2008) and even more recently Deformable User Interfaces (Kildal 2012). (a) (b) Figure 1: (a) A tablet computer touchscreen and (b) an Android Phone touchscreen in use. These new approaches to computer interaction represent a paradigm shift in how the sense of touch and feel are incorporated into the GUI. Expanding from solid touch sensitive surfaces, new innovations are supporting interactions in three dimensions; now it is possible to interact with computers using non-rigid, malleable, pliable or deformable surfaces. We push, pull, twist, turn, shape things and mould surfaces naturally; and these actions can be directly integrated into a computer system.
  • 11. 11 One area of interest to this dissertation is haptic interfaces. A haptic interface is a feedback device that generates sensation to the skin and muscles, including a sense of touch, weight and rigidity (Iwata et al. 2001). There are several examples of research in this field. For instance, Poupyrev et al. (2004) developed Lumen (Figure 2). Lumen is an interactive display that presents visual images and physical, moving shapes, both controlled independently. The smooth, organic physical motions provide aesthetically pleasing, calm displays for ambient computing environments. Users interact with Lumen directly, forming shapes and images with their hands. Figure 2: Lumen - an interactive display (permission requested). Another related area is shape displays. For example, Blackshaw et al. (2011) developed Recompose (Figure 3), a new system for manipulation of an actuated surface: Recompose is a framework allowing direct and gestural manipulation of the physical environment. Recompose complements the highly precise, yet concentrated affordance of direct manipulation with a set of gestures allowing functional manipulation of an actuated surface (Blackshaw et al. 2011). An important note mentioned by this paper is that direct manipulation of an actuated surface allows us to precisely affect the material world, where the user is guided throughout the interaction by natural haptic feedback (Blackshaw et al. 2011). This is a relevant fact to my research; our ability to express ourselves with our hands and utilizing direct manipulation in more than 2 dimensions provides excellent feedback to the user and a pleasing 'hands-on' feeling of interaction.
  • 12. 12 Figure 3: Direct manipulation and Gestural interaction with Recompose (Image courtesy of M. Blackshaw). Ishii (2008) describes how, when interacting with the (2D) GUI world, the user does not utilize their skill in evolved dexterity or direct manipulating of physical objects with our hands (such as building blocks or clay models). Ishii reasons that this is where the fields of Tangible, Organic and Deformable User Interfaces could be able to provide a user with a 'seamless coupling' between the physical environment and the computer generated environment. Digital Foam (Figure 4) is a device which fits into the category of a Deformable User Interface (DUI). This Foam interface is active and the interactions with it are recordable (Smith, Thomas & Piekarski 2008a). It provides developers of prototyping, wearable or innovative computer systems the opportunity to interface with a computer in a new and unique fashion. This is an example of a computer interface where the user has a more natural, or non-rigid interaction with it; like interacting with clay or sponge material. Figure 4: An example of Digital Foam.
  • 13. 13 Digital Foam can be pushed, shaped and moulded into shapes that are recorded in 3D space with a computer. Smith, Thomas & Piekarski (2008b) developed six methods of HIC with Digital Foam; all of these methods involved the use of Digital Foam as a standalone input device without the need for a keyboard or mouse. Some examples of Digital Foam's versatility uses include free-form digital sculpting, controlling a video camera's aspect (zoom, pan & tilt) and driving a custom made on- screen menu (Smith, Thomas & Piekarski 2008b). Digital Foam can be a very versatile input device as its construction can be altered to fit different form factors such as planar (Figure 5(a)), hemispherical (Figure 5(b)) and spherical (Figure 6), or any shape that polyurethane foam can be moulded into. In Figure 7 the Digital Foam input device is manufactured in a spherical shape – the user holds the device with both hands like a ball of clay and pushes and shapes with it. (a) (b) Figure 5: (a) Planar and (b) Hemispherical Digital Foam. (a) (b) Figure 6: (a) and (b) Spherical Digital Foam.
  • 14. 14 As Digital Foam is a relatively new invention, interaction techniques and devices that incorporate the Foam sensor in their design are being actively developed. Figure 7 displays an example of a computer modelling interface, showing Digital Foam's versatility in interaction with a dynamic shaping environment, where the user is sculpting a digital model using the spherical input device. Figure 7: Digital Foam being used to sculpt a 3D model like clay. This dissertation is interested in exploring a new use of the Digital Foam Sensor to identify characteristics of objects that touch its surface. Consider the scenario of covering the entire surface of a robotic arm with flexible Digital Foam, producing a sensor that is similar to human skin, enabling a robot to inherit a sense of touch. This will enable a robotic arm to detect physical objects or collisions (e.g. Figure 8); creating habitat awareness without the use of computer vision techniques - such as environments with low or poor vision quality like factory production lines or mining spaces. If particular shapes and characteristics can be recognised; the robot could respond accordingly. That is, if a robot could tell the difference between a human body and the chassis of a new car, the robot could change its behaviour and cease all actions, potentially stopping a workplace accident or injury from occurring. My research into shape identification with the Foam surface is motivated by this example.
  • 15. 15 Figure 8: The FANUC Robot M-410iB Palletizing Industrial Robot.1 Previous interaction techniques with Digital Foam did not involve any shape recognition or learning methods. The aim of this study is to create a form of shape learning application for Digital Foam. Included in this is some machine learning intelligence to recognize shape pushes into the foam Sensor, giving it the ability to learn about its environment and expand the knowledge of its surroundings over time. 1 http://www.abelwomack.com/warehouse-products/industrial-robots/fanuc-robots/m-410ib/ E.g. If the metal surrounds of this Robot Arm were covered in Digital Foam; any unexpected physical contact could be considered an error; causing the Robot to react accordingly (without the need for vision sensors that could be hindered by environmental conditions).
  • 16. 16 2 Research Question Can different shaped objects be detected when pushed into the Digital Foam Sensor? The aim of this dissertation is to develop a shape recognition system for Digital Foam. As the Foam sensor consists of a deformable surface which, when an area of the surface is pushed, all surrounding sensors send a depth reading with respect to force applied, the aim is to identify rounded or smooth edges of geometric shapes. Two promising methods of identification are image comparison and Machine Learning. Image comparison template matching algorithms are a potential solution that can be adapted for recognition of shapes. Another possible approach is to adapt existing Machine Learning methods to identify shapes in the Foam Sensor. The use of the supervised learning process with training data is a means by which I will explore the recognition of shapes pushed into Digital Foam.
  • 17. 17 3 Literature Review A review of relevant literature for this dissertation is provided in this section. Given the research question; the literature review is broken into six sections: 1. Tangible and User Interfaces 2. Organic User Interfaces 3. Deformable User Interfaces 4. Digital Foam 5. Machine Learning 6. Machine Learning with OpenCV 3.1 Tangible User Interfaces Conventionally, the computer mouse and keyboard have clear boundaries between user input and the resultant output on a computer screen or Graphical Environment: the user moves a mouse which in turn moves a cursor on a screen; similarly the keyboard is pressed which in turn inputs text onto a page. Tangible User Interfaces begin to bridge the gap between virtual systems and the physical environment by employing props to enhance user interactions; becoming closer than a mere dictation or pointer based interface. This is the goal of the Tangible Media Lab of MIT, run by Professor Hiroshi Ishii. Ishii and Ullmer (1997) introduced the term 'Tangible User Interfaces' (TUIs). Their view is of a computer interface or environment that is immersive with humans, because we interact with our natural world in more ways than just pointing and clicking on a 2D surface. TUIs are user interfaces employing real world objects as physical interfaces to digital information (Ullmer & Ishii 1997). This is a description of a physical tool, or prop, which is the interface. For example, a real eraser could be used to delete virtual objects or a real pencil could be used to draw in a virtual environment. These physical props became to be known by the term ‘phicon’ – a physical icon. A major point of note about TUIs is that they are relatively specific interfaces tailored to certain types of applications in order to increase the directness and intuitiveness of interactions (Ishii 2008a).The following are three such examples of TUI's.
  • 18. 18 The Sandscape User Interface was created by Ishii's Tangible Media Group at MIT (Ishii 2008b). Users alter the form of the landscape model by manipulating sand while seeing the resultant effects of computational analysis generated and projected onto the surface of sand in real time, (as shown in Figure 9) the user is directly manipulating the sand, his movements are mapped and projected onto the screen. Figure 9: A user interacting with the Sandscape TUI (Image courtesy of H. Ishii). Another example of a TUI is the Illuminating Clay User Interface. The topography of a clay landscape model can be sculpted, shaped and expanded, while the changing geometry is captured in real-time by a ceiling-mounted laser scanner (Piper, Ratti, & Ishii 2002b).Users of this GUI would be in the field of Geographic Information Systems, environmental engineering and landscape design (Ratti et al. 2004; Ishii et al. 2004; Piper, Ratti, & Ishii 2002a). Figure 10 displays a user interacting with Illuminating Clay. Figure 10: The Illuminating Clay TUI – The user is directly forming/sculpting the geographic landscape (Image courtesy of H. Ishii). The ReacTable User Interface is a novel, multi-user electro-acoustic musical instrument with a tabletop Tangible User Interface. Several simultaneous performers share complete control over the instrument by moving physical artefacts on the table surface while constructing different audio topologies in a kind of tangible modular synthesizer (Kaltenbranner et al. 2006; Jorda et al. 2007). As shown in (Figure 11), the ReacTable has become a commercial product used by musicians in studios and live performances.
  • 19. 19 Figure 11: The ReacTable TUI – Creates sounds by the location and proximity of phicons on the surface (permission requested). 3.2 Organic User Interfaces Holman et al. (2013) states that the art of user interface design is on the cusp of a revolutionary change; one that will require designers to think about the effect of a material and a form on a design. This change also includes the parts of our body we use for interaction. That is, not just the fingers but the palms, hand, arm or even the entire body are potentially usable (Rekimoto 2008). Additionally, there has been a recent increase in interest toward using physical motion of real objects as a communication medium (Parkes, Poupyrev & Ishii 2008). The above factors of technology: using more body parts for interaction and physical motion, have combined into a term known as ‘Organic User Interfaces’. Organic User Interface (OUI) is a more recent phrase in the evolution of user interfaces, it was first suggested by (Vertegaal and Poupyrev 2008).The authors chose the term Organic because of the technologies that underpin some of the most important developments in this area, that is, organic electronics, and also because of the inspiration provided by millions of organic shapes that can be observed in nature; forms of amazing variety, forms that are often transformable and flexible. Another explanation of the term ‘organic’ is suggested by Schwesig (2008); he simply puts it as an interface that feels ‘natural’. Vertegaal and Poupyrev also introduced 3 main design themes that Organic Interfaces should follow: 1. Input equals output - input actions from the user can be output onto the same object 2. Function equals form - The form of an object clearly determines its ability to be used as an input 3. Form follows flow – the context in which the interaction takes place defines the action.
  • 20. 20 An excellent example of an Organic user Interface is the Kinetic Prototype Mobile phone (Figure 12), produced by Nokia researchers and demonstrated for the first time at Nokia World 2011 at London’s Excel Centre.2 Kildal, Paasovaara & Aaltonen (2012) created the device as a proof of concept, they state: ‘This prototype is functional as a deformable user interface, meaning that it detects the deformation input from the user, which can be used to design and test interactions’. Figure 12: The Nokia Kinetic Prototype Mobile Phone (Image courtesy of J. Kildal). Another OUI example is the Gummi credit card sized computing device (Schwesig, Poupyrev & Mori 2003) (Figure 13). The creators of Gummi describe it as: An interaction technique and device concept based on physical deformation of a handheld device. The device consists of several layers of flexible electronic components, including sensors measuring deformation of the device. Users interact with this device by a combination of bending and 2D position control (Schwesig, Poupyrev & Mori 2003). The authors indicate how the conventional Windows, Icons, Mouse, Pointer (WIMP) interface is not practical on smaller, mobile devices. As devices and screens become smaller, pointing and clicking on small interface elements becomes increasingly difficult. This paper indicates researchers were considering the limitations of the window, icon and mouse interface and the possibilities of hands on manipulation of a device would allow types of functionality that would not be possible in a WIMP interface. Figure 13: The Gummi bendable handheld computer (permission requested). 2 http://conversations.nokia.com/2011/10/28/nokia-kinetic-bendy-phone-is-the-next-big-thing-video/
  • 21. 21 3.3 Deformable User Interfaces The term Deformable User Interface (DUI) was first suggested by Kildal (2012); he describes the DUI as being placed in the intersection between Organic and Tangible User Interfaces. They consist of physical objects that are intended to be grasped and manipulated with the hands in order to interact with a system. The manipulation of a DUI results in the physical deformation of the material the object is made of. Thus, deforming the interface elastically or plastically is the distinctive form of input to the system when using a DUI. Such deformations are designed to give physical form to the interaction with information. The DUI could also be considered a subset of the Organic User Interface, as both terms involve the manipulation of raw material to convey computational information. The term free-form deformation (or FFD), can be thought of as a method for sculpturing solid models, first suggested by Sederberg & Parry (1986). Due to the limitations in materials and manufacturing at the time, their study was an entirely software based application with no external input devices for shaping and moulding. This study indicates that a sculpturing metaphor for geometric modelling has been a topic of interest for some time. Further developments in transducer technology enabled Murakami & Nakajima (1994) to create an elastic object as an input device. The interface system consists of a real elastic object as a shape deformation input device and realtime computer graphics. By deforming the object with bare hands with a tactile feedback, users can manipulate a 3-D shape modelled and displayed on a computer screen directly and intuitively (Murakami & Nakajima 1994) (Figure 14). This is an early example that the idea of manipulating a soft, compressible foam object could be useful as an interface device. Figure 14: Murakami and Nakajima’s concept of a 3D Shape Deformation device (Image courtesy of T. Murakami).
  • 22. 22 Murakami's 12 cm cube was made from both electrically nonconductive and conductive polyurethane foam. To measure device shape deformation, complex calculations were implemented, the authors discovered that due to the inaccuracy of the conductive foam as a sensor, measured lengths can be geometrically impossible (Murakami & Nakajima 1994). It was later found that it is possible to capture geometric shapes, using a variant of the conductive foam materials that was demonstrated by the Digital Foam Sensor (Smith, Thomas & Piekarski 2008b). 3.4 Digital Foam An input device that suits the requirements as a Deformable User Interface is Digital Foam. The development of this DUI is described as a new type of input device that can be used to support natural sculpting operations similar to those used when sculpting clay (Smith, Thomas, & Piekarski 2008a). Digital Foam consists of a mixture of conducting and insulated polyurethane foam, arranged in an evenly distributed matrix structure as shown in (Figure 15). Each conductive sensor produces a digital output when pressed that increases in value depending upon the depth of the press. This output is repeatable, making Digital Foam suitable for continual use. Figure 15: The conductive sensors of Digital Foam. Further explanation and utilisation of Digital Foam, in the domain of user interfaces, is described by (Smith, Thomas & Piekarski 2008b). Specifically, the authors detail activities that digital foam could be used for, such as free-form digital sculpting, controlling a video camera's aspect (zoom, pan & tilt) and driving a custom made on-screen menu. This paper also describes the advances made by the original inventors with higher resolution Digital Foam (adding more sensors on a ball shape foam input), additional shapes as inputs such as a half round graspable sensor for demonstrations (Figure 16), and the results of a user evaluation with this newer high resolution version.
  • 23. 23 Figure 16: Demonstration of Digital Foam. 3.5 Machine Learning To define 'Machine Learning', one must define learning with respect to an organic being. Michalski, Carbonell, & Mitchell describe the learning process as: the acquisition of new declarative knowledge, the development of motor and cognitive skills through instruction or practice, the organization of new knowledge into general, effective representations, and the discovery of new facts and theories through observation and experimentation (Michalski, Carbonell, & Mitchell 1984, p.12). There are two basic forms of learning: knowledge acquisition and skill refinement. Knowledge acquisition describes the process by which an entity obtains new symbolic information and then applies that information in an effective manner. Skill refinement can be considered as the gradual improvement of motor and cognitive skills through practice, such as learning to ride a bicycle, play the piano or a newborn horse taking its first shaky steps shortly after being born. Therefore, the knowledge acquisition type of Learning is the one that belongs in the field of artificial intelligence, as this form of learning can be considered an intellectual endeavour, whereas skill refinement can be seen as a motor coordination task performed by living creatures trying to move or perform some action. The authors consider skill refinement to be a more non-symbolic process, such as those studied in adaptive control systems. Samuel (1959) in his paper described machine learning as a 'Field of study that gives computers the ability to learn without being explicitly programmed'. Samuel showed how, even in the early computer era of the late 1950's, he
  • 24. 24 used a form of decision tree in his software program to 'learn' how to play the Checkers boardgame and compete well against a human opponent (and often win). Hence, knowledge acquisition using machine learning is the field of study of this thesis. The goal of machine learning is to design and develop algorithms that allow systems to use empirical data, experience, and training to evolve and adapt to changes that occur in their environment (Kapitanova & Son 2012). This is ideally what we will be hoping to achieve – a learning application for the user’s benefit. Another explanation of Machine Learning is: Given a training set, we feed it into a learning algorithm (like SVM, Artificial Neural Nets, Logistic Regression, Linear Regression etc). The learning algorithm then outputs a function, which for historical reasons is called the hypothesis and denoted by h.3 As shown in (Figure 17), the hypothesis’ job is to take a new input and give out an estimated output or class. Or simply - The hypothesis can be thought of as a machine that gives a prediction y on some unseen input x. Figure 17: The basic Machine Learning process.3 3 http://onionesquereality.wordpress.com/2009/03/22/why-are-support-vectors-machines-called-so/
  • 25. 25 The parameters that define the hypothesis are what are 'learned' by using a training set of either labeled, unlabeled or partially labelled data. Machine learning falls into three main categories: Supervised, Unsupervised and Semi-Supervised learning data (Figure 18). Figure 18: Machine Learning Algorithm classes (image adapted from http://bitsearch.blogspot.com.au/2011/02/supervised-unsupervised-and-semi.html). Unsupervised Learning Unsupervised learning is when there is no labelled data available for training. This method is used when a general 'structure' to the data is sought; rather than an exact name for each item. Examples of this are often clustering methods. Supervised Learning In this case the training data exists out of labelled data: All samples in the training set have a label, or name. The problem you solve here is often predicting the labels for data points without a label. Semi-Supervised Learning In this case both labeled data and unlabeled data are used. Generally a small amount of labelled data is associated with a large amount of unlabelled data. This method is used for example, when the training set is large but has many examples of a similar type; a time/cost saving benefit can be achieved by not labelling every single item in the dataset. Supervised Machine Learning is the process by which an algorithm uses manually labelled data to train a training set; which is then used in the validation (or hypothesis) process to describe, or classify, the input variable.
  • 26. 26 For this thesis' application; Identifying Shapes with the Digital Foam surface; I have selected Supervised Learning as the appropriate method of Machine Learning. This is due to the fact that the training set is considered relatively small in size and the idea that we want a definite result or description of the new shape sample; I don't want to cluster the data, I want to identify it. Some examples of existing frameworks, all of which contain versions of Supervised Learning Algorithms that could be suitable for this thesis are:  Torch7: A Matlab-like Environment for Machine Learning - a versatile numeric computing framework and machine learning library that extends Lua (Collobert, Kavukcuoglu & Farabet 2011).  Bob: A free signal processing and machine learning toolbox for researchers – a researcher- friendly Python environment for rapid development, yet remains efficient for processing large amounts of multimedia data through the use of fast C++ implementations (Anjos et al. 2012).  OpenCV: An open source computer vision library created by INTEL Research in 1999 (Bradski & Kaehler 2008); contains implementations more than 500 optimized algorithms. New functionality and algorithms for object detection and general machine learning have been added by (Druzhkov et al. 2011).  Emgu Open CV: Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library. Allowing OpenCV functions to be called from .NET compatible languages such as C#, VB, VC++, Python etc. The wrapper can be compiled in Mono and run on Windows, Linux, Mac OS X, iPhone, iPad and Android devices.4 4 http://www.Emgu.com/wiki/index.php/Main_Page
  • 27. 27 3.6 Machine Learning with OpenCV There are a number of 'off the shelf' Machine Learning environments available, that could be suitably applied for this dissertation. My studies have found several instances where OpenCV has been adapted for intelligent system learning scenarios such as image recognition, human feature detection and environment awareness for robots or automated systems. Flynn, De Hoog & Cameron (2009) presented a paper on Machine Learning Applied to Object Recognition in Robot Search and Rescue Systems. This paper details how they were successful in adapting an OpenCV machine learning algorithm, know as a Decision Tree, for the detection of human faces and common obstacles in disaster zones, enabling a robot to traverse a disaster zone and provide as much information as possible on the location and status of survivors. Pahalawatta & Green (2013) detail the use of OpenCV and Machine Learning to detect the diffusion of a photoelectric beam caused by small airborne particles. This application was suggested as a possible improvement in household smoke detectors. As a silicon photodiode receives the infrared beam of light scattered by smoke and dust particles, this received infrared light ray is then converted to an RBG image, and a histogram sequence from the sampled images is created. This histogram sequence is then compared to seven known smoke particle type histogram sequences to determine the presence of smoke. The authors applied the results to two widely used supervised classification algorithms (Multiple Discriminant Analysis and K-nearest neighbours), and concluded that their process was an improvement over current photoelectric smoke detection methods.
  • 28. 28 4 Research Design This section details the research into shape recognition with Digital Foam. Figure 19 describes the process of recognising a shape pushed into Digital Foam. To accomplish the task of shape recognition, I explain each of the aspects described in Figure 19: 1. The Digital Foam apparatus and its operation 2. The selection of Test Shapes 3. Machine Learning theory - specifically Image Recognition by Machine Learning 4. The selection of a Software language and Framework. Figure 19: Pushing a shape into Digital Foam and recognising the shape by Digital Foam Reader 4.1 Digital Foam Hardware Overview and Apparatus Other samples of Digital Foam are constructed with a larger or non-square array of sensors, the 4x4 example is used in this Thesis as a prototype version; it is envisaged that the software code and application can be reconfigured for these larger Digital Foam examples. I will use the 4x4 Digital Foam Prototype, (as shown in Figure 20(a) as an Arduino Shield and Figure 20(b) the device housed in a plastic case) to prove my software application is capable of the required results. (a) (b) Figure 20: (a) Digital Foam Prototype - constructed as an Arduino Shield, and (b) the 4 x 4 Digital Foam apparatus used in this application.
  • 29. 29 When the Digital Foam sensor is physically pressed (or deformed), an output of numerical data for each conductive tube in the foam construction is produced. As shown in (Figure 21) there is a 4 x 4 array of conductive foam sensors(dark grey tubes of foam), housed within a 50mm square block of non-conductive foam (light grey foam in Figure 21). 16 digitised outputs are produced. The depth of the foam in (Figure 21) is 25mm. Ideally, a depth reading of 0mm would return 1024, and a full depth press reading of 25mm would return 0 (Figure 22(a)). Due to physical limitations when compressing Digital Foam, a full depth press produced a numerical value anywhere from 0 to 512, hence 512 was considered the cut off point for a full press , with lighter presses scaled linearly back to 1024 for no press reading (Figure 22(b)). Figure 21: Each Conductive Foam Tube produces a variable output according to depth of deformation. (a) Pushing a shape into Foam with an Ideal depth reading (b) Scaled reading Figure 22: The Output from each Conductive Foam Tube is converted to a Grey Scale value by Digital Foam Reader, according to depth of push, (a) is an ideal scale, (b) is the actual scale used. 25 mm thick Digital Foam ensor
  • 30. 30 4.2 Shape Selection The aim is to be able to discern between some simple example shapes. For testing purposes, I have chosen four different shaped objects (Figure 23). These are the 'Test Shapes' used in this dissertation. The shapes have been designed with Autodesk Inventor (Shih, R 2012; Waguespack, C 2012) and produced physically with a 3D Printer, each shape is approximately 40mm x 40mm in size, similar in size to the Digital Foam array in Figure 21. Each shape has a handle to facilitate pressing into the Foam Sensor for sampling purposes. These shapes provide an excellent test bed to identify shapes, with a mixture of sharp edges, rounded edges and a spherical specimen. The examples for the purposes of object recognition I chose are:  L - Shape  T - Shape  Sphere  Flat Cylinder (a) (b) (c) (d) Figure 23: (a) 'L' Shape, (b) 'T' Shape, (c) ‘Sphere’ and (d) ‘Flat Cylinder’ Test Shape designs.
  • 31. 31 The test shapes were produced in a 3D printer from the Autodesk designs, specifically for this dissertation: (a) (b) (c) (d) Figure 24: (a) - ‘L’ Shape, (b) - ‘T’ Shape, (c) - ‘Sphere’ and (d) - ‘Flat Cylinder’ plastic samples. These four shapes provide a mixture of right angle, an intersection, straight and spherical samples. It is hoped that with these shapes, and multiple push samples of each shape, an excellent test library will be created. The orientation of the four test objects will remain constant; no image comparison of a rotation of the sample objects is initially required. The possibility of orientation matching will be examined with the Machine Learning process; as some unidentified shapes will be angular rotations of the sample shapes; to confirm whether orientations of a shape can be recognised by software. Flat Cylinder HandleHandle Sphere Handle L - Shape Handle T - Shape
  • 32. 32 4.3 Design of Simple Image Comparator Initially, a simple comparison algorithm is intended to be used by the application. Utilising some simple methods in OpenCV we can subtract one image from another, hopefully finding a 'best fit' image to identify a new shape push by. This process is shown below: 1. The image library is loaded; in this instance the library contained only one sample of each of the four test shapes (as displayed in Figure 24), namely: Figure 25: The Image Library of the four sample shapes from Figure 24. 2. A Push Sample is taken: A shape is pushed into Digital Foam. The deformation of the Foam Sensor by the shape is captured in the form of a grey scale image. Figure 26: The Push Sample Image that will be compared to the Image Library. 3. The Sample Image is compared with all images in the Library: The application cycles through all Library Images, subtracting the Push Sample Image from each Library Image, using the OpenCV function: Image<Gray, Byte> resultComparison; resultComparison = Image1.AbsDiff(Image2[arrayItemCount]);
  • 33. 33 For each resultComparison Grey Scale Image, the quantity of grey scale data is summed numerically from the scale in Figure 22(b). The lowest amount of data is the closest image comparison. For the Push Sample Image in Figure 26 Error! Reference source not found. compared against the four image library in Figure 25, the calculations were: Library Image 1 Compared to result = grey count = 806 Library image 2 Compared to result = grey count = 966 Library image 3 Compared to result = grey count = 1015 Library image 4 Compared to result = grey count = 317 Figure 27: The Image Comparison process The lowest number being closest to the Sample Push Image, in this case 317 was the best result. The result is displayed in Figure 28. Figure 28: The Library Sample chosen to be the correct image. Furthermore, if the application did not return a suitable image comparison, it provides the user with the option to save this image to the image library. This is the process by which a user builds the library of shapes they believe are useful and relevant for their particular application.
  • 34. 34 4.4 Design and Theory of Machine Learning Algorithm To describe Machine Learning by a computer; I explain it as 'a method of teaching a computer to make a prediction or to identify something'. In order for a computer to predict or identify, there are two main steps to machine learning: 1. Create a library of knowledge: A certain level of knowledge about a particular topic must be created. The way a computer builds knowledge on something is by the building of a library (or database); this library is the equivalent of a person's memory - this is what they know about a certain subject or situation. 2. Apply a suitable algorithm to obtain an outcome: The computer then uses this library (knowledge) in a logical manner (algorithm) to either classify a situation or to predict an outcome. In the context of push recognition for Digital Foam, this process is displayed in the following Flowchart: Figure 29: Flowchart of the Shape Recognition Software. YesNo Connect to Digital Foam Device Load Labelled Dataset Add push image to Library of shapes Describe image Get a push image from Digital Foam, compare to library of shapes Match found
  • 35. 35 This flowchart is a common approach to Machine Learning techniques. This enables me to utilize existing methods and algorithms for this dissertation, without the need to invent new algorithms, only the requirement to adapt a suitable existing algorithm for the above process. 4.4.1 Initial Process - the Creation of a Library of Knowledge If the user pushes a spherical object into the Foam (Figure 30(a)), Digital Foam produces numerical data corresponding to push depth as shown in (Figure 30(b)), an output grey scale image results from Digital Foam Reader as per (Figure 30(c)). The differing grey scale squares equate to the depth of push received from each Conductive Tube on the Foam device. (a) Pushing a sphere test sample into Digital Foam. 1022 1022 1022 1022 1022 930 905 1022 1022 724 843 1022 1022 1022 1022 1022 (b) The numerical values output from Digital Foam equivalent to depth of push. Figure 30: (c) The resultant image created by Digital Foam Reader.
  • 36. 36 In the early stages of the application, it can be assumed that the image library is empty - to compare an image at this stage would be worthless, as the computer has no 'memory' of shapes to select an appropriate match from. A library of images can be created by the user at this time, for as many different types of push images they believe are suitable for their situation. This image library is the training set for the Machine Learning process. Some examples of image libraries are displayed below, with a 8 image library (Figure 31(a)) and a 15 image library (Figure 31(b)). Figure 31: (a) 8 Images in the image library and (b) a 15 image library stored by Digital Foam Reader. As this library grows, the more useful Machine Learning becomes in identifying further push shapes into the Foam sensor, because the Machine Learning Algorithm has a greater working knowledge of the situation. 4.4.2 Application of Suitable Machine Learning Algorithm As mentioned by 5 , OpenCV has a number of Machine Learning Algorithms already implemented as C++ classes. The Machine Learning Library is a set of classes and functions for statistical classification, regression, and clustering of data. The Machine Learning Algorithms implemented by OpenCV in C++ are:  Normal (Naive) Bayes Classifier  K-Nearest Neighbours  Support Vector Machines (SVM)  Decision Trees  Boosting  Gradient Boosted Trees  Random Forest Trees
  • 37. 37  Extremely randomized trees  Expectation Maximization  Artificial Neural Networks As Emgu CV is a wrapper library (a layer of code that enables cross language interoperability), not all the C++ Machine Learning Algorithms have been adapted to C#. The list of Emgu CV C# Machine Learning Algorithms is smaller 4 , and includes:  Normal Bayes Classifier  K Nearest Neighbours  Support Vector Machine (SVM)  Expectation-Maximization (EM)  Neural Network (ANN MLP)  Mushroom Poisonous Prediction (Decision Tree) A brief explanation of each of these Algorithms, and the decision whether to apply each one in this dissertation, follows: 1. Normal (or Naive) Bayes Classifier: The naive Bayes classifier is a term in Bayesian statistics dealing with a simple probabilistic classifier based on applying Bayes' theorem with strong (naive) independence assumptions.5 A more descriptive term for the underlying probability model would be 'independent feature model'. In image recognition, a Bayes classifier usually works on separate, unconnected features to determine the identity of an object (such as a face). A widely used classifier in the field of image recognition, and possibly suitable to my application. As the features are not pronounced enough in my shape detection application (There are actually not enough features when using the 4x4 digital foam array prototype), the ability to adapt this classifier to my application was limited, but with larger Digital Foam arrays a Naive Bayes Classifier could be considered a solution to shape push identification. 2. K-Nearest Neighbours (K-NN): Useful algorithm for classification. A method for classifying objects based on closest training examples in the feature space. K-NN is a type of instance-based learning, or lazy learning where the function is only approximated locally and all computation is deferred until classification. It can also be used for regression (Bradski & Kaehler 2008, Page 460). The K-NN algorithm is considered one of the simplest machine learning algorithms; a test data point is classified according to the majority vote of its K nearest other data points, in a Euclidian sense of nearness (Bradski & Kaehler 2008, Page 463). This algorithm could be very useful for Shape Recognition with Digital Foam. K-NN also performs all computation at the classification stage; this could slow down image
  • 38. 38 processing, but should be sufficient for my application as the images are considered small in terms of number of pixel comparisons to be made. 3. Support Vector Machine (SVM): A Support Vector Machine is used mostly for classification between two data sets.5 The data library can be considered to be two sets of vectors in an n-dimensional space. A SVM will construct a separating hyperplane in that space, one which maximizes the margin between the two data sets. To calculate the margin, two parallel hyperplanes are constructed; one on each side of the separating hyperplane, which are 'pushed up against' the two data sets. As there was only a single data value to classify images with (in my case, grey scale pixel values), I found I was unable to adapt an SVM to my particular application. 4. Expectation Maximization (EM): The expectation maximization algorithm enables parameter estimation in probabilistic models with incomplete data.6 EM is used in statistical calculations where the equation cannot be solved directly. Latent variables, unknown parameters or missing data values can be formulated with the assistance of additional theoretical data points. As I will be working with finite data, this algorithm would not be entirely suitable for my application. 5. Artificial Neural Network (ANN): An ANN is an algorithm that is loosely modelled after the neuronal structure of the mammalian cerebral cortex but on much smaller scales.7 A Neural network typically consists of a number of interconnected nodes which contain an 'activation function'. Patterns are presented to the network via the input layer, which communicates to one or more hidden layers where the actual processing is done via a system of weighted connections. The hidden layers then link to an 'output layer' where the classification or prediction is output (see Figure 32). As I was able to find many examples of image recognition using ANNs, one must consider adapting and ANN algorithm for this dissertation. Therefore, I will not entirely rule out it the use of an ANN algorithm for Foam sensor shape recognition. 5 http://docs.opencv.org/modules/refman.html 6 http://ai.stanford.edu/~chuongdo/papers/em_tutorial.pdf 7 http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html
  • 39. 39 Figure 32: Diagram of an Artificial Neural Network (ANN).5 6. Decision Trees: A decision tree is a binary tree (tree where each non-leaf node has two child nodes). It can be used either for classification or for regression. For classification, each tree leaf is marked with a class label; multiple leaves may have the same label. For regression, a constant is also assigned to each tree leaf, so the approximation function is piecewise constant.5 Decision Trees are mostly used in the field of decision analysis, where the end result of the problem is the solution of a puzzle or the reaching of a goal. Decision tree learning is a very successful technique for supervised classification learning. A Decision Tree is a flowchart like structure, with multiple internal binary nodes where a test is made. As Decision Trees work in a top-down structure starting from a single point and splitting multiple times on each node's decision, I was unable to adapt a Decision Tree for my image recognition process. From the above analysis of Machine Learning Algorithms available in the Emgu CV Library, I concluded that the most suitable Algorithms to this dissertation, in order, are: 1. K-Nearest Neighbours (KNN) 2. Artificial Neural Network (ANN) 3. Normal (or Naive) Bayes Classifier. Initially I will attempt to adapt a KNN Machine Learning method of image recognition to my dissertation.
  • 40. 40 4.4.3 Application of K-Nearest Neighbours to Image Recognition To identify shape in an image, I chose keypoint matching, or feature detection, as the method of comparing two images. This approach was taken as I found sufficient theory on feature detection and description within the OpenCV documentation to validate my choice.5 Some of the common and widely implemented feature (or object) detection algorithms available in the Emgu CV library are (in order of date of implementation):  SIFT Detector - Scale Invariant Feature Transform (Lowe 1999)  HOG Descriptor - Histogram of Oriented Gradients (Dalal & Triggs, 2005)  SURF Detector - Speeded-up Robust Features (Bay, Tuytelaars & Van Gool, 2006)  FAST Detector - Features from Accelerated Segment Test (Rosten, Porter & Drummond, 2010)  ORB Detector - Oriented BRIEF keypoint detector and descriptor extractor (Rublee, Rabaud, Konolige & Bradski, 2011). The exact manner in which each feature detection method operates will not be discussed in this Dissertation. Each feature detection algorithm has its benefits and costs; e.g. one algorithm could be faster at feature extraction and comparison, but another algorithm could be more accurate but time consuming. As the speed of comparison was not an issue (the grey scale image has very little detail to be examined), the only deciding factor was the accuracy and repeatability of the algorithm to find good feature matches between sample and library image. Discovering the best feature detection method for my dissertation was achieved through applying each algorithm in software, observing the accuracy of the output and making comparisons with all other methods. The SURF Algorithm was found to produce the most promising results for the grey-scale images of push sample comparison, when compared to the other detection methods listed. The K-Nearest Neighbours algorithm works as follows: Using n features extracted from the unknown image by the feature detection algorithm (chosen previously), the feature points are placed on a graph, making them feature vectors. The same process is performed on a known (labellel and Classified) image, producing a graph also with n feature vectors. Each n feature vector on the unknown image is then measured to find the k-Nearest Neighbours in the labelled image. This measurement is performed by way of inverse Euclidean distance (d) between the sampled image's feature vector and the labelled image's k-Nearest feature vectors, producing (k x d) for all n features (The value of k is an integer manually selected by the user, typically a low odd number such as 3 or 5; an odd number ensures a majority decision is possible when classifying each feature vector).
  • 41. 41 This calculation is repeated for all images in the labelled set (or library).The sum of n x (k x d) produces a class of images that has the highest majority vote; the unknown image is considered to be a member of this class, and is labelled accordingly (Cover & Hart, 1967). 4.5 Selection of Software and Framework I found sufficient articles and applications of the OpenCV software library for machine learning to convince me to apply this library to my research (refer section 3.6). Many articles can be found on OpenCV face recognition which, although the writers often employ machine learning algorithms from the OpenCV library, is not as relevant to this study, as face recognition is not a requirement for this dissertation. More relevant were articles on robotic machine vision, such as Kao & Huy (2013) and environment learning and object recognition such as Flynn (2009), and German et al. (2013). As these papers discussed the use of OpenCV for Robotic image processing in poor visibility and dynamic, hostile environments (e.g. earthquake sites, underground mines or battlefields), I found these papers particularly informative and inspiring to reaffirm my acceptance of OpenCV for this study. OpenCV was written originally in C++, one library called Emgu CV is available for applications in C# .As stated in 4 : Emgu CV is a cross platform .NET wrapper to the OpenCV image processing library, allowing OpenCV functions to be called from .NET compatible languages such as C#, VB, VC++. Emgu CV is an additional, non standard C# library, and the installation of the Emgu CV libraries are required for image processing with this application. Shi (2013) performed a comparison of three Computer Vision processing libraries: OpenCV, EmguCV and AForge.NET. His findings were that although Emgu CV was not the fastest application in terms of software performance (C++ OpenCV was faster); Emgu CV had better overall results in documentation and ease of use that compliments any performance issue and sets Emgu CV ahead of its two comparisons. Furthermore, my choice of applying the Emgu OpenCV Library is simply to take advantage of the Microsoft .NET Framework in the simplification of GUI development using the C# programming Language; OpenCV itself does not have the ability to create GUI applications on its own. Finally, I found the quality and quantity of online documentation and support available for C# Emgu CV to be better than Torch7 and Bob (refer section 3.5).
  • 42. 42 5 Implementation The software application, 'Digital Foam Reader', was designed and coded in Microsoft Visual Studio 2012, using the C#.NET language and utilising a version of OpenCV written for C# called EMGU-OpenCV (Emgu CV).5 The software application’s tasks can be broken down as follows: 1. Connect to the Digital Foam device via any Serial Port on a standard PC running Windows software. 2. Read ‘Push Data’ from Digital Foam and convert this data into a grey scale image. 3. Compare this grey scale image with other saved images from a library of shapes, find the closest matching image by two processes:– A. use a simple image ‘comparison’ method, and B. if A is not successful, use a more complex Machine Learning algorithm to choose the closest matching shape. 4. If there is no closest comparison to the pushed action, give the application the ability to realise this and offer the user to save this ‘Push Data’ as an image, with a label name. This image is then added to the array of images - hence increasing the application's knowledge of shapes by supervised learning. 5. Have the ability to save ‘Push Data’ as an image, then either load these saved images or possibly store these images in a library that can be used for further shape recognition. Figure 33: Microsoft Visual Studio early Development of Digital Foam Reader.
  • 43. 43 The software application has the ability to read from Digital Foam, capture the numerical data and convert this data into a 4x4 grey scale image. Figure 34: Digital Foam Reader application showing a sample push image. This is the start of the Library building phase; as the local folder becomes the repository of the push image library. It is important to note here that each time ‘Push Data’ is received, it will differ slightly even with the same shape pushed into Digital Foam by the same user; due to the amount of downward force on the object and any slight variation in vertical alignment between the object and the Foam Sensor. This complicates the process of shape recognition greatly. If the depth and angle that the shape is pushed into the foam is constant, or even if the output from Digital Foam was constant with the same push (which it isn’t – the digital values can differ slightly each time), shape recognition would be much simpler. But due to these factors which were noted whilst testing the software, part of the complexity of this study is due to the inability to repeat the exact same push sample into Digital Foam. This result is a statistical situation known as standard deviation, or bell curve, where most push observations are in the middle of the depth scale, but some samples will occur from either a light push or a heavy push into the Foam. From all the above variables, it would be rare for any two push samples to produce the exact same grey-scale image. Eventually the user builds a library of images that will be used for comparison techniques, and further into the application this same library of push images will be used as the training set for
  • 44. 44 Machine Learning techniques - for when a simple image comparison algorithm does not produce a sufficient image match. Figure 35 - Figure 38 displays the shape push and image creation trials for the four test objects constructed for this dissertation: (a) (b) (c) Figure 35: (a) - the ‘L’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale image capture of this push sample. (a) (b) (c) Figure 36: (a) - the ‘T’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale image capture of this push sample. (a) (b) (c) Figure 37: (a) - the ‘Sphere’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey- scale image capture of this push sample.
  • 45. 45 (a) (b) (c) Figure 38: (a) - the ‘Flat Cylinder’ Shape test object, (b) - the sample being pushed into foam and, (c) - the output grey-scale image capture of this push sample. Once a deformation is successfully and satisfactorily captured, the user can save the grey-scale image as a jpeg file to the selected local folder; the filename of which will become the ‘label’ for the shape. This is the ‘Train a machine’ step in (Figure 39). For example, with the shape in (Figure 37), the user could enter ‘small-sphere’ as the shape. If this shape was push-sampled again; the Machine Learning application should be able to recognize this shape as ‘small-sphere’ from its learned library. Figure 39: Steps to Train a learning machine.
  • 46. 46 5.1 Simple Image Comparator The Simple Image Comparator implements a Grey-Scale image comparison between the sample image and a library image. The comparison is performed on each of the sixteen shaded squares which are depth conversions of a sample shape pushed into the Foam Sensor (see Figure 22). A library image is a product of the same process, saved previously. An interesting point to note here is that virtually every push sample of the same shape produces a slightly different Grey-Scale image, due to slight variations in axis and depth of push when the shape is held down by the user and sampled. So, the unrepeatability of two push samples of the same shape will affect image recognition greatly. (a) (b) (c) Figure 40: (a) The Push Sample Image, (b) Closest matching Library Image and (c) Result of the Comparator process 5.1.1 Evaluation To determine the accuracy and repeatability of the Simple Image Comparator, a test recording of shapes was undertaken (Figure 40). This test measurement involved 25 push samples of the four test shapes, as well as 25 push samples of four unknown shapes; 200 push samples in total. The unknown shapes consisted of a rotation of three of the test shapes, rotated either 45 degrees or 180 degrees, and the Sphere sampled in the bottom right corner of the Foam Sensor (see Figure 41(b)). The test was conducted with:  25 push samples of each of the four test shapes, 100 total samples.  25 push samples of four unknown shapes, 100 total samples. The unknown shapes were chosen as they were as different as possible to the original test shapes, within the constraint of a 4x4 sample foam array.
  • 47. 47 Known Shapes Unknown Shapes (a) (b) Figure 41: (a) The four Test Shapes and (b) The four Unknown Shapes to be compared in this evaluation. 5.1.2 Results The results of the test are displayed as a Confusion Matrix, as shown in Table 5-1. The Image Comparator results are displayed in the columns from left to right, showing the percentage of correct identification that was made for each shape in the final column. The actual physical shapes are displayed in the rows from top to bottom, separated into known and unknown shapes. As stated in the Evaluation, the unknown shapes were rotations (except the Sphere sample which was repositioned) of the Test Shapes. This proves that the Simple Image Comparator will only identify a match if the position and orientation of the push sample is exactly the same as the library image sample. Additionally, the depth of push and angle of incidence to the Foam Sensor also affect the accuracy of the Simple Comparator's results.
  • 48. 48 Table 5-1 Test results for the Comparator Evaluation. Predicted from Comparator Software L-shape Cylinder Sphere T-shape Unknown Percent Correct Actual Physical Shape Known Shapes L-Shape 25 0 0 0 0 100 Cylinder 0 24 0 0 1 96 Sphere 0 0 25 0 0 100 T-Shape 0 0 0 25 0 100 Total 25 24 25 25 1 99% Unknown Shapes L-Shape 1800 0 0 0 6 18 72 Cylinder -450 0 0 2 0 23 96 Sphere Moved 0 0 0 0 25 72 T-Shape 1800 1 0 0 0 24 96 Total 1 0 2 6 90 90% Total accuracy for Simple Image Comparator: 94.5% 5.1.3 Discussion The results obtained from the Comparator Experiment demonstrate that, with some algorithm refinement, a simple image comparator can produce excellent results, with 99% accuracy when identifying a known shape from its library, 90% accuracy when acknowledging an unidentified image, and an overall success rate of 94.5%. Indeed, the objects must in the exact position and orientation on the Foam Sensor between sample and library images. Of particular note when I performed the test was the indication that depth of push was relevant to the comparator's accuracy; with all the incorrectly identified shape comparisons taking place when a push was either light in pressure or very heavy in pressure. This indicated that a working range of push depth is very important to shape recognition with image comparisons. From the test results, I consider a working Grey Count range to be from 300 - 900 (see Figure 58), anything beyond 900 and generally the Digital Foam Sensor is deforming heavily, thus affecting the Foam Sensors along side the sensors that are deformed due to the shape being forced further into the Foam, pushing the surface fabric down with it (see Figure 42). Finally, the angle of incidence when pushing a Shape into the Foam Sensor affects the resulting Grey Scale image, lessening the chances of identification (Figure 43).
  • 49. 49 (a) Too little deformation (b) suitable deformation (c) Excessive deformation Figure 42: (a) a soft Push Sample, (b) a suitable Push Sample and (c) a Push Sample with excessive depth. Figure 43 Poor Shape orientation to Foam Sensor Surrounding fabric of Foam Sensor not touched by the shape is deformed Foam Sensors not deforming enough, producing low intensity Grey Scale image
  • 50. 50 5.2 Machine Learning Image Comparator The realisation of the Machine Learning Image Comparator was the implementation of the SURF Feature Detection algorithm used to detect features in the image, then the execution of a K-Nearest Neighbours algorithm to find the closest image from the library. This Comparator returns an image, similar to Figure 44, displaying the features found in each image (small yellow circles), the matches between features (blue line joining two yellow circles) and the size of the Region of Interest (if any) in the form of an orange quadrilateral shape. The closer the orange quadrilateral is to the full shape of the sampled image, the better the match (See Figure 44). In Figure 44 the sampled shape from the Digital Foam Sensor is on the left, with the library sample that is deemed to be the best match from the result of the k-Nearest Neighbours algorithm is displayed on the right of the image. (a) A poor image match (b) An average image match (c) A good image match (d) An excellent image match Figure 44: Image match accuracy from (a) poor, to (b) average, to (c) good and (d) excellent. It is important to note here that with the execution of this algorithm, an image match is always found; meaning the Machine Learning Comparator will never return with an unknown image. The calculation of a cut-off, or threshold value, where an image comparison is unidentified is still
  • 51. 51 required. But, as tests show (see Table 5-2), the Machine Learning Image Comparator's accuracy is limited, and the addition of returning an unknown shape produced a larger error response during tests, and was omitted to ease algorithm improvement. 5.2.1 Evaluation To determine the accuracy of the Machine Learning Image Comparator, a test recording of shapes was undertaken. This test measurement involved 25 push samples of the four test shapes only (as displayed in Figure 41(a)), 100 shape samples in total. The evaluation was conducted with push samples of the four Test Shapes compared to the library of shapes as displayed in Figure 25. 5.2.2 Results The results of the test are displayed in a Confusion Matrix, as shown in Table 5-2. The Machine Learning Image Comparator results are displayed in the columns from left to right, showing the percentage of a correct identification was made for each shape in the final column. The actual physical shapes are displayed in the rows from top to bottom. All four Test Shapes should be found in the library of shapes. Table 5-2: Test results for the Machine Learning Comparator Evaluation. Predicted Class from Machine Learning Algorithm L-shape Cylinder Sphere T-shape Number Correct Percent Correct Known Shapes L-Shape 9 0 0 16 9/25 36 Cylinder 1 24 0 0 24/25 96 Sphere 8 4 4 9 4/25 16 T-Shape 3 6 0 16 16/25 64 Total 21 34 4 41 53/100 53% Total accuracy for Machine Learning Comparator: 53% The Machine Learning Image Comparator had great difficulty in identifying the Sphere, with only 16% accuracy, but excellent results for the Cylinder object, with 96% accuracy. The L-Shape and T-Shape objects produced mixed results with 36% and 64% accuracy respectively. Overall, the Machine Learning Comparator produced an image identification accuracy of 53%.
  • 52. 52 5.2.3 Discussion Considering the results obtained from the evaluation, it would appear that the Machine Learning Image Comparator's ability to recognise a shape is not much greater than that achieved from a 50/50 guess. This would indicate that either the Machine Learning algorithm is not performing correctly, or my interpretation of the algorithm's output is imprecise. To improve accuracy of the Machine Learning Image recognition process, there are some variables in the algorithm that require adjustment. Particularly the values of:  k (the nearest number of neighbours), and integer from 1 upward  SURF feature detection points, an integer from 1 upward  The uniqueness threshold; the point where two features are considered a match (a value between 0 and 1). To implement the Machine Learning Image Comparator proved to be a challenging task. Many algorithm variables and the analysing of information proved to be the main reason for error. Specifically, it is the evaluation and assessment of the results returned by the Machine Learning algorithm that determines its accuracy. One example is the determination of a cut-off for uniqueness threshold; this value determines if uniqueness between two points is true. Another important value is the value of k. This value is like an expanding circle; the higher the value of k, the more the circle expands outwards around a sample point, until k number of library points are consumed by the circle. So, adding the number of SURF feature points, the uniqueness threshold between two points and the value of k - nearest neighbours all affect the accuracy of the algorithm. The ideal value for each variable is yet to be determined, and has proven to be very difficult in defining. Therefore, I did not set a cut-off value below which was deemed an unidentified shape. With further analysis and refining, I believe the accuracy of the Machine Learning Image Comparator process can be improved.
  • 53. 53 6 Conclusion In this thesis I have introduced techniques to identify shapes pushed into the Digital Foam apparatus. Four unique shapes were manufactured to suit the prototype 4x4 Digital Foam Sensor that was used throughout this study. The size of the test shapes was equivalent to the surface area of the Foam Sensor to facilitate ease of testing and confirmation of results. The test shapes were pushed into the Foam Sensor and a resulting deformation sample was converted to a grey scale image. These images were compared to a previously saved image library of shapes, to confirm whether it was possible to identify any shapes and label them accordingly. Two methods of image identification were studied; a grey scale image comparator and a more elaborate Machine Learning algorithm. Both methods displayed strengths and weaknesses when comparing images. The grey scale image comparator was easily implemented in software, but was limited in the variation of image it could identify; the sampled image had to be in the same position and orientation as the library image to be considered a match. The Machine Learning image comparison algorithm produced mixed results, and was indeed outperformed by the simple comparator in my study. It was noted that the Machine Learning image comparator could withstand slight variances in shape position and orientation. This observation was unexpected but interesting, indicating this is where the strength of Machine Learning lies in the field of image recognition. There was a limitation caused by the size of the Foam Sensor producing a low resolution image with little detail. A solution to this problem would be to use a larger Digital Foam array to produce images with more interesting features to compare. The K-Nearest Neighbour Machine Learning algorithm may not be ideal for image matching with Digital Foam; and consideration should be given to other common Machine Learning algorithms that have been adapted for image matching such as Artificial Neural Networks, Naive Bayes Classifiers and Random Forests. Furthermore, the K-Nearest Neighbours algorithm was run on the SURF Detector feature set; this resulted in only average image comparison results. Future directions could include the implementation of K-nearest neighbours on other image attributes such as the histogram attribute, SIFT Features, or a colour attribute if we were to create colour sample images instead of grey-scale images. In summary, I found the capacity for Digital Foam to recognise shapes does exist, and the application of Machine Learning could be refined to improve the quality of shape recognition with the Digital Foam Sensor.
  • 54. 54 7 References 1. Anjos, A, El Shafey, L, Wallace, R, Gunther, M, McCool, C, Marcel, S 2012, ‘Bob: a free signal processing and machine learning toolbox for researchers’, in Proceedings of the 20th ACM international conference on Multimedia, Nara, Japan, pp. 1449–1452. 2. Bay, H, Tuytelaars, T & Van Gool, L 2006, ‘SURF: Speeded Up Robust Features’, Lecture Notes in Computer Science, vol. 3951, pp. 404–417. 3. Blackshaw, M, Devincenzi, A, Lakatos, D, Leithinger, D & Ishii, H 2011, ‘Recompose: direct and gestural interaction with an actuated surface’, in CHI ’11 Extended Abstracts on Human Factors in Computing Systems, Vancouver, BC, Canada, pp. 1237–1242. 4. Bradski, G & Kaehler, A 2008, Learning OpenCV: Computer Vision with the OpenCV Library, 1st Edn, O’Reilly Media, Inc., Canada. 5. Collobert, R, Kavukcuoglu, K and Farabet, C 2011, ‘Torch7: A Matlab-like Environment for Machine Learning’, In Big Learning 2011: NIPS 2011 Workshop on Algorithms, Systems, and Tools for Learning at Scale. 6. Cover, T & Hart, P 1967, ‘Nearest neighbor pattern classification’, IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 21 – 27. 7. Dalal, N & Triggs, B 2005, ‘Histograms of oriented gradients for human detection’, in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, San Diego, CA, USA, pp. 886 - 893. 8. Dix, A, Finlay, J, Abowd, G & Beale, R 2004, Human-Computer Interaction, 3rd Edn, Prentice Hall, London. 9. Druzhkov, V, Erukhimov, V, Zolotykh, N, Kozinov, E, Kustikova, V, Meerov, I and Polovinkin, A 2011, ‘New object detection features in the OpenCV library’, Pattern Recognition and Image Analysis, vol. 21, no. 3, pp. 384–386.
  • 55. 55 10. Engelbart, D, English W K 1968, 'A research center for augmenting human intellect', in: AFIPS ’68 Proceedings of the December 9-11, 1968, Fall Joint Computer Conference, Part I, pp. 395–410. 11. Flynn, H, De Hoog, J & Cameron, S 2009, ‘Integrating Automated Object Detection into Mapping in USARSim’, in EEE/RSJ International Conference on Intelligent Systems and Robots (IROS 2009), St. Louis‚ USA. 12. Hibernik, K, Ghrairi, Z, Hans, C & Thoben, K-D 2011, ‘Co-creating the Internet of Things First experiences in the participatory design of Intelligent Products with Arduino’, Proceedings of the 17th International Conference on Concurrent Enterprising (ICE 2011), Aachen, Germany, pp. 1 – 9. 13. Holman, D & Vertegaal, R 2008, ‘Organic user interfaces: designing computers in any way, shape, or form’, Communications of the ACM - Organic user interfaces, vol. 51, no. 6, pp. 48–55. 14. Holman, D, Girouard, A, Benko, H, & Vertegall, R 2013, ‘The Design of Organic User Interfaces: Shape, Sketching and Hyper context’, Interacting with Computers, vol. 25, no. 2, pp. 133–142. 15. Ishii, H & Ullmer, B 1997a, 'Tangible bits: towards seamless interfaces between people, bits and atoms', in Proceedings of the ACM SIGCHI Conference on Human factors in computing systems, Atlanta, Georgia, USA, pp. 234–241. 16. Ishii, H 2008a, ‘Tangible bits: beyond pixels’, in 2nd international conference on Tangible and embedded interaction, Bonn, Germany, pp. 15–25. 17. Ishii, H 2008b, ‘The tangible user interface and its evolution’, Communications of the ACM - Organic user interfaces, vol. 51, no. 6, pp. 32–36. 18. Ishii, H, Ratti, C, Piper, B, Wang, Y, Biderman, A, Ben-Joseph E 2004, ‘Bringing Clay and Sand into Digital Design — Continuous Tangible user Interfaces’, BT Technology Journal, vol. 22, no. 4, pp. 287–299. 19. Iwata, H, Yano, H, Nakaizumi, F & Kawamura, R 2001, ‘Project FEELEX: adding haptic surface to graphics’, in Proceedings of the 28th annual conference on Computer graphics and interactive techniques, Los Angeles, CA, USA, pp. 469–476.
  • 56. 56 20. Jorda, S, Geiger, G, Alonso, M & Kaltenbrunner, M 2007, ‘The ReacTable: exploring the synergy between live music performance and tabletop tangible interfaces’, in Proceedings of the 1st international conference on Tangible and embedded interaction, Baton Rouge, LA, USA, pp. 139– 146. 21. Kaltenbrunner, M, Jorda, S, Geiger, G, & Alonso, M 2006, ‘The ReacTable: A Collaborative Musical Instrument’, in Proceedings of the 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, Manchester, UK, pp. 406 - 411. 22. Kapitanova, K & Son, S 2012 Book chapter ‘Machine Learning Basics, in Intelligent Sensor Networks: Across Sensing, Signal Processing, and Machine Learning’, Taylor & Francis LLC, CRC Press ISBN 9781439892817. 23. Kildal, J 2012, ‘Interacting with Deformable User Interfaces: Effect of Material Stiffness and Type of Deformation Gesture’, in Lecture Notes in Computer Science, vol. 7468, Lund, Sweden, pp. 71–80. 24. Kildal, J, Paasovaara, S & Aaltonen, V 2012, ‘Kinetic device: designing interactions with a deformable mobile interface’, in CHI ’12 Extended Abstracts on Human Factors in Computing Systems, Austin, Texas, USA, pp. 1871–1876. 25. Lowe, DG 1999, ‘Object recognition from local scale-invariant features’, in The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, Kerkyra, Greece, pp. 1150 - 1157. 26. Mao, Z, Zeng, C, Gong, H & Li, S 2010, ‘A new method of virtual reality based on Unity3D’, paper presented at Geoinformatics, 2010 18th International Conference on, Beijing, China, pp. 1–5. 27. Michalski, R, Carbonell, J & Mitchell, T 1984, ‘Machine Learning: An Artificial Intelligence Approach - Volume 1’, Springer-Verlag, Berlin. 28. Murakami, T & Nakajima, N 1994, ‘Direct and intuitive input device for 3-D shape deformation’, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Boston, Massachusetts USA, pp. 465–470.
  • 57. 57 29. Murakami, T, Hayashi, K, Oikawa, K & Nakajima, N 1995, ‘DO-IT: deformable objects as input tools’, in CHI ’95 Conference Companion on Human Factors in Computing Systems, Denver, USA, pp. 87– 88. 30. Myers, B 1998, ‘A brief history of human-computer interaction technology’, Interactions Magazine, vol. 5, no. 2, pp. 44–54. 31. Pahalawatta, K. & Green, R 2013, ‘Particle Detection and Classification in Photoelectric Smoke Detectors Using Image Histogram Features’, in International Conference on Digital Image Computing: Techniques and Applications (DICTA 2013), Hobart, Tasmania, pp. 1 – 8. 32. Parkes, A, Poupyrev, I & Ishii, H 2008, ‘Designing kinetic interactions for organic user interfaces’, Communications of the ACM - Organic user interfaces, vol. 51, no. 6, pp. 58–65. 33. Piper, B, Ratti, C & Ishii, H 2002a, ‘Illuminating Clay: A Tangible Interface with potential GRASS applications’, Proceedings of the Open source GIS - GRASS users conference 2002, Trento, Italy. 34. Piper, B, Ratti, C & Ishii, H 2002b, ‘Illuminating Clay: A 3-D Tangible Interface for Landscape Analysis’, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Minneapolis, Minnesota, USA, pp. 355–362. 35. Poupyrev, I, Nashida, T, Maruyama, S, Rekimoto, J & Yamaji, Y 2004, ‘Lumen: interactive visual and shape display for calm computing’, in ACM SIGGRAPH 2004 Emerging technologies, Los Angeles, CA, USA, Page 17. 36. Ratti, C, Wang, Y, Ishii, H, Piper, B and Frenchman, D 2004, 'Tangible User Interfaces (TUIs): A Novel Paradigm for GIS'. Transactions in GIS, vol. 8, no. 4, pp. 407–421. 37. Rekimoto, J 2008, ‘Organic interaction technologies: from stone to skin’, Communications of the ACM - Organic user interfaces, vol. 51, no. 6, pp. 38–44. 38. Rosten, E, Porter, R & Drummond, T 2010, ‘Faster and Better: A Machine Learning Approach to Corner Detection’, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp. 105–119.
  • 58. 58 39. Rublee, E, Rabaud, V, Konolige, K & Bradski,, G 2011, ‘ORB: An efficient alternative to SIFT or SURF’, in 2011 IEEE International Conference on Computer Vision, Barcelona, Spain, pp. 2564 – 2571. 40. Samuel, A L 2000, ‘Some studies in machine learning using the game of checkers’, IBM Journal of Research and Development, vol. 44, no. 1/2. 41. Schwesig, C 2008, ‘What makes an interface feel organic?’, Communications of the ACM - Organic user interfaces, vol. 51, no. 6, pp. 67–69. 42. Schwesig, C, Poupyrev, I & Mori, E 2003, ‘Gummi: user interface for deformable computers’, in CHI ’03 Extended Abstracts on Human Factors in Computing Systems, Ft Lauderdale, USA, pp. 954–955. 43. Schwesig, C, Poupyrev, I & Mori, E 2004, ‘Gummi: a bendable computer’, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Vienna, Austria, pp. 263–270. 44. Sederberg, T & Parry, S 1986, ‘Free-form deformation of solid geometric models’, in Proceedings of the 13th annual conference on Computer graphics and interactive techniques, Dallas, USA, pp. 151–160. 45. Sellen, A, Rogers, Y, Harper, R, & Rodden, T 2009. 'Reflecting human values in the digital age'. Communications of the ACM - Being Human in the Digital Age, Volume 52, Issue 3, pp. 58–66. 46. Shi, S 2013, Emgu CV Essentials, Packt Publishing, http://www.packtpub.com/. 47. Shih, R 2012, Learning Autodesk Inventor 2012, SDC Publications, Kansas. 48. Smith, R T, Thomas, B H & Piekarski, W 2008b, 'Digital Foam Interaction Techniques for 3D Modelling', ACM symposium on Virtual Reality Software Technology (VRST), Bordeaux, France, 27- 29 October 2008. 49. Smith, R T, Thomas, B H, Piekarski, W 2008a, 'Tech Note: Digital Foam', IEEE Symposium on 3D User Interfaces (3DUI), Reno, Nevada, USA, 8-9 Mar 2008.
  • 59. 59 50. Ullmer, B & Ishii, H 1997b, ‘The metaDESK: models and prototypes for tangible user interfaces’, in Proceedings of the 10th annual ACM symposium on User interface software and technology, Banff, Alberta, Canada, pp. 223 – 232. 51. Vertegaal, R & Poupyrev, I 2008, ‘Introduction : Organic User Interfaces’, Communications of the ACM - Organic user interfaces, vol. 51, no. 6, pp. 26–30. 52. Waguespack, C 2012, Mastering Autodesk Inventor 2013 and Autodesk Inventor LT 2013, Sybex; 1st edition.
  • 60. 60 8 Appendix 8.1 Software Requirements To load and run the application ‘Digital Foam Reader’ (filename SerialFoam.exe), the user requires a PC with the following software and hardware: Software Requirements: 1. The Microsoft Windows 7 SP1 64-bit, Windows 8 64-bit or Windows 8.1 64-bit Operating Systems. Note: it is not guaranteed that this software will execute correctly on a 32-bit version of the above Operating Systems. 2. The Microsoft .NET Framework (version 4.0 or later) – this is a stand alone Framework , Version 3.5 is included with a standard Windows 7 installation, so an upgrade to .NET 4.0 or later is required. The .NET Framework 4.5 is included with the Windows 8 Operating System. Similarly, the .NET Framework 4.5.1 is included with Windows 8.1; no upgrade is required on a computer running Windows 8 or 8.1. 3. The OpenCV computer vision and machine learning software library – the latest version of this library can be downloaded from www.opencv.org the latest version (as of 25-04-2014) available for free download from this site is OpenCV 2.4.9. The minimum version required to run this software is OpenCV 2.4.6.0; this version is included in the Software package – this is a self extracting installer; see below for installation instructions if not already installed on your PC. 4. The Arduino UNO USB Driver for Windows 7. 5. A copy of the Digital Foam Reader software executable – included in the Software package. Hardware Requirements: 1. The minimum hardware and Operating System requirements for the .NET Framework 4.5 or later can be found at: http://msdn.microsoft.com/library -> .NET Development -> .NET Framework 4.5 -> .NET Framework System Requirements 2. A PC with at least one free USB 2.0 port available. 3. The 4x4 Prototype Digital Foam Sensor (see Figure 45), connected to the computer via a USB cable. Figure 45: The 4x4 Digital Foam Sensor used in this Dissertation.
  • 61. 61 8.2 Software Installation 8.2.1 .NET Framework on a Windows 7 Computer It is a requirement to check the Microsoft .NET Framework installed version on Windows 7. For Windows 8 or later, the correct version of The Microsoft .NET Framework is installed. On a Windows 7 computer, check the current .NET version installed by selecting: Start Menu -> Control Panel -> Programs -> Programs and Features In the list of installed programs, navigate through the list to find the Name: Microsoft .NET Framework x.x.x If this version is 4.0.0 or later, you do not need to upgrade the .NET Framework (see Figure 46) Figure 46: Microsoft .NET Framework Version 4.5.1 is installed on this computer. Alternatively, if you have administration rights to your computer and are familiar with running regedit, you can investigate further with reference to web site: http://support.microsoft.com/kb/318785 (How to determine which versions and service pack levels of the Microsoft .NET Framework are installed). If this version is 3.5.1 or earlier (the default installation on Windows 7 SP1), you MUST upgrade to the latest version of the .NET Framework (at least version 4.0.0). Microsoft .NET Framework 4.0 installation for Windows 7 can be found at: http://www.microsoft.com/en-au/download/details.aspx?id=17851 This is a self-installing package for Windows 7. Download and install this update to the .NET Framework. A restart may be necessary upon completion.