Tele-presence aided teleoperation of semi- autonomous work ...
Helsinki University of Technology
Department of Automation and Systems Technology
Automation Technology Laboratory
Tele-presence aided teleoperation of semi-
autonomous work vehicles
CONTENTS _______________________________________________________ 2
1. Introduction ___________________________________________________ 4
1.1 Background_________________________________________________ 4
1.2 Motivation of this work ______________________________________ 7
2. Teleoperation (and supervisory control)_________________________ 9
2.1 History _____________________________________________________ 9
2.2 Applications _______________________________________________ 10
2.2.1 Space __________________________________________________ 10
2.2.2 Underwater ____________________________________________ 11
2.2.3 Military and antiterrorist _________________________________ 12
2.2.4 Medicine _______________________________________________ 13
2.2.5 Heavy Work Vehicles ____________________________________ 14
3. Definitions ___________________________________________________ 16
4. Teleoperation Interfaces ______________________________________ 20
4.1 Direct _____________________________________________________ 20
4.2 Multimodal / multisensor ___________________________________ 20
4.3 Supervisory control ________________________________________ 21
4.4 Novel _____________________________________________________ 21
5.1 Direct teleoperation (short delay) ___________________________ 23
5.2 Move and wait teleoperation (long delay) ____________________ 24
6. Tele-presence ________________________________________________ 26
6.1 Vision _____________________________________________________ 26
6.2 Hearing ___________________________________________________ 27
6.3 Touch _____________________________________________________ 27
6.3.1 Force feedback (kinesthetic information) __________________ 27
6.3.2 Haptic feedback (tactile information) _____________________ 28
6.4 Vestibular sensors__________________________________________ 28
6.4 Virtual and augmented presence ____________________________ 28
6.5 Enhancement Level of Tele (virtual)-presence________________ 30
6.6 Problems of Tele (virtual) presence _________________________ 30
7. Semi-autonomous Work vehicles _______________________________ 32
7.1 Motivation _________________________________________________ 32
7.2 Related Work ______________________________________________ 32
8.Teleoperation experiments ____________________________________ 33
8.1 Introduction _______________________________________________ 33
8.2 Experiments in an unstructured environment (Papers IV and VI) 33
8.2.1 Test Equipment _________________________________________ 33
8.2.2 Experiments _________________________________________ 34
8.2.3 Results_________________________________________________ 36
8.3 Experiments in a structured environment (Papers V and VI) ___ 38
8.3.1 Test vehicle and equipment ______________________________ 38
8.3.2 Tele-existence equipment________________________________ 39
8.3.3 Test persons ____________________________________________ 40
8.3.4 Tele-existence configuration _____________________________ 40
8.3.5 Test runs _______________________________________________ 41
8.3.6 Evaluation methods _____________________________________ 41
8.4 Results ____________________________________________________ 42
8.4.1 Servo camera and monitor _______________________________ 42
8.4.2 HMD with mono vision ___________________________________ 43
8.4.3 HMD with stereo ________________________________________ 43
9. Visual flow in teleoperation ___________________________________ 45
9.1 Test Setup_________________________________________________ 45
9.2 Test drives ________________________________________________ 46
9.3 Preliminary results _________________________________________ 47
10. Conclusions__________________________________________________ 49
11. References __________________________________________________ 50
12. Appendices __________________________________________________ 53
Traditionally teleoperation has been used in applications where normal on-board
manual operation/control cannot be used or where it would be too hazardous or
expensive. Typical examples are the handling of nuclear materials (dangerous),
control of small models (impossible) and space and underwater exploration (too
hazardous and expensive). The history of modern teleoperation began at the end of
the 1940’s when the first master – slave manipulator was developed in the Argonne
National Laboratory for chemical and nuclear material handling [Vertut and Coiffet,
1985]. After that, the development of teleoperation was fast. Adaptation of video
technology and force feedback to teleoperation made the first telepresence systems
possible. Computer technology brought the advanced control loops into the remote
(teleoperator) end of the system, and finally brought virtual reality into teleoperation.
Despite progress in the technology, the traditional idea of teleoperation was based on
the idea that the human operator would at all times be available to exercise more or
less direct control.
Meanwhile, computer technology made it possible to automate complicated factory
processes like those performed in chemical plants, by paper machines and in
different batch processes. Little by little, automation technology spread almost
unnoticed into mines, farms, forestry, and construction sites also. However, unlike
the factory tasks, the difficult and dynamic work tasks of work machines operating in
changing outdoor environments were not easy to automate, and, even today, the huge
majority of work machines are still manually controlled from the cockpit of the
vehicle. Automation technology is still used to assist the driver only. However, some
of the tasks can be automated. Typical examples can be found in mining, where
automation is perhaps at the highest level among heavy work vehicles. The easiest
tasks to be automated in mining are the hauling and dumping tasks that are
performed during the work cycle of an LHD machine. The only work for which a
driver is needed is the loading task. Another example is a modern drilling machine.
The driver only brings the machine to the drilling place; the actual drilling is done
automatically. The driver mainly monitors the task and intervenes in the case of
malfunction. These machines are called semi-autonomous.
In work-vehicle automation – like in all automation technology – the target is to
reduce production costs and improve quality as much as possible. Normally – in the
long run – this leads to a fully automatic system where the human only supervises
the process. This has already happened in the process industry. In Figure 1, the
history and a future vision of the automation level in forestry vehicles is illustrated.
This vision can be adapted to any task where heavy mobile machines are used.
Teleoperation is, more or less, only an intermediate phase in the development of
fully automated systems. However, the need for teleoperation will last at least for the
few decades before fully autonomous work vehicles emerge, and even then human
supervision will always be needed.
aut onom ou sit y / int ellig enc e
aut onom ou s t hinning
and br ushing
r obot soc iet ies
au t onom ou s
f or w ar d er sem i-aut onom ou s
r em ot e oper at ed
t eleop er at ed m u lt i-m ac hine
w heel/t r a c k f or est har vest ing
m an or leg based w or k sit e
& har vest er
m an wal k ing
& har vest er
w heel/t r ac k
m an har vest er
MAN t r ak t or
t im e
1950 1970 2000 2010
Figure 1: A vision for future development in forestry. Adapted from [Halme and
The reason for the lack of teleoperation in industrial work vehicles is clear. In
industrial tasks there are practically no advantages in utilizing teleoperation. There is
no financial connection between cases where the driver is driving in the cockpit of
the vehicle and teleoperating from a control room, if one machine takes his full
attention. The elimination of the cockpit would – of course – bring savings, but these
would be compensated by the cost of teleoperation equipment and the high
bandwidth data transmission infrastructure. There are, however, some applications
where direct full time teleoperation is used. For example, in some mines there are
areas that are productive but – for some reason – not totally safe. These areas, where
human operators are not allowed to go, can be mined by teleoperated machines.
Semi-autonomous work vehicles raised a commercial need for teleoperation. When
the simple tasks of a work vehicle are automated, the human operator is enabled to
focus his efforts on the more demanding tasks. This means that, for all the time the
machine is running autonomously, the operator is free to do something else, for
example to control an other machine. In traditional operation, this doesn’t work,
because a change over from one vehicle to another is impossible because of time and
– in most cases – safety. If these tasks that need manual control are teleoperated, the
driver can easily, speedily and safely operate two or even several machines. This so
called part time teleoperation combines the advantages of the traditional direct
teleoperation with those of the advanced automation to decrease the need for
manpower in work-vehicle operation.
Part time teleoperation, as a time and money saving method, is a relatively old
invention. Especially in unmanned electric distribution stations, simple tasks like
reconnecting switches, etc. are usually teleoperated in order to speed up the process
and to save the time of service people.
High Bandwidth Bus
Figure 2: Principle of semi-autonomous work vehicles
Partially teleoperated semi-autonomous working machines are without doubt the
next step on the way to developing automation in worksites like mines or
construction sites. Simple tasks like drilling or hauling ore in mines can be
automated, whereas more sophisticated tasks like selecting the start point of the drill
or loading the ore need the hand of an experienced operator. When most of the time
in the work cycle of working machines is automated, one operator can easily handle
from two to five machines, assuming that all the machines can be teleoperated from
one place. In Figure 2, a typical layout of a semi-autonomous work machine system
In the future, the development of robotics will bring service robots really among
humans, not only in industry, but also in homes and other places where service work
is needed. Typical examples of these new service robots are Honda’s humanoid
Asimo [www.hondabeat.com] and HUT’s centauroid WorkPartner [Halme et al.,
2000], [Halme et al., 2001]. These robots set totally new demands for teleoperation.
Again, it will take years or decades before service robots will be able to perform
autonomously even the simplest tasks. Meanwhile, there have to be teleoperation
methods to teach new tasks to the robot, or to help it in faulty or exceptional
situations. In the case of commercial services, where there are several robots
working under the same “employer”, the situation can be more or less similar to part
time teleoperation. One human can control a group of robots from a control room
utilizing all the practical teleoperation methods like tele-presence and high
bandwidth closed loop control.
Figure 3: Asimo and Workpartner robots
[Asimo adopted from: http://www.hondabeat.com/news/asimo.cfm]
What to do when there is only one service robot? This is a typical situation in homes
and in most cases where operator and robot are working together. Now the operator
is usually relatively near to the robot and can carry only very light and small
equipment for robot control, or none at all. Novel control methods like speech and
gestures [Paper III] or even brainwave [Amai et al., 2001] control provide humanlike
interaction with the robot, but the autonomy of the robot has to be improved.
1.2 Motivation of this work
The evolution of teleoperation has generated sophisticated tele-presence systems
where the operator can really feel that he is present in the teleoperation site. When
looking at the related research in the area of teleoperation, it may be noted that most
of the research has been done in order to provide better and more effective
teleoperation methods for difficult work and manipulation tasks where stereo vision
and anthropomorphic manipulators with force feedback are needed in order to
perform the task. This development is, of course, very natural for any technical area.
The (wretched) motivation in the industry is to earn more money. This usually means
that not the best system in terms of technology is chosen, but rather the most suitable
system, or the most cost efficient. In heavy work vehicles the subsystems are usually
not the most state-of-the-art, but simple, well-tested and reliable systems that will
work for thousands of hours without failure.
Teleoperation systems are still so rare among work vehicles that there has not been
much research at the required level of teleoperation or telepresence. The obvious
thing is that the required level of telepresence depends on the task. The operator
needs a different amount of information in the case of operating a road roller from
the amount needed in the case of operating a harvester in unknown forest.
The aim of this study is to test and compare different levels of telepresence
equipment in the operation of common work-vehicle tasks. When an adequate and/or
the best composition is found, it is analyzed into the properties of the task that define
the level of presence needed. Also, the effect of learning and the different
performance of the drivers are studied. The study is based on tests with both an
experimental and a real-work vehicle. The test results are evaluated both objectively
2. Teleoperation (and supervisory control)
The poking of fire (Figure 4) might have been one of the first general teleoperation
tasks in the history of mankind. To be exact, the poking of fire is tele or remote
manipulation, which was the earliest type of teleoperation. This task is also a good
example with which to demonstrate the difference between teleoperation and tool
utilization. A human hand is a perfect tool for setting the firewood better and, in fact,
usually the unfired wood is set by hand in the fireplace. After the fire has been set,
the environment is so hostile that a more bulky tool must be used in order to protect
the hand. Tools make it possible to perform a task like cutting (a knife), or to
improve work like digging (a spade).
Figure 4: Primitive teleoperation task
The first modern master - slave teleoperators were mechanical pantographs. The
group working under R. Goertz developed these manipulators in the late 1940s at the
Argonne National Laboratory, where Enrico Fermi developed the first nuclear
reactor [Vertut and Coiffet, 1985]. The need was obvious. The radioactive nuclear
material has to be manipulated safely. The nuclear material was placed in a “hot cell”
where the operator could manipulate it outside the cell by remote handling (Figure
5). The visual contact with the target was through a protective window and/or a
Figure 5: R. Goertz and the first mechanical master - slave manipulator. Chemicals
are manipulated remotely behind a protective glass. [Adopted from: Vertut and
The mechanical manipulators were soon replaced by electro mechanical servos. In
1954, Goertz’s team developed the first electro mechanical manipulator with
feedback servo control. After this, the teleoperation of manipulators and vehicles
extended rapidly to new branches where the advantages of teleoperation techniques
could be utilized.
One of the first areas where teleoperation techniques were utilized was deep-sea
exploration. The deep oceans are even today regarded as so hostile that most of the
deep-sea operations are made with teleoperated submarines. These submarines are
called – even today - Remote Operated Vehicles (ROV), though the term could be
equally well understood as referring to ground, or even flying, vehicles. Often ROVs
are equipped with telemanipulators in order to perform underwater work tasks.
Teleoperation is also typically used for space and military applications. In both
cases, the environment is hostile for humans, while in space applications there is the
additional point that the extra equipment needed for the human pilot is more
expensive than a sophisticated teleoperation system.
Space applications provide several good reasons for teleoperation.
Safety – in all space operations there are big risks, which can lead to the loss of
astronauts’ lives. There have also been scenarios in which teleoperated mining sites
have been built in space.
Costs – in space operations the equipment needed for human passengers is much
more expensive and weighs more than teleoperation technology.
Time – long space missions can take several years, which is not possible for manned
The first successfully teleoperated vehicle on the moon was Russian Lunakhod 1
(Figure 6) at the start of the 1970’s. Lunakhod ran 11 days over 10 kilometers. The
Lunakhod mission also faced the problem of teleoperation over long time delay,
which is typical for space missions. Already the delay of several seconds to and back
from the moon made the fast closed loop control impractical because of the resulting
instability. Instead of closed loop control, a “move and wait method” is used. The
vehicle is operated by open loop commands without immediate feedback. After one
or more commands have been executed, the operator waits for the confirmation and
A much longer delay was faced at the end of 1990 when NASA’s Sojourner landed
on Mars. Despite the 10 - 20 min. control delay, Sojourner was successfully operated
over the planned 7-sol (Martian day) period.
Figure 6: Teleoperated space rovers: Lunakhod 1 and Sojourner
In Paper 1, a teleoperated rover for planetary soil sampling missions is presented.
This experimental RoSA2 is designed to be controlled with a “move and wait”
strategy improved with a 3D-model of the environment and a virtual model of the
vehicle, which can be closed loop controlled.
As mentioned before, underwater operations were one the first mobile applications
where teleoperation techniques were adopted. Today these ROVs probably represent
the largest commercial market for mobile vehicle teleoperation. ROVs are used in
surveying, inspections, oceanography and different simple manipulation and work
tasks, which were traditionally performed by divers. ROVs are generally tethered to
a surface ship and controlled using video monitors and joysticks. The most recent
system can also perform some autonomous tasks such as station keeping or track
The French research submersible “Victor” (Figure 7) can dive down to 6000m.
Figure 7: Submersible Victor [adopted from: www.ifremer.fr]
2.2.3 Military and antiterrorist
The military field provides endless possibilities for teleoperated systems. It was not a
surprise that one of the first automotive teleoperators was developed for military
applications. Mobile military teleoperators/-robots cover the whole scale from ROVs
to the Unmanned Air Vehicles (UAV). In between, there is a wide range of
teleoperated ground vehicles.
Modern UAVs like US Air Force Predator (Figure 8) are remotely piloted by radio or
satellite links. They can also have the capability to fly autonomously with the help of
GPS and inertial navigation. Their typical tasks are reconnaissance and target
Unmanned Ground Vehicles have a wide application field in military operations.
Typical tasks are reconnaissance, surveillance, target acquisition, route clearing,
ordnance disposal and landmine detection. The first UGVs were fully teleoperated
with closed loop control. The newest models like SARGE (Figure 9) are equipped
with vehicle localization (GPS, Inertial navigation) and supervisory control to
improve the performance. Military UGVs are often supplied with the state of the art
teleoperation equipment like stereovision telepresence etc. to provide best possible
feedback in fast and dangerous operations. UGV development has been led by the
Figure 8: US Air Force Predator [adopted from: www.airforce-technology.com]
Figure 9: The Surveillance And Reconnaissance Ground Equipment (SARGE) by
Sandia National Laboratories [adopted from: http://www.sandia.gov/]
Increasing criminality and terrorism have created a new variety of military type
teleoperators [Davies, 2001], [Hewish, 2001]. These so called terrobots are used for
bomb disposal, surveillance in police operations and even assault against dangerous
targets. These vehicles are teleoperated with closed loop control over radio or cable
connection. Typical of vehicle equipment are color camera(s), infrared cameras,
manipulators, hydraulic guns, shotguns and non-lethal guns.
Figure 10: Terrobots [adopted from: Davies, 2001]
In medicine, teleoperation usually takes the form of micromanipulation. In
endoscopic operations, the cutting equipment and endoscope are taken to the target
through a small hole, while the operator cuts the target causing only minimal damage
to the surrounding tissue. The risks are smaller and the recovery time remarkably
shorter than in the traditional open wound surgery.
Figure 11: Endoscopic nose operation
[Adopted from: toffelcenter.com]
Micromanipulation is also a common tool in biochemistry, especially in genetic
manipulation. A typical example is cloning when the genotype located in the nucleus
of a cell is replaced.
2.2.5 Heavy Work Vehicles
In mining, teleoperation has already been in use for two decades in cases where the
mining area was not totally safe. Drill vehicles and loaders are driven manually in
the safe parts of the mine, but teleoperated in areas where safety can’t be guaranteed
Teleoperated rescue vehicles have also been developed for mines. Ralston and
Hainsworth [Ralston and Hainsworth, 1998] present a mine emergency response
robot called Numbat. It operates in conjunction with rescue teams as they enter a
mine after an emergency. It moves ahead of the teams under remote control through
the mine and transmits to the surface video or infrared images of the hazardous areas,
as well as data on the atmospheric conditions.
Definitions are to help the reader to understand what the writer has meant with the
word in this document. Most of these definitions are generally approved by the
robotics research community but, as is the case with all definitions, there are usually
Robot 1: A robot is a re-programmable, multi-functional manipulator designed to
move material, parts, tools, or specialized devices through variable programmed
motions for the performance of a variety of tasks. [Robot Institute of America]
Robot 2: Any automatically operated machine that replaces human effort, though it
may not resemble human beings in appearance or perform functions in a humanlike
manner. The term is derived from the Czech word robota, meaning “forced labor.”
Autonomous robot is something, which is not available now, and will be extremely
difficult to build in the future. Animals and humans are autonomous. To be
autonomous a robot has to have consciousness, which cannot be created by existing
computer and software technology. (This is the writer’s own, rather pessimistic,
Usually the term autonomous robot is used for a robot that can execute its task(s)
autonomously without an operator's help. The degree of difficulty of the task does
not affect this.
Operator = A human operator is the person who monitors the operated machine and
takes the control actions needed.
Teleoperator is the teleoperated machine. A sophisticated teleoperator can also be
called a telerobot.
Teleoperation means simply to operate a vehicle or a system over a distance [Fong
and Thorpe, 2001]. However, more exact definitions are needed to separate the
poking of fire from high-level supervisory control. The first teleoperation tasks like
poking fire or manipulating nuclear material can be classified as remote operation or
remote manipulation. The word "remote" emphasizes that the controlled vehicle or
system is, for most of the time, in the view area of the operator. Today, in “normal
teleoperation” there is no visual contact with the controlled machine. The visual
feedback is made (usually) by a camera – monitor combination. Control commands
are sent electrically by wire or radio. Where the connection between the manipulator
and operator is mechanical, the term "remote manipulation" means mechanical
manipulation. In tele-manipulation, this connection is electrical.
Here the word "teleoperation" has been used both in its widest sense covering all
meanings, and in the sense meaning just the “normal teleoperation” defined above.
Between the simple mechanical manipulation and high-level supervisory control,
there are several systems of different technical levels included under the term
Mechanical manipulation: The control commands are transmitted mechanically or
hydraulically to the teleoperator. Visual feedback can be direct or via a monitor.
Remote operation/control: The operator has direct visual contact most of the time
with the controlled target. Control commands are sent electrically by wire or radio
Figure 13: Remote control of a drilling machine
To clarify the wide concept of “normal teleoperation”, this can be divided easily into
three different sublevels:
Closed loop control (Direct teleoperation): The operator controls the actuators of the
teleoperator by direct (analog) signals and gets real-time feedback. This is possible
only when the delays in the control loop are minimal. A typical example of this is a
radio controlled toy car (Figure 14).
Figure 14: RCtoy car (Tamiya) and controller (Futaba) example of closed loop
teleoperation [adopted from: http://www.tamiya.com/ and http://www.futaba.com/]
Coordinated teleoperation: The operator again controls the actuators, but now there
is some internal control - remote loop (the blue line in Figure 15) - included.
However, there is no autonomy included in the remote end. The remote loops are
used only to close those control loops that the operator is unable to control because
of the delay. A typical example of this is a teleoperator for whom the speed control
has a remote loop and, instead of controlling the throttle position, the operator gives
a speed set point. Digital closed loop control systems almost always fall into this
In supervisory control [Sheridan, 1992], the remarkable part of the control is to be
found in the teleoperator end (compare coordinated teleoperation). The teleoperator
can now perform part of the tasks more or less autonomously, while the operator
mainly monitors and gives high-level commands. The term task based teleoperation
is sometimes used here, but it is more limited than "supervisory control".
In Figure 15 the red loop - operator loop - demonstrates feedback from the HMI
computer. This can be a virtual model, estimated parameters, etc.
OPERATOR OPERATOR OPERATOR
display controls display controls display controls
HMI computer HMI computer HMI computer
transmission transmission transmission
teleoperator’s teleoperator’s teleoperator’s
computer computer computer
sensors actuators sensors actuators sensors actuators
TASK TASK TASK
Figure 15: The first figure demonstrates closed loop control; the second and third,
Telepresence (tele-existence): When a sufficient amount of sensor information
(vision, sound, force) is brought from the teleoperator site to the operator, then he or
she feels physically present in the site.
Virtual presence (or virtual reality) is similar to telepresence, except the
environment where the operator feels to be present (and the sensor information) is
artificially generated by a computer (Red loop in Figure 15).
Augmented presence (or augmented reality) is a combination of real world and
virtual reality. A typical example of this is a real camera image with additional
computer generated virtual information.
4. Teleoperation Interfaces
Interfaces partly overlap the definitions presented in the previous chapter. However,
interfaces are an essential part of teleoperation and a more profound definition is
justified. The presented classification of vehicle teleoperation interfaces, direct,
multimodal/multisensor, supervisory control and novel, is adopted from [Fong and
Thorpe, 2001]. Often, the applied system can be clearly classified according to these
classes, but frequently there are features included from two or more classes. In part
time teleoperation, typically at least direct and supervisory controls are included.
Despite these problems, this classification clarifies the concepts in the field of
The traditional and most common method of vehicle teleoperation is direct control.
The operator controls the vehicle via hand controllers (joysticks, or steering wheel)
and watches the feedback video from vehicle-mounted cameras. The operator feels
as if he is inside the teleoperator, looking out. Direct control is appropriate when the
real-time decision making of a human operator is needed continuously. The
restrictive feature is the requirement of high-bandwidth and low-delay
communications. Even though communication techniques have developed a lot in
recent years, delay still exists, especially in digital signal transmissions. The
presence of delay is tedious and fatiguing for the operator. Also, the mismatch
between different senses during control can create simulator sickness, especially
when the operator’s feeling of presence is improved by using telepresence.
4.2 Multimodal / multisensor
When a complex robot moves into a dynamic situation, the operator can have
difficulties in perceiving the environment and robot’s state, or in performing control
Multimodal interface provides the operator with a variety of control modes. Typical
examples are separate control of individual actuators with graphical feedback and
coordinated motion. Feedback displays also contain multimodal information in
graphics and text. A multimodal interface of a legged robot is illustrated in Figure
Multisensor interfaces collect information from several sensors and combine it into
one integrated display. In vehicle teleoperation, these displays are often used to
improve the operator’s depth-judgment or attitude feeling.
Figure 16: Multimodal control interface of WorkPartner robot
4.3 Supervisory control
Supervisory control has already been defined in Chapter 3.
Nowadays, when, in practice, all teleoperation is based on computers and digital data
transfer, almost all teleoperation cases can be regarded as cases of supervisory
The term “novel” is somehow misleading because it is relative. Most of the
teleoperation interfaces, as well as every technical invention, has once been novel.
Thus, it is most likely that the “novel interfaces” presented here will not be called
novel in the future. Nevertheless, at the moment they can be classified as such.
Some interfaces are novel because the input method is unconventional. [Amai et al.,
2001] presents a vehicle-driving controller based on brainwave and muscle
movement monitoring. Paper III describes a “cognitive teleoperation interface”
where control commands are given by speech or by gestures that are recognized
either by image processing or hand-tracker. Direct control of both vehicle and two-
hand manipulator is performed with the hand-tracker interface (Figure 17).
[Fong et al. 2000] describe the Haptic driver, which enables the drive-by-feel
control, and the Gesture driver, which is based on gestures (mapped with the robot
camera). [Heinzmann and Zelinsky, 2001] also use gestures, but, instead of hands,
they use face gestures and gaze control.
The web-based teleoperation interfaces (Figure 18) can also be classified as novel,
although they can be classified under the multimodal interfaces or supervisory
control as well. The Web is very interesting because it provides global low-cost data
transfer for long distance teleoperation [Schilling et al., 1997]. There are also
problems, like unpredictable, varying bandwidth and delay, which are characteristic
of the Internet.
Figure 17: Hand-tracker interface for mobile machine control in outdoor conditions
Figure 18: Teleoperation interface in Web [adopted from: http://redrover.ars.fh-
5. Problems in Teleoperation
The main problems in teleoperation are related to delays and the human - machine
interface (HMI). In semiautonomous machines, the HMI is even more problematic
because, in addition to the direct teleoperation state, there is also a supervision state,
and – most difficult – the transition between these two states. The delay problem can
be divided into two different cases: direct teleoperation (short delay) and move-and-
wait control (long delay).
5.1 Direct teleoperation (short delay)
There are always delays in a teleoperation loop, which is like any control loop with
controller, process and feedback measurement(s). According to the law of Shannon,
the process can be measured (and controlled) only when the measurement frequency
is at least 2 times higher than the nominal frequency of the measured process. In
teleoperation this means that the delay in the control loop – from control action to the
feedback of the action effect – should be at least two times the nominal frequency of
the controlled process, otherwise the process frequency has to be decreased.
The delay in teleoperation equipment (the human delay is not included) consists of
several parts (Figure 19). Nowadays, the digital signal processing causes the major
part of the delay. However, the advantages of digital processing are so great that
there is no use for analog technology.
Figure 19: Delays in teleoperation
At the operator end, the control delay is computational and consists of digitizing the
values of the control equipment, i.e. of the steering wheel, joystick and pedals. This
delay - a maximum of tens of milliseconds - is usually not significant. Even the
electric signal has about the speed of light, so there is always delay in the
transmission. Digital transmission contains extra delays compared to analog
transmission. Typical delays in the digital transmission are between 10 and 100ms.
In addition to the control information, the feedback also has to be transmitted back to
the operator. Usually this information contains both image and data. Image
information is the most critical. Even today, in most cases, analog video links are
used and they work with practically no delays. The image compression and
decompression delays are included into the feedback delay to separate them from the
transmission delay. They can be significant (>100ms) in the digital video
transmission. The biggest delay – with the transmission – occurs at the teleoperator
end. The control of robot actuators (remote loop), as with the steering and throttle,
includes both the control delay and the delays characteristic of the controlled process
like mechanical time constants. In most cases, the remote loop (closed control loop
in the teleoperator) helps the teleoperation, but one must remember that in all cases
where the remote loop is included into the direct teleoperation loop it increases the
A very good example of the remote loop delay can be found in Paper V. The loader
used has articulated steering with a hydraulic actuator. The manual control of the
loader is performed with a joystick, which directly controls the servo valve of the
steering actuators, i.e. the position of the joystick is relative to the angular velocity of
the steering, not to the angle. In the experiments, the loader has three different
(teleoperation) steering configurations: joystick, orbitroll and “servo”. Joystick
control was identical with the manual control. In orbitroll control, the angular speed
of the steering wheel (frequency of encoder pulses) was transformed to the control
current of the valve, i.e. the angular speed of the steering wheel is relative to the
angular speed of the steering. Servo steering imitated normal car steering where the
steering wheel position is relative to the steering angle. To perform this, a remote
controller was needed in the loader. A PID controller was added to the system to
control the steering angle according to the set point from the steering wheel. At the
start of the experiments, the drivers were offered the possibility of testing all
configurations, and they chose the one for the rest of the tests. None of the drivers
chose the servo steering. The reason was mainly “too sensitive steering”, but also, all
drivers noticed the small but noticeable delay compared to the other two methods.
As mentioned before, the delay has to be proportional to the frequency of the
controlled process. In vehicle control, the process frequency is relative to the vehicle
kinematics and the driving speed. In most work vehicles, the driving speed is low
(<40km/h), and the dynamic driving is utilized only in winter conditions. In the
experiments made by the author, the vehicle control is trivial when the total delay is
less than 100ms and relatively easy when it is below 0,5s.
The human part – cognitive and decision making ability – is the most important in
the control loop. In case of delay, human learning ability plays the main role. It
seems that the control delay is something that humans are used to. In all
experiments, the control delay was noticeable (0,3 – 1s), and test drivers learn to
compensate for it in only a few minutes. However, it has to be stressed that
compensation is possible only when the process is controllable with the existing
Virtual models of a telerobot and its environment can also be used for short delay
compensation. These so called predictor displays [Sheridan, 1995] present the
estimated movements of the robot resulting from a control action in real-time. In this
case, the operator can see both the estimated immediate response and the real
delayed response (augmented reality). This model-based compensation is more
typical in cases of long delay (see next chapter).
5.2 Move and wait teleoperation (long delay)
When the transmission delay increases enough, there is no possibility of direct
teleoperation. This is a typical situation in space applications where distances are so
long that the speed of light is the limiting factor in the delay. The only possibility in
these conditions is to increase the autonomy of the robot and use task-based move-
and-wait methods. From the operator's point of view, it would be easiest to make the
teleoperator highly autonomous and give it only long and demanding tasks to avoid
unnecessary operator control. However, in space conditions, the vehicle should be as
simple as possible, and only 100% sure tasks can be commanded because errors
cannot be allowed. The positive thing in space operations is that time is usually not a
limiting factor. In Figure 20, (and Paper I), a configuration of a teleoperated Mars
robot is shown. The robot is connected by a tether to the lander. All communication
between the operator and the robot goes through the lander. The feedback of the
robot movements and the environment comes from the cameras, which are located in
the lander. The control sequence is as follows: 1. A stereo image from the rover and
the environment is transferred to the operator. 2. A 3-dimensional environment
model is created from the image data. This model can be created either in the lander
or on the ground. 3. The operator looks at the image and the model and plans a short
trajectory for the robot. This can be, for example, “drive 10 cm forward” or “turn
30deg”. In the trajectory planning, the most important thing is to be sure that the task
can be executed without problems. 4. The robot executes the task and a new image is
taken and transferred to the operator. 5. The operator gets visual feedback from the
image, and plans a new task. One must now remember that in the case of Mars, the
transmission delay between the robot and operator is at least 15mins., and that the
low speed of the transmission (about 1200bit/s) increases the delay, especially in the
case of visual feedback.
Figure 20: Operational scenario of a Mars robot
The models of the telerobot and its environment can be used in task planning like
that presented in the previous chapter.
Tele-presence simply means that the operator feels that he is present at the
teleoperator site. Already the simple camera monitor combination creates some level
of presence, but usually a more sophisticated system is called for in order to call it
telepresence. The most typical ways to create telepresence are cameras that follow
the operator’s head movements, stereovision, sound feedback, force feedback and
tactile sensing. To provide a perfect telepresence, all human senses should be
transmitted from the teleoperator site to the operator site. A good example of multi-
sense telepresence is presented by Caldwell [Caldwell, 1996]. His system provides
both hearing and vision in stereo mode, head tracking, tactile, force, temperature and
even pain feedback.
The vision, hearing and sense are relatively easy to transmit, but smell and taste are
more complicated. Fortunately, these two senses are rarely important in machine
Humans get more than 90% of their perception information via vision. The human
vision sensors – eyes – are very complex opto-mechanical systems. They allow
stereovision, focusing, fast pointing, and a very wide field of view. The human field
of view is 180 deg horizontally and 120 deg vertically. The focused area is only a
few degrees, but movements and other interesting targets can be noticed from the
whole field. It is extremely difficult to manufacture a teleoperation system that can
imitate human vision and provide the operator with the same amount of information
as he could get in the teleoperator place.
In most visual feedback configurations, from simple monitor to complex telepresence
systems, the field of view is reduced because of the camera and monitor technology
used. In all monovision systems, the perception of distances is limited because of the
lack of depth view. In most cases, there is no need to build up any complex
telepresence systems. The simple vision feedback with static camera and monitor is
enough for most cases. As in the delay compensation, the learning will also
compensate for the limitations in visual feedback.
In some cases, an advanced telepresence is needed. To create the presence, a human
operator has to be cheated into feeling that he is present in the teleoperator place. To
“cheat” a person is primarily to cheat his or hers vision – to see is to believe. It was
Goertz who first showed that when the monitor is fixed relative to the operator’s
head, and the pan and tilt movements of the head drive the pan and tilt of the camera,
the operator feels as if he were present at the location of the camera. Already, the
head mounted display with tracked pan and tilt provides clear telepresence for the
operator. If the roll, and even the eye movements [Sharkey, Murray, 1997], is
tracked, the feeling is even more real.
[Tachi et al., 1989] were amongst the first who developed a high performance
hardware system for telepresence (Tachi called it tele-existence) experiments.
Tachi’s system had a very fast 3 DoF head tracking which – together with a high
class HMD - provided a very good feeling of presence [http://www.star.t.u-
The total range of human hearing is between 16 – 20000 Hz. The smallest audible
intensity depends on the frequency; the minimum is between 1000 and 6000Hz and
increases for lower and higher frequencies. In the control of a heavy work vehicle,
the noise of the machine is usually so high that the driver uses hearing protectors,
and can in general observe only the sounds of his vehicle. Despite damping, these
sounds are extremely valuable for the driver. In Paper II, a teleoperation experiment
with a mine drill machine is described. It was amazing how the operator could
operate the drill by hearing almost only the sounds of the machine. In the
experiments of Papers III – VI, it was also noticed that sound was even more
important to teleoperation than to manual driving because the drivers got no touch
response from the vehicle. The loading of the engine during a loading task, for
example, could be felt during manual loading, but only heard while teleoperating.
In teleoperation, the electrical transmission of sounds also makes it possible to tune
the intensity, and filter the non-informative noise away. It is difficult to create a tele-
presence without sounds.
From a very fundamental point of view, the touch or feel is the most important
human sense. Without vision or hearing, the human being can survive amazingly
well, but without the sense of touch, he would die relatively soon.
Human touch sensors – mechanoreceptors – are activated by touch, i.e. by pressure
on the tissues. These sensors are located throughout the human body. They sense the
positions and movements of joints, tension in muscles and touch on the skin. These
tactile sensors can be divided into two basic classes [Durlach, Mavor, 1995]:
1. tactile information, referring to the sense of contact with the object, mediated
by the responses of low-threshold mechanoreceptors innervating the skin
(say, the finger pad) within and around the contact region and
2. kinesthetic information, referring to the sense of position and motion of limbs
along with the associated forces conveyed by the sensory receptors in the
skin around the joints, joint capsules, tendons, and muscles, together with
neural signals derived from motor commands.
Touch is needed in all kinds of work where the human being is mechanically
interfacing with tools and environment – practically in every work except thinking.
Even the computer work is difficult without the sense of touch in the fingertips.
However, in case of tools or machines, like heavy work vehicles, touch is not
focused on the actual task but the control equipment of the machine. In some
teleoperation tasks like manipulation, the feedback of touch can help the operator to
a remarkable degree. Touch feedback can be divided in two types: force feedback
and haptic feedback.
6.3.1 Force feedback (kinesthetic information)
Force feedback means that the force generated by the teleoperator, usually a
manipulator, is fed back to the operator in order to generate a real response in
gripping and manipulation tasks. Among mechanical manipulators, this feature was
inbuilt because it was the force of the operator that was using the manipulator. When
hydraulic and electrical servos replaced the straight mechanical contact, force
feedback was no longer used. Now the feedback was generated artificially by
measuring the force from the actuator of the robot and generating it with an
additional actuator to the control equipment. In the manipulation tasks, force
feedback is essential for a good telepresence. Force feedback can also be used in
virtual environments to generate the feeling of presence.
6.3.2 Haptic feedback (tactile information)
In the wide sense, both the force and tactile feedback come under the term "haptic
feedback". In teleoperation, the main difference between a haptic interface and a
force feedback interface is the touch point. In force feedback, the muscular
(kinesthetic) sensors give the response to the operator. In the haptic feedback, the
tactile skin sensors have the main role. Usually in haptic interfaces, the tactile
sensing of the robot manipulator is fed back to the fingers of the operator. But it can
also be the vibration of the vehicle or the intensity of the camera that is fed back to
the human skin.
6.4 Vestibular sensors
Vestibular sensors are located inside the inner ear, and they are sensitive either to
angular acceleration and thus rotation, or to linear acceleration in the horizontal and
vertical plane, i.e. to gravity. This allows the position and movements of the head to
be detected. Vestibular sensing is important in all dynamic work tasks. The driver of
a heavy work vehicle gets a lot of information via his vestibular sensors, but also a
lot of annoying movements like vibrations. In teleoperation, the vestibular feedback
is not used because the control can be made without feedback in almost all situations,
and the vestibular feedback needs expensive mechanical structures. The lack of
vestibular feedback in the case where the operator is using a head-mounted display
generates a conflict of senses, which can generate simulator sickness (see Ch. 6.7).
Vestibular feedback is usually used in simulators to provide very natural feeling of
presence (see Ch. 6.5).
6.4 Virtual and augmented presence
In virtual presence, the operator feels he is present in an environment that has been
artificially generated by a computer. Pure virtual environments are usually used only
in simulators and games. The most typical examples are flight simulators, which are
used both for entertainment and the real training of pilots. Flight simulators were
also the first systems where virtual reality was utilized. In flight simulators, the
accelerations of the “plane” are also simulated by moving the simulator (and
operator) with hydraulic actuators (Figure 21).
Figure 21: Finnair’s DC-10 flight simulator
In teleoperation, usually the virtual reality is used to augment the telepresence. This
is called "augmented presence" (or "augmented reality"). Augmented reality can be
used for prediction and planning, for example, in cases where the long time delay
disturbs the teleoperation. In prediction, the existing environment is modeled, and,
when the operator is operating, the estimated actions of the teleoperator are shown
virtually. The real actions are shown in the same display after the delay. This way the
operator can do direct teleoperation tasks despite the time delay. However, the
estimated actions must be corrected with the real feedback information every now
In Paper 2 and [Halme et al., 1997], an augmented reality system is presented where
the operator can create and correct the virtual model in telepresence. The
telepresence system consists of HMD, head-tracker and 3DoF servo head with stereo
cameras and laser pointer. The virtual model is created in a PC with the WTK
program. A virtual (3D) model is fixed within the real world by updating it according
to the head-tracker information. The real image and virtual model are overlaid in a
video mixer, which provides the possibility of overlaying the two sources steplessy.
Additionally with this model, the virtual image also contains a graphical user
interface (GUI), which can be used with a mouse.
When a new, unmodeled object is noticed, the operator divides it into basic pieces,
which are modeled one by one. The basic pieces are: box, sphere, cylinder and cone.
Modeling is performed by naming the object and measuring a group of points from
the surface of the object with the laser pointer. The modeling software calculates and
draws a model of the object and places it in the overlaid image. If the size or the
place of the object is not matching totally the user can move, rotate and scale the
object by using a mouse and the user interface included in the model.
Figure 22: a) Real image, measured points and GUI. b) Real image and the model
based on the measured points
6.5 Enhancement Level of Tele (virtual)-presence
In this work, the manner in which the level of the telepresence affects the
performance of the operator in different work-vehicle tasks is studied. The word
“level” here refers to the level to which the telepresence system is advanced. In the
experiments, the telepresence was concentrated on the vision. The level was changed
from a stable camera - monitor combination to servo cameras and HMD via three
sublevels. Does the increase in its level also mean an increase in its quality? How
can the quality be measured objectively?
Schloerb [Schloerb, 1995] presents an evaluation method for telepresence systems
where the objective evaluation is to indicate how well the defined task is performed.
This leads inevitably to a situation in which the defined task also affects the
evaluation result. In some tasks, it might be possible that the lack of presence is
actually improving the performance. Schloerb’s subjective evaluation is based on the
feeling of how good the presence is from the operator’s point of view. It seems that if
there is clear mismatch between the objective and subjective evaluation, the
evaluated task makes the difference. Despite its slight limitations, Schloerb’s
evaluation uses the best criteria for measuring the quality of the telepresence.
Drascic [Drascic, 1991] compares the performances of mono- and stereovision in a
simple robot teleoperation task. In the experiments, the camera is stable but
stereovision clearly provides a higher enhancement level (feeling of presence).
6.6 Problems of Tele (virtual) presence
If technical problems like delay, lack of bandwidth, etc. are not considered the
biggest problem in telepresence based and virtual aided teleoperation, then simulator
sickness (SS) is. Simulator sickness is very similar to motion sickness and the
symptoms also resemble those of motion sickness like: apathy, general discomfort,
headache, stomach awareness, nausea, etc. The difference is that SS can occur
without any actual motion of the operator. SS problems are encountered especially
when HMD type displays are used.
Individual Simulator Task
age binocular viewing altitude above terrain
concentration level calibration degree of control
ethnicity color duration
experience with real- contrast global visual flow
experience with field of view head movements
flicker fusion flicker Luminance level
gender inter-pupillary unusual maneuvers
illness and personal motion platform method of movement
mental rotation ability phosphor lag rate of linear or
perceptual style position-tracking self-movement speed
postural stability refresh rate sitting vs. standing
scene content vection
time lag (transport type of application
update rate (frame
Table 1: Potential Factors Associated with Simulator Sickness in Virtual
Environments [adapted from Kolasinski, et al., 1995]
Kolasinski [Kolasinski, et al., 1995] has researched the SS especially in simulator
(virtual) environments but the results can be transferred to telepresence environments
also. The most typical reason of SS is the cue conflict. In cue conflict different
nerves get different information from the environment. Typical case, which will
occur in teleoperation also, is the conflict between visual and vestibular inputs. Other
possible reasons can be the quality of displays especially when HMD is used and the
time lags in vision and control. Potential factors associated with SS are shown in
7. Semi-autonomous Work vehicles
The automation level of work vehicles is still far removed from the level of factory
automation. The machines are still designed around the driver, and any automation is
only to improve performance, while the idea of replacing the driver has not been
taken seriously by the industry.
As described before, the main motivation in semiautonomous work vehicles is
money. The full autonomy is still more or less a pipe dream in the case of most
machines. Increasing the automation level is sensible as long as it boosts the
performance – either time or quality – of the work task. When the part of the task
that the driver can perform as well as the computer is automated, then there will be
spare time for driver. In the factory, this spare time could be used for extra work, but
in the case of a work vehicle, the spare time is more or less useless because drivers
do not usually have any other tasks to do. This problem can be solved by using
teleoperation, which allows the driver to use his spare time to control another
machine, or to do something else.
If a work vehicle can work autonomously more than 50% of its work cycle, one
operator can control two or more machines and save the man power costs. From the
point of view of economics, the productivity of the automation investment can be
simplified to the calculation of costs (automation and teleoperation investments) and
savings (decreased man power costs and savings in the driver infrastructure).
The fastest development in work-vehicle automation has been in the field of mining.
Also the first industrial experiments and products in the semiautonomous machines
have been done in the mining industry.
7.2 Related Work
In the LKAB’s mine Kirunavaara, Sweden [Erikson and Kitok, 1991] made the first
real environment experiments with a semiautonomous LHD machine. The system
approximated to the one presented in Paper V. The teleoperation was based on radio
control and video feedback from static cameras to monitor, while the automatic
driving was based on an underground signal cable and inductive sensor coils in the
vehicle. The loading was performed by teleoperation from a control room and the
hauling, dumping and driving back to the loading place were driven autonomously
by the vehicle.
The reaction of the test operators, who were all experienced LHD drivers, was
mainly positive due to the improvements in the working environment. The loading
and driving of an LHD machine by teleoperation didn’t cause any great problems for
the inexperienced operators. On the other hand, they noticed that it takes a
considerable amount of time for an automatic LHD system to manage to reach a
satisfactory functionality level. The production average speed was 80-90% of the
manually driven LHD in the same mine. There was no mention of the operation of
several machines by one operator.
A full telepresence system with all trimmings is technically demanding. The system
easily becomes complex and expensive. In the case of heavy working vehicles, the
price is important, and so is the robustness of the system. Depending on the work and
the environment in which it is done, a simplified system is often sufficient, and in
some cases is even better than a more complex one. It is not very clear, however,
what the main factors affecting this are. In what follows, we try to illustrate some
related problems through a number of field tests.
The first part of the tests is done with a teleoperated test vehicle by simulating
different possible tasks in an unstructured environment. In the second part, the same
experiments are conducted in a structured environment with real vehicles doing real
8.2 Experiments in an unstructured environment (Papers IV and
8.2.1 Test Equipment
The study was started with general teleoperation experiments involving different
tasks where heavy working vehicles might be used. These tasks can be found from
forestry, earth-moving, construction etc. As the use of real machines was not
possible, the tasks were performed using a Honda all-terrain vehicle(ATV) called
Arska [Koskinen et al., 1993]. The teleoperation of the ATV, shown in Figure 23,
was implemented with a steering wheel and pedal combination that provided the
feeling of driving a normal car. The operator station is shown in Figure 3.
Communication between the control station and ATV was made with one pair of half
duplex radio modems. The telepresence equipment included a stereo-HMD, a head-
tracker, two monitors, two cameras, a laser pointer, a 2 DOF servo head (Figure 4),
two pairs of short range video links, a half duplex radio, and a pair of radio modems
for transmitting and receiving the head tracking data from tracker to servo head.
Figure 23: The test vehicle “Arska”
To study the effect of equipment level, the telepresence hardware was configured to
five different systems respectively representing different enhancement levels in
vision and camera control:
1. Full telepresence (SYSTEM A)
stereovision, sound, 2DOF head tracking
2. Monovision telepresence (SYSTEM B)
monovision, sound, 2DOF head tracking
3. Monitor based telepresence (SYSTEM C)
image on screen, sound, 2DOF head tracking
4. Manual telepresence (SYSTEM D)
image on screen, sound, manual 2DOF camera control
5. Standard teleoperation (SYSTEM E)
image on screen, sound, fixed camera
Sound was left in each alternative because leaving it out was observed to deteriorate
the system radically.
Driving experiments were made in the university test field that consists of an uneven
ground surface mostly covered by hard sand, stones and vegetation. The experiments
included tasks that simulated real world driving tasks involving material handling
and transportation. The test tasks were the following:
1. “Corridor” driving
Driving on a defined route similar to a road, like ore transport in mine tunnels, etc.
The test was conducted by driving along a winding path that included narrow gates.
Overriding the path border and collisions with obstacles were counted as errors
Figure 24: “Corridor” driving
2. Unknown terrain driving
Driving over an unknown area where there are a lot of obstacles and no specific
route, typical of forestry machines.
The test was carried out by driving to an unknown forest area were the operator had
to follow a natural path after perceiving it.
3. Loading tasks
The vehicle must take a load into its manipulator and move during the loading,
typical of different kinds of loaders.
The test was carried out by pushing boxes from one line to another by the aid of the
beak assembled on the front of the vehicle.
4. Maneuvering tasks
Maneuvering in close places, typical of forestry and loading machines.
The test was done on a “slalom” track, where the driver must dodge piles, stop on a
line and park in a given slot.
5. Fast driving
Driving with velocities of more than 5 m/s, typical of transporting tasks.
The tests were carried out by driving fast and stopping the vehicle in a given position
on an open field.
6. Off-road driving
Driving in areas where both obstacles and surface conditions can stop the vehicle.
The test was carried out by driving over uneven ground and crossing obstacles, like
ditches and stones.
Five different people aged 26 - 40, all men, were used as test operators. Two of them
were classified as experienced and the rest just amateurs. The following properties
were evaluated from each system:
• Ease of driving with continuous motion
• Ease of driving accurately
• Ease of navigation
• Perception of obstacles and unexpected objects in the environment
• Possible ergonomic drawbacks
The results of each test were evaluated by measuring overall execution time and the
number of errors during the test. Only the results obtained by the experienced
operators were counted. Verbal assessment concerning the properties of the system
was obtained from all operators.
When repeating a particular test several times the effect of learning can be clearly
seen. This affect is independent of the system used. In order to eliminate the effect,
only the best results obtained after a training period were taken into account. Another
point was the quality of vision in different system configurations. Due to some
problems of cross hearing the video channels, the quality of stereovision was not so
high as it could have been. This effect influenced both execution times and verbal
As pointed out in Chapter 5.3, the effect of learning is very noticeable in tasks that
are repeatable in nature. This is illustrated in Table 2, where the first three execution
times of Test 1 (Corridor driving) by one of the experienced operators are given. The
system configuration is C, which was used when driving the test the very first times.
Effect of learning
1 2 3
Table 2: Execution times (min.) in Test 1 when an operator drives the test the first
As to the different systems, the operators reported soon after starting the experiments
that System D with manual camera control was much more difficult to use than the
others and ergonomically bad, so it was eliminated from further evaluation. A
comparison of the rest of the systems is given in the Table 3. Here the fastest
execution times are presented for Tasks 1, 3 and 4. It can be seen that the times are
somewhat shorter when using the helmet, but also the simplest system - system E -
has good values. The different systems can be also evaluated according to error
sensitivity. Table 4 illustrates error sensitivity, which has been given for each system
as the relative amount of errors calculated from the total amount of errors registered
during the whole test period.
From this data, it can be clearly seen that systems A, B and C with head tracked
camera are less sensitive to errors than system E with fixed camera. The conclusion
that the stereovision helps somewhat for improving the error sensitivity may be also
drawn. Operators could accomplish Test 2 (unknown terrain driving) only when
using systems A, B and C. The biggest problem in this test was loosing the path so
badly that it was not possible to find it again.
Corridor Maneuver Loading
Table 3: The shortest execution times of Tests 1, 3 and 4 with different system
configurations. REF is the reference execution, which was driven manually. (Note:
system D was used only in corridor driving)
Corridor Maneuver Loading
Table 4: Error sensitivity of different systems. Note: relative number of errors in
three test cases.
The turnable camera was necessary to overcome this problem; system E was not
suitable for this purpose.
It has to be stressed that the number of drives was too limited for proper statistical
analysis. More information was derived from the subjective evaluation, which was
made by interviewing the drivers. The following points were made during the
1. It was difficult to say that the stereovision clearly helped in tasks (3,4), which
required accurate driving, but indications in this direction exist. The situation
could be operator dependent. It should also be noticed that the HMD was
relatively old and that the video links were “different pairs”, i.e. the quality of
stereo pictures was not equal to mono and monitor pictures.
2. Mono-vision was considered better than stereovision because of the better
quality of picture. In cases of the same quality of pictures, previous tests
[Drascic, 1991] showed that stereo was faster when performing difficult tasks
for the first time. After training, the difference between mono and stereo
decreased. Both mono- and stereovision caused simulator sickness in some
3. The use of monitor + head tracking was considered clumsy, especially if the
whole workspace of the camera had to be used. However, the image was
better than when using the HMD, and simulator sickness was not observed at
4. The manual use of the camera while driving was not easy, and ergonomically
it was not feasible.
5. The fixed camera + monitor combination was judged as the best alternative
when the task and environment was familiar.
To summarize from the trials in an unstructured environment:
1. When operating in an unknown environment for the first time, the use of a
head-tracker and possibly servo cameras is justifiable.
2. HMD-displays can cause simulator sickness.
3. In most cases, after learning, the fixed camera + monitor is enough.
8.3 Experiments in a structured environment (Papers V and VI)
Previous tests produced generic information for teleoperation of basic work tasks,
but also left many questions that can only be answered in real tests with a real
machine. Tests with the same type of equipment were carried out with a full scale
LHD-machine. The difference was that this time the environment was an
underground construction site that formed clearly structured surroundings.
8.3.1 Test vehicle and equipment
The loader used weighed 40 tons and had a loading capacity of about 5 m3. The
vehicle was diesel powered with hydrostatic power transmission. The steering was
an articulated type, i.e. with a frame divided into two parts of approximately the
same size, which were connected by the steering joint. The bucket had two degrees
of freedom: lift and tilt.
The loader was equipped for full teleoperation. All the required driving actions could
be performed remotely from a remote control station.
The actual user interface in the remote control station was the control chair with a
steering wheel and pedals (Figure 25). The steering could be controlled either with
the steering wheel or with the joystick on the left handle of the chair. The throttle and
brake were respectively controlled with two pedals. Engine start, gears, camera, etc.
were controlled with the buttons, switches and the other joystick on the right chair
The feedback data was shown on the monitor of the control PC. The video image
from the vehicle was shown on a separate monitor. Also, sound from the vehicle was
available. The control PC read the data from control devices (pedals, joysticks,
buttons, etc.) and sent it to the vehicle.
Figure 25: Control chair, steering wheel and pedals
The control data was transmitted with a pair of radio modems between the vehicle
and the control station. A leaky feeder cable system was used in order to cover the
whole test route of the construction site.
Two video channels and one sound channel were required for the tests. The leaky
feeder system used did not support video frequencies so two analog video links with
2,4GHz frequency were used. To ensure the connection throughout the whole test
route, three pairs of video receivers were located around the route. The right receiver
was chosen manually during a run.
8.3.2 Tele-existence equipment
The telepresence equipment was basically the same as that used in previous tests
[Halme et al., 97].
22.214.171.124 Head Mounted display (HMD)
The old HMD used in previous tests was replaced with a new one. It had two 1,35”
active-matrix TFT-LCDs with VGA (640 x 480) resolution. The quality of the image
was evaluated as good, especially in terms of resolution, colors and sharpness. The
only negative evaluation related to a clear distortion of the image, which was
probably caused by the HMD optics.
The position of the operator’s head was tracked with the same 6 DOF mechanical
head-tracker, which provided fast and accurate position data, but the mechanical
connection to the operator limited his mobility and disturbed his concentration.
The cameras were compact integrated lens video cameras with auto focus and 12 x
optical zoom. The field of view was 47º horizontal (wide) and 4º horizontal (tele). In
the tests, the wide mode was used all the time. Cameras were evaluated as very good.
126.96.36.199 Servo head
The servo head was also a new one. It was designed in the automation laboratory
especially for teleoperation applications. The small sized robust head had a size of
300 x 300 x 300 mm. It provided turning angles of 180º in the vertical direction, and
a full 360º in the horizontal. The positions of the servo head could be sent through an
RS-232 or a CAN-bus. The performance of the head was equal to normal head
8.3.3 Test persons
A (age 40) was a professional loader driver. He had 20 years' experience operating
loaders and was also well experienced in loader teleoperation. He had driven several
thousands of buckets with a system that was similar to the test system with the fixed
B (age 43) was also a professional loader driver with several years of experience. He
also was well experienced with other types of work vehicles. He did not have any
teleoperation experience before these tests.
C (age 35) was a professional loader driver. He had no teleoperation experience.
D (age 30) was a research engineer in the automation laboratory (HUT). He had no
experience of loaders. He had teleoperated a laboratory test bed 'Arska' with all the
same telepresence configurations that were used in these tests. He also had a large
experience of video and computer games. He was an amateur pilot.
8.3.4 Tele-existence configuration
The telepresence system was configured to four different enhancement levels. These
levels were basically the same as in the previous experiments, except that the manual
control was omitted:
1. Fixed cameras and monitor
The image came from the fixed cameras pointing straightforward and backward.
Cameras were located on the centerline of the vehicle. The front camera was fixed to
the front part and the back camera to the back part of the vehicle. The image was
shown on a monitor ahead of the driver. The image was automatically switched
between the front and the back camera depending on the gear position. There was
also a possibility of taking a look in the opposite direction by pressing a button on
the handle of the operating chair.
2. Servo camera and monitor
The image was coming from the right camera of the servo head, which was located
on the top of the cabin about 1m from the vehicle centerline. The operator controlled
the camera movements with the ADL head-tracker. The image was shown in a
monitor like in the previous configuration. The main pointing direction of the servo
head could be rotated 180° by a button.
3. Servo camera and mono HMD
Like the previous configuration but with the monitor replaced by the HMD. The
image from the right camera is divided between the two displays in the HMD.
4. Servo camera and stereo HMD
Like the previous configuration, but both cameras were used. The images from the
cameras were transferred to the two displays of the HMD to create a stereo image.
The sound feedback was considered to be so important that it was included in all the
8.3.5 Test runs
The test runs were divided into two parts. In the first part, drivers drove three runs
with all three steering configurations by using the first telepresence configuration
(fixed camera + monitor). The steering tests are not reported here. After that, they
chose the steering configuration they liked most. This was criticized because all the
drivers then learned the driving with the same telepresence configuration. However,
there was no time for driving with all the configurations, and test runs with this
defect were approved.
In the second part, they drove the remaining test runs with three telepresence
configurations while using the chosen steering configuration.
The problems in the leaky feeder communication forced the first two drivers A and B
to drive the shorter route. Test drivers C and D drove the full route. In both cases, the
communication was not working properly all the time. The vehicle control system
generates an emergency stop if two consecutive data packets are missed. This caused
several stops during test runs. The runs were continued after the stops but this took a
lot of time because the engine of the vehicle had to be started again after an e-stop.
8.3.6 Evaluation methods
Schloerb [Schloerb, 1995] presents an evaluation method for telepresence systems
where the objective evaluation is to conclude how well the defined task is performed,
while the subjective evaluation is based on the feeling of how good the presence is
from the operator’s point of view.
In our case, the evaluation was also divided to two parts: objective and subjective
evaluation. The objective evaluation was quite near to Schloelrb’s definition; the
subjective evaluation was based on subjective comments of the drivers. These
comments concerned both the “goodness of the telepresence” and the performance in
In previous experiments, it was noticed that with the resources available it was
difficult to drive enough to get a sufficient quantity of data for proper statistical
analysis. In the objective evaluation, the performance was evaluated on the basis of
In the subjective evaluation, the comments of the drivers during driving were
recorded. Each driver was interviewed after each run, and when all the runs had been
In the objective evaluation, the following data was collected in order to evaluate the
performance of the driver: 1.Time of the run: The time of each run was measured.
The e-stop interrupts were subtracted. The problem here was that the number of the
runs did not allow a proper statistical analysis. 2. Errors: All the errors, which were
mostly hits to the walls or instances of emergency braking by the safetyman, were
calculated 3. Logging the operator driving data: The steering and the throttle
movements made by the operator were logged with 100Hz frequency. The data that
is shown for the time phase, and, especially, for the frequency phase, shows very
clearly how nervous the driving was. 4. Logging the vehicle positional data: The test
vehicle also had navigation equipment. The 2D position of the vehicle was calculated
from optical gyro and velocity, speed, motor speed, angle of the middle joint, time
and angular speed were measured.
After the first (steering) tests, the fixed cameras were changed to the servo
camera(s). All the drivers tested all the three configurations with the chosen steering
system. The result of those tests was a surprise. Unlike in the previous tests, the use
of the servo cameras didn’t help the driver at all. In fact, even with the stereovision,
the results were clearly worse than with the fixed cameras. The number of collisions
with the walls and the driving times increased. In [Halme et al., 97] and [Drascic, 91]
it was shown that after the task was learned the performance difference between the
different enhancement degrees disappeared. Now, despite learning the performance,
in all cases involving the servo camera(s), performance was clearly worse.
In the evaluation, the logged operator data was noticed to be very informative.
Steering and throttle data showed clearly how fast and smoothly the driver could
drive. Figure 26 shows the operator data from the different drivers. It can be clearly
seen how drivers A and C drove faster with a smaller number of steering movements
(less nervous) than drivers B and D.
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
Throttle (up) and steering data of drivers A and B
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160
Throttle (up) and steering data of drivers C and D
Figure 26: Raw throttle and steering data for all drivers (sticks, fixed cameras)
8.4.1 Servo camera and monitor
It was found unergonomic to turn the cameras by means of the head and still look at
the fixed monitor. The driving times increased, and then started to approach the
times of driving with fixed cameras. The times decreased when drivers start to keep
their heads (and cameras) fixed.
Figure 27: Tele-existence aided driving. Driver (front) is using a servo camera and
normal monitor. Virtual passenger (back) is looking at the stereo image from servo
cameras with HMD.
The greatest change compared to the fixed camera configuration was the different
camera place. The fixed front camera was on the middle line on the front part of the
vehicle. The fixed back camera was on the back part. The servo cameras were
located in the left edge of the cabin (back part). This helped the loading, because the
bucket could be seen a little better when it was in the lowest position. When cameras
were located to the side of the centerline, it was difficult to center the vehicle in the
In the servo camera operation also, “information increased the pain”. When the
driver had the chance to see how near the corner was in a turning, he started to
correct the situation, even though the vehicle would have managed without his doing
8.4.2 HMD with mono vision
Again errors and times increased from the fixed camera runs. In Figures 28 and 29,
the data from driver C with all four telepresence configurations is shown. It can be
seen how especially the throttle is used much more careful in HMD + mono
During driving, the drivers started again to keep the cameras as fixed as possible.
The times were still worse than with the fixed cameras. Driver C got nausea and the
others complained that they could not drive for long periods of time with the HMD.
It seems that an HMD - even a good quality one – is not suitable for use over a long
period of time.
8.4.3 HMD with stereo
Again errors and times increased from those of the fixed camera runs. Figures 28 and
29 show that the data for the runs with the HMD + stereo is slightly better than that
for the runs with the HMD + mono, but clearly worse than for those with the fixed
cameras. However, at this point, the drivers also made positive comments about the
servo vision, even though it strained their eyes. It created a real telepresence and
helped drivers in their distance estimation. This especially helped in the loading.
0 100 200 300 400 500 600 700
0 100 200 300 400 500 600 700
0 100 200 300 400 500 600 700
0 100 200 300 400 500 600 700
Figure 28: Throttle data; diver C by using sticks. Fixed camera (top), servo camera
+ monitor, HMD +
mono, HMD +
0 100 200 300 400 500 600 700
0 100 200 300 400 500 600 700
0 100 200 300 400 500 600 700
0 100 200 300 400 500 600 700
Figure 29: Steering data; driver C, steering by using sticks. Fixed camera (top),
servo camera + monitor, HMD + mono, HMD + stereo.
An HMD always provides a feeling of existence. With the real stereo, the feeling is
very strong. In a monitor the movements of the vehicle are movements of the image.
With an HMD, the operator feels himself moving while he is sitting in a stationary
chair. This conflict of senses can cause simulator sickness (see Ch.5.4).
The stereovision also causes a problem. When the mutual angle of cameras is fixed,
the operator has to correct the distortion with his eyes. This gives rise to eyestrain
after a while.
9. Visual flow in teleoperation
The “easy” driving in a tunnel with fixed camera and normal monitor raised further
questions as to how the walls do actually affect teleoperation. In [Halme et al., 1997],
it was noticed that on a normal route, where route markings are on the ground,
driving is easy, until tight maneuvers are needed. In the bends of the route, the route
disappears from the view of the camera if there is no possibility of turning the
camera (this depends on the camera’s field of view (FoV)). Especially on an
unfamiliar route, this renders driving remarkably difficult. In a tunnel, this situation
doesn’t arise because the walls are always visible in the bends in spite of the camera
FoV. But are there any other effects in tunnel teleoperation? Srnivasan [Srnivasan et
al., 2000] demonstrates how honeybees flying in a tunnel center themselves by
sensing the frequency differences between the textures of the walls (optical flow).
Although there is a great difference between the visual perception of bees and
humans, it was thought interesting to see whether the same phenomena have a role in
human driving. This interesting test was imitated by replacing the honeybees with a
RC-car teleoperated by a human driver. As mentioned above, the eye of a human is
totally different from that of a honeybee. Most probably the cognition of the two is
also different. Nevertheless, teleoperation in a tunnel remains comparable to the
experimental tasks of Srnivasan.
9.1 Test Setup
The first idea was to use conveyor belt type walls where a “driver” could adjust the
camera between the walls. However, this was considered too artificial, and a full
teleoperation environment was created on a RC-car (Figure 30).
The “normal” RC-car controller with two sticks was considered too complicated for
people who are used to driving cars only; therefore the sticks were replaced with a
car type (and size) steering wheel and gas pedal.
The test tunnel was made from corrugated cardboard coated with white paper with
black stripes (Figure 8). The frequency of the stripes was different on each side of
the tunnel. The length of the tunnel was 20 m. The test drives were conducted in
tunnels 50 and 70 cm wide respectively. (The width of the car was 20cm).
Figure 30: Small scale RC-car equipped with camera, videolink and headlight
9.2 Test drives
The test drives were carried out by 7 persons, each driving the car forth and back in 8
tunnel configurations. The parameters of test drives are set out in Table 5. Tests
were started in normal “daylight” conditions (Figure 31). However, it seemed that
drivers could see a lot of possible landmarks other than stripes. To remedy this, the
“stripe effect” was strengthened by adding night (or real tunnel) driving to the test
program. In the vehicle’s headlight, only striped walls could be seen.