Classification : Restricted
WP Task No.
3 3100 D302
Status Draft/Issued/Revised Rev.
Date Release to CEC
CSTB - G. Sauce ELSEWISE
03/04/1997 01:58:00 PM -
20/05/2010 - 1
Document Control Sheet
ESPRIT 20876 - ELSEWISE
Revision Status Page Nos. Amendment Date By
1 WP3 : PDIT Support (draft Report Section)
Systems and technologies
Gerard Sauce CSTB
Dee Chaudhari BICC
Theo van Rijn TNO
Alastair Watson Leeds University
_01 _02 _03
Date 28/1/97 14/3/97 3/4/97
TW * * *
BOU * * *
HBG * * *
BICC * *
CAP * * *
CSTB * * *
ULeed * * *
TNO * * *
VTT * *
CSTB - G. Sauce ELSEWISE 20/05/2010 - 2
Table of contents
CSTB - G. Sauce ELSEWISE 20/05/2010 - 3
This Work package aims at detailing a critical evaluation of the key technology and methodology
developments in PDIT which are likely to provide the tools to support and shape the LSE Industry in the
future. It concentrates on those technologies, methodologies and standards most likely to impact on the
working and thinking in LSE. A layered approach will be adopted, considering first the supporting PDIT
environment (Task 3000), then the systems and technologies (Task 3100), and finally the application software
(Task 3200). (c.f. Figure 1 : Layered approach of the key technologies)
T3 0StmnT h l g
A 0: yes de noe
1 s a c o is
di ii n
S pt g
Th l g
c o is
p t o
w de e
oH w a Ur
na a n s
r r d e
Figure 1 : Layered approach of the key technologies
This task 3100 deals with the intermediary layer between users applications and supporting environment. The
main characteristic of systems and technologies is the independence from hardware on one side and from users
on the other side. A software of this category has to be adapted to the technical domain by specifying the
technical knowledge and the particular environment of the user in order to develop a specific application.
Moreover it is possible to find at least one software implementing each of these technologies on every kind of
At each stage, the objective is to provide an assessment of the existing technologies, the trends, and an
evaluation of the relevance to LSE. The aim being to report what can be expected to exist by 2005. The
methodology will be to use existing sources, to review published material and to consult with the key players.
It is important to note that our objective of identifying key technologies doesn't include an analysis of the
software market. Effectively, we list and detail the main technologies in order to give to the lecturer the
general concepts and the key point to go further. The latest will found neither an assessment of specific
software, nor a testing protocol to choice one of them.
This analyse report addresses LSE industry, but needs some IT background to be understood.
The scope of this task covers a large panel of systems and technologies which we have grouped into four
- Human Interface :
Human-Computer Interface considers anything which allows a user to interact with a computer.
Examples are command line interpreter, windows, icons, menus, pointer or virtual reality. This doesn't
include the different devices and way of communication between human and computer. We analyse
Graphical User Interface, Virtual reality, technologies of recognition and voice synthesis.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 4
- Data Management:
Data Management consists of controlling, protecting, and facilitating access to data in order to provide
information consumers with timely access to the data they need. Data modelling, Object databases,
distributed data bases and access via Internet are detailed.
- Knowledge management
Within this chapter, the whole sense of knowledge, including both object of knowledge and process of
knowing are considered. We analyse the three main aspects of knowledge managment : representation,
acquisition and usage
- Virtual enterprise
Even if this concept and the relevant technology are recent, several systems are available to allow the
creation of part of Virtual Enterprise. A Virtual Enterprise is defined as a temporary consortium of
companies that come together quickly to explore fast-changing business opportunities. This definition is
detailed through the communication level, the organisational level and the virtual enterprise
For each key technology, this report tries to give an answer to the four following questions:
What exists today?
What are barriers to LSE uptake?
What can we expect for 2005?
What are the relevance for LSE?
At the end of each chapter, a summary emphasises the main points, particularly barriers and relevance to LSE
CSTB - G. Sauce ELSEWISE 20/05/2010 - 5
Two definitions could be associated to the abbreviation "HCI" :
1 - Human-Computer Interaction : The study of how humans use computers and how to design computer
systems which are easy, quick and productive for humans to use.
2 - Human-Computer Interface : Anything which allows a user to interact with a computer. Examples are
command line interpreter, windows, icons, menus, pointer or virtual reality.
In this work, the focus is put on the second definition of HCI, considering mainly the notion of interface. This
problematic should be characterised by five key words : communication, language, interaction, process and
ergonomics. The three main following aspects summarise the latest.
Human characteristics govern how people work and how they expected to interact with computers. To design
the best interface, it is important to understand how human processes information, how he structures his
actions, how he communicates and what are his physical and psychological requirements. The Human
information processing has been and is always the object of numerous research which led to various models :
models of cognitive architecture, symbol-system models, connectionist models, engineering models, etc.
Language and communication are so natural to us that we rarely think about it. Human interaction using
natural language not only depends on the words spoken but also on tone of voice, which enables parties to
interpret the semantics of the spoken message. In vis-à-vis interaction this is emphasised by the physical body
language, consciously or unconsciously shown by people. Developing a language between human and
computers underscores the complexity and difficulty of a language as a communication and interface medium.
Ergonomics. Anthropomorphic and physiological characteristics of people can be related and applied to
workspace and environmental parameters. Some frequent problems people using computers face today are the
operation of the computer, using a pointing device, such as the mouse, and watching the screen. Labour laws
restrict the periods people are allowed to spend behind the computer, trying to prevent sight problems and
Repetitive Strain Injury (RSI).
Use and Context of Computers
The general social, work and business context may be important and has a profound impact on every part of
the interface and its success.
• The social organisation , the nature of work must be considered as a whole. this concerns points of view,
models of human , models of small-groups and organisations, models of work, workflow, co-operative
activity, offices socio-technical systems, etc. Let us remark an important notion : human systems and
technical systems mutually adapt to each other. Once a system has been written, most of the adaptation
is by the human, Effectively, he may be able to configure the software, but not much software will yet
adapt (automatically or not) to a user.
There are classes of application domains and particular application areas where characteristic interfaces have
been developed for example : document-oriented interfaces (e.g., text-editing, document formatting, structure
oriented editor, illustrators, spreadsheets, hypertext), communications-oriented interfaces (e.g., electronic mail,
computer conferencing, telephone and voice messaging), design environments (e.g., programming
environments, CAD/CAM), etc.
Part of the purpose of design is to arrange a fit between the designed object and its use. Adjustments to fit can
be made either at design time or at time of use by either changing the system (or the user at last) and changes
can be made by either the users themselves or, sometimes, by the system.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 6
Computer systems and architecture.
Computers have specialised components for interacting with humans. Some of these components are basically
transducers for moving information physically between human and machine (Input and Output Devices).
Others have to do with the control structure and representation of aspects of the interaction (basic software).
Devices must be constructed for mediating between humans and computers. They are mainly input devices
(keyboard, mouse, devices for the disabled, handwriting and gestures, speech input, eye tracking, exotic
devices such as EEG and other biological signals) and output devices (display, vector and raster devices,
frame buffers and image stores, canvases, event handling, performance characteristics, devices for the
disabled, sound and speech output, 3D displays, motion e.g. flight simulators, exotic devices). This technical
aspect is out of scope of this chapter.
The software basic architecture and techniques for human computer interaction recover four aspects :
a - Dialogue genre : The conceptual uses to which the technical means are put. Such concepts arise in any
media discipline such as film and graphical design. (Workspace model, transition management, design, style
b - Dialogue inputs : Types of input purposes (e.g., selection, discrete parameter specification, continuous
control) and input techniques: keyboards (e.g., commands, menus), mouse-based (e.g., picking, rubber-
banding), pen-based (e.g., character recognition, gesture), voice-based.
c - Dialogue outputs : Types of output purposes (e.g., convey precise information, summary information,
illustrate processes, create visualisations of information) and output techniques (e.g., scrolling display,
windows, animation, sprites, fish-eye displays)
d - Dialogue interaction techniques : Dialogue type and techniques (e.g., commands, form filling, menu
selection, icons and direct manipulation, generic functions, natural language), navigation and orientation in
dialogues, error management, and multimedia and non-graphical dialogues: speech i/o, voice and video mail,
active documents, videodisk, CD ROM.
The Dialogue techniques are the subject of this section.
What exists today
We identified three groups within the existing technologies :
- Graphical User Interface,
- Virtual Reality,
- Other Interfaces
Graphical User Interface (GUI)
The style of graphical user interface invented at Xerox PARC, popularised by the Apple Macintosh and now
available in other varieties such as the X Window System, OSF/Motif, NeWS and RISC OS, are also known as
WIMP (Windows, Icons, Menus and Pointers or maybe Windows, Icons, Mouse, Pull-down menus).
GUI uses pictures rather than just words to represent the input and output of a program. A program with a GUI
runs under some windowing systems (e.g.; The X window System, Microsoft Windows, Acorn RISC OS). The
program displays some icons, buttons, dialogue boxes etc. In its window on the screen, the user controls the
program by moving a pointer on the screen (typically controlled by a mouse) and selecting certain objects by
pressing buttons on the mouse while the pointer is pointing at them.
Each type of GUI is available on several platforms. One can classify the main WIMP under three categories
using a platform criteria :
- Workstation (mainly using UNIX as operating system).
- PC computer.
Furthermore, it's possible to find some software packages running on one of the previous platform, which
emulate the GUI of another platform.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 7
GUI On workstation
X Windows :
Sponsoring body: Initially developed by MIT's project Athena. It became a de facto standard supported by the
X (X window) is a specification for device-independent windowing operations on bitmap display devices. X
uses a client-server protocol, the X protocol. The server is the computer or X terminal with the screen,
keyboard, mouse and server program and the clients are application programs.
X Windows is used on many Unix systems. OpenWindows is a server for Sun workstations handling X
window System protocol.
X has also been described as over-sized, over-features, over-engineered and incredibly over-complicated.
Clients may run on the same computer as the server or on a different computer, communicating over Internet
via TCP/IP protocols. This is confusing because X clients often run on what people usually think of as their
server (e.g. a file server) but in X, is the screen and keyboard etc. which is being "served out" to the
For instance, Solaris OpenStep for Sun workstation, is implemented using the facilities of the Solaris
Operating Environment, including the X11 windowing system.
Other GUI on Workstation
OSF Motif (Open Software Foundation - Motif : Sponsoring Body : OSF Open Software Foundation is the
standard graphical user interface and window manager (responsible for moving and resizing windows and
other practical functions). This is one of the choices of look and feel under the X Window system. Motif is
based on IBM's Common User Access (CUA) specification which permits users from the PC world to migrate
to UNIX with relative ease.
Products implementation: Motif is used on a wide variety of platforms by a large population of Unix users.
Machine types range from PCs running Unix, to workstations and servers for major manufacturers including
Sun, Hewlett Packard, IBM, Digital Equipment Corporation, and Silicon Graphics.
In order to have one's software certified as Motif compliant, one must pay a fee to OSF. Motif is running under
the X Window system. Developers who wish to distribute their OSF/MOTIF applications may do so as long as
the license agreement is not violated. There are no non-commercial Motif toolkits available, and the Motif
source by OSF is fairly expensive.
Open Look is a graphical interface and window manager from Sun and AT&T. This is one of the choices of
look and feel under the X Window system. It determines the `look and feel' of a system, the shape of windows,
buttons and scroll-bars, how you resize things, how you edit files, etc. It was originally championed by SUN
microsystems before they agreed to support COSE (Common Open Software Environment)
OLIT, XView, TNT are toolkits for programmers to use in developing programs that conform to the OPEN
- OLIT was AT&T's OPEN LOOK Intrinsics Toolkit for the X Window system;
- XView is Sun's toolkit for X11, written in C. XView is similar in programmer interface to SunView.
- The NeWS Toolkit (TNT) was an object-oriented programming system based on the PostScript
language and NeWS. TNT implements many of the OPEN LOOK interface components required to
build the user interface of an application. It's included in OpenWindows up to release 3.2, but is not
supported (and will not run) under OpenWindows 3.3 (based on X11R5).
WABI (Developer/Company: IBM)
A software package to emulate Microsoft Windows under the X window System: Wabi 1.1 for AIX is actually
ordered as a feature of AIXwindows(r), IBM's implementation of the advanced X Window system using the
OSF/Motif(tm) interface standard. Since Wabi 1.1 for AIX is mapped to the AIXwindows interface on the
AIX/6000 system, it provides all the function of the industry-standard X Window, including the ability to
exploit Xstations at a lower cost-per-seat than fully configured workstations. What's more, X Windowing can
CSTB - G. Sauce ELSEWISE 20/05/2010 - 8
improve application performance by offloading the graphics-intensive work from the server to the Xstations.
WABI is also widely available on SUN.
GUI On PC
Developed by the Company Microsoft MS-Windows was a windowing system running on a DOS System. It
replaces the DOS prompt as a mean of entering commands to the computer : Instead of typing a command at
the "C:" on the screen, entails typing the command correctly and remembering all the necessary parameters, a
pointer on the screen is moved to an icon (small picture) that represents the command to be run. By pressing a
button on a mouse, a window appears on the screen with the activated program running inside. Several
windows can be opened at once, each with a different program running inside. The most important role of
Windows has been to promulgate some interface standards. Although software designers have used the
components of the Windows interface in different way, in order to make their product unique, the same
components are used in every Windows interface : Windows, Menus and Commands, dialogue boxes, button
and check boxes, Icons.
The window system and user interface software released by Microsoft in 1987 was widely criticised for being
too slow on the machines available.
The latest release of the primary feature (May 1994 :windows 3.1, or maybe Windows 3.11) was more
efficient and became really a standard GUI for PC, used by developers.
The new feature of MS-Windows (95 and NT) replaces both MS-DOS and Windows 3 on workstations. It
became multi-tasks. A server version (Windows NT 3.5) is available today. These two versions are replaced by
Windows NT 4.
X-Windows Emulator :
Some software packages propose to emulate a X-Windows system, in order to transform the PC to a X
terminal, connected to a X server. (X-Win32, Exceed 5, etc.). This allows the user to run UNIX applications
from his PC.
GUI on Macintosh
Macintosh user interface
Originally developed at Xerox's Palo Alto Research Centre and commercially introduced on the Xerox Star
computer in 1981, Apple later built a very similar version for the Macintosh.
The Macintosh user interface has become a very popular method for commanding the computer. This style of
user interface uses a graphical metaphor based on familiar office objects positioned on the two-dimensional
"desktop" workspace. Programs and data files are represented on screen by small pictures (icons) that look like
the actual objects. An object is selected by moving a mouse over the real desktop which correspondingly
moves the pointer on screen. When the pointer is over an icon on screen, the icon is selected by pressing a
button on the mouse. A hierarchical file system is provided that lets a user "drag" a document (a file) icon into
and out of a folder (directory) icon. Folders can also contain other folders and so on. To delete a document, its
icon is dragged into a trash can icon. The Macintosh always displays a row of menu titles at the top of the
screen. When a mouse button is pressed over a title, a pull-down menu appears below it. With the mouse held
down, the option within the menu is selected by pointing to it and then releasing the button.
For people that were not computer enthusiasts, managing files on the Macintosh was easier than using the MS-
DOS or Unix command line interpreter, and this fact explains the great success of the Macintosh.
Unlike the IBM PC worlds, which, prior to Microsoft Windows had no standard graphical user interface,
Macintosh developers almost always conformed to the Macintosh interface. As a result, users feel comfortable
with the interface of a new program from the start even if it takes a while to learn all the rest of it. They know
there will be a row of menu options at the top of the screen, and basic tasks are always performed in the same
way. Apple also kept technical jargon down to a minimum.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 9
The last version of Macintosh user interface provides some of the MS-Windows functions, and then allows the
possibility to run software developed for MS-Windows. For instance, the PowerPC Macintosh have an easy
way to instantly switch between the Macintosh operating system and the DOS or Windows 3.1 environment.
X-Windows Emulator :
Some software packages propose to emulate a X-Windows system, in order to transform the Macintosh to a X
terminal, connected to a X server. MacX 1.5 is an enhanced version easy-to-use software for high-performance
X Window System computing on Macintosh and other MacOS-based systems. MacX helps increase the
productivity of Macintosh users in UNIX/VMS environments by enabling them to seamlessly run both
network-based X applications and Macintosh applications on one Macintosh computer.
Virtual Reality is a set of computer technologies the combination of which provides an interface to a
computer-generated world, and in particular, provide such a convincing interface that the user believes he is
actually in a three dimensional computer-generated world. This computer generated world may be a model of
a real-world object, such as a house; it might be an abstract world that does not exist in a real sense but is
understood by humans, such as a chemical molecule or a representation of a set of data; or it might be in a
completely imaginary science fiction world. A key feature is that the user believes that he is actually in this
different world. This can only be achieved if the user's senses - sight, touch, smell, sound and taste, but
primarily sight, touch and sound - give convincing evidence that he or she is in that other world. There must be
no hint of the real world outside. Thus, when the person moves, reaches out, sniffs the air and so on, the effect
must be convincing and consistent. A second key feature of Virtual Reality is that if the human moves his
head, arms or legs, the shift of visual cues must be those he would expect in a real world. In other words,
besides immersion, there must be navigation and interaction.
Three key words characterise the current common Virtual Reality softwares :
- 3 dimensions,
The first condition to obtain a reality base and a real immersion is to address the sight. This sense takes a
particular importance and refers to the nature of reality being shown. One of the most important cues is the
perception of depth. To achieve this illusion, it is necessary to have a 3D model of the world and to send
slightly different images to the left and right eyes in order to mimic the sense. In this context, actual displays
are unsuitable tools, and if a user wants to have a very believable experience, he must use specific headset.
Recent research have shown that good quality 'surround sound' can be just as important to the sense of reality
as are visual clues.
VR allows the user to see how pieces inter-relate and how they interact with one another. Interactivity is
considered as the core element of VR. Rather than just passively observe it, users should be able to act on the
environment, to control events which transform the world.
The speed of response of the system to movement and interaction is an essential aspect to appreciate how the
virtual world convincing is. In order to obtain a natural speed, the system should perform a perfect
synchronisation between the person moving his head, the visual clues following, and eventually the other sense
Two fundamental aspects of VR could be identified : technology and software.
True interactivity in immersive VR requires adapted feedback, i.e. controlling the interaction not by a
keyboard or mouse, but by physical feedback to the participant using special gloves, hydraulically controlled
motion platforms, etc. The technology of VR includes Headset, data glove, body suit, sound generation.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 10
Sotfware packages propose 2 types of tools :
- Tools for modelling virtual world through 3D representation (also sound for the most efficient
ones), and for defining specific behaviour and interaction between objets. These tools, for
developers, are call world builder. A lot of various world builders are available for every kind of
- Tools for representing the world and allowing users immersion. They control the specific
devices. They are named world browser.
Nowadays, each package has its own model representation, no de facto standard emerges, except the Virtual
Reality Modelling Language (VRML) which is a standard language for describing interactive 3-D objects and
worlds delivered across the Internet. Nowadays, this language limits the VR application to one sense : the
sight. Effectively, users immersion is only performed into a 3D model of the world, allowing then the virtual
world access to users with ordinary computers with simple devices (keyboard and mouse). The next version of
VRML will extend its functionnalities by providing sensors and allowing 3D animations.
In practice only a small number of applications are already available, mainly for testing further developments.
VR industry is immature, but it will in a medium term become a key component of Human Interface.
Multimedia information is defined usually as combinations of different data types, like audio, voice, still and
moving pictures, and drawings, as well as the traditional text and numerical data. Multimedia is a way to
satisfy people need's for communication, using computer. One aspect of multimedia is the amount of control
exercised by the user, i.e. how does the user interact and “browse” through the information presented in a
multimedia fashion, such as video-disks or information columns. This is why we cannot really consider
multimedia as a human computer interface. It involves several techniques used for HCI, specially graphic
representation and audio/video effects, and adapted devices.
Document recognition technology
Close to specific devices several systems refer recognition technology for data entry and document
management. Characters and graphical elements can now be recognised automatically. The main technologies
are the followings :
- Optical character recognition (OCR) is a process which transforms a page into ASCII code. Scanning
devices can capture a page of text in a few seconds with relatively little manual effort. Afterward, OCR
software can analyse and transform image into string and text which can be edited, indexed, etc.
Several methods of scanning of text exist : analytical, morphological, statistical methods and "neural
network" technology. An advanced OCR package also uses lexical context to improve accuracy.
Success depends in a large part on the quality of the original text.
- Optical mark reading : This variant of OCR is suitable when user can describe specific forms to search.
- Intelligent Character recognition : this technique is widely used for Handwriting acquisition. The two
main problems are the accuracy and the effective speed of ICR packages.
- Barcoding : a very efficient and suitable technology anywhere where data can be pre-coded, with a
typical error rate of 0.05 %.
Voice synthesis and recognition
Several years of research have been conducted with the goal of obtaining speech recognition by computers.
Whereas today's systems are far from reaching the goal of real time recognition of unconstrained human
language, the technology has evolved to a point where it is useful in a number of applications. There are
several classification of voice recognition technologies. They are as follows:
- Speaker Independent vs. Speaker Dependent
Speaker independent systems are designed to work for anyone without having to train them to a specific
person voice. Speaker dependent systems are trained to recognise a single person voice. Speaker
independent systems work immediately for anyone, but usually have less accuracy and a smaller
vocabulary than speaker dependent systems. These systems are commonly used for commands such as
"copy" or "paste" because the vocabulary is generally around 300 words. Speaker dependent systems
CSTB - G. Sauce ELSEWISE 20/05/2010 - 11
require about 1 hour of training for a new user. It also requires that you make corrections as you are
dictating to help the system keep on learning. These systems have vocabularies of 30,000 to 60,000
words. There are also many specialised vocabularies available mainly in medicine and legal areas.
Vocabularies are often specific to a field such as radiology and emergency medicine. The accuracy of
voice recognition can be up to 98% (low 90's is more realistic).
- Continuous Speech vs. Discrete Speech
Continuous speech is basically talking at a normal rate. Most of the voice recognition industry is
currently researching continuous speech systems, but they are years away. There are optimistic
predictions to have it by the year 2000, but there are also critics doubting this. Discrete speech is
pausing between each word. Voice recognition technology currently requires discrete speech. The
pauses between words do not have to be long, but it definitely takes practice. Manufacturers claim up to
80 words per minute, but 60 wpm is probably more realistic. Speaking with pauses between words has
the potential of being very distracting to the thought process.
- Natural Speech
Natural speech refers to understanding of language. This is more of an ideal than a current goal. For
voice recognition systems, they would hypothetically function more like a human transcriptionist. This
would allow saying things like "Replace those last two sentences with…" and know what to do with
"um, ah,". There are several features in voice recognition systems. For example, some systems use the
context to help select the correct word, (night / knight). Some allow macros or templates that build a
standard reports and then only variances need to be dictated.
There are a number of speech synthesis systems on the market today. It is interesting to note that in artificial
speech generation there is a tradeoff between intelligibility and naturalness. Currently, the industry has placed
emphasis on naturalness, with the unfortunate consequence that even the high end systems are sometimes hard
to understand. Speech synthesis is generally regarded as a secondary and much less complex issue when
compared to speech recognition and understanding.
The current trends in the industry are for speaker-independent systems, software based systems, and extensive
use of post-processing. Of great interest is the emergence of spoken language systems which interpret
Speech recognition systems are being widely used by telephone companies, banks, and as dictation systems in
many offices. These applications are highly constrained and often require the user to pause between each word
(isolated speech recognition).
The main disadvantages of this kind of technology are related to :
- The accuracy : Most speech understanding systems are capable of high accuracy given unlimited
processing time. The challenge is to achieve high accuracy in real-time.
- Speech recognisers are susceptible to ambient noise. This is especially true for speaker independent
systems which use speaker models that are developed in quiet laboratories. The models may not
perform well in noisy environments.
- Out-of-Phraseology Speech - Speech recognisers are not yet capable of understanding unconstrained
human speech. Accordingly, applications are developed based on constrained vocabularies. The
challenge is to detect out-of-phraseology speech and reject it before it is post-processed.
It is clear that speech is the most familiar way for humans to communicate, unfortunately, speech recognition
systems are not yet capable of perfect recognition of speech in real time.
Barriers to LSE uptake
GUI have taken a primordial importance for the software development. Nowadays, old softwares using a
simple alphanumerical interface (on VT100 terminal or DOS Operating system) are progressively removed
from companies. Apple first with Macintosh, and recently Microsoft with MS-Windows have widely
contributed to the new feature of the software packages. The restriction to few de facto standards is really a
good aspect for users, it facilitates the changing of computer and reduces the adaptation period. However,
actual GUI need more and more memory and performances, sometimes to the detriment of the pure
Considering Virtual reality, this new technology is confronted with several problems actually limiting its
CSTB - G. Sauce ELSEWISE 20/05/2010 - 12
- Firstly a technological problem due essentially to the poor performance : The objects of the virtual
world do not move or react fast enough to provide natural scene movement, or need super computer to
obtain a reality base. The computation needed to render and display a 3D model are enormous, and
consequently need a important investment.
- Secondly, the characteristics of specific devices, weight of headset and gloves, communication between
computer and devices through direct cable links, low resolution of the images are great disadvantages
for the realism.
- Thirdly, defining a virtual world model of high quality is a very long and expensive work. A great part
of this action is actually "hand made". Some software packages propose an automatic generation from
CAD Data, but the latest are not thinking for this purpose and consequently not really adapted.
Furthermore, for a whole virtual world, the model size is usually enormous and leads problems of
memory and calculation time.
- Fourthly, the absence of a de facto standard limits the investment in such technologies. Effectively,
only big firm can invest in a software package which risk to disappear few years later. Whereas this
package is standard compliant, modelling worlds, data and results could be reused within an other
Regarding other HCI technologies, LSE as other industries is waiting for more efficient and suitable tools.
This is the case for voice and Handwriting recognition.
Trends and expectations
Optimal exploitation of computers can be achieved only if staff adapt to machines. Human Computer Interface
(HCI) research aims to reduce this adaptation and produce a more 'user natural' environment. Five definitions
of different areas of HCI research are the following:
- Interactional hardware and software : Developing, studying and providing guidelines for various ways
of interacting with computers, both at the software and hardware level.
- Matching models : studying how users interact with computers.
- Task level : Determining how well systems meet user's needs.
- Design and development : studying designers and the design process.
- Organisational impact : studying the impact of new systems on organisations, groups and individuals.
Human Computer interfaces include speech technologies (speech synthesis and speech recognition), gesture
recognition, natural language, virtual reality.
There is a new emerging generation of intelligent multimedia human-computer interfaces with the ability to
interpret some forms of multimedia input and to generate co-ordinated multimedia output : from images to
text, from text to images, co-ordinating gestures and language, and integrating multiple media in adaptive
presentation systems. Over the past years, researchers have begun to explore how to translate visual
information into natural language and the inverse, the generation of images from natural language text. This
work has shown how a physically based semantics of motion verbs and locative prepositions can be seen as
conveying spatial, kinematic and temporal constraints, thereby enabling a system to create an animated
graphical simulation of events described by natural language utterances. There is an expanding range of
exciting applications for these methods such as advanced simulation, entertainment, animation and CAD
The use of both gestures and verbal descriptions is of great importance for multimedia interfaces, because it
simplifies and speeds up reference to objects in a visual context.
However, natural pointing behaviour is possibly ambiguous and vague, so that without a careful analysis of the
discourse context of a gesture there is a high risk of reference failure
Practically, for the next ten years, realistic expectation appear for the following technologies :
Two trends characterise the development of virtual reality software packages :
- Virtual reality in the pure sense, that includes total immersion addressing the three main senses (sight,
sound and touch) using specific devices.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 13
- Simplified Virtual Reality based mainly on sight and sound, using ordinary display. This trends will
probably lead to very efficient tools. In fact this will add to actual graphical modelers two fundamental
functionalities : interactivity and pseudo immersion in real-time. It introduces the temporal dimension.
Although these specifications are restrictive, first very efficient available tools are really interesting.
At this time, it is very difficult to identify an emerging de facto standard for Virtual reality model. If VRML
version 2 makes a real and positive contribution to this technology, a big step will be done in this direction.
Moreover, VRML affers a new dimension to virtual reality : multi-user capability. VRML will evolve into a
multi-user shared experience in the near future, allowing complete collaboration and communication in
interactive 3D spaces. Any proposal must anticipate the needs of multi-user VRML in its design, considering
the possibility that VRML browsers might eventually need to support synchronisation of changes to the world,
locking, persistent distributed worlds, event rollback and dead reckoning.
In the past, we have been required to interact with machines in the language of those machines. With the
advent of speech recognition and synthesis technology, humans can communicate with machines using
constrained natural language without noisy ambience. Much research is being conducted in the area of spoken
language understanding. Spoken language systems attempt to take the best possible result of a speech
recognition system and further process the result. A spoken language system is defined as a system
understanding spontaneous spoken input. Spontaneous speech is both acoustically and grammatically
challenging to understand.
Speech recognition applications have a tremendous opportunity for growth. Besides the speaker-independent
continuous speech, the future development of speech recognition technology will be dedicated to the following
- Client/Server Speech Recognition :Client/Server software will allow speech applications to work over a
wired or wireless network. This means that after users dictate from their workstation, users can press a
"SEND" button and convey their recorded voice file to the server, where the speech engine will perform
the recognition and return them to users.
- PC-Based Computer-Telephony Application : Dictation, call processing, and personal information
manager (PIM) will be integrated into one system installed in a computer. For example, users will use
voice to update their daily schedule and use telephone voice-menu to order products.
Because prices of Central Processing Units and Digital Signalling Processors are coming down, speech
recognition systems are seen as affordable and feasible equipments for computers. The goal of speech
recognition is to speak in a speaker-independent continuous fashion into computers. However, the present
situation for most widely used speech-recognition technology is "discrete." This means that users need to
pause between words so that the computer can distinguish the beginning and ending of each word. In the
future, continuous speaker-independent speech recognition will be due to better processor and algorithmic
Optical Character Recognition (OCR) is already well established in commercial applications. The recognition
of handwritten text is an important requirement for Pen Computing but far from a trivial one. Many different
initiatives were taken to solve this problem and it is not yet known which is the best and the results are still not
perfect. It is important to stress two points:
- Handwriting recognition is no more a new technology, but it has not gained public attention until
- The ideal goal of designing a handwriting recognition method with 100% accuracy is illusionary,
because even human beings are not able to recognise every handwritten text without any doubt, e.g. it
happens to most people that they sometimes cannot even read their own notes. There will always be an
obligation for the writer to write clearly. Recognition rates of 97% and higher would be acceptable by
The requirements for user interfaces supporting handwriting recognition can easily be deducted from the pen
and paper interface metaphor. First, there should be no constraints on what and where the user writes. It should
be possible to use any special character commonly used in various languages, the constraint to use only ASCII
characters is awkward, especially with the growing internationalisation that the world is facing these days. The
ideal system is one that supports handwritten input of Unicode characters. The second requirement is that text
CSTB - G. Sauce ELSEWISE 20/05/2010 - 14
can be written along with non-textual input, e.g. graphics, gestures, etc. The recogniser must separate these
kinds of input.
Off-line recognition is done by a program that analyses a given text when it is completely entered, hence there
is no interaction with the user at this point. On-line recognition, on the other hand, takes place while the user is
writing. The recognizer works on small bits of information (characters or words) at a time and the results of
the recognition are presented immediately. The nature of on-line systems is that they must be able to respond
in real time to a user's action, while off-line systems can take their time to evaluate their input : speed is only a
measure of performance, not of quality.
Until recently, only the recognition of handprinted writing has been studied, but with the growing wide-spread
use of pen-based systems, cursive writing becomes important. Most systems do not yet support the recognition
of cursive (script) writing. Printed characters are much easier to recognise because they are effortlessly
separated and the amount of variability is limited.
Pen-based systems are already available (Microsoft Windows for Pen Computing), GO (PenPoint), Apple
(Newton), General Magic (Magic Cap, Telescript). In the next decade new results in handwriting recognition
research will make thess systems more and more attractive.
Relevance to LSE
Computers are now imperative to realise LSE projects, with a high quality for company and users. This means
to improve industrial actor's productivity, to reduce time to market through prototyping, digital mock-up, etc.
In this context, everything which facilitates the usage of computer (and consequently increases the
productivity) is interesting for LSE.
In order to detail the relevance of HCI to LSE, let us focus on two important stages of LSE projects :
- Design stage :
During the design phase, the first problem is a perception of the project, this means objects which have
no real physical existence, id virtual object. We can consider at this stage two points of view : designers
viewpoint, and the one from the others (client, environment, ...). If designers have a relatively good
understanding of the project, it is usually a very difficult task to give this vision to the others. The
assistance of Virtual reality will represent an primordial advantage in terms of communication and
client perception. Expectation are also for designers, VR will add a new dimension in their project
vision which is limited by display size. The notion of immersion will allow them to have a better self
possession and control on the project. VR will also enable a better relation to be made between Design
stage and Construction stage by assessing the produceability / constructability through visualisation of
the product design, i.e. closing in on Design for Construction.
Obviously, designers are also interested in GUI for their daily work, in order to spend more time for
working on the project than for communicating with computer.
- Construction stage:
Here, the two main components are related to the place of actors : in office or on site.
In office, Virtual Reality is always interesting in order to represent a permanent actualised view of the
site, rightly to fulfil the gap with reality, due to the distance or difficulties to access of the project.
On the other side, actors on site, sometimes in difficult situation, need more ergonomics to
communicate with computer and simply use it for increase the productivity and the quality. All
recognition techniques, voice, handwriting, gesture will facilitate HCI in this context.
GUI have taken a primordial importance for the software development. The restriction to few de facto
standards is really a good aspect for users, it facilitates the changing of computer and reduces the adaptation
period. On the contrary, this de facto standard introduces a dependence to specific platforms.
LSE Industry as other industrial domain, is always waiting for more ergonomics and new ways to interact with
CSTB - G. Sauce ELSEWISE 20/05/2010 - 15
- As input, recognition technologies promise soon efficient application (Voice and Handwriting)
- Virtual reality offers new possibilities which will completely change HCI for input communication as
well for output
LSE actors has to contribute to the development of virtual reality by experimenting this technology in two
- for the whole project perception, for project communication but also for engineering design.
One of the most problem of IT is that the project model stored in the computer is more and more
abstract, and more and more difficult to perceive as a whole. Virtual reality appears as a mean to extend
the limits of current display in order for instance to facilitate the presentation of the project to the client
or its integration within the environment. The extension to other senses and the capacity of interaction
will be a complementary advantage notably for engineers, allowing them new possibilities of
- Virtual reality will probably stir up HCI : For instance, Virtual enterprise needs some new kind of
ergonomics, and on can easily imagine a software which proposes a virtual environment with virtual
actors, meeting, and so on.
LSE Industry has to explore this technology to be able to establish a set of specifications in order to influence
the future software tools.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 16
First of all, it seems necessary to define the notion of Data Management. At the basic level, the term of DATA
characterises numbers, characters, images or other method of recording, in a form which can be assessed by a
human or (especially) input into a computer, stored and processed there, or transmitted on some digital
channel. Data on its own has no meaning, only when interpreted by some kind of data processing system does
it take on meaning and become information. For example, the number 5.5 is data but if it is the length of a
beam, then that is information. People or computers can find patterns in data to perceive information, and
information can be used to enhance knowledge.
In the past, Data Element which is the most elementary unit of data identified and described in a dictionary or
repository which cannot be subdivided, evolved from simple alphanumerical item to become more complex
items. Nowadays, Data Element represents facts, text, graphics, bit-mapped images, sound, analog or digital
live-video segments. Data is the raw material of a system supplied by data producers and is used by
information consumers to create information.
The term management has evolved too. At the beginning the main sense considers mainly the data point of
view. Data Management consists of controlling, protecting, and facilitating access to data in order to provide
information consumers with timely access to the data they need. The notion of Database refers to a collection
of related data stored in one or more computerized files in a manner that can be accessed by users or computer
programs by a database management system.
Then, a Database management system (DBMS) is an integrated set of computer programs that provide the
capabilities needed to establish, modify, make available, maintain the integrity of and generate reports from a
database. It has evolved along four generic forms :
1 Hierarchical DBMS (1960s) - records were organised in a pyramid-like structure, with each record
linked to a parent.
2 Network DBMS(1970s) - records could have many parents, with embedded pointers indicating the
physical location of all related records in a file.
3 Relational DBMS (1980s) - records were conceptually held in tables, similar in concept to a
spreadsheet. Relationships between the data entities were kept separate from itself. Data manipulation
created new tables, called views.
4 Object DBMS (1990s) - Object Oriented DBMS combine capabilities of conventional DBMS and
object oriented programming languages. Data are considered as objects. Other classes of ODBMS
associates Relational DBMS and Object capabilities.
In a parallel direction, Data Modelling methods used to analyse data requirements and define data structure
needed to support the business functions and processes of a company evolves a lot. These data requirements
are recorded as a conceptual data model, a logical map that represents the inherent properties of the data
independent of software, hardware or machine performance considerations. The model shows data elements
grouped into records, as well as the association around those records. Data modelling defines the relationships
between data elements and structures with associated data definitions.
Database schema became more logical, this means that some methodologies allowed developers to elaborate
schema more and more independently from specific DBMS.
These methodologies are an essential part of Data Management technology.
Last ten years, main basic problems such as data access, security, sharing which represents the fundamental
functionalities of DBMS have been completed with an essential aspect : the distribution not in a homogeneous
context (hardware and software) but in a heterogeneous environment. The development of network (Intranet
and Internet) have created new kind of possibilities, users and needs for DBMS.
Client/server systems and distributed systems using network is the actual challenge of Data Management
CSTB - G. Sauce ELSEWISE 20/05/2010 - 17
Regarding the increasing complexity of data, new definitions of data management systems are proposed, more
end user oriented. Objectives consist in giving more sense to information in order to take maximum benefits
from the stored data. Effectively, a company has in its own lot of information not really useful for an other
goal than the designed one. Then it is necessary to organise the whole information manipulated by a company
or a project.
Two main classes of such systems are now in rapid development. The first one is product oriented, and the
second one business oriented.
- Product Data Management Systems (PDMS) or Engineering Data Management System (EDMS) :
The fundamental role of a these systems is to support the storage of technical documentation and the
multiple processes related to the products design, manufacturing, assembly, inspection, testing and
maintenance (the whole life-cycle). In project, characterised with globally distributed design, and
complex production processes the task to manage all product related information is of first importance
for the project success. By the introduction of CAD/CAM/CAE systems the amount of product
information has increased tremendously resulting in the need for management systems to control the
creation and maintenance of technical documentation. This implies the use of well defined
configuration management processes and software tools capable of controlling, maintaining all the data
and processes needed for the development and implementation of the whole product.
Product being a transversal project shared by several companies, P&EDMS involves heterogeneous
environment and distributed data.
- Business oriented
- Material Requirements Planning Systems:
Implementing the material requirements planning process has been ongoing for several years. MRP
systems have evolved from pure Material Requirements Planning through integrated business support
systems, which integrate information about the major business processes involved in manufacturing
industries. Support exists for registration of prospects and customers, acquisition, proposals and sales
orders, the associated financial procedures, inventory and distribution control, global production and
capacity planning, with a limited capability for planning, analysis and derivation of management
information. The majority of the systems are registration systems by nature, advanced planning tools
are lacking. Because of the registrative nature of the systems, its lack of flexibility in querying the
available information no easy management information can be produced which might be used to
manage and control a business.
- Data Warehouse Management Tool :
An implementation of an informational database that allows users to tap into a company's vast store of
operational data to track and respond to business trends and facilitate forecasting and planning efforts.
It corresponds to a manager viewpoint. From clients management to stocks management, the whole set
of data of a company is distributed in several databases, managed by different DBMS. Objective of
Data Warehousing is to arrange internal and external data in order to create pertinent information easily
accessible by manager to assist them in their decision activity. This aims at taking out a maximum
benefit of the huge mass of information available in the company. The two main characteristics of data
warehouse are firstly to consider the whole company and not only a part of it like a department, and
secondly to analyse information through time scale. Managers want to analyse information state
evolution and origins of change, this means to maintain an historical record in order to really have
decision information to better appraise the trends. Data mining is a mechanism for retrieving data from
the data warehouse. In fact many companies would claim that they have been using these sort of tools
for a number of years. However, when there is a significant amount of data, the retrieving and analysis
is a complex and potentially have a poor response. Data mining derives its name from searching fro
valuable business information in a large database, and mining a mountain of vein of valuable ore. Both
processes require either sifting through a large amount of material or find out where the value resides.
Data mining tools allow professional analysts make a structured examination of the data warehouse
using proven statistical techniques. They can establish cross-correlations which would be difficult or
impossible to establish using normal analysis and query tools
There are a number of companies who claims to offer tools and services to implement the Data
Warehousing system e.g. IBM, Oracle, Software AG, OLAP databases, Business Objects, Brio
CSTB - G. Sauce ELSEWISE 20/05/2010 - 18
In spite of different objectives, problematics involved in these two cases is relatively the same and can be
summarised by several keywords.
- Meta model : This means to integrate information between the different sources (marketing, production,
etc. or design, manufacturing, etc.) at a conceptual level.
- Heterogeneous environment : this second aspect of integration takes place at a hardware level.
- Distributed databases : It characterises sharing and access to information across multiple platforms.
- Historical records of data in order to manage the product or company activities evolution.
- Navigator, Information filtering and Analysing tools to exploit the mass of information. System
objectives induce specific tools, for instances, graphical aspects are indispensable for EDMS whereas
financial analysing tools are specific for data warehouse.
What exists today
In terms of technology, four points seem very interesting : Data Modelling, OODBMS, Distributed Database
management and access to Database applications running over Internet.
Data modelling aims at analysing and expressing in a formal way data requirements needed to support the
functions of an activity. These requirements are recorded as a conceptual data model with associated data
definitions, including relationships. The three following points can summarise the objectives of such a
1 To build data models : the first goal consists of defining a model which represent as faithfully as
possible the information needed to perform an activity. It is important to place this work at a conceptual
level to be independent from a specific DBMS.
2 To communicate models : In order to validate the data organisation, a maximum number of lecturers
should analyse the proposed data model. Then a fundamental function of a method is to allow
uninitiated people to have a global view on a data model, to perceive the main hypothesis and to
analyse in detail very quickly.
3 To capitalise knowledge : Data model and its context (i.e. activities) represent an important part of
specific knowledge of an application domain. Taking place at a conceptual level, this analysis should
increase the global knowledge and contribute having a better understanding of this domain, on
condition that the modelling method permits the knowledge durability.
These methods represent the keystone of data management. Effectively the system efficiency depends directly
on the quality of the data models.
One can list three classes of methods :
1 Method for Relational DBMS : these methods consider clearly two aspects : functions and data, and for
the latest, entities and relationships.
2 Object Oriented methodologies : This kind of method are characterised by an incremental and iterative
approach. Certain methods are pure object whereas other ones use a functional approach associated
3 STEP methodologies : The international standard ISO 10303 STEP prescribes a specific methodology
for analysing activity of an application domain (IDEF0) and for Application Reference Model
(EXPRESS). This approach is completed with implementation specifications (SPF, SDAI).
Every modelling methods have the same drawback : in fact they are not really independent of the software
context. This is mainly the consequences of developers interest. Actually they are the most interested actors in
the result of modelling, Effectively, the aspect of capitalising and increasing the application domain
knowledge is not yet a primary objective of such an action.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 19
Relational representation is always a current interesting technology and proposes efficient solutions for
numerous data management problems. Object Oriented DBMS is a more recent technology developed to take
maximum advantage of Object Oriented Languages. OODBMS are attractive because of their ability to
support applications involving complex data types, such as graphics, voice, text which are not correctly
handled by relational DBMS A number of reasonably mature OODBMS products exist, and are widely used in
certain specialised applications. Increasingly, these products are being enhanced with query and other facilities
similar to those provided by DBMS and are being applied in more general purpose applications.
Another class of DBMS, the Object-Relational DBMS (ORDBMS), constitutes the most recent system.
ORDBMS are both based on the results of research on extending relational DBMS and on the emergent viable
OODBMS. Current individual products represent different mixtures of these capabilities.
The appearance of these two classes of product indicates a general convergence on the object concept as a key
mechanism for extended DBMS capabilities. Without further detail of OODBMS technology, it is important to
note that an considerable standardisation activity affects the development of this technology, including Object
extension to SQL (The relational standard query language) being developed by ANSI and ISO standard
committees, and proposed standards for OODBMS, defined by the Object Database Management Group.
Distributed Data Base Management (DDBM)
A distributed database manager is responsible for providing transparent and simultaneous access to several
databases that are located on, possibly, dissimilar and remote computer systems. In a distributed environment,
multiple users physically dispersed in a network of autonomous computers share information. Regarding this
objective, the challenge is to provide this functionality (sharing of information) in a secure, reliable, efficient
and usable manner that is independent of the size and complexity of the distributed system.
An abundant literature is dedicated to DDBM which can be summarised by the following functions :
- Schema Integration : One can identified different levels of schema :
- A global schema which represents an enterprise-wide view of data, is the basis for providing
transparent access to data located at different sites, perhaps in different format.
- External schemata show the user view of the data
- Local conceptual schemata show the data model at each site.
- Local internal conceptual schemata represents the physical data organisation at each computer.
- Location transparency and distributed query processing : in order to provide transparency of data access
and manipulation, the data base queries do not have to indicate where the data bases are located,
decompose and route automatically to appropriate locations and access to appropriate copy if several
copies of data exist. This allows the databases to be located at different sites and move around if
- Concurrency control and failure handling in order to update synchronisation of data for concurrent users
and to maintain data consistency under failure.
- Administration facilities enforce global security and provide audit trails.
A DDBM will allow an end user to :
- Create and store a new entity anywhere in the network;
- Access an entity without knowledge of its physical location;
- Delete an entity without having to worry about possible duplication in other databases;
- Update an entity without having to worry about updating other databases;
- Access an entity from an alternate computer.
The sharing of data using DDBM is already common and will become persuasive as this system grows in scale
and importance. But the problem mainly consists on mixing Object concept, distributed technology and
standard aspect. The generalisation of this kind of system depends directly on the development of standards.
Three main candidate technologies are interested on distributed Object oriented Techniques.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 20
- Remote procedure call technologies as DCE (Distributed Common Environment) : This technique
provides low level distribution semantics and requires high programming effort.
- Microsoft distribution technology (also called CAIRO or Distributed COM) will only be available on
Microsoft systems, and nowadays is not yet really marketed.
- OMG distributed objects : The Object Management Group aims at promoting the theory and practice of
Object Technology for the development of Distributed Computing System. OMG provides a forum and
framework for the definition of a general architecture for distributed systems called the Object
Management Architecture. This object management architecture is divided in 5 components. Object
Request Broker (ORB) know as CORBA is the communication heart of the standard; it provides an
infrastructure to object communication, independently of any specific platforms, a technical foundation
for distribution, it specifies a generic distributed programming interface (API) called the Interface
Definition Language (IDL) It also specifies architecture and interface of the Object Request Broker
software that provides the software backplane for object distribution. The other parts of these
specifications concern Object Services (life-cycle, persistence, event notification, naming), Common
Facilities (printing, document management, database, electronic mail), Domain Interfaces designed to
perform particular task for users and Application Objects.
The market of CORBA products evolves very fast, and several Software claim to be CORBA
compliant, but usually they do not implement all CORBA functionalities.
Nowadays, the technology allows true distributed data management, and soon efficient DDBM will be
Database applications running over Internet
The development of Internet leads to propose new possibilities in term of communication with databases : a
dynamic database interaction with Web pages. These pages that are data driven are created through the use of
technologies such as CGI and Java to link a server database with a client browser. CGI (Common Gateway
interpreted (not compiled) by client whereas Java is compiled on server before execution on client.
The problem of security is on the first importance over Internet. The communication link can be made secure
through the Secure Sockets Layer (SSL). SSL is an industry-standard protocol that makes substantial use of
public-key technology. SSL provides three fundamental security services, all of which use public-key
- Message privacy. Message privacy is achieved through a combination of public-key and symmetric key.
All traffic between an SSL server and an SSL client is encrypted using a key and an encryption
algorithm negotiated during the SSL handshake.
- Message integrity. The message integrity service ensures that SSL session traffic does not change in
progress to its final destination. If the Internet is going to be a viable platform for electronic commerce,
we must ensure that vandals do not tamper with message contents as they travel between clients and
servers. SSL uses a combination of a shared secret and special mathematical functions called hash
functions to provide the message integrity service.
- Mutual authentication. Mutual authentication is the process whereby the server convinces the client of
its identity and (optionally) the client convinces the server of its identity. These identities are coded in
the form of public-key certificates, and the certificates are exchanged during the SSL handshake.
During its handshake, SSL is designed to make its security services as transparent as possible to the end user.
This technology is already available and several products are marketed, based on different Web browsers,
specific DBMS and operating systems.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 21
Barriers to LSE uptake
LSE already uses Data Base Management technology, but certainly not as much as possible. Although the
development of certain techniques such as distributed databases is quite recent, the main reason is probably
due to a knowledge problem. Effectively, the keystone of Data Management is data model. The latest is not
under the responsibility of the computer science side but is the concern of the LSE industry.
Consequently an important effort has to be done to formalise the LSE activity and the LSE information flow.
However, data modelling methodologies seem not really adapted to end users. Three aspects of these methods
have to be improved :
- Validation : Nowadays, no methodologies and tools allow end users to appraise the quality of a data
model and to validate it. Usually this model appears to end users as a very strange presentation of its
knowledge, and they are totally dependent from some developers.
- Dynamic aspect : data evolution is a fundamental aspect of a LSE project, but data modelling has
actually some difficulties to undertake this aspect.
- Capitalisation : If LSE industry wants to go forward, it is important to accumulate this kind of
knowledge, which is not theoretical but representative of a know-how, practical processes and company
Another aspect of data modelling is the standardisation of the data models in order to facilitate the
communication between various firms and even within the company. The international standard ISO 10303
STEP represents the international context which permits to develop such actions.
In fine, the main problem of Data Management is the preservation of data during the life-cycle of a product. In
many cases the life of a product, especially one in construction and civil engineering, easily spans more than
thirty or forty years during which the data associated to the product must remain available and must be kept
up-to-date. Not only must the data be available it should also be possible to correctly interpret the data.
Until recently drawings represented the design specification of a product. Although electronic CAD systems
are used to produce drawings, storage and archiving are still done on paper drawings. The paper drawing still
is the official version of a product design specification. It is the paper version of the drawing which should be
available for the product life-time, including results of design changes, maintenance adaptation and reworks.
Today we are still able to read and interpret drawings on paper which are made thirty years ago. We are even
able to read and interpret design drawings the Romans made for their public baths. Archiving paper drawings
is built upon the preservation of the medium bearing the drawing, e.g. paper. Some decades ago large paper
archives have been compacted using micro-fiche. The retrieval and viewing of the micro-fiche archives is now
depending on some form of translation from a storage medium to a means for representation. Archiving is still
done on the original data, i.e. the printed paper, which has been decreased in size. No interpretation step is
involved and the compaction/de-compaction process is a pure physical one.
Only quite recently the notion of product models arose. A Product Model is a presentation of a product in
electronic, computer-readable form which is used by various computer applications to interpret the
applications view and needs of the model. To enable different programs to interpret the same product model, a
standard and interchange mechanisms are needed.
Storage of a Product Model in electronic form over long periods also needs standards to ensure the data will
mean the same in say, twenty years. Another concern then arises: the medium on which to save the data for
archiving purposes. Today we are unable to read data which has been produced by an office application of 8
years ago. It is also not guaranteed that we can still read 51/4" computer disks made four years ago because of
wear of the disk.
This problem has been identified in several areas of industry which have a high data density. One of the trends
in this area is the foundation of small to medium companies who make a living of archiving and retrieving
data for a purpose, data services companies. The idea is to enable the user to get hold of the correct product
data throughout the life-time of a product or service. The data services company, for a fee, takes care that
upgrades to new versions of standards, tools, storage applications, e.g., databases, and new media are carried
out. Such companies can only thrive based on standards, used and kept.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 22
Trends and expectations
Current trends aim at adding value to Data Base Management system in one hand to specify more end users
oriented DBMS, in other hand to develop Intelligent Database environment. The goal is to manage high level
of information. The concepts of Data Warehouse and Engineering Data management are at the beginning of
their development, and will procure in the next years more powerful possibilities.
The semantic aspect induced by a high level of data allows to develop reasoning capabilities in order to
- declarative query language, to ease interrogations,
- deductive system for instance query association permits to return additional relevant information not
explicitly asked by users,
- query optimisation to reduce time of transactions,
- integration of heterogeneous information sources that may include both structured and unstructured
From a technical point of view, ODBMS and Distributed Data Management will evolve in performance and in
security. The Object Database Management Group continue its activity on standard deployment regarding
object definition and query language.
Numerous research projects deal with distribution. They aim at improving time of transactions, security, at
establishing better connections using Web browser. New technologies are in progress, especially persistent
distributed store which provides a shared memory abstraction.
Relevance to LSE
As every kind of activity, Data management represents the foundation of the LSE industry. Whether
considered viewpoint, from a product one or a firm one, LSE activity constitutes an heterogeneous
environment and manipulates a high semantic level of information which is recorded on different supports.
The association of data management, distribution and new communication technologies is proving on the first
LSE is interested to DBMS with added value such as EDMS and Data Warehouse. Engineering Data
management system constitutes a interesting response in term of project information management whereas
Data Warehouse will interest the firm as its own.
LSE industry is also mainly involved in standard activities, not directly for the computer science standards,
but for establishing the product data models. We shown that data models are the keystone of data
management. These ones contain an important part of LSE knowledge, still not yet formalised.
Nowadays, Data Management is based on Sound technology and lot of efficient software tools are available.
LSE industry already uses this technology in the different part of companies. Lot of progress are expecting
using object representation, probably more adapted to project design. Consequently great interests are waiting
for the next evolution of OODBMS.
At the present time, exchange of information and distribution of information are too limited. (mainly graphical
document exchange). A higher semantic level within the exchange allows the different actors of a project to
take more benefit of the new Information technologies. Developing data model, the key stone of exchange of
information, needs sound standards at two levels :
- Data modelling methodologies especially to facilitate the understanding of models and their validation.
Communication is important for LSE actors allowing them a control of these models, but also to
guarantee the continuity of the knowledge contained within models. No current modelling
methodologies offers efficient way of model validation. This phase of validation is only based on the
competencies of the model designer, and no technical, analytical or mathematical validation are
CSTB - G. Sauce ELSEWISE 20/05/2010 - 23
- Product data model : LSE industry has to be responsible on the development on its model. Actions have
to be intensified to elaborate adapted models, notably in the context of the STEP standard.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 24
The previous chapter analysed Data management, the subject here is to present Knowledge management, but
what is the difference between Data and Knowledge. Whereas Data represent the basic pieces of information,
the term Knowledge recovers several definitions, depending on user viewpoint. It could be assimilated to the
process of knowing or to a superior level of information. In this work, we won't debate on this definition and
will consider the whole sense of knowledge, including both object of knowledge and process of knowing.
The two following terms could be associated to knowledge : management and engineering. The main
difference is that the manager establishes the direction (administrative, executive) the process should take,
whereas the engineer develops the means to accomplish this direction. Then Knowledge Management is
relating to the knowledge needs of the firm, to make what decisions and enable what actions. Knowledge
Engineering develops technologies satisfying Knowledge Management needs.
In fact, here we will develop Knowledge technologies, dependent on the knowledge engineer (usually a
computer scientist) , which allows manager to develop policies for enterprise level knowledge ownership.
Understanding knowledge is key to Knowledge Management process. It means how we acquire knowledge,
how we represent it, how we store it, how we use it and what are the applications. Therefore we identify three
main parts in knowledge management :
- Knowledge representation
The representation or modelling process consists of defining a model (or several models) able to
represent knowledge as reliable as possible. The first result of the modelling process is to formalise the
knowledge in order for accumulation and communication and sharing. The second aim is to establish a
model which could be manage and use by computer programs.
- Knowledge acquisition
The smallest Knowledge Based system need a huge quantity of knowledge in order to be realistic. The
knowledge acquisition process is critical to the development of any knowledge management program. It
means to have a complete understanding of the nature and behaviour of knowledge. Then it is necessary
to develop specific procedures to acquire this knowledge. This objectives of identifying and creating
dynamic knowledge bases is itself also a knowledge process. It involves knowledge learning
communications techniques, modelled on the human (or child) behaviour.
- Knowledge usage
The objective of accumulating its knowledge is on its own essential for an enterprise, it represents the
memory of the firm. But this only interest is minor and the challenge is to develop a daily exploitation
of the knowledge base, in every kind of action : technical, management, financial, etc. The possibilities
includes decision making, diagnostic, control, training, command and so one. The number of potential
applications is endless.
The next paragraph will present the existing technologies in accordance with these three aspects.
What exists today
The first steep regarding knowledge management process consist of choosing a knowledge representation.
Effectively, the natural language usually used by every body is not adapted for developing a knowledge base.
But the problem is that nowadays, it doesn't exist a universal model of knowledge representation. The existing
formalisms depends on the application (diagnostics, decision making...) or on the knowledge domain(Natural
language, technical application, medical science, etc.). In the past, two trends are developed which are
knowing process oriented (the procedural knowledge representation) for the first and knowledge description
oriented (declarative knowledge) for the second.
Nowadays the trend tries to develop more universal knowledge representation.
Without going into details or considerations of pure computer scientists, it is possible to identify four current
in Knowledge representation.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 25
- Production rules
- Logical form
- Conceptual graphs
- Object oriented representation
A rule consists of two parts :
- An antecedent or situation part usually represented by IF <condition>. This condition is expressed by
mean of atomic formula and logical statement formed by using logical connectives such AND, OR,
- A consequent part or action represented by THEN <conclusion>. The consequent may include several
Antecedents could be considered as patterns and consequences as conclusion or action to be taken. Each rules
represents a independent grain of knowledge. Effectively, a rule could not refer to an other one, and contains
on its own all the condition of its application. In addition, a rules base of a production system doesn't specify
an order of application of the rules.
Various methods have been used to deal with uncertain or incomplete information in such a knowledge base.
- Certainty factors consists of assigning a number to each rule for indicating the level of confidence for
this piece of knowledge. Usually this number, in the range of -1 to 1 or 0 to 1 or 0% to 100%, are combined
with factors of evidence associated to fact to deduce factors of evidence for the conclusion.
- Fuzzy logic : The theory of fuzzy sets can be used to present information with blurry or grey
boundaries. In this theory, an element may be in a set with a degree of membership between 1 and 0. A fuzzy
set contain elements with assigned degrees of membership. The main problem is to construct the membership
- Probabilities : Bayes Theorem is well known in statistics and probabilities literature. It provides a
method for calculating the probabilities of an event or fact based on the knowledge of prior probabilities. The
difficulties of this method is to independently determine one another the probabilities of observed facts.
This representation is based on the mathematical Logic. The most basic logical system is propositional logic.
Each basic element of the system, or proposition, can be either true or false. The propositions can be
connected to each other by using the connectives AND, OR, NOT, EQUIVALENT, IMPLIES, etc. First-order
predicate logic is composed of :
- atoms (symbols),
- functions and predicates (a function with one or more atomic arguments which produce a boolean
- two substatements joined by a conjunction, disjunction, or implication,
- negated substatement,
- statement with an existential or universal quantifier (in this case, atoms in the statements can be
replaced by variables in the quantifier).
For instance : Bob has a bicycle might be given the following logical form :
∃ B , Bicycle (B) ∧ Has (Bob, B)
This is a very descriptive declarative representation with a well founded method of deriving new knowledge
from a database. Its flexibility makes it a good choice when more than one module may add to or utilise a
common database. Unfortunately, this flexibility has limitations. To maintain consistency, learning must be
monotonic. This limits its effectiveness when there are incomplete domain theories
This representation has been extended in several direction to correct these weaknesses and to propose new type
of application. These extensions of logic include, but not limited to, closed world reasoning, defaults,
epistemic reasoning, queries, constraints, temporal and spatial reasoning, procedural knowledge
CSTB - G. Sauce ELSEWISE 20/05/2010 - 26
The integration of Logical form with other formalisms, such as object-oriented languages and systems, type
systems, constraint-based programming, rule-based systems, etc., represents most of the actual development of
Applications concern numerous areas such as natural language, planning, learning, databases, software
engineering, information management systems, etc.
Conceptual graphs (CG) are a graph knowledge representation system invented by John Sowa which integrate
a concept hierarchy with the logic system of Peirce's existential graphs. Conceptual graphs are as general as
predicate logic and have a standard mapping to natural language.
Conceptual graphs are represented as finite, bipartite, connected graphs. The two kinds of nodes of the graphs
are concepts and conceptual relations which are connected by untyped arcs. Each conceptual graph forms a
proposition. The main characteristics of Conceptual graphs are the following :
- A concept is an instantiation of a concept type. A concept by itself is a conceptual graph and conceptual
graphs may be nested within concepts. A concept can be written in graphical notation or in linear
- A pre-existent type hierarchy of concept types is assumed to exist for each conceptual graph system. A
relation < is defined over the set of concepts to show concept types that are subsumed in others.
- Two or more concepts in the same conceptual graph must be connected using links to a relation.
- For any given conceptual graph system there is a predefined catalogue of conceptual relations which
defines how concepts may be linked with relations.
- Each concept has a referent field used to identify the concept specifically or to generalise the concept,
to perform quantification, to provide a query mechanism, etc.
- Canonical graphs represent real or possible situations in the world. Canonical graphs are formed
through perception, insight, or are derived from other canonical graphs using formation rules. Canonical
graphs are a restriction of what is possible to model using conceptual graphs since their intent is to
weed out absurdities or nonsense formulations.
- A canonical graph may be derived from other canonical graphs by formation rules: copy, restruct, join,
and simplify. The formation rules form one of two independent proof mechanisms.
- A maximal join is the join of two graphs followed by a sequence of restrictions, internal joins (joins on
other concepts), and simplifications until no further operation is possible. A maximal join acts as
unification for conceptual graphs.
- New concept types and relations may be defined using abstractions. An abstraction is a conceptual
graph with generic concepts and a formal parameter list.
- Schemata incorporate domain-specific knowledge about typical objects found in the real world.
Schemata define what is plausible about the world. Schemata are similar to type definitions except that
there may be only one type definition but many schemata.
- A prototype is a typical instance of an object. A prototype specializes concepts in one or more
schemata. The defaults are true of a typical case but may not be true of a specific instance. A prototype
can be derived from schemata by maximal joins and then assigning referents to some of the generic
concepts in the result.
Conceptual graphs is considered as a good choice of knowledge representation for systems where knowledge
sharing is a strong criterion. They have a formal, theoretical basis, support reasoning as well as model-based
queries and have a universe of discourse that is described by a conceptual catalogue. They have no inherent
mechanism for knowledge sharing or for translation but the flexible representation, mapping to natural
language and, especially, work on their database semantics is an encouragement to consider CG as a candidate
Conceptual graphs include more or less Semantic Networks which consist of a collection of nodes for
representation of concepts, objects, events, etc., and links for connecting the nodes and characterising their
interrelationship. Semantic networks propose also inheritance possibility.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 27