Classification : Restricted
WP Task No.
3 3100 D302
Status Draft/Issued/Revised Rev.
Date Release to CEC
CSTB - G. Sauce ELSEWISE
03/04/1997 01:58:00 PM -
20/05/2010 - 1
Document Control Sheet
ESPRIT 20876 - ELSEWISE
Revision Status Page Nos. Amendment Date By
1 WP3 : PDIT Support (draft Report Section)
Systems and technologies
Gerard Sauce CSTB
Dee Chaudhari BICC
Theo van Rijn TNO
Alastair Watson Leeds University
_01 _02 _03
Date 28/1/97 14/3/97 3/4/97
TW * * *
BOU * * *
HBG * * *
BICC * *
CAP * * *
CSTB * * *
ULeed * * *
TNO * * *
VTT * *
CSTB - G. Sauce ELSEWISE 20/05/2010 - 2
Table of contents
CSTB - G. Sauce ELSEWISE 20/05/2010 - 3
This Work package aims at detailing a critical evaluation of the key technology and methodology
developments in PDIT which are likely to provide the tools to support and shape the LSE Industry in the
future. It concentrates on those technologies, methodologies and standards most likely to impact on the
working and thinking in LSE. A layered approach will be adopted, considering first the supporting PDIT
environment (Task 3000), then the systems and technologies (Task 3100), and finally the application software
(Task 3200). (c.f. Figure 1 : Layered approach of the key technologies)
T3 0StmnT h l g
A 0: yes de noe
1 s a c o is
di ii n
S pt g
Th l g
c o is
p t o
w de e
oH w a Ur
na a n s
r r d e
Figure 1 : Layered approach of the key technologies
This task 3100 deals with the intermediary layer between users applications and supporting environment. The
main characteristic of systems and technologies is the independence from hardware on one side and from users
on the other side. A software of this category has to be adapted to the technical domain by specifying the
technical knowledge and the particular environment of the user in order to develop a specific application.
Moreover it is possible to find at least one software implementing each of these technologies on every kind of
At each stage, the objective is to provide an assessment of the existing technologies, the trends, and an
evaluation of the relevance to LSE. The aim being to report what can be expected to exist by 2005. The
methodology will be to use existing sources, to review published material and to consult with the key players.
It is important to note that our objective of identifying key technologies doesn't include an analysis of the
software market. Effectively, we list and detail the main technologies in order to give to the lecturer the
general concepts and the key point to go further. The latest will found neither an assessment of specific
software, nor a testing protocol to choice one of them.
This analyse report addresses LSE industry, but needs some IT background to be understood.
The scope of this task covers a large panel of systems and technologies which we have grouped into four
- Human Interface :
Human-Computer Interface considers anything which allows a user to interact with a computer.
Examples are command line interpreter, windows, icons, menus, pointer or virtual reality. This doesn't
include the different devices and way of communication between human and computer. We analyse
Graphical User Interface, Virtual reality, technologies of recognition and voice synthesis.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 4
- Data Management:
Data Management consists of controlling, protecting, and facilitating access to data in order to provide
information consumers with timely access to the data they need. Data modelling, Object databases,
distributed data bases and access via Internet are detailed.
- Knowledge management
Within this chapter, the whole sense of knowledge, including both object of knowledge and process of
knowing are considered. We analyse the three main aspects of knowledge managment : representation,
acquisition and usage
- Virtual enterprise
Even if this concept and the relevant technology are recent, several systems are available to allow the
creation of part of Virtual Enterprise. A Virtual Enterprise is defined as a temporary consortium of
companies that come together quickly to explore fast-changing business opportunities. This definition is
detailed through the communication level, the organisational level and the virtual enterprise
For each key technology, this report tries to give an answer to the four following questions:
What exists today?
What are barriers to LSE uptake?
What can we expect for 2005?
What are the relevance for LSE?
At the end of each chapter, a summary emphasises the main points, particularly barriers and relevance to LSE
CSTB - G. Sauce ELSEWISE 20/05/2010 - 5
Two definitions could be associated to the abbreviation "HCI" :
1 - Human-Computer Interaction : The study of how humans use computers and how to design computer
systems which are easy, quick and productive for humans to use.
2 - Human-Computer Interface : Anything which allows a user to interact with a computer. Examples are
command line interpreter, windows, icons, menus, pointer or virtual reality.
In this work, the focus is put on the second definition of HCI, considering mainly the notion of interface. This
problematic should be characterised by five key words : communication, language, interaction, process and
ergonomics. The three main following aspects summarise the latest.
Human characteristics govern how people work and how they expected to interact with computers. To design
the best interface, it is important to understand how human processes information, how he structures his
actions, how he communicates and what are his physical and psychological requirements. The Human
information processing has been and is always the object of numerous research which led to various models :
models of cognitive architecture, symbol-system models, connectionist models, engineering models, etc.
Language and communication are so natural to us that we rarely think about it. Human interaction using
natural language not only depends on the words spoken but also on tone of voice, which enables parties to
interpret the semantics of the spoken message. In vis-à-vis interaction this is emphasised by the physical body
language, consciously or unconsciously shown by people. Developing a language between human and
computers underscores the complexity and difficulty of a language as a communication and interface medium.
Ergonomics. Anthropomorphic and physiological characteristics of people can be related and applied to
workspace and environmental parameters. Some frequent problems people using computers face today are the
operation of the computer, using a pointing device, such as the mouse, and watching the screen. Labour laws
restrict the periods people are allowed to spend behind the computer, trying to prevent sight problems and
Repetitive Strain Injury (RSI).
Use and Context of Computers
The general social, work and business context may be important and has a profound impact on every part of
the interface and its success.
• The social organisation , the nature of work must be considered as a whole. this concerns points of view,
models of human , models of small-groups and organisations, models of work, workflow, co-operative
activity, offices socio-technical systems, etc. Let us remark an important notion : human systems and
technical systems mutually adapt to each other. Once a system has been written, most of the adaptation
is by the human, Effectively, he may be able to configure the software, but not much software will yet
adapt (automatically or not) to a user.
There are classes of application domains and particular application areas where characteristic interfaces have
been developed for example : document-oriented interfaces (e.g., text-editing, document formatting, structure
oriented editor, illustrators, spreadsheets, hypertext), communications-oriented interfaces (e.g., electronic mail,
computer conferencing, telephone and voice messaging), design environments (e.g., programming
environments, CAD/CAM), etc.
Part of the purpose of design is to arrange a fit between the designed object and its use. Adjustments to fit can
be made either at design time or at time of use by either changing the system (or the user at last) and changes
can be made by either the users themselves or, sometimes, by the system.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 6
Computer systems and architecture.
Computers have specialised components for interacting with humans. Some of these components are basically
transducers for moving information physically between human and machine (Input and Output Devices).
Others have to do with the control structure and representation of aspects of the interaction (basic software).
Devices must be constructed for mediating between humans and computers. They are mainly input devices
(keyboard, mouse, devices for the disabled, handwriting and gestures, speech input, eye tracking, exotic
devices such as EEG and other biological signals) and output devices (display, vector and raster devices,
frame buffers and image stores, canvases, event handling, performance characteristics, devices for the
disabled, sound and speech output, 3D displays, motion e.g. flight simulators, exotic devices). This technical
aspect is out of scope of this chapter.
The software basic architecture and techniques for human computer interaction recover four aspects :
a - Dialogue genre : The conceptual uses to which the technical means are put. Such concepts arise in any
media discipline such as film and graphical design. (Workspace model, transition management, design, style
b - Dialogue inputs : Types of input purposes (e.g., selection, discrete parameter specification, continuous
control) and input techniques: keyboards (e.g., commands, menus), mouse-based (e.g., picking, rubber-
banding), pen-based (e.g., character recognition, gesture), voice-based.
c - Dialogue outputs : Types of output purposes (e.g., convey precise information, summary information,
illustrate processes, create visualisations of information) and output techniques (e.g., scrolling display,
windows, animation, sprites, fish-eye displays)
d - Dialogue interaction techniques : Dialogue type and techniques (e.g., commands, form filling, menu
selection, icons and direct manipulation, generic functions, natural language), navigation and orientation in
dialogues, error management, and multimedia and non-graphical dialogues: speech i/o, voice and video mail,
active documents, videodisk, CD ROM.
The Dialogue techniques are the subject of this section.
What exists today
We identified three groups within the existing technologies :
- Graphical User Interface,
- Virtual Reality,
- Other Interfaces
Graphical User Interface (GUI)
The style of graphical user interface invented at Xerox PARC, popularised by the Apple Macintosh and now
available in other varieties such as the X Window System, OSF/Motif, NeWS and RISC OS, are also known as
WIMP (Windows, Icons, Menus and Pointers or maybe Windows, Icons, Mouse, Pull-down menus).
GUI uses pictures rather than just words to represent the input and output of a program. A program with a GUI
runs under some windowing systems (e.g.; The X window System, Microsoft Windows, Acorn RISC OS). The
program displays some icons, buttons, dialogue boxes etc. In its window on the screen, the user controls the
program by moving a pointer on the screen (typically controlled by a mouse) and selecting certain objects by
pressing buttons on the mouse while the pointer is pointing at them.
Each type of GUI is available on several platforms. One can classify the main WIMP under three categories
using a platform criteria :
- Workstation (mainly using UNIX as operating system).
- PC computer.
Furthermore, it's possible to find some software packages running on one of the previous platform, which
emulate the GUI of another platform.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 7
GUI On workstation
X Windows :
Sponsoring body: Initially developed by MIT's project Athena. It became a de facto standard supported by the
X (X window) is a specification for device-independent windowing operations on bitmap display devices. X
uses a client-server protocol, the X protocol. The server is the computer or X terminal with the screen,
keyboard, mouse and server program and the clients are application programs.
X Windows is used on many Unix systems. OpenWindows is a server for Sun workstations handling X
window System protocol.
X has also been described as over-sized, over-features, over-engineered and incredibly over-complicated.
Clients may run on the same computer as the server or on a different computer, communicating over Internet
via TCP/IP protocols. This is confusing because X clients often run on what people usually think of as their
server (e.g. a file server) but in X, is the screen and keyboard etc. which is being "served out" to the
For instance, Solaris OpenStep for Sun workstation, is implemented using the facilities of the Solaris
Operating Environment, including the X11 windowing system.
Other GUI on Workstation
OSF Motif (Open Software Foundation - Motif : Sponsoring Body : OSF Open Software Foundation is the
standard graphical user interface and window manager (responsible for moving and resizing windows and
other practical functions). This is one of the choices of look and feel under the X Window system. Motif is
based on IBM's Common User Access (CUA) specification which permits users from the PC world to migrate
to UNIX with relative ease.
Products implementation: Motif is used on a wide variety of platforms by a large population of Unix users.
Machine types range from PCs running Unix, to workstations and servers for major manufacturers including
Sun, Hewlett Packard, IBM, Digital Equipment Corporation, and Silicon Graphics.
In order to have one's software certified as Motif compliant, one must pay a fee to OSF. Motif is running under
the X Window system. Developers who wish to distribute their OSF/MOTIF applications may do so as long as
the license agreement is not violated. There are no non-commercial Motif toolkits available, and the Motif
source by OSF is fairly expensive.
Open Look is a graphical interface and window manager from Sun and AT&T. This is one of the choices of
look and feel under the X Window system. It determines the `look and feel' of a system, the shape of windows,
buttons and scroll-bars, how you resize things, how you edit files, etc. It was originally championed by SUN
microsystems before they agreed to support COSE (Common Open Software Environment)
OLIT, XView, TNT are toolkits for programmers to use in developing programs that conform to the OPEN
- OLIT was AT&T's OPEN LOOK Intrinsics Toolkit for the X Window system;
- XView is Sun's toolkit for X11, written in C. XView is similar in programmer interface to SunView.
- The NeWS Toolkit (TNT) was an object-oriented programming system based on the PostScript
language and NeWS. TNT implements many of the OPEN LOOK interface components required to
build the user interface of an application. It's included in OpenWindows up to release 3.2, but is not
supported (and will not run) under OpenWindows 3.3 (based on X11R5).
WABI (Developer/Company: IBM)
A software package to emulate Microsoft Windows under the X window System: Wabi 1.1 for AIX is actually
ordered as a feature of AIXwindows(r), IBM's implementation of the advanced X Window system using the
OSF/Motif(tm) interface standard. Since Wabi 1.1 for AIX is mapped to the AIXwindows interface on the
AIX/6000 system, it provides all the function of the industry-standard X Window, including the ability to
exploit Xstations at a lower cost-per-seat than fully configured workstations. What's more, X Windowing can
CSTB - G. Sauce ELSEWISE 20/05/2010 - 8
improve application performance by offloading the graphics-intensive work from the server to the Xstations.
WABI is also widely available on SUN.
GUI On PC
Developed by the Company Microsoft MS-Windows was a windowing system running on a DOS System. It
replaces the DOS prompt as a mean of entering commands to the computer : Instead of typing a command at
the "C:" on the screen, entails typing the command correctly and remembering all the necessary parameters, a
pointer on the screen is moved to an icon (small picture) that represents the command to be run. By pressing a
button on a mouse, a window appears on the screen with the activated program running inside. Several
windows can be opened at once, each with a different program running inside. The most important role of
Windows has been to promulgate some interface standards. Although software designers have used the
components of the Windows interface in different way, in order to make their product unique, the same
components are used in every Windows interface : Windows, Menus and Commands, dialogue boxes, button
and check boxes, Icons.
The window system and user interface software released by Microsoft in 1987 was widely criticised for being
too slow on the machines available.
The latest release of the primary feature (May 1994 :windows 3.1, or maybe Windows 3.11) was more
efficient and became really a standard GUI for PC, used by developers.
The new feature of MS-Windows (95 and NT) replaces both MS-DOS and Windows 3 on workstations. It
became multi-tasks. A server version (Windows NT 3.5) is available today. These two versions are replaced by
Windows NT 4.
X-Windows Emulator :
Some software packages propose to emulate a X-Windows system, in order to transform the PC to a X
terminal, connected to a X server. (X-Win32, Exceed 5, etc.). This allows the user to run UNIX applications
from his PC.
GUI on Macintosh
Macintosh user interface
Originally developed at Xerox's Palo Alto Research Centre and commercially introduced on the Xerox Star
computer in 1981, Apple later built a very similar version for the Macintosh.
The Macintosh user interface has become a very popular method for commanding the computer. This style of
user interface uses a graphical metaphor based on familiar office objects positioned on the two-dimensional
"desktop" workspace. Programs and data files are represented on screen by small pictures (icons) that look like
the actual objects. An object is selected by moving a mouse over the real desktop which correspondingly
moves the pointer on screen. When the pointer is over an icon on screen, the icon is selected by pressing a
button on the mouse. A hierarchical file system is provided that lets a user "drag" a document (a file) icon into
and out of a folder (directory) icon. Folders can also contain other folders and so on. To delete a document, its
icon is dragged into a trash can icon. The Macintosh always displays a row of menu titles at the top of the
screen. When a mouse button is pressed over a title, a pull-down menu appears below it. With the mouse held
down, the option within the menu is selected by pointing to it and then releasing the button.
For people that were not computer enthusiasts, managing files on the Macintosh was easier than using the MS-
DOS or Unix command line interpreter, and this fact explains the great success of the Macintosh.
Unlike the IBM PC worlds, which, prior to Microsoft Windows had no standard graphical user interface,
Macintosh developers almost always conformed to the Macintosh interface. As a result, users feel comfortable
with the interface of a new program from the start even if it takes a while to learn all the rest of it. They know
there will be a row of menu options at the top of the screen, and basic tasks are always performed in the same
way. Apple also kept technical jargon down to a minimum.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 9
The last version of Macintosh user interface provides some of the MS-Windows functions, and then allows the
possibility to run software developed for MS-Windows. For instance, the PowerPC Macintosh have an easy
way to instantly switch between the Macintosh operating system and the DOS or Windows 3.1 environment.
X-Windows Emulator :
Some software packages propose to emulate a X-Windows system, in order to transform the Macintosh to a X
terminal, connected to a X server. MacX 1.5 is an enhanced version easy-to-use software for high-performance
X Window System computing on Macintosh and other MacOS-based systems. MacX helps increase the
productivity of Macintosh users in UNIX/VMS environments by enabling them to seamlessly run both
network-based X applications and Macintosh applications on one Macintosh computer.
Virtual Reality is a set of computer technologies the combination of which provides an interface to a
computer-generated world, and in particular, provide such a convincing interface that the user believes he is
actually in a three dimensional computer-generated world. This computer generated world may be a model of
a real-world object, such as a house; it might be an abstract world that does not exist in a real sense but is
understood by humans, such as a chemical molecule or a representation of a set of data; or it might be in a
completely imaginary science fiction world. A key feature is that the user believes that he is actually in this
different world. This can only be achieved if the user's senses - sight, touch, smell, sound and taste, but
primarily sight, touch and sound - give convincing evidence that he or she is in that other world. There must be
no hint of the real world outside. Thus, when the person moves, reaches out, sniffs the air and so on, the effect
must be convincing and consistent. A second key feature of Virtual Reality is that if the human moves his
head, arms or legs, the shift of visual cues must be those he would expect in a real world. In other words,
besides immersion, there must be navigation and interaction.
Three key words characterise the current common Virtual Reality softwares :
- 3 dimensions,
The first condition to obtain a reality base and a real immersion is to address the sight. This sense takes a
particular importance and refers to the nature of reality being shown. One of the most important cues is the
perception of depth. To achieve this illusion, it is necessary to have a 3D model of the world and to send
slightly different images to the left and right eyes in order to mimic the sense. In this context, actual displays
are unsuitable tools, and if a user wants to have a very believable experience, he must use specific headset.
Recent research have shown that good quality 'surround sound' can be just as important to the sense of reality
as are visual clues.
VR allows the user to see how pieces inter-relate and how they interact with one another. Interactivity is
considered as the core element of VR. Rather than just passively observe it, users should be able to act on the
environment, to control events which transform the world.
The speed of response of the system to movement and interaction is an essential aspect to appreciate how the
virtual world convincing is. In order to obtain a natural speed, the system should perform a perfect
synchronisation between the person moving his head, the visual clues following, and eventually the other sense
Two fundamental aspects of VR could be identified : technology and software.
True interactivity in immersive VR requires adapted feedback, i.e. controlling the interaction not by a
keyboard or mouse, but by physical feedback to the participant using special gloves, hydraulically controlled
motion platforms, etc. The technology of VR includes Headset, data glove, body suit, sound generation.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 10
Sotfware packages propose 2 types of tools :
- Tools for modelling virtual world through 3D representation (also sound for the most efficient
ones), and for defining specific behaviour and interaction between objets. These tools, for
developers, are call world builder. A lot of various world builders are available for every kind of
- Tools for representing the world and allowing users immersion. They control the specific
devices. They are named world browser.
Nowadays, each package has its own model representation, no de facto standard emerges, except the Virtual
Reality Modelling Language (VRML) which is a standard language for describing interactive 3-D objects and
worlds delivered across the Internet. Nowadays, this language limits the VR application to one sense : the
sight. Effectively, users immersion is only performed into a 3D model of the world, allowing then the virtual
world access to users with ordinary computers with simple devices (keyboard and mouse). The next version of
VRML will extend its functionnalities by providing sensors and allowing 3D animations.
In practice only a small number of applications are already available, mainly for testing further developments.
VR industry is immature, but it will in a medium term become a key component of Human Interface.
Multimedia information is defined usually as combinations of different data types, like audio, voice, still and
moving pictures, and drawings, as well as the traditional text and numerical data. Multimedia is a way to
satisfy people need's for communication, using computer. One aspect of multimedia is the amount of control
exercised by the user, i.e. how does the user interact and “browse” through the information presented in a
multimedia fashion, such as video-disks or information columns. This is why we cannot really consider
multimedia as a human computer interface. It involves several techniques used for HCI, specially graphic
representation and audio/video effects, and adapted devices.
Document recognition technology
Close to specific devices several systems refer recognition technology for data entry and document
management. Characters and graphical elements can now be recognised automatically. The main technologies
are the followings :
- Optical character recognition (OCR) is a process which transforms a page into ASCII code. Scanning
devices can capture a page of text in a few seconds with relatively little manual effort. Afterward, OCR
software can analyse and transform image into string and text which can be edited, indexed, etc.
Several methods of scanning of text exist : analytical, morphological, statistical methods and "neural
network" technology. An advanced OCR package also uses lexical context to improve accuracy.
Success depends in a large part on the quality of the original text.
- Optical mark reading : This variant of OCR is suitable when user can describe specific forms to search.
- Intelligent Character recognition : this technique is widely used for Handwriting acquisition. The two
main problems are the accuracy and the effective speed of ICR packages.
- Barcoding : a very efficient and suitable technology anywhere where data can be pre-coded, with a
typical error rate of 0.05 %.
Voice synthesis and recognition
Several years of research have been conducted with the goal of obtaining speech recognition by computers.
Whereas today's systems are far from reaching the goal of real time recognition of unconstrained human
language, the technology has evolved to a point where it is useful in a number of applications. There are
several classification of voice recognition technologies. They are as follows:
- Speaker Independent vs. Speaker Dependent
Speaker independent systems are designed to work for anyone without having to train them to a specific
person voice. Speaker dependent systems are trained to recognise a single person voice. Speaker
independent systems work immediately for anyone, but usually have less accuracy and a smaller
vocabulary than speaker dependent systems. These systems are commonly used for commands such as
"copy" or "paste" because the vocabulary is generally around 300 words. Speaker dependent systems
CSTB - G. Sauce ELSEWISE 20/05/2010 - 11
require about 1 hour of training for a new user. It also requires that you make corrections as you are
dictating to help the system keep on learning. These systems have vocabularies of 30,000 to 60,000
words. There are also many specialised vocabularies available mainly in medicine and legal areas.
Vocabularies are often specific to a field such as radiology and emergency medicine. The accuracy of
voice recognition can be up to 98% (low 90's is more realistic).
- Continuous Speech vs. Discrete Speech
Continuous speech is basically talking at a normal rate. Most of the voice recognition industry is
currently researching continuous speech systems, but they are years away. There are optimistic
predictions to have it by the year 2000, but there are also critics doubting this. Discrete speech is
pausing between each word. Voice recognition technology currently requires discrete speech. The
pauses between words do not have to be long, but it definitely takes practice. Manufacturers claim up to
80 words per minute, but 60 wpm is probably more realistic. Speaking with pauses between words has
the potential of being very distracting to the thought process.
- Natural Speech
Natural speech refers to understanding of language. This is more of an ideal than a current goal. For
voice recognition systems, they would hypothetically function more like a human transcriptionist. This
would allow saying things like "Replace those last two sentences with…" and know what to do with
"um, ah,". There are several features in voice recognition systems. For example, some systems use the
context to help select the correct word, (night / knight). Some allow macros or templates that build a
standard reports and then only variances need to be dictated.
There are a number of speech synthesis systems on the market today. It is interesting to note that in artificial
speech generation there is a tradeoff between intelligibility and naturalness. Currently, the industry has placed
emphasis on naturalness, with the unfortunate consequence that even the high end systems are sometimes hard
to understand. Speech synthesis is generally regarded as a secondary and much less complex issue when
compared to speech recognition and understanding.
The current trends in the industry are for speaker-independent systems, software based systems, and extensive
use of post-processing. Of great interest is the emergence of spoken language systems which interpret
Speech recognition systems are being widely used by telephone companies, banks, and as dictation systems in
many offices. These applications are highly constrained and often require the user to pause between each word
(isolated speech recognition).
The main disadvantages of this kind of technology are related to :
- The accuracy : Most speech understanding systems are capable of high accuracy given unlimited
processing time. The challenge is to achieve high accuracy in real-time.
- Speech recognisers are susceptible to ambient noise. This is especially true for speaker independent
systems which use speaker models that are developed in quiet laboratories. The models may not
perform well in noisy environments.
- Out-of-Phraseology Speech - Speech recognisers are not yet capable of understanding unconstrained
human speech. Accordingly, applications are developed based on constrained vocabularies. The
challenge is to detect out-of-phraseology speech and reject it before it is post-processed.
It is clear that speech is the most familiar way for humans to communicate, unfortunately, speech recognition
systems are not yet capable of perfect recognition of speech in real time.
Barriers to LSE uptake
GUI have taken a primordial importance for the software development. Nowadays, old softwares using a
simple alphanumerical interface (on VT100 terminal or DOS Operating system) are progressively removed
from companies. Apple first with Macintosh, and recently Microsoft with MS-Windows have widely
contributed to the new feature of the software packages. The restriction to few de facto standards is really a
good aspect for users, it facilitates the changing of computer and reduces the adaptation period. However,
actual GUI need more and more memory and performances, sometimes to the detriment of the pure
Considering Virtual reality, this new technology is confronted with several problems actually limiting its
CSTB - G. Sauce ELSEWISE 20/05/2010 - 12
- Firstly a technological problem due essentially to the poor performance : The objects of the virtual
world do not move or react fast enough to provide natural scene movement, or need super computer to
obtain a reality base. The computation needed to render and display a 3D model are enormous, and
consequently need a important investment.
- Secondly, the characteristics of specific devices, weight of headset and gloves, communication between
computer and devices through direct cable links, low resolution of the images are great disadvantages
for the realism.
- Thirdly, defining a virtual world model of high quality is a very long and expensive work. A great part
of this action is actually "hand made". Some software packages propose an automatic generation from
CAD Data, but the latest are not thinking for this purpose and consequently not really adapted.
Furthermore, for a whole virtual world, the model size is usually enormous and leads problems of
memory and calculation time.
- Fourthly, the absence of a de facto standard limits the investment in such technologies. Effectively,
only big firm can invest in a software package which risk to disappear few years later. Whereas this
package is standard compliant, modelling worlds, data and results could be reused within an other
Regarding other HCI technologies, LSE as other industries is waiting for more efficient and suitable tools.
This is the case for voice and Handwriting recognition.
Trends and expectations
Optimal exploitation of computers can be achieved only if staff adapt to machines. Human Computer Interface
(HCI) research aims to reduce this adaptation and produce a more 'user natural' environment. Five definitions
of different areas of HCI research are the following:
- Interactional hardware and software : Developing, studying and providing guidelines for various ways
of interacting with computers, both at the software and hardware level.
- Matching models : studying how users interact with computers.
- Task level : Determining how well systems meet user's needs.
- Design and development : studying designers and the design process.
- Organisational impact : studying the impact of new systems on organisations, groups and individuals.
Human Computer interfaces include speech technologies (speech synthesis and speech recognition), gesture
recognition, natural language, virtual reality.
There is a new emerging generation of intelligent multimedia human-computer interfaces with the ability to
interpret some forms of multimedia input and to generate co-ordinated multimedia output : from images to
text, from text to images, co-ordinating gestures and language, and integrating multiple media in adaptive
presentation systems. Over the past years, researchers have begun to explore how to translate visual
information into natural language and the inverse, the generation of images from natural language text. This
work has shown how a physically based semantics of motion verbs and locative prepositions can be seen as
conveying spatial, kinematic and temporal constraints, thereby enabling a system to create an animated
graphical simulation of events described by natural language utterances. There is an expanding range of
exciting applications for these methods such as advanced simulation, entertainment, animation and CAD
The use of both gestures and verbal descriptions is of great importance for multimedia interfaces, because it
simplifies and speeds up reference to objects in a visual context.
However, natural pointing behaviour is possibly ambiguous and vague, so that without a careful analysis of the
discourse context of a gesture there is a high risk of reference failure
Practically, for the next ten years, realistic expectation appear for the following technologies :
Two trends characterise the development of virtual reality software packages :
- Virtual reality in the pure sense, that includes total immersion addressing the three main senses (sight,
sound and touch) using specific devices.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 13
- Simplified Virtual Reality based mainly on sight and sound, using ordinary display. This trends will
probably lead to very efficient tools. In fact this will add to actual graphical modelers two fundamental
functionalities : interactivity and pseudo immersion in real-time. It introduces the temporal dimension.
Although these specifications are restrictive, first very efficient available tools are really interesting.
At this time, it is very difficult to identify an emerging de facto standard for Virtual reality model. If VRML
version 2 makes a real and positive contribution to this technology, a big step will be done in this direction.
Moreover, VRML affers a new dimension to virtual reality : multi-user capability. VRML will evolve into a
multi-user shared experience in the near future, allowing complete collaboration and communication in
interactive 3D spaces. Any proposal must anticipate the needs of multi-user VRML in its design, considering
the possibility that VRML browsers might eventually need to support synchronisation of changes to the world,
locking, persistent distributed worlds, event rollback and dead reckoning.
In the past, we have been required to interact with machines in the language of those machines. With the
advent of speech recognition and synthesis technology, humans can communicate with machines using
constrained natural language without noisy ambience. Much research is being conducted in the area of spoken
language understanding. Spoken language systems attempt to take the best possible result of a speech
recognition system and further process the result. A spoken language system is defined as a system
understanding spontaneous spoken input. Spontaneous speech is both acoustically and grammatically
challenging to understand.
Speech recognition applications have a tremendous opportunity for growth. Besides the speaker-independent
continuous speech, the future development of speech recognition technology will be dedicated to the following
- Client/Server Speech Recognition :Client/Server software will allow speech applications to work over a
wired or wireless network. This means that after users dictate from their workstation, users can press a
"SEND" button and convey their recorded voice file to the server, where the speech engine will perform
the recognition and return them to users.
- PC-Based Computer-Telephony Application : Dictation, call processing, and personal information
manager (PIM) will be integrated into one system installed in a computer. For example, users will use
voice to update their daily schedule and use telephone voice-menu to order products.
Because prices of Central Processing Units and Digital Signalling Processors are coming down, speech
recognition systems are seen as affordable and feasible equipments for computers. The goal of speech
recognition is to speak in a speaker-independent continuous fashion into computers. However, the present
situation for most widely used speech-recognition technology is "discrete." This means that users need to
pause between words so that the computer can distinguish the beginning and ending of each word. In the
future, continuous speaker-independent speech recognition will be due to better processor and algorithmic
Optical Character Recognition (OCR) is already well established in commercial applications. The recognition
of handwritten text is an important requirement for Pen Computing but far from a trivial one. Many different
initiatives were taken to solve this problem and it is not yet known which is the best and the results are still not
perfect. It is important to stress two points:
- Handwriting recognition is no more a new technology, but it has not gained public attention until
- The ideal goal of designing a handwriting recognition method with 100% accuracy is illusionary,
because even human beings are not able to recognise every handwritten text without any doubt, e.g. it
happens to most people that they sometimes cannot even read their own notes. There will always be an
obligation for the writer to write clearly. Recognition rates of 97% and higher would be acceptable by
The requirements for user interfaces supporting handwriting recognition can easily be deducted from the pen
and paper interface metaphor. First, there should be no constraints on what and where the user writes. It should
be possible to use any special character commonly used in various languages, the constraint to use only ASCII
characters is awkward, especially with the growing internationalisation that the world is facing these days. The
ideal system is one that supports handwritten input of Unicode characters. The second requirement is that text
CSTB - G. Sauce ELSEWISE 20/05/2010 - 14
can be written along with non-textual input, e.g. graphics, gestures, etc. The recogniser must separate these
kinds of input.
Off-line recognition is done by a program that analyses a given text when it is completely entered, hence there
is no interaction with the user at this point. On-line recognition, on the other hand, takes place while the user is
writing. The recognizer works on small bits of information (characters or words) at a time and the results of
the recognition are presented immediately. The nature of on-line systems is that they must be able to respond
in real time to a user's action, while off-line systems can take their time to evaluate their input : speed is only a
measure of performance, not of quality.
Until recently, only the recognition of handprinted writing has been studied, but with the growing wide-spread
use of pen-based systems, cursive writing becomes important. Most systems do not yet support the recognition
of cursive (script) writing. Printed characters are much easier to recognise because they are effortlessly
separated and the amount of variability is limited.
Pen-based systems are already available (Microsoft Windows for Pen Computing), GO (PenPoint), Apple
(Newton), General Magic (Magic Cap, Telescript). In the next decade new results in handwriting recognition
research will make thess systems more and more attractive.
Relevance to LSE
Computers are now imperative to realise LSE projects, with a high quality for company and users. This means
to improve industrial actor's productivity, to reduce time to market through prototyping, digital mock-up, etc.
In this context, everything which facilitates the usage of computer (and consequently increases the
productivity) is interesting for LSE.
In order to detail the relevance of HCI to LSE, let us focus on two important stages of LSE projects :
- Design stage :
During the design phase, the first problem is a perception of the project, this means objects which have
no real physical existence, id virtual object. We can consider at this stage two points of view : designers
viewpoint, and the one from the others (client, environment, ...). If designers have a relatively good
understanding of the project, it is usually a very difficult task to give this vision to the others. The
assistance of Virtual reality will represent an primordial advantage in terms of communication and
client perception. Expectation are also for designers, VR will add a new dimension in their project
vision which is limited by display size. The notion of immersion will allow them to have a better self
possession and control on the project. VR will also enable a better relation to be made between Design
stage and Construction stage by assessing the produceability / constructability through visualisation of
the product design, i.e. closing in on Design for Construction.
Obviously, designers are also interested in GUI for their daily work, in order to spend more time for
working on the project than for communicating with computer.
- Construction stage:
Here, the two main components are related to the place of actors : in office or on site.
In office, Virtual Reality is always interesting in order to represent a permanent actualised view of the
site, rightly to fulfil the gap with reality, due to the distance or difficulties to access of the project.
On the other side, actors on site, sometimes in difficult situation, need more ergonomics to
communicate with computer and simply use it for increase the productivity and the quality. All
recognition techniques, voice, handwriting, gesture will facilitate HCI in this context.
GUI have taken a primordial importance for the software development. The restriction to few de facto
standards is really a good aspect for users, it facilitates the changing of computer and reduces the adaptation
period. On the contrary, this de facto standard introduces a dependence to specific platforms.
LSE Industry as other industrial domain, is always waiting for more ergonomics and new ways to interact with
CSTB - G. Sauce ELSEWISE 20/05/2010 - 15
- As input, recognition technologies promise soon efficient application (Voice and Handwriting)
- Virtual reality offers new possibilities which will completely change HCI for input communication as
well for output
LSE actors has to contribute to the development of virtual reality by experimenting this technology in two
- for the whole project perception, for project communication but also for engineering design.
One of the most problem of IT is that the project model stored in the computer is more and more
abstract, and more and more difficult to perceive as a whole. Virtual reality appears as a mean to extend
the limits of current display in order for instance to facilitate the presentation of the project to the client
or its integration within the environment. The extension to other senses and the capacity of interaction
will be a complementary advantage notably for engineers, allowing them new possibilities of
- Virtual reality will probably stir up HCI : For instance, Virtual enterprise needs some new kind of
ergonomics, and on can easily imagine a software which proposes a virtual environment with virtual
actors, meeting, and so on.
LSE Industry has to explore this technology to be able to establish a set of specifications in order to influence
the future software tools.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 16
First of all, it seems necessary to define the notion of Data Management. At the basic level, the term of DATA
characterises numbers, characters, images or other method of recording, in a form which can be assessed by a
human or (especially) input into a computer, stored and processed there, or transmitted on some digital
channel. Data on its own has no meaning, only when interpreted by some kind of data processing system does
it take on meaning and become information. For example, the number 5.5 is data but if it is the length of a
beam, then that is information. People or computers can find patterns in data to perceive information, and
information can be used to enhance knowledge.
In the past, Data Element which is the most elementary unit of data identified and described in a dictionary or
repository which cannot be subdivided, evolved from simple alphanumerical item to become more complex
items. Nowadays, Data Element represents facts, text, graphics, bit-mapped images, sound, analog or digital
live-video segments. Data is the raw material of a system supplied by data producers and is used by
information consumers to create information.
The term management has evolved too. At the beginning the main sense considers mainly the data point of
view. Data Management consists of controlling, protecting, and facilitating access to data in order to provide
information consumers with timely access to the data they need. The notion of Database refers to a collection
of related data stored in one or more computerized files in a manner that can be accessed by users or computer
programs by a database management system.
Then, a Database management system (DBMS) is an integrated set of computer programs that provide the
capabilities needed to establish, modify, make available, maintain the integrity of and generate reports from a
database. It has evolved along four generic forms :
1 Hierarchical DBMS (1960s) - records were organised in a pyramid-like structure, with each record
linked to a parent.
2 Network DBMS(1970s) - records could have many parents, with embedded pointers indicating the
physical location of all related records in a file.
3 Relational DBMS (1980s) - records were conceptually held in tables, similar in concept to a
spreadsheet. Relationships between the data entities were kept separate from itself. Data manipulation
created new tables, called views.
4 Object DBMS (1990s) - Object Oriented DBMS combine capabilities of conventional DBMS and
object oriented programming languages. Data are considered as objects. Other classes of ODBMS
associates Relational DBMS and Object capabilities.
In a parallel direction, Data Modelling methods used to analyse data requirements and define data structure
needed to support the business functions and processes of a company evolves a lot. These data requirements
are recorded as a conceptual data model, a logical map that represents the inherent properties of the data
independent of software, hardware or machine performance considerations. The model shows data elements
grouped into records, as well as the association around those records. Data modelling defines the relationships
between data elements and structures with associated data definitions.
Database schema became more logical, this means that some methodologies allowed developers to elaborate
schema more and more independently from specific DBMS.
These methodologies are an essential part of Data Management technology.
Last ten years, main basic problems such as data access, security, sharing which represents the fundamental
functionalities of DBMS have been completed with an essential aspect : the distribution not in a homogeneous
context (hardware and software) but in a heterogeneous environment. The development of network (Intranet
and Internet) have created new kind of possibilities, users and needs for DBMS.
Client/server systems and distributed systems using network is the actual challenge of Data Management
CSTB - G. Sauce ELSEWISE 20/05/2010 - 17
Regarding the increasing complexity of data, new definitions of data management systems are proposed, more
end user oriented. Objectives consist in giving more sense to information in order to take maximum benefits
from the stored data. Effectively, a company has in its own lot of information not really useful for an other
goal than the designed one. Then it is necessary to organise the whole information manipulated by a company
or a project.
Two main classes of such systems are now in rapid development. The first one is product oriented, and the
second one business oriented.
- Product Data Management Systems (PDMS) or Engineering Data Management System (EDMS) :
The fundamental role of a these systems is to support the storage of technical documentation and the
multiple processes related to the products design, manufacturing, assembly, inspection, testing and
maintenance (the whole life-cycle). In project, characterised with globally distributed design, and
complex production processes the task to manage all product related information is of first importance
for the project success. By the introduction of CAD/CAM/CAE systems the amount of product
information has increased tremendously resulting in the need for management systems to control the
creation and maintenance of technical documentation. This implies the use of well defined
configuration management processes and software tools capable of controlling, maintaining all the data
and processes needed for the development and implementation of the whole product.
Product being a transversal project shared by several companies, P&EDMS involves heterogeneous
environment and distributed data.
- Business oriented
- Material Requirements Planning Systems:
Implementing the material requirements planning process has been ongoing for several years. MRP
systems have evolved from pure Material Requirements Planning through integrated business support
systems, which integrate information about the major business processes involved in manufacturing
industries. Support exists for registration of prospects and customers, acquisition, proposals and sales
orders, the associated financial procedures, inventory and distribution control, global production and
capacity planning, with a limited capability for planning, analysis and derivation of management
information. The majority of the systems are registration systems by nature, advanced planning tools
are lacking. Because of the registrative nature of the systems, its lack of flexibility in querying the
available information no easy management information can be produced which might be used to
manage and control a business.
- Data Warehouse Management Tool :
An implementation of an informational database that allows users to tap into a company's vast store of
operational data to track and respond to business trends and facilitate forecasting and planning efforts.
It corresponds to a manager viewpoint. From clients management to stocks management, the whole set
of data of a company is distributed in several databases, managed by different DBMS. Objective of
Data Warehousing is to arrange internal and external data in order to create pertinent information easily
accessible by manager to assist them in their decision activity. This aims at taking out a maximum
benefit of the huge mass of information available in the company. The two main characteristics of data
warehouse are firstly to consider the whole company and not only a part of it like a department, and
secondly to analyse information through time scale. Managers want to analyse information state
evolution and origins of change, this means to maintain an historical record in order to really have
decision information to better appraise the trends. Data mining is a mechanism for retrieving data from
the data warehouse. In fact many companies would claim that they have been using these sort of tools
for a number of years. However, when there is a significant amount of data, the retrieving and analysis
is a complex and potentially have a poor response. Data mining derives its name from searching fro
valuable business information in a large database, and mining a mountain of vein of valuable ore. Both
processes require either sifting through a large amount of material or find out where the value resides.
Data mining tools allow professional analysts make a structured examination of the data warehouse
using proven statistical techniques. They can establish cross-correlations which would be difficult or
impossible to establish using normal analysis and query tools
There are a number of companies who claims to offer tools and services to implement the Data
Warehousing system e.g. IBM, Oracle, Software AG, OLAP databases, Business Objects, Brio
CSTB - G. Sauce ELSEWISE 20/05/2010 - 18
In spite of different objectives, problematics involved in these two cases is relatively the same and can be
summarised by several keywords.
- Meta model : This means to integrate information between the different sources (marketing, production,
etc. or design, manufacturing, etc.) at a conceptual level.
- Heterogeneous environment : this second aspect of integration takes place at a hardware level.
- Distributed databases : It characterises sharing and access to information across multiple platforms.
- Historical records of data in order to manage the product or company activities evolution.
- Navigator, Information filtering and Analysing tools to exploit the mass of information. System
objectives induce specific tools, for instances, graphical aspects are indispensable for EDMS whereas
financial analysing tools are specific for data warehouse.
What exists today
In terms of technology, four points seem very interesting : Data Modelling, OODBMS, Distributed Database
management and access to Database applications running over Internet.
Data modelling aims at analysing and expressing in a formal way data requirements needed to support the
functions of an activity. These requirements are recorded as a conceptual data model with associated data
definitions, including relationships. The three following points can summarise the objectives of such a
1 To build data models : the first goal consists of defining a model which represent as faithfully as
possible the information needed to perform an activity. It is important to place this work at a conceptual
level to be independent from a specific DBMS.
2 To communicate models : In order to validate the data organisation, a maximum number of lecturers
should analyse the proposed data model. Then a fundamental function of a method is to allow
uninitiated people to have a global view on a data model, to perceive the main hypothesis and to
analyse in detail very quickly.
3 To capitalise knowledge : Data model and its context (i.e. activities) represent an important part of
specific knowledge of an application domain. Taking place at a conceptual level, this analysis should
increase the global knowledge and contribute having a better understanding of this domain, on
condition that the modelling method permits the knowledge durability.
These methods represent the keystone of data management. Effectively the system efficiency depends directly
on the quality of the data models.
One can list three classes of methods :
1 Method for Relational DBMS : these methods consider clearly two aspects : functions and data, and for
the latest, entities and relationships.
2 Object Oriented methodologies : This kind of method are characterised by an incremental and iterative
approach. Certain methods are pure object whereas other ones use a functional approach associated
3 STEP methodologies : The international standard ISO 10303 STEP prescribes a specific methodology
for analysing activity of an application domain (IDEF0) and for Application Reference Model
(EXPRESS). This approach is completed with implementation specifications (SPF, SDAI).
Every modelling methods have the same drawback : in fact they are not really independent of the software
context. This is mainly the consequences of developers interest. Actually they are the most interested actors in
the result of modelling, Effectively, the aspect of capitalising and increasing the application domain
knowledge is not yet a primary objective of such an action.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 19
Relational representation is always a current interesting technology and proposes efficient solutions for
numerous data management problems. Object Oriented DBMS is a more recent technology developed to take
maximum advantage of Object Oriented Languages. OODBMS are attractive because of their ability to
support applications involving complex data types, such as graphics, voice, text which are not correctly
handled by relational DBMS A number of reasonably mature OODBMS products exist, and are widely used in
certain specialised applications. Increasingly, these products are being enhanced with query and other facilities
similar to those provided by DBMS and are being applied in more general purpose applications.
Another class of DBMS, the Object-Relational DBMS (ORDBMS), constitutes the most recent system.
ORDBMS are both based on the results of research on extending relational DBMS and on the emergent viable
OODBMS. Current individual products represent different mixtures of these capabilities.
The appearance of these two classes of product indicates a general convergence on the object concept as a key
mechanism for extended DBMS capabilities. Without further detail of OODBMS technology, it is important to
note that an considerable standardisation activity affects the development of this technology, including Object
extension to SQL (The relational standard query language) being developed by ANSI and ISO standard
committees, and proposed standards for OODBMS, defined by the Object Database Management Group.
Distributed Data Base Management (DDBM)
A distributed database manager is responsible for providing transparent and simultaneous access to several
databases that are located on, possibly, dissimilar and remote computer systems. In a distributed environment,
multiple users physically dispersed in a network of autonomous computers share information. Regarding this
objective, the challenge is to provide this functionality (sharing of information) in a secure, reliable, efficient
and usable manner that is independent of the size and complexity of the distributed system.
An abundant literature is dedicated to DDBM which can be summarised by the following functions :
- Schema Integration : One can identified different levels of schema :
- A global schema which represents an enterprise-wide view of data, is the basis for providing
transparent access to data located at different sites, perhaps in different format.
- External schemata show the user view of the data
- Local conceptual schemata show the data model at each site.
- Local internal conceptual schemata represents the physical data organisation at each computer.
- Location transparency and distributed query processing : in order to provide transparency of data access
and manipulation, the data base queries do not have to indicate where the data bases are located,
decompose and route automatically to appropriate locations and access to appropriate copy if several
copies of data exist. This allows the databases to be located at different sites and move around if
- Concurrency control and failure handling in order to update synchronisation of data for concurrent users
and to maintain data consistency under failure.
- Administration facilities enforce global security and provide audit trails.
A DDBM will allow an end user to :
- Create and store a new entity anywhere in the network;
- Access an entity without knowledge of its physical location;
- Delete an entity without having to worry about possible duplication in other databases;
- Update an entity without having to worry about updating other databases;
- Access an entity from an alternate computer.
The sharing of data using DDBM is already common and will become persuasive as this system grows in scale
and importance. But the problem mainly consists on mixing Object concept, distributed technology and
standard aspect. The generalisation of this kind of system depends directly on the development of standards.
Three main candidate technologies are interested on distributed Object oriented Techniques.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 20
- Remote procedure call technologies as DCE (Distributed Common Environment) : This technique
provides low level distribution semantics and requires high programming effort.
- Microsoft distribution technology (also called CAIRO or Distributed COM) will only be available on
Microsoft systems, and nowadays is not yet really marketed.
- OMG distributed objects : The Object Management Group aims at promoting the theory and practice of
Object Technology for the development of Distributed Computing System. OMG provides a forum and
framework for the definition of a general architecture for distributed systems called the Object
Management Architecture. This object management architecture is divided in 5 components. Object
Request Broker (ORB) know as CORBA is the communication heart of the standard; it provides an
infrastructure to object communication, independently of any specific platforms, a technical foundation
for distribution, it specifies a generic distributed programming interface (API) called the Interface
Definition Language (IDL) It also specifies architecture and interface of the Object Request Broker
software that provides the software backplane for object distribution. The other parts of these
specifications concern Object Services (life-cycle, persistence, event notification, naming), Common
Facilities (printing, document management, database, electronic mail), Domain Interfaces designed to
perform particular task for users and Application Objects.
The market of CORBA products evolves very fast, and several Software claim to be CORBA
compliant, but usually they do not implement all CORBA functionalities.
Nowadays, the technology allows true distributed data management, and soon efficient DDBM will be
Database applications running over Internet
The development of Internet leads to propose new possibilities in term of communication with databases : a
dynamic database interaction with Web pages. These pages that are data driven are created through the use of
technologies such as CGI and Java to link a server database with a client browser. CGI (Common Gateway
interpreted (not compiled) by client whereas Java is compiled on server before execution on client.
The problem of security is on the first importance over Internet. The communication link can be made secure
through the Secure Sockets Layer (SSL). SSL is an industry-standard protocol that makes substantial use of
public-key technology. SSL provides three fundamental security services, all of which use public-key
- Message privacy. Message privacy is achieved through a combination of public-key and symmetric key.
All traffic between an SSL server and an SSL client is encrypted using a key and an encryption
algorithm negotiated during the SSL handshake.
- Message integrity. The message integrity service ensures that SSL session traffic does not change in
progress to its final destination. If the Internet is going to be a viable platform for electronic commerce,
we must ensure that vandals do not tamper with message contents as they travel between clients and
servers. SSL uses a combination of a shared secret and special mathematical functions called hash
functions to provide the message integrity service.
- Mutual authentication. Mutual authentication is the process whereby the server convinces the client of
its identity and (optionally) the client convinces the server of its identity. These identities are coded in
the form of public-key certificates, and the certificates are exchanged during the SSL handshake.
During its handshake, SSL is designed to make its security services as transparent as possible to the end user.
This technology is already available and several products are marketed, based on different Web browsers,
specific DBMS and operating systems.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 21
Barriers to LSE uptake
LSE already uses Data Base Management technology, but certainly not as much as possible. Although the
development of certain techniques such as distributed databases is quite recent, the main reason is probably
due to a knowledge problem. Effectively, the keystone of Data Management is data model. The latest is not
under the responsibility of the computer science side but is the concern of the LSE industry.
Consequently an important effort has to be done to formalise the LSE activity and the LSE information flow.
However, data modelling methodologies seem not really adapted to end users. Three aspects of these methods
have to be improved :
- Validation : Nowadays, no methodologies and tools allow end users to appraise the quality of a data
model and to validate it. Usually this model appears to end users as a very strange presentation of its
knowledge, and they are totally dependent from some developers.
- Dynamic aspect : data evolution is a fundamental aspect of a LSE project, but data modelling has
actually some difficulties to undertake this aspect.
- Capitalisation : If LSE industry wants to go forward, it is important to accumulate this kind of
knowledge, which is not theoretical but representative of a know-how, practical processes and company
Another aspect of data modelling is the standardisation of the data models in order to facilitate the
communication between various firms and even within the company. The international standard ISO 10303
STEP represents the international context which permits to develop such actions.
In fine, the main problem of Data Management is the preservation of data during the life-cycle of a product. In
many cases the life of a product, especially one in construction and civil engineering, easily spans more than
thirty or forty years during which the data associated to the product must remain available and must be kept
up-to-date. Not only must the data be available it should also be possible to correctly interpret the data.
Until recently drawings represented the design specification of a product. Although electronic CAD systems
are used to produce drawings, storage and archiving are still done on paper drawings. The paper drawing still
is the official version of a product design specification. It is the paper version of the drawing which should be
available for the product life-time, including results of design changes, maintenance adaptation and reworks.
Today we are still able to read and interpret drawings on paper which are made thirty years ago. We are even
able to read and interpret design drawings the Romans made for their public baths. Archiving paper drawings
is built upon the preservation of the medium bearing the drawing, e.g. paper. Some decades ago large paper
archives have been compacted using micro-fiche. The retrieval and viewing of the micro-fiche archives is now
depending on some form of translation from a storage medium to a means for representation. Archiving is still
done on the original data, i.e. the printed paper, which has been decreased in size. No interpretation step is
involved and the compaction/de-compaction process is a pure physical one.
Only quite recently the notion of product models arose. A Product Model is a presentation of a product in
electronic, computer-readable form which is used by various computer applications to interpret the
applications view and needs of the model. To enable different programs to interpret the same product model, a
standard and interchange mechanisms are needed.
Storage of a Product Model in electronic form over long periods also needs standards to ensure the data will
mean the same in say, twenty years. Another concern then arises: the medium on which to save the data for
archiving purposes. Today we are unable to read data which has been produced by an office application of 8
years ago. It is also not guaranteed that we can still read 51/4" computer disks made four years ago because of
wear of the disk.
This problem has been identified in several areas of industry which have a high data density. One of the trends
in this area is the foundation of small to medium companies who make a living of archiving and retrieving
data for a purpose, data services companies. The idea is to enable the user to get hold of the correct product
data throughout the life-time of a product or service. The data services company, for a fee, takes care that
upgrades to new versions of standards, tools, storage applications, e.g., databases, and new media are carried
out. Such companies can only thrive based on standards, used and kept.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 22
Trends and expectations
Current trends aim at adding value to Data Base Management system in one hand to specify more end users
oriented DBMS, in other hand to develop Intelligent Database environment. The goal is to manage high level
of information. The concepts of Data Warehouse and Engineering Data management are at the beginning of
their development, and will procure in the next years more powerful possibilities.
The semantic aspect induced by a high level of data allows to develop reasoning capabilities in order to
- declarative query language, to ease interrogations,
- deductive system for instance query association permits to return additional relevant information not
explicitly asked by users,
- query optimisation to reduce time of transactions,
- integration of heterogeneous information sources that may include both structured and unstructured
From a technical point of view, ODBMS and Distributed Data Management will evolve in performance and in
security. The Object Database Management Group continue its activity on standard deployment regarding
object definition and query language.
Numerous research projects deal with distribution. They aim at improving time of transactions, security, at
establishing better connections using Web browser. New technologies are in progress, especially persistent
distributed store which provides a shared memory abstraction.
Relevance to LSE
As every kind of activity, Data management represents the foundation of the LSE industry. Whether
considered viewpoint, from a product one or a firm one, LSE activity constitutes an heterogeneous
environment and manipulates a high semantic level of information which is recorded on different supports.
The association of data management, distribution and new communication technologies is proving on the first
LSE is interested to DBMS with added value such as EDMS and Data Warehouse. Engineering Data
management system constitutes a interesting response in term of project information management whereas
Data Warehouse will interest the firm as its own.
LSE industry is also mainly involved in standard activities, not directly for the computer science standards,
but for establishing the product data models. We shown that data models are the keystone of data
management. These ones contain an important part of LSE knowledge, still not yet formalised.
Nowadays, Data Management is based on Sound technology and lot of efficient software tools are available.
LSE industry already uses this technology in the different part of companies. Lot of progress are expecting
using object representation, probably more adapted to project design. Consequently great interests are waiting
for the next evolution of OODBMS.
At the present time, exchange of information and distribution of information are too limited. (mainly graphical
document exchange). A higher semantic level within the exchange allows the different actors of a project to
take more benefit of the new Information technologies. Developing data model, the key stone of exchange of
information, needs sound standards at two levels :
- Data modelling methodologies especially to facilitate the understanding of models and their validation.
Communication is important for LSE actors allowing them a control of these models, but also to
guarantee the continuity of the knowledge contained within models. No current modelling
methodologies offers efficient way of model validation. This phase of validation is only based on the
competencies of the model designer, and no technical, analytical or mathematical validation are
CSTB - G. Sauce ELSEWISE 20/05/2010 - 23
- Product data model : LSE industry has to be responsible on the development on its model. Actions have
to be intensified to elaborate adapted models, notably in the context of the STEP standard.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 24
The previous chapter analysed Data management, the subject here is to present Knowledge management, but
what is the difference between Data and Knowledge. Whereas Data represent the basic pieces of information,
the term Knowledge recovers several definitions, depending on user viewpoint. It could be assimilated to the
process of knowing or to a superior level of information. In this work, we won't debate on this definition and
will consider the whole sense of knowledge, including both object of knowledge and process of knowing.
The two following terms could be associated to knowledge : management and engineering. The main
difference is that the manager establishes the direction (administrative, executive) the process should take,
whereas the engineer develops the means to accomplish this direction. Then Knowledge Management is
relating to the knowledge needs of the firm, to make what decisions and enable what actions. Knowledge
Engineering develops technologies satisfying Knowledge Management needs.
In fact, here we will develop Knowledge technologies, dependent on the knowledge engineer (usually a
computer scientist) , which allows manager to develop policies for enterprise level knowledge ownership.
Understanding knowledge is key to Knowledge Management process. It means how we acquire knowledge,
how we represent it, how we store it, how we use it and what are the applications. Therefore we identify three
main parts in knowledge management :
- Knowledge representation
The representation or modelling process consists of defining a model (or several models) able to
represent knowledge as reliable as possible. The first result of the modelling process is to formalise the
knowledge in order for accumulation and communication and sharing. The second aim is to establish a
model which could be manage and use by computer programs.
- Knowledge acquisition
The smallest Knowledge Based system need a huge quantity of knowledge in order to be realistic. The
knowledge acquisition process is critical to the development of any knowledge management program. It
means to have a complete understanding of the nature and behaviour of knowledge. Then it is necessary
to develop specific procedures to acquire this knowledge. This objectives of identifying and creating
dynamic knowledge bases is itself also a knowledge process. It involves knowledge learning
communications techniques, modelled on the human (or child) behaviour.
- Knowledge usage
The objective of accumulating its knowledge is on its own essential for an enterprise, it represents the
memory of the firm. But this only interest is minor and the challenge is to develop a daily exploitation
of the knowledge base, in every kind of action : technical, management, financial, etc. The possibilities
includes decision making, diagnostic, control, training, command and so one. The number of potential
applications is endless.
The next paragraph will present the existing technologies in accordance with these three aspects.
What exists today
The first steep regarding knowledge management process consist of choosing a knowledge representation.
Effectively, the natural language usually used by every body is not adapted for developing a knowledge base.
But the problem is that nowadays, it doesn't exist a universal model of knowledge representation. The existing
formalisms depends on the application (diagnostics, decision making...) or on the knowledge domain(Natural
language, technical application, medical science, etc.). In the past, two trends are developed which are
knowing process oriented (the procedural knowledge representation) for the first and knowledge description
oriented (declarative knowledge) for the second.
Nowadays the trend tries to develop more universal knowledge representation.
Without going into details or considerations of pure computer scientists, it is possible to identify four current
in Knowledge representation.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 25
- Production rules
- Logical form
- Conceptual graphs
- Object oriented representation
A rule consists of two parts :
- An antecedent or situation part usually represented by IF <condition>. This condition is expressed by
mean of atomic formula and logical statement formed by using logical connectives such AND, OR,
- A consequent part or action represented by THEN <conclusion>. The consequent may include several
Antecedents could be considered as patterns and consequences as conclusion or action to be taken. Each rules
represents a independent grain of knowledge. Effectively, a rule could not refer to an other one, and contains
on its own all the condition of its application. In addition, a rules base of a production system doesn't specify
an order of application of the rules.
Various methods have been used to deal with uncertain or incomplete information in such a knowledge base.
- Certainty factors consists of assigning a number to each rule for indicating the level of confidence for
this piece of knowledge. Usually this number, in the range of -1 to 1 or 0 to 1 or 0% to 100%, are combined
with factors of evidence associated to fact to deduce factors of evidence for the conclusion.
- Fuzzy logic : The theory of fuzzy sets can be used to present information with blurry or grey
boundaries. In this theory, an element may be in a set with a degree of membership between 1 and 0. A fuzzy
set contain elements with assigned degrees of membership. The main problem is to construct the membership
- Probabilities : Bayes Theorem is well known in statistics and probabilities literature. It provides a
method for calculating the probabilities of an event or fact based on the knowledge of prior probabilities. The
difficulties of this method is to independently determine one another the probabilities of observed facts.
This representation is based on the mathematical Logic. The most basic logical system is propositional logic.
Each basic element of the system, or proposition, can be either true or false. The propositions can be
connected to each other by using the connectives AND, OR, NOT, EQUIVALENT, IMPLIES, etc. First-order
predicate logic is composed of :
- atoms (symbols),
- functions and predicates (a function with one or more atomic arguments which produce a boolean
- two substatements joined by a conjunction, disjunction, or implication,
- negated substatement,
- statement with an existential or universal quantifier (in this case, atoms in the statements can be
replaced by variables in the quantifier).
For instance : Bob has a bicycle might be given the following logical form :
∃ B , Bicycle (B) ∧ Has (Bob, B)
This is a very descriptive declarative representation with a well founded method of deriving new knowledge
from a database. Its flexibility makes it a good choice when more than one module may add to or utilise a
common database. Unfortunately, this flexibility has limitations. To maintain consistency, learning must be
monotonic. This limits its effectiveness when there are incomplete domain theories
This representation has been extended in several direction to correct these weaknesses and to propose new type
of application. These extensions of logic include, but not limited to, closed world reasoning, defaults,
epistemic reasoning, queries, constraints, temporal and spatial reasoning, procedural knowledge
CSTB - G. Sauce ELSEWISE 20/05/2010 - 26
The integration of Logical form with other formalisms, such as object-oriented languages and systems, type
systems, constraint-based programming, rule-based systems, etc., represents most of the actual development of
Applications concern numerous areas such as natural language, planning, learning, databases, software
engineering, information management systems, etc.
Conceptual graphs (CG) are a graph knowledge representation system invented by John Sowa which integrate
a concept hierarchy with the logic system of Peirce's existential graphs. Conceptual graphs are as general as
predicate logic and have a standard mapping to natural language.
Conceptual graphs are represented as finite, bipartite, connected graphs. The two kinds of nodes of the graphs
are concepts and conceptual relations which are connected by untyped arcs. Each conceptual graph forms a
proposition. The main characteristics of Conceptual graphs are the following :
- A concept is an instantiation of a concept type. A concept by itself is a conceptual graph and conceptual
graphs may be nested within concepts. A concept can be written in graphical notation or in linear
- A pre-existent type hierarchy of concept types is assumed to exist for each conceptual graph system. A
relation < is defined over the set of concepts to show concept types that are subsumed in others.
- Two or more concepts in the same conceptual graph must be connected using links to a relation.
- For any given conceptual graph system there is a predefined catalogue of conceptual relations which
defines how concepts may be linked with relations.
- Each concept has a referent field used to identify the concept specifically or to generalise the concept,
to perform quantification, to provide a query mechanism, etc.
- Canonical graphs represent real or possible situations in the world. Canonical graphs are formed
through perception, insight, or are derived from other canonical graphs using formation rules. Canonical
graphs are a restriction of what is possible to model using conceptual graphs since their intent is to
weed out absurdities or nonsense formulations.
- A canonical graph may be derived from other canonical graphs by formation rules: copy, restruct, join,
and simplify. The formation rules form one of two independent proof mechanisms.
- A maximal join is the join of two graphs followed by a sequence of restrictions, internal joins (joins on
other concepts), and simplifications until no further operation is possible. A maximal join acts as
unification for conceptual graphs.
- New concept types and relations may be defined using abstractions. An abstraction is a conceptual
graph with generic concepts and a formal parameter list.
- Schemata incorporate domain-specific knowledge about typical objects found in the real world.
Schemata define what is plausible about the world. Schemata are similar to type definitions except that
there may be only one type definition but many schemata.
- A prototype is a typical instance of an object. A prototype specializes concepts in one or more
schemata. The defaults are true of a typical case but may not be true of a specific instance. A prototype
can be derived from schemata by maximal joins and then assigning referents to some of the generic
concepts in the result.
Conceptual graphs is considered as a good choice of knowledge representation for systems where knowledge
sharing is a strong criterion. They have a formal, theoretical basis, support reasoning as well as model-based
queries and have a universe of discourse that is described by a conceptual catalogue. They have no inherent
mechanism for knowledge sharing or for translation but the flexible representation, mapping to natural
language and, especially, work on their database semantics is an encouragement to consider CG as a candidate
Conceptual graphs include more or less Semantic Networks which consist of a collection of nodes for
representation of concepts, objects, events, etc., and links for connecting the nodes and characterising their
interrelationship. Semantic networks propose also inheritance possibility.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 27
What tools are available? There have been many experimental implementations of conceptual structures. The
conceptual structures community are cooperating in the development of a industrial strength, freely available,
portable conceptual structures workbench in the Peirce Project .
Object oriented representation.
There are many variants of object oriented representation (OOR), each of them have their own definition and
characteristics. Even the response to the fundamental question : "what is an object ?" depends on the viewpoint
of the author. The followings only describe the main aspects of OOR :
- An object is an entity (a thing, an individual, a concept or an idea) that could be identified. An object
has two parts : a descriptive part and a procedural part.
- The descriptive part represents the state of the object, and is composed of attributes. Each attributes
is specified by a type, a domain, a default value and other characteristics which describe this
- The procedural part represents the behaviour of the object. Usually it consists of method which are
composed by a descriptor (the name of the method) and the associated function (the procedure
which performs the behaviour). It could also be a reflex which represent a specific behaviour
attached to one attribute. A reflex is automatically triggered when an action affects this attribute : to
read, to reset, to modify, to initialise...
- Message : Objects usually communicate using messages. A message is compose of tree items at least :
the addressee or the interrogated object, the descriptor which corresponds to a name of a method, and
the arguments. Some times, the sender is mentioned which allows the object to adapt the response to
- Encapsulation : When the internal structure of an object is inaccessible, then this object is considered as
a black box. In this case, the only way of communication consists of using messages. The notion of
encapsulation leads to put the object interface with the environment before the internal structure, and
facilitates the reuse and the evolution of objects.
- Instanciation : Most of OOR are classes based. We just mentioned that it exists other alternatives such
as prototypes. The notion of class comes from the concept of set. Every element of a set has the same
characteristics and behaviours. Only the state (the values of attributes) of element change. A class
describes the shell of its element : the descriptive part. Every element of this class is an instance,
identified by a relation "is a" with its own class. Regarding its behaviour, every element, usually
dynamically, refers to its class to know the understandable messages and the associated functions. A
class is an abstraction that Emphasises relevant characteristics and Ignores or suppresses other
characteristics. Each object is an instance of a class.
- Specialisation : Inheritance is a relationship between classes where one class is the parent
(base/superclass/ancestor/etc.) class of another. Inheritance provides programming by extension (as
opposed to programming by reinvention) and can be used as an "is-a-kind-of" relationship. The
Inheritance is effective for both descriptive part and procedural part. Inheritance provides a natural
classification for kinds of objects and allows for the commonality of objects to be explicitly taken
advantage of in modelling and constructing object systems. Multiple Inheritance occurs when a class
inherits from more than one parent.
- Reference : The value of an attribute can be more than a simple type and refers to another entity. It
defines a kind of relationship.
- Aggregation : A complex object may be composed by several other object which are "part-of" this one.
This kind of relationship introduces a dependency between the parts of the complex object. It means
that if the latest disappears, its parts are killed too.
Since the first object-oriented language (in 1967 Simula was the first providing objects, classes, inheritance,
and dynamic typing), numerous variants was developed. At the present time, it is important to distinguish
between Object Oriented Programming Language (OOL) and OOR. OOL like C++, Classic-Ada, Java, Object
Pascal, etc., aim at developing softwares based on object framework whereas OOR is a knowledge
Learning may have several senses in computer science, three at least :
CSTB - G. Sauce ELSEWISE 20/05/2010 - 28
- Human training : the aim is to increase the knowledge of a person using Computer Based Training
Software. (This viewpoint is out of scope)
- Computer training : here, the aim is to create, extend or modify a computer knowledge base. In a first
case human uses a specific interactive and ergonomic software to perform this tasks. A second
possibility consists of a computer self training. Specific softwares allow computer to develop, correct or
update its knowledge base. Usually, practical approaches combine these two possibilities.
- Discovery of new knowledge, using computer : It is a more ambitious objective consisting of going
deeper in a specific knowledge domain or exploring new knowledge domain, in order to identify and
validate new facts, hypotheses or theories both for human and computer.
Learning involves different processes such as the acquisition of new knowledge, the organisation of the
acquired knowledge into effective representations, the development of perceptual motor, and cognitive skills
and the discovery of new facts, hypotheses, or theories about the world through exploration, experimentation,
induction, deduction, or abduction.
Learning structures and processes are essential components of adaptive, flexible, robust, and creative
intelligent systems. Knowledge representation mechanisms play a central role in problem solving and learning.
Indeed, learning can be thought of as the process of transforming observations or experience into knowledge to
be stored in a form suitable for use whenever needed. Theoretical and empirical evidence emerging from
investigations of learning within a number of research paradigms, using a variety of mathematical and
computational tools strongly suggests the need for systems that can successfully exploit a panoply of
representations and learning techniques. It is unlikely that a single knowledge representation scheme or a
single reasoning or knowledge transformation mechanism would serve all of the system's needs effectively in a
dto Idt n
c a ne e
ms r t e
s rent re
oc a a t
Fmef a n
r s cc i s
o p ii to Fmaps
o x l
ul Ce t ds
o da Fmaps
o x l
t u un
d ak wg
o i n ld
m oe e
n d ak wg
o i n ld
m oe e
Figure 2 : Different types of reasoning for acquisition knowledge
Main technologies developed this last decade try to extract and formalise knowledge from various data
sources. Datas may be examples, concepts or behaviour. The type of reasoning could be deductive, inductive
or analogical. This taxonomy is more theoretical than practical, because, a learning software often involves
several of these characteristics, which make its classification very critical.
Figure 2 : Different types of reasoning for acquisition knowledge represents a simplified classification of the
different types of acquisition possibilities.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 29
Analytic learning in its pure form uses deductive inference to transform knowledge into a form that is useful
for efficient performance of tasks in a given environment. A commonly used analytical learning strategy is
explanation-based learning using examples. For instance, given a solution to a particular problem, explanation-
based learning mechanism generates an explanation using background knowledge (general rules of integration)
showing how the solution deductively follows from what the system knows. The resulting explanation is then
used to reformulate the given solution to the particular problem into a form that makes it possible to recognise
and solve similar problems efficiently in the future. This means increase the efficiency of the system, for
example, by reducing the amount of search effort needed to come up with a solution because the necessary
knowledge now is in a more usable or operational form.
Other related forms of analytic learning include deductive derivation of abstractions or generalisations using
facts provided by the environment and the background knowledge. it consists on interpreting a specific
function (a regulation for example) to deduce practical procedures.
Induction is the primary inference mechanism used in synthetic learning. Unlike deduction which can lead to
no fundamentally new knowledge (because all inferences logically follow from the assumed background
knowledge and the given facts), inductive inference allows creation of new knowledge. A typical example of
an inductive learning task involves the learning of an unknown concept given a set of examples and/or a
counter-examples. This problem of concept learning essentially reduces to a search for a concept description
that agrees with or approximates the unknown concept on the set of examples. This can be done without
explicit knowledge about the specific domain of study. In this case the learning process involves very generic
mechanisms. The inductive process may use also meta-knowledge depending on domain specificities and on
the existing knowledge background. Inductive leaning concerns several types of knowledge representation as
well as rules, as conceptual graphs and objects.
Learning by analogy
Relatively little is known about the formation of analogical mappings or analogical inference which appear to
be an integral part of human reasoning. Analogy refers to a mapping between two entities (objects, events,
problems, conceptual graphs, networks, behaviours, etc.), that consists of detecting difference and likeness
named similarity relationship. In one case the search concerns causality relation, in the other case, the
comparison interests the proximity relation.
For instance, the template matching approach to pattern recognition is reminiscent of exemplar-based or case-
based reasoning in artificial intelligence. Thus it is possible to adapt much of the work in exemplar-based
learning to structured templates. Learning in this case reduces to the task of acquiring and modifying templates
necessary for specific tasks (e.g., pattern recognition).
Some other technologies of learning depend directly on the knowledge representation. For instance, learning in
connectionist network system can be performed by modifying a parameter (weight) in network or changing the
Usually, the learning process needs more than one technologies. For instance in a deductive mechanism, if the
background knowledge is incomplete, inconsistent or imprecise, explanation generation, the key component of
explanation-based learning, cannot proceed without postulating changes in background knowledge. An
interesting possibility that is worth exploring is to treat the knowledge base as though it were non-monotonic
and the derived explanations as though they were tentative and use a variety of non-deductive learning
strategies for hypothesising candidate revisions of the background knowledge. Other possibilities include non-
deductive generalisation of explanations as hypotheses to be validated by further experimentation with the
environment. Methods developed in the context of syntactic pattern recognition such as those based on
distance measures for structured templates or grammar inference as well as connectionist learning models offer
additional sources of ideas (e.g., adaptation of learned explanations to new situations as hypotheses to be
tested) for extending the power of current deductive learning systems.
Knowledge acquisition leads to verify coherency and integrity of the knowledge base. The level of syntactic
control is easily performed, but nowadays, the only way for controlling the semantic of the base consists of
using examples in order to detect incoherences and anomalies.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 30
Traditional artificial intelligence and syntactic pattern recognition provide a number of powerful tools. The
problem is to qualified and compared the market offer. A broad range of performance measures (e.g., speed,
accuracy, robustness, efficiency) are conceivable, but each kind of technologies depends on a specific
knowledge representation and the knowledge domain semantic.
Nowadays, knowledge acquisition technologies are used in various domain of industry such as default
detection of helicopter propeller or financial domain such as bank loan management. Numerous comparisons
shown that an expert system based on knowledge directly obtained from an expert is systematically less
efficient than this one built with learning technologies. Furthermore, the size knowledge base of the latest is
more concise for equivalent results.
Resources notes : Honavar, Vasant. (1994). (ISU CS-TR 93-22) "Toward Learning Systems That Use Multiple
Strategies and Representations." In: "Artificial Intelligence and Neural Networks: Steps Toward Principled
Integration". Honavar, V. and Uhr, L. (Ed.), pp. 615-644. New York: Academic Press, 1994.
This aspects of knowledge management consists of exploiting the bases for a precise objective. It aims at
demonstrating, detecting, deciding, analysing, proving, verifying, etc., in other terms to assist users in their
This mainly involves the reasoning capacities of Knowledge Based Systems (KBS). Numerous possibilities of
reasoning are explored in Artificial intelligence, in the image of human reasoning capacities. IA identifies
mainly three mechanisms : formal reasoning (logical reasoning is the most current approach), analogical
reasoning (including case based reasoning) and reasoning by abstraction, generalisation and specialisation
(such as classification). The following paragraphs detail the different tool categories. The objectives isn't to be
exhaustive but to describe the main families of tools, giving the keystone to get more information regarding
the technology itself and particularly with the existing tools.
Expert system (ES)
Computer programs using AI techniques to assist people in solving difficult problems involving knowledge,
heuristics and decision making are called expert systems. This first definition was limited to one kind of
knowledge representation and reasoning : symbol processing systems. Nowadays, expert systems mainly
concern knowledge base involving explicitly rule driven symbol manipulation using production rules and logic
programming. An expert system tend to mimic the decision-making and reasoning process of human experts.
They can provide advice, answer questions and justify their conclusions. An expert system may be highly
interactive, and its output can be qualitative rather than quantitative..
Basic ES principle is a separation between knowledge base of the domain and reasoning mechanism.
The knowledge base is divided into two main parts :
- The generic knowledge of the domain, containing many independent production rules. Usually, these
rules describes heuristics (called surface knowledge) but also more rarely, well established theories and
principles (called deep knowledge).
- The base of facts describing the studied case or the current problem. Its knowledge representation may
be one or several of the following : logical propositions, objects, conceptual graphs, etc.
The reasoning mechanism also called inference mechanism aims at producing new knowledge from the base
of facts, applying the rules. The Inference mechanism determines the order of which rules should be invoked
and resolves many conflicts when several rules are satisfied. Different strategies of reasoning, depending on
the objectives of ES (e.g. proving an hypothesis, proposing a solution, etc.), may be applied : forward,
backward and mixed chaining. The fundamental characteristic of such a mechanism is its independence with
the knowledge domain. It depends mainly on the type of logic using for the rules representation : propositional
or first order logic.
ES are suitable for problems involving deduction and not so much for problems involving induction or
ES prototypes have been largely developed in various industry domain, but very few of them are really
efficient. This is mainly due to an initial objective of the ES usually too ambitious, involving a too large
domain of expertise or with too rare experts. Nowadays, the main conditions for a successful application of ES
- A well defined task, limited in size. This task is presently performed more by a specialist than an expert.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 31
- Identified end users : ES are not suitable for expert, but usually developed for aiding a non specialist
during the absence of the expert.
- Identified expert which are able to propose, to explain their knowledge and their reasoning.
- To collect several cases with their solution.
- To anticipate the integration of the ES in the activities and the information system of the end-user.
Nowadays, the market doesn't propose ES but tools for developing ES.
Two kinds of tools are available :
- Expert Systems Generators which consist of toolkit containing all the components for building an ES.
They usually offer several possibilities of knowledge representation, various reasoning strategies,
different interfaces (human and machine). A lot of generators are available for every platforms at every
- Shell are generators limited or adapted to a specific class of problems. These classes could be identified
by similar reasoning or by specific technical domain.
These generators permit to concentrate the effort on the knowledge identification and acquisition. They allow
very quick prototyping and development of ES.
Various methods have been used to deal with uncertain or incomplete information in the knowledge base.
Some of these methods are briefly described in section 220.127.116.11 Logical form. Nowadays, no one of them is
Multiple expert system is the last evolution of this kind of tool. In order to solve more complicated tasks
involving several exerts, the proposed solution consists of developing cooperation between particular ES. So
ES are able to communicate during reasoning, to ask information to an other one and to detect conflict
between different views. The main problems are synchronisation in reasoning, solving conflict, and synthesis
of solution. Generic mechanisms such as blackboard are not really satisfying, and the resolution of the problem
needs specific domain knowledge.
Case Based Reasoning
Case-based reasoning (CBR) relies on the idea that similar problems have similar solutions. Facing a new
problem, a case-based system retrieves similar cases stored in a case base and adapts them to fit the problem at
hand. CBR is usually viewed as a process of remembering and recalling one or a small set of actual instances
or generic cases. The decision making is based on comparisons between the new situation and the old
experience; thus, the quality of reasoning depends on the quality and quantity of cases in the case base. With
increased number of unique cases (instances or generic cases), problem-solving capabilities of CBRS improve
while the effectiveness of the system (at least in terms of efficiency) may decrease. On the one hand, if the
cases are not unique enough, i.e., the cases are highly correlated, then the system will be limited in the
diversity of solutions it can generate. On the other hand, if the case base is small, then the possible solutions
Regarding the case representation : There is still a lack of consensus as to exactly what information should be
represented in a case. A case is an arbitrary set of features, used for describing a particular concept.
Considering its form, a case can be a simple concept or a connected set of subcases; it can be a specific
instance or a generalisation. Many different case representations (Objects, conceptual graphs for instance)
have been proposed so far. There are two main reasons for this:
1. the motivation for CBR came from different directions, namely a desire to model human performance
of knowledge-intensive tasks, and an attempt to use existing database technology to store and access
2. different application areas required different functional requirements on the case representation.
The major processes in CBR are as follows :
- Case storage or case base construction representing all important features of the episode (actual or
hypothetical or generic) in a case base. An important issue, related to representation, is case base
construction. It is not a trivial task to decide how to represent a case in a particular domain and what
cases are worth storing (because of the utility. Moreover, the representation can be extremely complex
which makes it difficult to acquire a sufficient number of cases for a particular application. These
problems are the main reasons why many CBR systems have relatively small case bases. The process of
a case base construction can be simplified as follows:
CSTB - G. Sauce ELSEWISE 20/05/2010 - 32
- A case base is created automatically from an existing database.
- A special intelligent system is used as an assistance to case acquisition. The system may be based on
the use of previous cases as a model and guide for expressing new cases.
The basic assumption of the case-based reasoning system is that all relevant cases will be retrieved
efficiently. In order to guarantee efficient retrieval, case storage requires specific organisation. Several
general approaches have been proposed for this problem, namely:
1. Classification of all cases into a hierarchy, grouping relevant cases together (in CBR literature this
process is usually referred to as the indexing problem);
2. Implementation of a CBR system on a parallel computer without using case classification;
3. Classification is supported by parallel implementation;
4. Neither classification nor the parallel implementation is. However, this approach is only usable for
small case bases.
- Case retrieval : Retrieval of cases from memory can be characterised as remembering of relevant past
experience. Retrieval consists of the following sub-tasks.
- Recall previous cases : the goal is to retrieve cases relevant to the current problem, e.g., retrieve
- Select the best subset : only the most promising case (or cases) retrieved in the initial step are
selected. For some application only one case is required, for other, small set of relevant cases is
important (for adaptation and/or justification).
Retrieval strategies depends on the knowledge representation and organisation of the case base. The
three main types algorithms are associative (retrieval classifies any or all feature independently of all
the other features), hierarchical (organises into a general to specific concept structure) or mixed.
The selection of the best cases needs to assess the similarities with the current problem. Case retrieval
and similarity are inter-related. Stored cases and problem solving situations are represented in terms of
surface and abstract features. Procedure to assess similarity relies on domain knowledge in addition to
case features. Domain knowledge enables inferences of featural equivalence and evaluation of the
importances of unmatched features in assessing similarity.
- Case adaptation : There are only a few examples where the old solution can be reused without any
change. More often, one must modify either a solution or strategy to get a solution for the current
problem. The process of modifying the old example to fit a new problem is called adaptation. One can
differentiate adaptation strategies based on diversity of CBR systems (planning, explanation, diagnosis,
etc.) and designate the knowledge required for its application. The adaptation algorithms can be
categorised into structural adaptation applying the adaptation rules directly to the solution stored in a
case and the derivation adaptation using rules that generated the original solution to generate the new
solution by their re-application. The following adaptation methods have been identified:
1. Null adaptation is the direct application of the retrieved solution to the new situation.
2. Parametrised solutions, a structural adaptation, is probably best understood. It is based on the
comparison of the retrieved and input problem descriptions along the specified parameters. (This
method was also called prototype recognition)
3. Abstraction and re-specialisation is a general structural adaptation technique that abstract the piece
of the retrieved solution and re-specialises it later.
4. Critic-based adaptation is a structural adaptation based on using critics to debug almost correct
5. Re-instantiation is a derivational adaptation method since it operates on the plan that was used to
generate a particular solution.
Application of Case-Based Systems concerns numerous and various areas :
- Case-based design : problems are defined as a set of constraints; the problem solver is required to
provide a concrete artefact that solves the constraint problem.
- Case based planning : Planning is the process of coming up with a sequence of steps or a schedule for
achieving a particular state of the world.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 33
- Case-based diagnosis : a solver is given a set of symptoms and is asked to explain them.
- Case-based explanation :Explanation is introduced to the CBR terminology in the connection with the
plan or solution failure
Only recently, some attention has been paid to evaluation of CBR systems. System evaluation can include not
only performance measure issues but also verification and validation. Verification is performed to ensure
consistency, completeness, and correctness of the system. Validation ensures correctness of the final system
with respect to the user needs and requirements.
Resource notes : Representation and Management Issues for Case-Based Reasoning Systems- Igor Jurisica -
Department of Computer Science -University of Toronto - Toronto, Ontario M5S 1A4, Canada - 15 Sept 1993
Neural Networks are universal approximators : they can appropriate any non linear input-output mapping. This
striking property opens up various fields of application such as automatic classification (for pattern recognition
and expertise gathering through examples), industrial process modelling, non-linear system control, etc.
Connectionist networks or artificial neural networks are massively parallel, shallowly serial, highly
interconnected networks of relatively simple computing elements or neurones. Much of the attraction of
connectionist networks is due to their massive parallelism that is amenable to co-operative and competitive
computation, potential for limited noise and fault-tolerance, discovery of simple, mathematically elegant
learning algorithms for certain classes of such networks, and to those interested in cognitive and brain-
modelling, their similarity with networks of neurones in the brain.
Each neurone computes a relatively simple function of its inputs and transmits outputs to other neurones to
which it is connected via its output links. A variety of neurone functions are used in practice. The most
commonly used are the linear, the threshold, and the sigmoid. Each neurone has associated with it a set of
parameters which are modifiable through learning. The most commonly used parameters are the so-called
The representational an computational power of such networks depends on the functions computed by the
individual neurones as well as the architecture of the network (e.g., the number of neurones and how the
neurones are connected).
Such networks find application in data compression, feature extraction, pattern classification and function
The application of neural networks depends directly on their learning capacities, due to their very specific
representation. variety of forms. A number of researchers have investigated techniques of initialising a
network with knowledge available in the form of propositional rules. Very little systematic study has been
done on the representation and use (especially in learning) of structured and conceptual graphs in connectionist
At the present time Neural Network applications concern very specific fields of industry such as non-linear
process command or pattern recognition.
Classification is an important issue in pattern recognition and knowledge representation. Data classification
can be performed in two modes: supervised and unsupervised. The task of supervised classification is
classifying new objects (or cases) into predefined classes while unsupervised classification aims to determine
homogeneous groups of objects in the data.
- The supervised problem is based on the assumption that the data are made up of distinct classes or
groups known a priori. Each case is described by a set of features or attributes (body-cover, body-temp,
fertilisation-mode, ...). The general task is to develop decision rules in order to determine the class of
any case from its attribute values. These classification rules are first inferred (or computed) from a set
of available cases whose the classes are known (the training set), and are then used for class prediction
of unknown cases. Supervised classification has become a major active research field during this last
decade. Many new kinds of algorithms have been developed, including techniques from traditional Data
Analysis, Pattern Recognition, Artificial Intelligence (Machine Learning) and Artificial Neural
Networks. They differ principally in the kind of decision rules they produce and the kind of strategy
they develop to realise the discriminating task. Hence, some algorithms focus their search on explicit
decision boundaries that best discriminate among the different classes. Some other approaches focus on
reference objects and use a resemblance measure to classify unknown instances.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 34
- In the case of unsupervised classification (or clustering), the allocation of data to distinct classes is not
known a priori. The general task is now to discover homogenous groups or categories in the data, based
on the attribute values of the cases concerned. The elements in each cluster must be as similar as
possible to each other and dissimilar to those in the other clusters. Unsupervised classification is a
useful tool in exploratory data analysis which allows pattern discovery in the data. Hence, clustering
results can be used for hypothesis generation and testing during scientific inquiry. The applications of
unsupervised classification methods are extensive such as in problem solving and planning,
engineering, natural language processing, information retrieval, etc.
Classification processes concern every types of knowledge representation as well as object, conceptual graphs,
neural networks or rules. This is a keystone of the knowledge management, and more particularly for base
acquisition and organisation. It find also several applications in other reasoning processes such as CBR or data
completion. Several tools are available, depending on the knowledge representation, the objectives of the
classification and the technical field.
Constraints Satisfaction Problems
Intuitively, a constraint satisfaction problem (CSP) is a search problem that involves finding an (or several)
object (s) that satisfies a given specification. CSP is basically a set of variables with finite domains which are
constrained by certain conditions. CSP may have a very great number of possible variable assignments that
needed to be tested to find an optimal solution. Basically one encounters an exponential explosion of the
search space volume when increasing the number of variables. That is why it is interesting to develop heuristic
search algorithms to solve the problems within an acceptable time without searching the complete volume.
Constraint logic programming provides an efficient problem-solving environment for solving constraint-based
configuration tasks, often characterised as variants on dynamic constraint satisfaction problems. Several
efficient tools are available and specifics technical applications are successful.
Another interesting problem deals with constraint management. This problem does not consist of solving a
constraint problem, but to assure that the solution satisfies the constraint. It is usually the case in design and
various decision support techniques. It involves several aspects :
- Constraint representation,
- Constraint recognition,
- Constraint Propagation,
- Constraint satisfaction.
Numerous domains of IA such as the theory of constraint hierarchies, the hierarchical constraint logic
programming, and constraint solvers contribute to the development of this problem.
The interest of constraint management has been its integration within the other knowledge representation such
as object and conceptual graphs. Constraints have been used in a variety of languages and systems, particularly
user interface toolkits, in planning and scheduling, and in simulation.
Genetic Algorithms (GA) are adaptive methods which may be used to solve search and optimisation problems.
They are based on the genetic processes of biological organisms. Over many generations, natural populations
evolve according to the principles of natural selection and "survival of the fittest", first clearly stated by
Charles Darwin in The Origin of Species. By mimicking this process, genetic algorithms are able to "evolve"
solutions to real world problems, if they have been suitably encoded. For example, GA can be used to design
bridge structures, for maximum strength/weight ratio, or to determine the least wasteful layout for cutting
shapes from cloth. They can also be used for online process control, such as in a chemical plant, or load
balancing on a multi-processor computer system.
GA are not the only algorithms based on an analogy with nature. Neural networks are based on the behaviour
of neurones in the brain. They can be used for a variety of classification tasks, such as pattern recognition,
machine learning, image processing and expert systems.
Before a GA can be run, a suitable representation (or coding) for the problem must be devised. It also requires
a fitness function, which assigns a figure of merit to each coded solution. During the run, parents must be
selected for reproduction, and recombined to generate offspring.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 35
Barriers to LSE uptake
This problematic, knowledge management, is a very recent concept based on emergent technologies. First
developments and applications really involving knowledge as a well identified concept concern specific and
limited technical fields. In addition, they did not be all successful. Used technologies (generally coming from
Artificial intelligence) was always immature and not really efficient, mainly due to the poor capacities (speed
and memories) of computers.
The main impediments to LSE uptake are both social and technical.
- Social approach : Companies only begin to understand the advantages of capitalising knowledge for
their activities. Then they have to develop specific competencies for their knowledge management, in
term of policy, specialists, and resources. In order to develop knowledge based system, companies have
to be involved more than in a classical software, because it concerns directly their own know how and
specifities. Furthermore, firms must have more faith in the technology (i.e. Intelligence Artificial tools).
- Technical approach : Nowadays, a total lack of standards and taxonomy of possibilities make a
comparison or a choice between the different tools very difficult even impossible for a non specialist. It
is the case for the knowledge representation and for the capacities of reasoning. As long as any standard
(even de facto standard) are not available, companies will hesitate to invest in technologies which risk
The second main problem is related to the knowledge base construction. Efficient KBS needs high
quality Knowledge base. Knowledge engineering is a difficult and tedious enterprise because experts
are often unable to translate the mental processes that they use in solving problems in their domains of
expertise into a sufficiently detailed, perfectly-defined, set of rules or procedures. Before extracting
knowledge, it would be necessary to train experts to this very specific task. Furthermore, it is
indispensable to develop knowledge acquisition and learning , which are the keystone of the future of
Trends and expectations
Every available technologies are always in development phase, in order to increase their reliability and
performances. Every days, new features and applications field are explored. At the present time the challenge
in terms of knowledge management involves two objectives :
- Capitalising knowledge : this means that existing knowledge has to be the base of new knowledge
which will enrich the knowledge fund of the company.
- Sharing of knowledge : several systems using various reasoning, perhaps built on different knowledge
representation, manipulate in fact the same knowledge. Furthermore, these systems will probably
Then, four major trends emerge at this time :
- Representation knowledge level
- Sharing of knowledge
- Knowledge acquisition :
- Distributed knowledge through the WEB
Representation knowledge level
Every type of representation has its own domain of application. The aim is not to propose an universal mode
of knowledge representation, but to have a better definition of the existing ones. Too different dialects co-exist
for the same knowledge representation. Then, some actions aim at defining common languages for each
representation, for instance Object oriented knowledge principles are now sufficiently defined to be
Efforts of formalisazion of knowledge representation are developed under the concept of ontology. Some of
researchs propose ontologies as another knowledge representation, but their real interest is in their capacities
of abstraction. Then ontologies are place as a meta level which allow the conceptualisation of knowledge.
It is too soon to judge the results of the huge effort of research, but they correspond to a real need of
capitalising, reuse and sharing (see next paragraphs) of knowledge.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 36
Tom Gruber (Knowledge system laboratory - Stanford University) define the term ontology as a specification
of a conceptualisation, used for making ontological commitments. For pragmatic reasons, one chooses to write
an ontology as a set of formal vocabulary in a way that is consistent with respect to the theory specified by the
ontology. To define common ontologies means defining vocabularies of representational terms with agreed-
upon definitions, in the form of human readable text and machine-enforceable, declarative constraints on their
well formed use. Definitions may include restrictions on domains and ranges, placement in subsumption
hierarchies, class-wide facts inherited to instances, and other axioms. describing a domain or performing a
task. Each ontology embodies a set of ontological commitments in a form that enables one to build knowledge
bases and tools based on those same commitments. The aim is to build libraries of shared, reusable knowledge.
If the specification of a standard declarative language is like a grammar of English, ontologies are reference
works akin to dictionaries. Libraries could contain "off-the-shelf" knowledge-based tools that perform well-
defined tasks such as varieties of simulation, diagnosis, etc. Ontologies specify the terms by which a tool user
writes the "domain knowledge" for the tool, such as the equation models that drive a simulation or the
components and failure mode descriptions used by the diagnostic engine. A knowledge library could also
contain reusable fragments of domain knowledge, such as component models (e.g., of transistors, gears,
valves) that can be composed to produce device models (e.g., of amplifiers, servo mechanisms, and hydraulic
systems). Ontologies define various ways of modelling electrical, mechanical, and fluid flow mechanisms that
make such reusable component libraries possible.
In practical terms, several prototypes are built on ontologies principles, using various implementation choices.
Resource notes : "The Role of Common Ontology in Achieving Sharable, Reusable Knowledge Bases"
Thomas R. Gruber - Knowledge Systems Laboratory Stanford University 701 Welch Road, Building C Palo
Alto, CA 94304 - 31 January 1991
Sharing of knowledge
This problematic is very close to the precedent one, but involves different practical solution, which are
realistic in short term. Nowadays, the four main impediments to sharing and reuse of information are the
- Heterogeneous representations
- Dialects within language families
- Lack of communication conventions
- Model mismatches at the knowledge level
Sharing of knowledge involves knowledge base organisation, structured in several level since a practical level
of knowledge, directly exploited in reasoning, to an abstraction level used for reuse and sharing. Sharing
knowledge needs also well defined procedures to perform exchange of knowledge between several
Numerous research projects develop several approaches such as :
- Knowledge Interchange Format (KIF) : KIF is a neutral specification language for the structures and
relationships of a typical knowledge representation. It encompasses first-order predicate calculus and
frame objects but it is not of itself a reasoning system. In computing terms, it is closest to a mechanism
for type definition along with type restrictions in order to specify semantic integrity constraints that a
compiler or translator can test and verify.
- KQML : Knowledge Query and Manipulation Language is in fact a protocol by which queries can be
exchanged between agents.
- An approach to interoperation based on programs called agents. Agents use the Agent Communication
Language (ACL) to supply machine processable documentation to the system programs (called
facilitators), which coordinates the activities of the agents. The facilitators assume the burden of
interoperation and the application programmers are relieved from this responsibility.
- KQL : Knowledge Query language which claims to be analogous to SQL (Standard query Language for
relational DBMS) for KM knowledge base.
Nowadays, it is difficult to detect an emergent solution, but this trends concentrates great efforts., and will
certainly produce exploitable results soon.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 37
Knowledge management has also great hopes in Knowledge acquisition, the key stone of this technology,
particularly in two axes :
- Abduction involves the generation of candidate hypotheses for observed or otherwise given facts in
terms of background knowledge and additional assumptions or hypotheses. The utility of such
hypotheses is primarily in terms of guiding the search for useful theories of particular domains of
interest. Abduction when used with analytic explanation-based learning can help extend background
knowledge through the proposal and evaluation of candidate explanations. Unlike induction and
deduction, the use of abduction in machine learning has not been widely explored, yet abduction
appears to play a central role in learning and discovery.
- Learning in evolutionary systems : since most task learning can be formulated as essentially search
problems, it is natural to apply evolutionary algorithms in machine learning. Classifier systems, Genetic
algorithms, evolutionary programming are all inspired by processes that appear to be used in biological
evolution and the working of the immune system. These evolutionary methods offer relatively efficient
mechanisms for exploring large spaces in the absence of other sources of information to direct search.
Distributed knowledge through the WEB
Knowledge modelling involves the management of many knowledge sources often geographically distributed.
The World Wide Web is a distributed hypermedia system available internationally through the Internet. It
provides general-purpose client-server technology which supports interaction through documents with
embedded graphic user interfaces. Knowledge modelling tools operating through the web to support
knowledge acquisition, representation and inference through semantic networks and repertory grids. It
illustrates how web technology provides a new knowledge medium in which artificial intelligence
methodologies and systems can be integrated with hypermedia systems to support the knowledge processes of
professional communities world wide.
The development of knowledge-based systems involves knowledge acquisition from a diversity of sources
often geographically distributed. The sources include books, papers, manuals, videos of expert performance,
transcripts of protocols and interviews, and human and computer interaction with experts. Expert time is
usually a scarce resource and experts are often only accessible at different sites, particularly in international
projects. Knowledge acquisition methodologies and tools have developed to take account of these issues by
using hypermedia systems to manage a large volume of heterogeneous data; interactive graphic interfaces to
present knowledge models in a form understandable to experts; rapid prototyping systems to test the models in
operation; and model comparison systems to draw attention to anomalous variations between experts.
The combination of web and knowledge-based systems technologies will make artificial intelligence systems
widely accessible internationally and allows innovative knowledge-based multimedia and groupware systems
to be developed..
Relevance to LSE
Some companies are beginning to feel that the knowledge of their employees and their passed activities is their
most valuable asset. They may be right, but few firms have actually begun to actively manage this knowledge
assets on a broad scale. Some little development have been realised for solving specific limited problems, but
pragmatic discussion on how knowledge can be managed and used more effectively on a daily basis should be
addressed at either a strategical or a technological level.
LSE knowledge, takes place in three different environments :
- Common LSE environment : that is mainly regulation, rules, general technical knowledge sharable
between all the LSE firms.
- Intra enterprise : This is the specific technical, organisational knowledge of each firm, representing its
originalities and its strength.
- At the level of a specific project : A temporary enterprise, representing a global knowledge used to
carry the project through to a successful conclusion.
It is indisputable that LSE companies future depends on the development of a real knowledge management.
But before using KM technologies they have to ask themselves some questions and establish a KM policy.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 38
Some other experiences lead to establish some principles that are a good starting point ("Some Principles of
Knowledge Management" by Thomas H. Davenport, PhD) :
Ten Principles of Knowledge Management
1 Knowledge management is expensive (but so is stupidity!)
2 Effective management of knowledge requires hybrid solutions of people & technology
3 Knowledge management is highly political
4 Knowledge management requires knowledge managers
5 Knowledge management benefits more from maps than models, more from markets than from
6 Sharing & using knowledge are often unnatural acts
7 Knowledge management means improving knowledge work processes
8 Knowledge access is only the beginning
9 Knowledge management never ends
10 Knowledge management requires a knowledge contract
It is very early days for knowledge management, and even the previous principles and rules of will engender
considerable disagreement. The positive aspect is that almost anything that a firm does in managing
knowledge will be a step forward.
Knowledge management, is a very recent concept based on emergent technologies. At the present time a great
variety of knowledge reasoning developed in the context of Artificial Intelligence, is available but two main
barriers freeze its usage :
- Technological barrier: Efficient KBS needs high quality Knowledge Base. This means to have
- efficient knowledge acquisition methodologies and tools. This point is the major objective of the
research team in knowledge management for the next years.
- the possibility of sharing and reuse of knowledge. This second aspect is on the main importance for
a firm which wants to maintain as far as possible only one Knowledge base, but uses this knowledge
within different systems. This implies universal (or less ambitious common) knowledge
- Social barrier: Companies only begin to understand the advantages of capitalising knowledge for their
activities. Then they have to develop specific competencies for their knowledge management, in term
of policy, specialists, and resources.
As long as effective technologies are not available, Knowledge Management will be problematic. However,
knowledge as the key stone of every activities of industry has to be managed, LSE industry must presently
prepare the availability of new tools by analysing the social impact on the firm and beginning the study of
their specific knowledge.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 39
Facing an increasing competitive environment where flexibility, adaptability to change, product quality and
value are the obliged route to success, companies have to renew their working habits. As a consequence, the
concept of Virtual Enterprise (VE) is emerging through the development of partnerships and collaboration
agreements, the outsourcing of product components design where production leads to delocalised
manufacturing processes in which each partner (manufacturer, suppliers, technical specialists, etc.) is focusing
on his core domain of competence for the shared profitability of the industrial projects. The central issue of the
project VE is the functional and shape oriented representation of a typical industrial enterprise. Based on the
main business processes, the VE displays the internal relations and information flows between the different
departments of this VE (e.g. business planning, research & development, operations scheduling,
manufacturing, sales etc.).
A Definition of a virtual enterprise may be the following :
"Extending the Concurrent Engineering strategy, a Virtual Enterprise is a temporary consortium of companies
that come together quickly to explore fast-changing business opportunities. Within the Virtual Enterprise,
companies share costs, skills and profitability. They also access global markets as a unique entity. Finally, the
Virtual Enterprise is gracefully dismantled when its business purpose is fulfilled."
The keystones of a VE are communication and organisation.
- Organisation : A VE provides a foundation for specific business solution that integrates people, process,
and strategies in an enterprise model, unconstrained by time, place or form. Well the challenge of VE is
to model new organisation, to be quickly adapted to new environment and to evolve continuously.
- Communication : The necessary IT infrastructure that wires the Virtual Enterprise partners together as
the main medium to information communication among the consortium has to be mentioned as part of
the definition as it stands for the required backbone of the Virtual Enterprise. Information Management
conveys certainly the right answer to this problematic. As a whole, Information Management means the
availability of storing and conveniently retrieving any piece of information which is needed within the
frame of a given project. The main problems can be summarised as follows : how to organise
information so that retrieval is easy while remaining independent of any application ? What use of the
existing Information Standards can be done ? How to design and implement an Information
Management Infrastructure ? How to fill it in once implemented ? How to communicate information
from the Information Management Infrastructure towards the various users working places ? These
questions stands for key challenges towards business competitiveness and success.
What exists today
One can identify three technological levels to build a virtual enterprise software :
- A level of Communication which is composed of a physical support and standards of communication.
The notion of VE needs an efficient permanent network that allows to immediately create a VE
between various partners all over the world. Nowadays, Internet in spite of its limitations represents
such a physical support. The used communication protocols would be standard, easy to implement and
- Organisational level : It concerns the functional model of the VE, the workflow between the actors, all
the organisational aspects that allow the management of the VE.
- Virtual Enterprise Environment, integrating human computer interfaces, aims at creating, parametrizing
easily and quickly the VE. The visualisation of the VE allows the user to explore interactively the
complex world of an industrial enterprise
The two first levels rely on the notion of "plug and play". This notion takes place at different levels :
- Hardware plug and play corresponds to the problem of device connection with computers.
- Object plug and play is at the level of standard of communication. It addresses the different object
managers and databases which has to be assembled to exchange objects as necessary without excessive
dependency on the user's knowledge.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 40
- Semantic plug and play means that different applications can exchange specific types of objects, each
with specific roles in each of several applications. So applications interact by providing services to one
One major requirement for the formation of a VE is a transparent interface between the units. An ideal
interface would allow the units to establish communication links with minimum effort. The communication
protocols of such an interface would be standard, easy to implement and flexible. The units would be able to
access product development resources through an existing data communication network.
With that respect, advanced IT, in particular existing or emerging Information Standards have to be
acknowledged for the desired Information Management Framework to guaranty interoperability among various
software components. The deliverable D301 "the supporting environment" proposes an overview of these
different. These standards addressed the physical way of communicating and the format of the contents of
- EDIFACT the International Standard for Electronic Data Interchange,
- CORBA, the OMG specifications tackling multi-platform interoperability,
- STEP, the International Standard for product data exchange,
- SGML, the International Standard for document representation,
- and of course the WEB technologies appear to be the reliable components towards software
architectures bridging the gap between multiple and delocalized proprietary software systems and thus
satisfying industrial needs.
This level relies on the semantic plug and play level. Explicit models of the required tools and provided by
each tool, are mapped to the models of the enterprise: Models packaged with the tools are made available to
the modelling repository, which already has access to the models of the current enterprise. These models
define the services to be exchanged between the tool and other tools within the enterprise, including data,
processing and knowledge services (i.e. the workflow). Modelling language standards allow to provide models
to VE software. They could do so through any of several standard forms:
- CDIF (CASE Data Interchange Format) from EIA (Electronic Industries Association)
- EXPRESS from STEP (STandard for the Exchange of Product model data )
- IDEF (Icam DEFinition) based upon Softech’s Structured Analysis and Design Technique (SADT)
Unfortunately, these standards are relatively immature or are not fully implemented in the modelling tools
available on the market. Moreover the standards are overlapping and non-interchangeable. A vendor who
selected a CDIF-based tool could not supply models to an EXPRESS-based, or vice-versa. But vendors should
not have to know or care whether their customers model in CDIF or EXPRESS or IDEF, no more (now that
STEP is in place) than CAD tool vendors need to know which product data managers are used by their
Standards organisations should strive to develop process-driven modelling techniques. Every STEP application
protocol (AP) is driven by a consensus business process, as expressed in an Application Activity Model
(AAM). The process is defined generically so that it can be tailored to the details of any enterprise, but the
process provides a context for deciding what kinds of data services must be exchanged through a STEP
standard. An AAM is not really exploitable for a VE development.
At present the integration of a tool into an environment is not accomplished through model mapping, although
reverse engineering a model from a tool might be used as a step in that process. Even if tools came out of the
box with complete models, there are neither technology nor techniques nor standards available to use those
models to support the integration process. Instead tool integration is achieved at the implementation level by
writing translators or APIs that physically translate a client view into a server view. Modelling tools that might
provide cost or impact analysis are not generally used to accomplish such a mapping. Some modelling
languages, such as EXPRESS and CDIF, have capabilities that could be used to relate the objects in one model
to those in another, but those capabilities were not designed for the application of modelling to semantic plug
and play environments. However, ISO TC184/SC4 has done some work in this area. STEP is investigating the
development of a dialect of EXPRESS (currently referred to as EXPRESS-X) to be used as a mapping
language between EXPRESS models.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 41
Virtual Enterprise Environment
Today, this notion addresses mainly the class of software called groupware. Definitions of groupware vary.
The sense of groupware has evolved as the term used to describe a class of computer technologies that enable
information sharing, coordination and collaboration between groups of people. Groupware has it's initial roots
in electronic mail systems, as these were the first systems to provide computer-to-computer, electronic
communication. As the need to work together as groups became more prevalent and as the workgroup
technology environment matured, email alone began to fall short. In addition to the ability to universally share
collected information, it was necessary to interactively collaborate on subject matter, and share that
"collaborative memory" with others as they became involved later in the process. Also, distributed workgroups
became more prevalent, resulting in the formation of vast "information islands". Groupware technology, as it
is now evolving, is addressing these previous issues. It has given the ability to store information and
universally share this information with others. It has enabled automated workflow and provided an
environment that moves us away from paper. It has also allowed us to "extend" our participation to virtually
every corner of the world. More importantly, it has done so with consideration to workers regardless of
physical location. Whether it be in a home, car, plane, mobile office, hotel room, office, or the beach, we all
can be active participants in the information sharing workspace.
In the past few years, software companies have taken electronic communications to the next step by
introducing mechanisms to collect, retain and distribute electronic information (e.g. text, graphics, voice,
sound and video) in a collaborative environment. The flagship product of Lotus Corporation, Lotus Notes, has
literally "set the standard" for workgroup and raised the awareness of groupware to the level it is today. Other
vendors offering groupware products include Oracle Corporation, Digital, Team Software Inc., Action
Technology Inc., ForeFront Group Inc., ICL Inc. and others. Recently, Collabra Software introduced a new
product to the market, Collabra Share. This product builds upon existing e-mail systems to provide support for
forums and electronic discussions. This collaboration capability is one of the most widely used features of the
Lotus Notes product. Other major players to soon enter the groupware market are: Microsoft with Microsoft
Exchange and Novell with their Wordperfect, FileNet, ICL and Xerox alliance. Other categories of products,
converging in the market at the same time, expand the groupware offerings. A summary of these products
categories are: advanced messaging capabilities, electronic document management, workflow, group decision
& support, calendaring/scheduling and electronic/video conferencing, shared whiteboard, etc. This, coupled
with the notion of "information highway" and the "Cyberspace of the Internet" has escalated the idea of
information sharing not just within companies, but with all participants in society at large. Thus the explosive
growth in the communications and computers industries, as the infrastructures and mechanisms to support
connectivity and information exchange "anywhere, anytime" evolves.
The present evolution of the notion of groupware will propose an incomplete response for the VE architecture.
Nowadays, some new softwares are addressing the capacities to define and build quickly a VE to rapidly
develop new business opportunities. For instance, Andersen Consulting’s Center for Strategic Technology
Research (CSTaR ®) developed the Prairie prototype: through innovative technologies such as knowledge
management, teamware, software agents, video conferencing, high-bandwidth networking, and megapixel
displays, Prairie envisions a business environment where people, process, technology, and strategy are
The materialisation or the visualisation of the VE is also an important aspect needed for understanding its
temporary and moving configuration. No real product satisfies this objective.
Barriers to LSE uptake
If the LSE industry has been rather successful in implementing from place to place sophisticated software
solutions, which too often unfortunately look very much like islands of automation, however, these have never
been integrated into enterprise wide systems.
No system takes in charge the complete range of needs companies are facing in their daily management and
operating job. The self-contained Business Management Framework (BMF) that would support the multiple
facets of designing, manufacturing, maintaining and decommissioning an industrial product, whether these
product life-cycle stages are accomplished by the company itself or result from a collaborative effort between
the company and some partners, doesn’t exist. Even though the vertical integration of these processes within a
single enterprise, aiming at a true collaborative and concurrent engineering practice is a goal that has not yet
been fully realised, the exigencies arising from today's requirements have forced industry to look beyond this
goal toward the horizontal integration in the Virtual Enterprise.
The next generation of BFM should result from the integration or interoperability of these systems. However,
most of them, partly for competitive reasons, partly for reasons of functionality, are associated with the use of
CSTB - G. Sauce ELSEWISE 20/05/2010 - 42
proprietary (ie non-standard) information representation formats. Information, even if it already exists on a
numeric storage unit, needs often to be retyped from a format to another. Another point is the fact that
electronically stored information may not be presented in an understandable way and may be difficult to
retrieve. The incapacity to support the distribution and communication of information on a wide basis is also a
The three keystones to the development of VE are the followings :
- An existent world wide efficient physical support. Internet represents the basis of VE communication,
but it must evolves in term of speed of transfer, reliability and security.
- Standardisation is indispensable to allow "plug and play" possibilities between every companies, at the
level of communication (i.e. supporting environment) and semantics. This level of semantics refers to
knowledge and data management.
- Generic models (also called sometimes "meta-models")for VE organisation. It is necessary to develop
model which allows every VE configurations, describing the actors, workflow, etc.
Trends and expectations
Targeting VE development tools, several prospective initiatives have been launched at the European level
under the auspices of the Esprit IV Research programme (e.g. RISESTEP and VEGA projects) or at various
national level (NIIIP project in the US, CORENET project at Singapore).
The RISESTEP project targets STEP distributed technologies as an approach to bridge the gap between
multiple and delocalised PDM and CAD systems in the aeronautic and automobile industries. Beyond
proprietary platform, distributed architectures based on the STEP and CORBA standards appear to be the way
towards software architectures satisfying industrial needs. Building upon dedicated end-users scenarios,
RISESTEP is willing to demonstrate the feasibility and relevance of such approaches at an industrial level.
Targeting the Architecture Engineering and Construction domain and LSE domain, the VEGA project also
aims at establishing an information infrastructure which will support the technical activities and the business
operations of the upcoming Virtual Enterprises in this sector. Groupware tools and distributed architectures
developed in good compliance with on-going standardisation activities stands for the overall approach to
satisfy the broad expectation shared by the end-users and expressed as a consensus about the need to support
Other projects are interested in VE environment. Central issue of the project Virtual Enterprise (Heinz Nixdorf
Intitut - Univerität-GH Padeborn) is the functional and shape oriented representation of a typical industrial
enterprise. Based on the main business processes, the Virtual Enterprise displays the internal relations and
information flows between the different departments of an enterprise (e.g. business planning, research &
development, operations scheduling, manufacturing, sales etc.). The visualisation of the Virtual Enterprise is
realised by a virtual environment (VE) allowing the user to explore interactively the complex world of an
Standardisation will produce soon an efficient framework to VE architecture development. Formal languages
for workflow and distribution modelling form the subject of great efforts of standardisation in the context of
International projects (e.g. STEP and Esprit IV)
Relevance to LSE
Large scale projects require the involvement of many body entities (client, architect, design engineers,
specialists from different technical disciplines, technical controllers, construction companies) sitting at various
locations, with different views and needs on the project. Furthermore, this temporary team specially created
for one project, is usually reorganised for another one. Even more, its organisation evolves during the life
cycle of the operation, from the inception to the demolition. On the information side, numerous documents of
diverse nature are involved in the construction process. Some of them such as regulations define the legal
context of a project. Others like masterplan drawings, technical specification documents or bills of quantities
are generated by the engineering activities and often have a contractual importance. Drawings are the straight
forward media to convey most of the information needed by the construction companies. They include a lot of
information that can hardly be put into words. Textual documents are complementary to drawings. They
traditionally support the engineering aspects of the project description.
Then, the functioning of the LSE industry already corresponds to the notion of virtual enterprise. Furthermore,
the complexity of engineering projects has increased markedly along over the past two decades regarding their
CSTB - G. Sauce ELSEWISE 20/05/2010 - 43
scale, strengthening constraints (quality, duration, cost, regulations), number of actors involved, volume of
information required and produced.
New work models and advanced technologies allow to support this organisation, corresponding to a business
entity defined not by real estate but by business opportunities. Because the VE is not limited by the constraints
of physical reality (i.e. the location and form of resources, the time required to access or modify them) it will
operate more cheaply, respond faster, change more easily. Virtualising the operations of an enterprise will
soon become a requirement for conducting business. The true value of virtualisation, however, will depend on
the ability to align and integrate the enterprise's people, processes and technologies with its business strategies.
By freeing these elements from the constraints of time, space and form imposed by the physical world, the
enterprise of the future will make quantum leaps in performance, realise the full value of its people and its
knowledge capital, and discover new opportunities for doing business.
Flexibility and adaptability are certainly the main benefits of the Virtual Enterprise. Flexibility and
adaptability are associated with the straightforward way such consortia are formed and dismantled : upon
(market) requests. The capacity of reactivity (i.e. quickness of creation, adaptation and evolution) of the
Virtual enterprise represents also an advantage in terms of business and adaptation to the market. More
explicitly, creating a Virtual Enterprise is becoming the natural way to quickly address business opportunities
in a fast changing environment.
Furthermore, the quality of the project will increase with the reliability of the information exchange and
transfer. Effectively, nowadays, numerous dysfunctions, leading to fall of quality, defaults and additional costs
are mainly due to losses of information during their exchanges between partners.
The concept of Virtual enterprise represents the main expectation of LSE industry. It corresponds to the
product point of view. Available software offers Partial response to virtual enterprise. However, enough
technology is available, notably relating to the communication level, to develop representative experiments
(cf. RISESTEP, VEGA). But, as for Data Management, this computer technological level needs LSE
knowledge to be implemented :
- Product Data model
- LSE Workflow model
Moreover, these results should be standardised to ensure a communication between every possible partners.
The STEP standard offers a response of the first point, the second must be considered in future actions.
In addition, ergonomic software should be developed, giving a certain reality to virtual enterprise. The virtual
reality will provide new way of representing the complexity of this association.
A last question is related to which network should be used : Internet has the advantage of a world wide
coverage, but problems of security and speed rate have to be solved.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 44
A large panel of systems and technologies are available at the present time, LSE industry uses a great part of
them. Numerous progress are expected notably relating to
- Human Interface
- Recognition techniques
- Virtual reality
- Data Management
- Data modelling methodologies
- Knowledge Management
- Sharing and reuse
- Virtual reality
- Whole software
At the present time, LSE industry has to make great effort to adapt the existing technologies to its own needs
and specifities. The main part of the knowledge of LSE is not identified and formalised. Systems and
technologies offer enormous possibilities but are only an empty container. LSE actors are responsible of the
material needed to fulfil these containers.
This means they have to develop standards in the domain of workflow model and product data model. The
must presently begin to collect and organise their knowledge.
CSTB - G. Sauce ELSEWISE 20/05/2010 - 45