• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content







Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    3100_03.doc 3100_03.doc Document Transcript

    • Classification : Restricted Deliverable Reference WP Task No. 3 3100 D302 Status Draft/Issued/Revised Rev. working 310003doc3983 .doc Date Release to CEC CSTB - G. Sauce ELSEWISE 03/04/1997 01:58:00 PM - 20/05/2010 - 1
    • Document Control Sheet Amendment Record ESPRIT 20876 - ELSEWISE Revision Status Page Nos. Amendment Date By 1 WP3 : PDIT Support (draft Report Section) Systems and technologies Authors: Gerard Sauce CSTB Contributors Dee Chaudhari BICC Theo van Rijn TNO Alastair Watson Leeds University Distribution _01 _02 _03 Date 28/1/97 14/3/97 3/4/97 TW * * * BOU * * * HBG * * * SKA * BICC * * CAP * * * CSTB * * * DSBC * ECI * ULeed * * * RAM * RAKLI * TNO * * * TR * CEC * VTT * * CSTB - G. Sauce ELSEWISE 20/05/2010 - 2
    • Table of contents CSTB - G. Sauce ELSEWISE 20/05/2010 - 3
    • Introduction This Work package aims at detailing a critical evaluation of the key technology and methodology developments in PDIT which are likely to provide the tools to support and shape the LSE Industry in the future. It concentrates on those technologies, methodologies and standards most likely to impact on the working and thinking in LSE. A layered approach will be adopted, considering first the supporting PDIT environment (Task 3000), then the systems and technologies (Task 3100), and finally the application software (Task 3200). (c.f. Figure 1 : Layered approach of the key technologies) T3 0StmnT h l g A 0: yes de noe 1 s a c o is di ii n e t f o n S pt g u on pri Eio e n nn v m r t Stmn yes d s a Th l g e noe c o is Alci n pi a p t o st a o r f e w de e e nn pdt oH w na a r r d e ide e npdt e nn oH w a Ur na a n s r r d e d e de e e nn pdt oUr nse Figure 1 : Layered approach of the key technologies This task 3100 deals with the intermediary layer between users applications and supporting environment. The main characteristic of systems and technologies is the independence from hardware on one side and from users on the other side. A software of this category has to be adapted to the technical domain by specifying the technical knowledge and the particular environment of the user in order to develop a specific application. Moreover it is possible to find at least one software implementing each of these technologies on every kind of platforms. At each stage, the objective is to provide an assessment of the existing technologies, the trends, and an evaluation of the relevance to LSE. The aim being to report what can be expected to exist by 2005. The methodology will be to use existing sources, to review published material and to consult with the key players. It is important to note that our objective of identifying key technologies doesn't include an analysis of the software market. Effectively, we list and detail the main technologies in order to give to the lecturer the general concepts and the key point to go further. The latest will found neither an assessment of specific software, nor a testing protocol to choice one of them. This analyse report addresses LSE industry, but needs some IT background to be understood. The scope of this task covers a large panel of systems and technologies which we have grouped into four categories : - Human Interface : Human-Computer Interface considers anything which allows a user to interact with a computer. Examples are command line interpreter, windows, icons, menus, pointer or virtual reality. This doesn't include the different devices and way of communication between human and computer. We analyse Graphical User Interface, Virtual reality, technologies of recognition and voice synthesis. CSTB - G. Sauce ELSEWISE 20/05/2010 - 4
    • - Data Management: Data Management consists of controlling, protecting, and facilitating access to data in order to provide information consumers with timely access to the data they need. Data modelling, Object databases, distributed data bases and access via Internet are detailed. - Knowledge management Within this chapter, the whole sense of knowledge, including both object of knowledge and process of knowing are considered. We analyse the three main aspects of knowledge managment : representation, acquisition and usage - Virtual enterprise Even if this concept and the relevant technology are recent, several systems are available to allow the creation of part of Virtual Enterprise. A Virtual Enterprise is defined as a temporary consortium of companies that come together quickly to explore fast-changing business opportunities. This definition is detailed through the communication level, the organisational level and the virtual enterprise environment. For each key technology, this report tries to give an answer to the four following questions: What exists today? What are barriers to LSE uptake? What can we expect for 2005? What are the relevance for LSE? At the end of each chapter, a summary emphasises the main points, particularly barriers and relevance to LSE industry. CSTB - G. Sauce ELSEWISE 20/05/2010 - 5
    • Human Interface Introduction Two definitions could be associated to the abbreviation "HCI" : 1 - Human-Computer Interaction : The study of how humans use computers and how to design computer systems which are easy, quick and productive for humans to use. 2 - Human-Computer Interface : Anything which allows a user to interact with a computer. Examples are command line interpreter, windows, icons, menus, pointer or virtual reality. In this work, the focus is put on the second definition of HCI, considering mainly the notion of interface. This problematic should be characterised by five key words : communication, language, interaction, process and ergonomics. The three main following aspects summarise the latest. Human characteristics Human characteristics govern how people work and how they expected to interact with computers. To design the best interface, it is important to understand how human processes information, how he structures his actions, how he communicates and what are his physical and psychological requirements. The Human information processing has been and is always the object of numerous research which led to various models : models of cognitive architecture, symbol-system models, connectionist models, engineering models, etc. Language and communication are so natural to us that we rarely think about it. Human interaction using natural language not only depends on the words spoken but also on tone of voice, which enables parties to interpret the semantics of the spoken message. In vis-à-vis interaction this is emphasised by the physical body language, consciously or unconsciously shown by people. Developing a language between human and computers underscores the complexity and difficulty of a language as a communication and interface medium. Ergonomics. Anthropomorphic and physiological characteristics of people can be related and applied to workspace and environmental parameters. Some frequent problems people using computers face today are the operation of the computer, using a pointing device, such as the mouse, and watching the screen. Labour laws restrict the periods people are allowed to spend behind the computer, trying to prevent sight problems and Repetitive Strain Injury (RSI). Use and Context of Computers The general social, work and business context may be important and has a profound impact on every part of the interface and its success. • The social organisation , the nature of work must be considered as a whole. this concerns points of view, models of human , models of small-groups and organisations, models of work, workflow, co-operative activity, offices socio-technical systems, etc. Let us remark an important notion : human systems and technical systems mutually adapt to each other. Once a system has been written, most of the adaptation is by the human, Effectively, he may be able to configure the software, but not much software will yet adapt (automatically or not) to a user. • There are classes of application domains and particular application areas where characteristic interfaces have been developed for example : document-oriented interfaces (e.g., text-editing, document formatting, structure oriented editor, illustrators, spreadsheets, hypertext), communications-oriented interfaces (e.g., electronic mail, computer conferencing, telephone and voice messaging), design environments (e.g., programming environments, CAD/CAM), etc. Part of the purpose of design is to arrange a fit between the designed object and its use. Adjustments to fit can be made either at design time or at time of use by either changing the system (or the user at last) and changes can be made by either the users themselves or, sometimes, by the system. CSTB - G. Sauce ELSEWISE 20/05/2010 - 6
    • Computer systems and architecture. Computers have specialised components for interacting with humans. Some of these components are basically transducers for moving information physically between human and machine (Input and Output Devices). Others have to do with the control structure and representation of aspects of the interaction (basic software). Devices must be constructed for mediating between humans and computers. They are mainly input devices (keyboard, mouse, devices for the disabled, handwriting and gestures, speech input, eye tracking, exotic devices such as EEG and other biological signals) and output devices (display, vector and raster devices, frame buffers and image stores, canvases, event handling, performance characteristics, devices for the disabled, sound and speech output, 3D displays, motion e.g. flight simulators, exotic devices). This technical aspect is out of scope of this chapter. The software basic architecture and techniques for human computer interaction recover four aspects : a - Dialogue genre : The conceptual uses to which the technical means are put. Such concepts arise in any media discipline such as film and graphical design. (Workspace model, transition management, design, style and aesthetics b - Dialogue inputs : Types of input purposes (e.g., selection, discrete parameter specification, continuous control) and input techniques: keyboards (e.g., commands, menus), mouse-based (e.g., picking, rubber- banding), pen-based (e.g., character recognition, gesture), voice-based. c - Dialogue outputs : Types of output purposes (e.g., convey precise information, summary information, illustrate processes, create visualisations of information) and output techniques (e.g., scrolling display, windows, animation, sprites, fish-eye displays) d - Dialogue interaction techniques : Dialogue type and techniques (e.g., commands, form filling, menu selection, icons and direct manipulation, generic functions, natural language), navigation and orientation in dialogues, error management, and multimedia and non-graphical dialogues: speech i/o, voice and video mail, active documents, videodisk, CD ROM. The Dialogue techniques are the subject of this section. What exists today We identified three groups within the existing technologies : - Graphical User Interface, - Virtual Reality, - Other Interfaces Graphical User Interface (GUI) The style of graphical user interface invented at Xerox PARC, popularised by the Apple Macintosh and now available in other varieties such as the X Window System, OSF/Motif, NeWS and RISC OS, are also known as WIMP (Windows, Icons, Menus and Pointers or maybe Windows, Icons, Mouse, Pull-down menus). GUI uses pictures rather than just words to represent the input and output of a program. A program with a GUI runs under some windowing systems (e.g.; The X window System, Microsoft Windows, Acorn RISC OS). The program displays some icons, buttons, dialogue boxes etc. In its window on the screen, the user controls the program by moving a pointer on the screen (typically controlled by a mouse) and selecting certain objects by pressing buttons on the mouse while the pointer is pointing at them. Each type of GUI is available on several platforms. One can classify the main WIMP under three categories using a platform criteria : - Workstation (mainly using UNIX as operating system). - PC computer. - Macintosh. Furthermore, it's possible to find some software packages running on one of the previous platform, which emulate the GUI of another platform. CSTB - G. Sauce ELSEWISE 20/05/2010 - 7
    • GUI On workstation X Windows : Sponsoring body: Initially developed by MIT's project Athena. It became a de facto standard supported by the X consortium. X (X window) is a specification for device-independent windowing operations on bitmap display devices. X uses a client-server protocol, the X protocol. The server is the computer or X terminal with the screen, keyboard, mouse and server program and the clients are application programs. X Windows is used on many Unix systems. OpenWindows is a server for Sun workstations handling X window System protocol. X has also been described as over-sized, over-features, over-engineered and incredibly over-complicated. Clients may run on the same computer as the server or on a different computer, communicating over Internet via TCP/IP protocols. This is confusing because X clients often run on what people usually think of as their server (e.g. a file server) but in X, is the screen and keyboard etc. which is being "served out" to the application. For instance, Solaris OpenStep for Sun workstation, is implemented using the facilities of the Solaris Operating Environment, including the X11 windowing system. Other GUI on Workstation OSF Motif (Open Software Foundation - Motif : Sponsoring Body : OSF Open Software Foundation is the standard graphical user interface and window manager (responsible for moving and resizing windows and other practical functions). This is one of the choices of look and feel under the X Window system. Motif is based on IBM's Common User Access (CUA) specification which permits users from the PC world to migrate to UNIX with relative ease. Products implementation: Motif is used on a wide variety of platforms by a large population of Unix users. Machine types range from PCs running Unix, to workstations and servers for major manufacturers including Sun, Hewlett Packard, IBM, Digital Equipment Corporation, and Silicon Graphics. In order to have one's software certified as Motif compliant, one must pay a fee to OSF. Motif is running under the X Window system. Developers who wish to distribute their OSF/MOTIF applications may do so as long as the license agreement is not violated. There are no non-commercial Motif toolkits available, and the Motif source by OSF is fairly expensive. Open Look is a graphical interface and window manager from Sun and AT&T. This is one of the choices of look and feel under the X Window system. It determines the `look and feel' of a system, the shape of windows, buttons and scroll-bars, how you resize things, how you edit files, etc. It was originally championed by SUN microsystems before they agreed to support COSE (Common Open Software Environment) OLIT, XView, TNT are toolkits for programmers to use in developing programs that conform to the OPEN LOOK specifications. - OLIT was AT&T's OPEN LOOK Intrinsics Toolkit for the X Window system; - XView is Sun's toolkit for X11, written in C. XView is similar in programmer interface to SunView. - The NeWS Toolkit (TNT) was an object-oriented programming system based on the PostScript language and NeWS. TNT implements many of the OPEN LOOK interface components required to build the user interface of an application. It's included in OpenWindows up to release 3.2, but is not supported (and will not run) under OpenWindows 3.3 (based on X11R5). PC Emulator WABI (Developer/Company: IBM) A software package to emulate Microsoft Windows under the X window System: Wabi 1.1 for AIX is actually ordered as a feature of AIXwindows(r), IBM's implementation of the advanced X Window system using the OSF/Motif(tm) interface standard. Since Wabi 1.1 for AIX is mapped to the AIXwindows interface on the AIX/6000 system, it provides all the function of the industry-standard X Window, including the ability to exploit Xstations at a lower cost-per-seat than fully configured workstations. What's more, X Windowing can CSTB - G. Sauce ELSEWISE 20/05/2010 - 8
    • improve application performance by offloading the graphics-intensive work from the server to the Xstations. WABI is also widely available on SUN. GUI On PC MS-Windows : Developed by the Company Microsoft MS-Windows was a windowing system running on a DOS System. It replaces the DOS prompt as a mean of entering commands to the computer : Instead of typing a command at the "C:" on the screen, entails typing the command correctly and remembering all the necessary parameters, a pointer on the screen is moved to an icon (small picture) that represents the command to be run. By pressing a button on a mouse, a window appears on the screen with the activated program running inside. Several windows can be opened at once, each with a different program running inside. The most important role of Windows has been to promulgate some interface standards. Although software designers have used the components of the Windows interface in different way, in order to make their product unique, the same components are used in every Windows interface : Windows, Menus and Commands, dialogue boxes, button and check boxes, Icons. The window system and user interface software released by Microsoft in 1987 was widely criticised for being too slow on the machines available. The latest release of the primary feature (May 1994 :windows 3.1, or maybe Windows 3.11) was more efficient and became really a standard GUI for PC, used by developers. The new feature of MS-Windows (95 and NT) replaces both MS-DOS and Windows 3 on workstations. It became multi-tasks. A server version (Windows NT 3.5) is available today. These two versions are replaced by Windows NT 4. X-Windows Emulator : Some software packages propose to emulate a X-Windows system, in order to transform the PC to a X terminal, connected to a X server. (X-Win32, Exceed 5, etc.). This allows the user to run UNIX applications from his PC. GUI on Macintosh Macintosh user interface Originally developed at Xerox's Palo Alto Research Centre and commercially introduced on the Xerox Star computer in 1981, Apple later built a very similar version for the Macintosh. The Macintosh user interface has become a very popular method for commanding the computer. This style of user interface uses a graphical metaphor based on familiar office objects positioned on the two-dimensional "desktop" workspace. Programs and data files are represented on screen by small pictures (icons) that look like the actual objects. An object is selected by moving a mouse over the real desktop which correspondingly moves the pointer on screen. When the pointer is over an icon on screen, the icon is selected by pressing a button on the mouse. A hierarchical file system is provided that lets a user "drag" a document (a file) icon into and out of a folder (directory) icon. Folders can also contain other folders and so on. To delete a document, its icon is dragged into a trash can icon. The Macintosh always displays a row of menu titles at the top of the screen. When a mouse button is pressed over a title, a pull-down menu appears below it. With the mouse held down, the option within the menu is selected by pointing to it and then releasing the button. For people that were not computer enthusiasts, managing files on the Macintosh was easier than using the MS- DOS or Unix command line interpreter, and this fact explains the great success of the Macintosh. Unlike the IBM PC worlds, which, prior to Microsoft Windows had no standard graphical user interface, Macintosh developers almost always conformed to the Macintosh interface. As a result, users feel comfortable with the interface of a new program from the start even if it takes a while to learn all the rest of it. They know there will be a row of menu options at the top of the screen, and basic tasks are always performed in the same way. Apple also kept technical jargon down to a minimum. MS-Windows Emulator: CSTB - G. Sauce ELSEWISE 20/05/2010 - 9
    • The last version of Macintosh user interface provides some of the MS-Windows functions, and then allows the possibility to run software developed for MS-Windows. For instance, the PowerPC Macintosh have an easy way to instantly switch between the Macintosh operating system and the DOS or Windows 3.1 environment. X-Windows Emulator : Some software packages propose to emulate a X-Windows system, in order to transform the Macintosh to a X terminal, connected to a X server. MacX 1.5 is an enhanced version easy-to-use software for high-performance X Window System computing on Macintosh and other MacOS-based systems. MacX helps increase the productivity of Macintosh users in UNIX/VMS environments by enabling them to seamlessly run both network-based X applications and Macintosh applications on one Macintosh computer. Virtual reality Virtual Reality is a set of computer technologies the combination of which provides an interface to a computer-generated world, and in particular, provide such a convincing interface that the user believes he is actually in a three dimensional computer-generated world. This computer generated world may be a model of a real-world object, such as a house; it might be an abstract world that does not exist in a real sense but is understood by humans, such as a chemical molecule or a representation of a set of data; or it might be in a completely imaginary science fiction world. A key feature is that the user believes that he is actually in this different world. This can only be achieved if the user's senses - sight, touch, smell, sound and taste, but primarily sight, touch and sound - give convincing evidence that he or she is in that other world. There must be no hint of the real world outside. Thus, when the person moves, reaches out, sniffs the air and so on, the effect must be convincing and consistent. A second key feature of Virtual Reality is that if the human moves his head, arms or legs, the shift of visual cues must be those he would expect in a real world. In other words, besides immersion, there must be navigation and interaction. Three key words characterise the current common Virtual Reality softwares : - 3 dimensions, - Interaction. - Real-Time, 3 dimensions The first condition to obtain a reality base and a real immersion is to address the sight. This sense takes a particular importance and refers to the nature of reality being shown. One of the most important cues is the perception of depth. To achieve this illusion, it is necessary to have a 3D model of the world and to send slightly different images to the left and right eyes in order to mimic the sense. In this context, actual displays are unsuitable tools, and if a user wants to have a very believable experience, he must use specific headset. Recent research have shown that good quality 'surround sound' can be just as important to the sense of reality as are visual clues. Interaction VR allows the user to see how pieces inter-relate and how they interact with one another. Interactivity is considered as the core element of VR. Rather than just passively observe it, users should be able to act on the environment, to control events which transform the world. Real-Time The speed of response of the system to movement and interaction is an essential aspect to appreciate how the virtual world convincing is. In order to obtain a natural speed, the system should perform a perfect synchronisation between the person moving his head, the visual clues following, and eventually the other sense reactions. Two fundamental aspects of VR could be identified : technology and software. True interactivity in immersive VR requires adapted feedback, i.e. controlling the interaction not by a keyboard or mouse, but by physical feedback to the participant using special gloves, hydraulically controlled motion platforms, etc. The technology of VR includes Headset, data glove, body suit, sound generation. CSTB - G. Sauce ELSEWISE 20/05/2010 - 10
    • Sotfware packages propose 2 types of tools : - Tools for modelling virtual world through 3D representation (also sound for the most efficient ones), and for defining specific behaviour and interaction between objets. These tools, for developers, are call world builder. A lot of various world builders are available for every kind of platform. - Tools for representing the world and allowing users immersion. They control the specific devices. They are named world browser. Nowadays, each package has its own model representation, no de facto standard emerges, except the Virtual Reality Modelling Language (VRML) which is a standard language for describing interactive 3-D objects and worlds delivered across the Internet. Nowadays, this language limits the VR application to one sense : the sight. Effectively, users immersion is only performed into a 3D model of the world, allowing then the virtual world access to users with ordinary computers with simple devices (keyboard and mouse). The next version of VRML will extend its functionnalities by providing sensors and allowing 3D animations. In practice only a small number of applications are already available, mainly for testing further developments. VR industry is immature, but it will in a medium term become a key component of Human Interface. Other Interfaces Multimedia Multimedia information is defined usually as combinations of different data types, like audio, voice, still and moving pictures, and drawings, as well as the traditional text and numerical data. Multimedia is a way to satisfy people need's for communication, using computer. One aspect of multimedia is the amount of control exercised by the user, i.e. how does the user interact and “browse” through the information presented in a multimedia fashion, such as video-disks or information columns. This is why we cannot really consider multimedia as a human computer interface. It involves several techniques used for HCI, specially graphic representation and audio/video effects, and adapted devices. Document recognition technology Close to specific devices several systems refer recognition technology for data entry and document management. Characters and graphical elements can now be recognised automatically. The main technologies are the followings : - Optical character recognition (OCR) is a process which transforms a page into ASCII code. Scanning devices can capture a page of text in a few seconds with relatively little manual effort. Afterward, OCR software can analyse and transform image into string and text which can be edited, indexed, etc. Several methods of scanning of text exist : analytical, morphological, statistical methods and "neural network" technology. An advanced OCR package also uses lexical context to improve accuracy. Success depends in a large part on the quality of the original text. - Optical mark reading : This variant of OCR is suitable when user can describe specific forms to search. - Intelligent Character recognition : this technique is widely used for Handwriting acquisition. The two main problems are the accuracy and the effective speed of ICR packages. - Barcoding : a very efficient and suitable technology anywhere where data can be pre-coded, with a typical error rate of 0.05 %. Voice synthesis and recognition Several years of research have been conducted with the goal of obtaining speech recognition by computers. Whereas today's systems are far from reaching the goal of real time recognition of unconstrained human language, the technology has evolved to a point where it is useful in a number of applications. There are several classification of voice recognition technologies. They are as follows: - Speaker Independent vs. Speaker Dependent Speaker independent systems are designed to work for anyone without having to train them to a specific person voice. Speaker dependent systems are trained to recognise a single person voice. Speaker independent systems work immediately for anyone, but usually have less accuracy and a smaller vocabulary than speaker dependent systems. These systems are commonly used for commands such as "copy" or "paste" because the vocabulary is generally around 300 words. Speaker dependent systems CSTB - G. Sauce ELSEWISE 20/05/2010 - 11
    • require about 1 hour of training for a new user. It also requires that you make corrections as you are dictating to help the system keep on learning. These systems have vocabularies of 30,000 to 60,000 words. There are also many specialised vocabularies available mainly in medicine and legal areas. Vocabularies are often specific to a field such as radiology and emergency medicine. The accuracy of voice recognition can be up to 98% (low 90's is more realistic). - Continuous Speech vs. Discrete Speech Continuous speech is basically talking at a normal rate. Most of the voice recognition industry is currently researching continuous speech systems, but they are years away. There are optimistic predictions to have it by the year 2000, but there are also critics doubting this. Discrete speech is pausing between each word. Voice recognition technology currently requires discrete speech. The pauses between words do not have to be long, but it definitely takes practice. Manufacturers claim up to 80 words per minute, but 60 wpm is probably more realistic. Speaking with pauses between words has the potential of being very distracting to the thought process. - Natural Speech Natural speech refers to understanding of language. This is more of an ideal than a current goal. For voice recognition systems, they would hypothetically function more like a human transcriptionist. This would allow saying things like "Replace those last two sentences with…" and know what to do with "um, ah,". There are several features in voice recognition systems. For example, some systems use the context to help select the correct word, (night / knight). Some allow macros or templates that build a standard reports and then only variances need to be dictated. There are a number of speech synthesis systems on the market today. It is interesting to note that in artificial speech generation there is a tradeoff between intelligibility and naturalness. Currently, the industry has placed emphasis on naturalness, with the unfortunate consequence that even the high end systems are sometimes hard to understand. Speech synthesis is generally regarded as a secondary and much less complex issue when compared to speech recognition and understanding. The current trends in the industry are for speaker-independent systems, software based systems, and extensive use of post-processing. Of great interest is the emergence of spoken language systems which interpret spontaneous speech. Speech recognition systems are being widely used by telephone companies, banks, and as dictation systems in many offices. These applications are highly constrained and often require the user to pause between each word (isolated speech recognition). The main disadvantages of this kind of technology are related to : - The accuracy : Most speech understanding systems are capable of high accuracy given unlimited processing time. The challenge is to achieve high accuracy in real-time. - Speech recognisers are susceptible to ambient noise. This is especially true for speaker independent systems which use speaker models that are developed in quiet laboratories. The models may not perform well in noisy environments. - Out-of-Phraseology Speech - Speech recognisers are not yet capable of understanding unconstrained human speech. Accordingly, applications are developed based on constrained vocabularies. The challenge is to detect out-of-phraseology speech and reject it before it is post-processed. It is clear that speech is the most familiar way for humans to communicate, unfortunately, speech recognition systems are not yet capable of perfect recognition of speech in real time. Barriers to LSE uptake GUI have taken a primordial importance for the software development. Nowadays, old softwares using a simple alphanumerical interface (on VT100 terminal or DOS Operating system) are progressively removed from companies. Apple first with Macintosh, and recently Microsoft with MS-Windows have widely contributed to the new feature of the software packages. The restriction to few de facto standards is really a good aspect for users, it facilitates the changing of computer and reduces the adaptation period. However, actual GUI need more and more memory and performances, sometimes to the detriment of the pure engineering application. Considering Virtual reality, this new technology is confronted with several problems actually limiting its usage : CSTB - G. Sauce ELSEWISE 20/05/2010 - 12
    • - Firstly a technological problem due essentially to the poor performance : The objects of the virtual world do not move or react fast enough to provide natural scene movement, or need super computer to obtain a reality base. The computation needed to render and display a 3D model are enormous, and consequently need a important investment. - Secondly, the characteristics of specific devices, weight of headset and gloves, communication between computer and devices through direct cable links, low resolution of the images are great disadvantages for the realism. - Thirdly, defining a virtual world model of high quality is a very long and expensive work. A great part of this action is actually "hand made". Some software packages propose an automatic generation from CAD Data, but the latest are not thinking for this purpose and consequently not really adapted. Furthermore, for a whole virtual world, the model size is usually enormous and leads problems of memory and calculation time. - Fourthly, the absence of a de facto standard limits the investment in such technologies. Effectively, only big firm can invest in a software package which risk to disappear few years later. Whereas this package is standard compliant, modelling worlds, data and results could be reused within an other package. Regarding other HCI technologies, LSE as other industries is waiting for more efficient and suitable tools. This is the case for voice and Handwriting recognition. Trends and expectations Optimal exploitation of computers can be achieved only if staff adapt to machines. Human Computer Interface (HCI) research aims to reduce this adaptation and produce a more 'user natural' environment. Five definitions of different areas of HCI research are the following: - Interactional hardware and software : Developing, studying and providing guidelines for various ways of interacting with computers, both at the software and hardware level. - Matching models : studying how users interact with computers. - Task level : Determining how well systems meet user's needs. - Design and development : studying designers and the design process. - Organisational impact : studying the impact of new systems on organisations, groups and individuals. Human Computer interfaces include speech technologies (speech synthesis and speech recognition), gesture recognition, natural language, virtual reality. There is a new emerging generation of intelligent multimedia human-computer interfaces with the ability to interpret some forms of multimedia input and to generate co-ordinated multimedia output : from images to text, from text to images, co-ordinating gestures and language, and integrating multiple media in adaptive presentation systems. Over the past years, researchers have begun to explore how to translate visual information into natural language and the inverse, the generation of images from natural language text. This work has shown how a physically based semantics of motion verbs and locative prepositions can be seen as conveying spatial, kinematic and temporal constraints, thereby enabling a system to create an animated graphical simulation of events described by natural language utterances. There is an expanding range of exciting applications for these methods such as advanced simulation, entertainment, animation and CAD systems. The use of both gestures and verbal descriptions is of great importance for multimedia interfaces, because it simplifies and speeds up reference to objects in a visual context. However, natural pointing behaviour is possibly ambiguous and vague, so that without a careful analysis of the discourse context of a gesture there is a high risk of reference failure Practically, for the next ten years, realistic expectation appear for the following technologies : Virtual reality Two trends characterise the development of virtual reality software packages : - Virtual reality in the pure sense, that includes total immersion addressing the three main senses (sight, sound and touch) using specific devices. CSTB - G. Sauce ELSEWISE 20/05/2010 - 13
    • - Simplified Virtual Reality based mainly on sight and sound, using ordinary display. This trends will probably lead to very efficient tools. In fact this will add to actual graphical modelers two fundamental functionalities : interactivity and pseudo immersion in real-time. It introduces the temporal dimension. Although these specifications are restrictive, first very efficient available tools are really interesting. At this time, it is very difficult to identify an emerging de facto standard for Virtual reality model. If VRML version 2 makes a real and positive contribution to this technology, a big step will be done in this direction. Moreover, VRML affers a new dimension to virtual reality : multi-user capability. VRML will evolve into a multi-user shared experience in the near future, allowing complete collaboration and communication in interactive 3D spaces. Any proposal must anticipate the needs of multi-user VRML in its design, considering the possibility that VRML browsers might eventually need to support synchronisation of changes to the world, locking, persistent distributed worlds, event rollback and dead reckoning. Voice recognition In the past, we have been required to interact with machines in the language of those machines. With the advent of speech recognition and synthesis technology, humans can communicate with machines using constrained natural language without noisy ambience. Much research is being conducted in the area of spoken language understanding. Spoken language systems attempt to take the best possible result of a speech recognition system and further process the result. A spoken language system is defined as a system understanding spontaneous spoken input. Spontaneous speech is both acoustically and grammatically challenging to understand. Speech recognition applications have a tremendous opportunity for growth. Besides the speaker-independent continuous speech, the future development of speech recognition technology will be dedicated to the following applications: - Client/Server Speech Recognition :Client/Server software will allow speech applications to work over a wired or wireless network. This means that after users dictate from their workstation, users can press a "SEND" button and convey their recorded voice file to the server, where the speech engine will perform the recognition and return them to users. - PC-Based Computer-Telephony Application : Dictation, call processing, and personal information manager (PIM) will be integrated into one system installed in a computer. For example, users will use voice to update their daily schedule and use telephone voice-menu to order products. Because prices of Central Processing Units and Digital Signalling Processors are coming down, speech recognition systems are seen as affordable and feasible equipments for computers. The goal of speech recognition is to speak in a speaker-independent continuous fashion into computers. However, the present situation for most widely used speech-recognition technology is "discrete." This means that users need to pause between words so that the computer can distinguish the beginning and ending of each word. In the future, continuous speaker-independent speech recognition will be due to better processor and algorithmic breakthroughs. Handwriting recognition Optical Character Recognition (OCR) is already well established in commercial applications. The recognition of handwritten text is an important requirement for Pen Computing but far from a trivial one. Many different initiatives were taken to solve this problem and it is not yet known which is the best and the results are still not perfect. It is important to stress two points: - Handwriting recognition is no more a new technology, but it has not gained public attention until recently. - The ideal goal of designing a handwriting recognition method with 100% accuracy is illusionary, because even human beings are not able to recognise every handwritten text without any doubt, e.g. it happens to most people that they sometimes cannot even read their own notes. There will always be an obligation for the writer to write clearly. Recognition rates of 97% and higher would be acceptable by most users. The requirements for user interfaces supporting handwriting recognition can easily be deducted from the pen and paper interface metaphor. First, there should be no constraints on what and where the user writes. It should be possible to use any special character commonly used in various languages, the constraint to use only ASCII characters is awkward, especially with the growing internationalisation that the world is facing these days. The ideal system is one that supports handwritten input of Unicode characters. The second requirement is that text CSTB - G. Sauce ELSEWISE 20/05/2010 - 14
    • can be written along with non-textual input, e.g. graphics, gestures, etc. The recogniser must separate these kinds of input. Off-line recognition is done by a program that analyses a given text when it is completely entered, hence there is no interaction with the user at this point. On-line recognition, on the other hand, takes place while the user is writing. The recognizer works on small bits of information (characters or words) at a time and the results of the recognition are presented immediately. The nature of on-line systems is that they must be able to respond in real time to a user's action, while off-line systems can take their time to evaluate their input : speed is only a measure of performance, not of quality. Until recently, only the recognition of handprinted writing has been studied, but with the growing wide-spread use of pen-based systems, cursive writing becomes important. Most systems do not yet support the recognition of cursive (script) writing. Printed characters are much easier to recognise because they are effortlessly separated and the amount of variability is limited. Pen-based systems are already available (Microsoft Windows for Pen Computing), GO (PenPoint), Apple (Newton), General Magic (Magic Cap, Telescript). In the next decade new results in handwriting recognition research will make thess systems more and more attractive. Relevance to LSE Computers are now imperative to realise LSE projects, with a high quality for company and users. This means to improve industrial actor's productivity, to reduce time to market through prototyping, digital mock-up, etc. In this context, everything which facilitates the usage of computer (and consequently increases the productivity) is interesting for LSE. In order to detail the relevance of HCI to LSE, let us focus on two important stages of LSE projects : - Design stage : During the design phase, the first problem is a perception of the project, this means objects which have no real physical existence, id virtual object. We can consider at this stage two points of view : designers viewpoint, and the one from the others (client, environment, ...). If designers have a relatively good understanding of the project, it is usually a very difficult task to give this vision to the others. The assistance of Virtual reality will represent an primordial advantage in terms of communication and client perception. Expectation are also for designers, VR will add a new dimension in their project vision which is limited by display size. The notion of immersion will allow them to have a better self possession and control on the project. VR will also enable a better relation to be made between Design stage and Construction stage by assessing the produceability / constructability through visualisation of the product design, i.e. closing in on Design for Construction. Obviously, designers are also interested in GUI for their daily work, in order to spend more time for working on the project than for communicating with computer. - Construction stage: Here, the two main components are related to the place of actors : in office or on site. In office, Virtual Reality is always interesting in order to represent a permanent actualised view of the site, rightly to fulfil the gap with reality, due to the distance or difficulties to access of the project. On the other side, actors on site, sometimes in difficult situation, need more ergonomics to communicate with computer and simply use it for increase the productivity and the quality. All recognition techniques, voice, handwriting, gesture will facilitate HCI in this context. Summary GUI have taken a primordial importance for the software development. The restriction to few de facto standards is really a good aspect for users, it facilitates the changing of computer and reduces the adaptation period. On the contrary, this de facto standard introduces a dependence to specific platforms. LSE Industry as other industrial domain, is always waiting for more ergonomics and new ways to interact with computer. CSTB - G. Sauce ELSEWISE 20/05/2010 - 15
    • - As input, recognition technologies promise soon efficient application (Voice and Handwriting) - Virtual reality offers new possibilities which will completely change HCI for input communication as well for output LSE actors has to contribute to the development of virtual reality by experimenting this technology in two directions : - for the whole project perception, for project communication but also for engineering design. One of the most problem of IT is that the project model stored in the computer is more and more abstract, and more and more difficult to perceive as a whole. Virtual reality appears as a mean to extend the limits of current display in order for instance to facilitate the presentation of the project to the client or its integration within the environment. The extension to other senses and the capacity of interaction will be a complementary advantage notably for engineers, allowing them new possibilities of simulation. - Virtual reality will probably stir up HCI : For instance, Virtual enterprise needs some new kind of ergonomics, and on can easily imagine a software which proposes a virtual environment with virtual actors, meeting, and so on. LSE Industry has to explore this technology to be able to establish a set of specifications in order to influence the future software tools. CSTB - G. Sauce ELSEWISE 20/05/2010 - 16
    • Data Management Introduction First of all, it seems necessary to define the notion of Data Management. At the basic level, the term of DATA characterises numbers, characters, images or other method of recording, in a form which can be assessed by a human or (especially) input into a computer, stored and processed there, or transmitted on some digital channel. Data on its own has no meaning, only when interpreted by some kind of data processing system does it take on meaning and become information. For example, the number 5.5 is data but if it is the length of a beam, then that is information. People or computers can find patterns in data to perceive information, and information can be used to enhance knowledge. In the past, Data Element which is the most elementary unit of data identified and described in a dictionary or repository which cannot be subdivided, evolved from simple alphanumerical item to become more complex items. Nowadays, Data Element represents facts, text, graphics, bit-mapped images, sound, analog or digital live-video segments. Data is the raw material of a system supplied by data producers and is used by information consumers to create information. The term management has evolved too. At the beginning the main sense considers mainly the data point of view. Data Management consists of controlling, protecting, and facilitating access to data in order to provide information consumers with timely access to the data they need. The notion of Database refers to a collection of related data stored in one or more computerized files in a manner that can be accessed by users or computer programs by a database management system. Then, a Database management system (DBMS) is an integrated set of computer programs that provide the capabilities needed to establish, modify, make available, maintain the integrity of and generate reports from a database. It has evolved along four generic forms : 1 Hierarchical DBMS (1960s) - records were organised in a pyramid-like structure, with each record linked to a parent. 2 Network DBMS(1970s) - records could have many parents, with embedded pointers indicating the physical location of all related records in a file. 3 Relational DBMS (1980s) - records were conceptually held in tables, similar in concept to a spreadsheet. Relationships between the data entities were kept separate from itself. Data manipulation created new tables, called views. 4 Object DBMS (1990s) - Object Oriented DBMS combine capabilities of conventional DBMS and object oriented programming languages. Data are considered as objects. Other classes of ODBMS associates Relational DBMS and Object capabilities. In a parallel direction, Data Modelling methods used to analyse data requirements and define data structure needed to support the business functions and processes of a company evolves a lot. These data requirements are recorded as a conceptual data model, a logical map that represents the inherent properties of the data independent of software, hardware or machine performance considerations. The model shows data elements grouped into records, as well as the association around those records. Data modelling defines the relationships between data elements and structures with associated data definitions. Database schema became more logical, this means that some methodologies allowed developers to elaborate schema more and more independently from specific DBMS. These methodologies are an essential part of Data Management technology. Last ten years, main basic problems such as data access, security, sharing which represents the fundamental functionalities of DBMS have been completed with an essential aspect : the distribution not in a homogeneous context (hardware and software) but in a heterogeneous environment. The development of network (Intranet and Internet) have created new kind of possibilities, users and needs for DBMS. Client/server systems and distributed systems using network is the actual challenge of Data Management technology. CSTB - G. Sauce ELSEWISE 20/05/2010 - 17
    • Regarding the increasing complexity of data, new definitions of data management systems are proposed, more end user oriented. Objectives consist in giving more sense to information in order to take maximum benefits from the stored data. Effectively, a company has in its own lot of information not really useful for an other goal than the designed one. Then it is necessary to organise the whole information manipulated by a company or a project. Two main classes of such systems are now in rapid development. The first one is product oriented, and the second one business oriented. - Product Data Management Systems (PDMS) or Engineering Data Management System (EDMS) : The fundamental role of a these systems is to support the storage of technical documentation and the multiple processes related to the products design, manufacturing, assembly, inspection, testing and maintenance (the whole life-cycle). In project, characterised with globally distributed design, and complex production processes the task to manage all product related information is of first importance for the project success. By the introduction of CAD/CAM/CAE systems the amount of product information has increased tremendously resulting in the need for management systems to control the creation and maintenance of technical documentation. This implies the use of well defined configuration management processes and software tools capable of controlling, maintaining all the data and processes needed for the development and implementation of the whole product. Product being a transversal project shared by several companies, P&EDMS involves heterogeneous environment and distributed data. - Business oriented - Material Requirements Planning Systems: Implementing the material requirements planning process has been ongoing for several years. MRP systems have evolved from pure Material Requirements Planning through integrated business support systems, which integrate information about the major business processes involved in manufacturing industries. Support exists for registration of prospects and customers, acquisition, proposals and sales orders, the associated financial procedures, inventory and distribution control, global production and capacity planning, with a limited capability for planning, analysis and derivation of management information. The majority of the systems are registration systems by nature, advanced planning tools are lacking. Because of the registrative nature of the systems, its lack of flexibility in querying the available information no easy management information can be produced which might be used to manage and control a business. - Data Warehouse Management Tool : An implementation of an informational database that allows users to tap into a company's vast store of operational data to track and respond to business trends and facilitate forecasting and planning efforts. It corresponds to a manager viewpoint. From clients management to stocks management, the whole set of data of a company is distributed in several databases, managed by different DBMS. Objective of Data Warehousing is to arrange internal and external data in order to create pertinent information easily accessible by manager to assist them in their decision activity. This aims at taking out a maximum benefit of the huge mass of information available in the company. The two main characteristics of data warehouse are firstly to consider the whole company and not only a part of it like a department, and secondly to analyse information through time scale. Managers want to analyse information state evolution and origins of change, this means to maintain an historical record in order to really have decision information to better appraise the trends. Data mining is a mechanism for retrieving data from the data warehouse. In fact many companies would claim that they have been using these sort of tools for a number of years. However, when there is a significant amount of data, the retrieving and analysis is a complex and potentially have a poor response. Data mining derives its name from searching fro valuable business information in a large database, and mining a mountain of vein of valuable ore. Both processes require either sifting through a large amount of material or find out where the value resides. Data mining tools allow professional analysts make a structured examination of the data warehouse using proven statistical techniques. They can establish cross-correlations which would be difficult or impossible to establish using normal analysis and query tools There are a number of companies who claims to offer tools and services to implement the Data Warehousing system e.g. IBM, Oracle, Software AG, OLAP databases, Business Objects, Brio Technology, etc. CSTB - G. Sauce ELSEWISE 20/05/2010 - 18
    • In spite of different objectives, problematics involved in these two cases is relatively the same and can be summarised by several keywords. - Meta model : This means to integrate information between the different sources (marketing, production, etc. or design, manufacturing, etc.) at a conceptual level. - Heterogeneous environment : this second aspect of integration takes place at a hardware level. - Distributed databases : It characterises sharing and access to information across multiple platforms. - Historical records of data in order to manage the product or company activities evolution. - Navigator, Information filtering and Analysing tools to exploit the mass of information. System objectives induce specific tools, for instances, graphical aspects are indispensable for EDMS whereas financial analysing tools are specific for data warehouse. What exists today In terms of technology, four points seem very interesting : Data Modelling, OODBMS, Distributed Database management and access to Database applications running over Internet. Data Modelling Data modelling aims at analysing and expressing in a formal way data requirements needed to support the functions of an activity. These requirements are recorded as a conceptual data model with associated data definitions, including relationships. The three following points can summarise the objectives of such a method : 1 To build data models : the first goal consists of defining a model which represent as faithfully as possible the information needed to perform an activity. It is important to place this work at a conceptual level to be independent from a specific DBMS. 2 To communicate models : In order to validate the data organisation, a maximum number of lecturers should analyse the proposed data model. Then a fundamental function of a method is to allow uninitiated people to have a global view on a data model, to perceive the main hypothesis and to analyse in detail very quickly. 3 To capitalise knowledge : Data model and its context (i.e. activities) represent an important part of specific knowledge of an application domain. Taking place at a conceptual level, this analysis should increase the global knowledge and contribute having a better understanding of this domain, on condition that the modelling method permits the knowledge durability. These methods represent the keystone of data management. Effectively the system efficiency depends directly on the quality of the data models. One can list three classes of methods : 1 Method for Relational DBMS : these methods consider clearly two aspects : functions and data, and for the latest, entities and relationships. 2 Object Oriented methodologies : This kind of method are characterised by an incremental and iterative approach. Certain methods are pure object whereas other ones use a functional approach associated with objects. 3 STEP methodologies : The international standard ISO 10303 STEP prescribes a specific methodology for analysing activity of an application domain (IDEF0) and for Application Reference Model (EXPRESS). This approach is completed with implementation specifications (SPF, SDAI). Every modelling methods have the same drawback : in fact they are not really independent of the software context. This is mainly the consequences of developers interest. Actually they are the most interested actors in the result of modelling, Effectively, the aspect of capitalising and increasing the application domain knowledge is not yet a primary objective of such an action. CSTB - G. Sauce ELSEWISE 20/05/2010 - 19
    • OODBMS Relational representation is always a current interesting technology and proposes efficient solutions for numerous data management problems. Object Oriented DBMS is a more recent technology developed to take maximum advantage of Object Oriented Languages. OODBMS are attractive because of their ability to support applications involving complex data types, such as graphics, voice, text which are not correctly handled by relational DBMS A number of reasonably mature OODBMS products exist, and are widely used in certain specialised applications. Increasingly, these products are being enhanced with query and other facilities similar to those provided by DBMS and are being applied in more general purpose applications. Another class of DBMS, the Object-Relational DBMS (ORDBMS), constitutes the most recent system. ORDBMS are both based on the results of research on extending relational DBMS and on the emergent viable OODBMS. Current individual products represent different mixtures of these capabilities. The appearance of these two classes of product indicates a general convergence on the object concept as a key mechanism for extended DBMS capabilities. Without further detail of OODBMS technology, it is important to note that an considerable standardisation activity affects the development of this technology, including Object extension to SQL (The relational standard query language) being developed by ANSI and ISO standard committees, and proposed standards for OODBMS, defined by the Object Database Management Group. Distributed Data Base Management (DDBM) A distributed database manager is responsible for providing transparent and simultaneous access to several databases that are located on, possibly, dissimilar and remote computer systems. In a distributed environment, multiple users physically dispersed in a network of autonomous computers share information. Regarding this objective, the challenge is to provide this functionality (sharing of information) in a secure, reliable, efficient and usable manner that is independent of the size and complexity of the distributed system. An abundant literature is dedicated to DDBM which can be summarised by the following functions : - Schema Integration : One can identified different levels of schema : - A global schema which represents an enterprise-wide view of data, is the basis for providing transparent access to data located at different sites, perhaps in different format. - External schemata show the user view of the data - Local conceptual schemata show the data model at each site. - Local internal conceptual schemata represents the physical data organisation at each computer. - Location transparency and distributed query processing : in order to provide transparency of data access and manipulation, the data base queries do not have to indicate where the data bases are located, decompose and route automatically to appropriate locations and access to appropriate copy if several copies of data exist. This allows the databases to be located at different sites and move around if needed. - Concurrency control and failure handling in order to update synchronisation of data for concurrent users and to maintain data consistency under failure. - Administration facilities enforce global security and provide audit trails. A DDBM will allow an end user to : - Create and store a new entity anywhere in the network; - Access an entity without knowledge of its physical location; - Delete an entity without having to worry about possible duplication in other databases; - Update an entity without having to worry about updating other databases; - Access an entity from an alternate computer. The sharing of data using DDBM is already common and will become persuasive as this system grows in scale and importance. But the problem mainly consists on mixing Object concept, distributed technology and standard aspect. The generalisation of this kind of system depends directly on the development of standards. Three main candidate technologies are interested on distributed Object oriented Techniques. CSTB - G. Sauce ELSEWISE 20/05/2010 - 20
    • - Remote procedure call technologies as DCE (Distributed Common Environment) : This technique provides low level distribution semantics and requires high programming effort. - Microsoft distribution technology (also called CAIRO or Distributed COM) will only be available on Microsoft systems, and nowadays is not yet really marketed. - OMG distributed objects : The Object Management Group aims at promoting the theory and practice of Object Technology for the development of Distributed Computing System. OMG provides a forum and framework for the definition of a general architecture for distributed systems called the Object Management Architecture. This object management architecture is divided in 5 components. Object Request Broker (ORB) know as CORBA is the communication heart of the standard; it provides an infrastructure to object communication, independently of any specific platforms, a technical foundation for distribution, it specifies a generic distributed programming interface (API) called the Interface Definition Language (IDL) It also specifies architecture and interface of the Object Request Broker software that provides the software backplane for object distribution. The other parts of these specifications concern Object Services (life-cycle, persistence, event notification, naming), Common Facilities (printing, document management, database, electronic mail), Domain Interfaces designed to perform particular task for users and Application Objects. The market of CORBA products evolves very fast, and several Software claim to be CORBA compliant, but usually they do not implement all CORBA functionalities. Nowadays, the technology allows true distributed data management, and soon efficient DDBM will be marketed. Database applications running over Internet The development of Internet leads to propose new possibilities in term of communication with databases : a dynamic database interaction with Web pages. These pages that are data driven are created through the use of technologies such as CGI and Java to link a server database with a client browser. CGI (Common Gateway Interface) provides a simple mechanism for web servers to run server-side programs or scripts. JavaScript is a compact, object-based scripting language for developing client and server Internet applications. The JavaScript language resembles Java, but without Java's static typing and strong type checking. JavaScript supports most of Java's expression syntax and basic control flow constructs. One of the main differencies is that JavaScript is interpreted (not compiled) by client whereas Java is compiled on server before execution on client. The problem of security is on the first importance over Internet. The communication link can be made secure through the Secure Sockets Layer (SSL). SSL is an industry-standard protocol that makes substantial use of public-key technology. SSL provides three fundamental security services, all of which use public-key techniques: - Message privacy. Message privacy is achieved through a combination of public-key and symmetric key. All traffic between an SSL server and an SSL client is encrypted using a key and an encryption algorithm negotiated during the SSL handshake. - Message integrity. The message integrity service ensures that SSL session traffic does not change in progress to its final destination. If the Internet is going to be a viable platform for electronic commerce, we must ensure that vandals do not tamper with message contents as they travel between clients and servers. SSL uses a combination of a shared secret and special mathematical functions called hash functions to provide the message integrity service. - Mutual authentication. Mutual authentication is the process whereby the server convinces the client of its identity and (optionally) the client convinces the server of its identity. These identities are coded in the form of public-key certificates, and the certificates are exchanged during the SSL handshake. During its handshake, SSL is designed to make its security services as transparent as possible to the end user. This technology is already available and several products are marketed, based on different Web browsers, specific DBMS and operating systems. CSTB - G. Sauce ELSEWISE 20/05/2010 - 21
    • Barriers to LSE uptake LSE already uses Data Base Management technology, but certainly not as much as possible. Although the development of certain techniques such as distributed databases is quite recent, the main reason is probably due to a knowledge problem. Effectively, the keystone of Data Management is data model. The latest is not under the responsibility of the computer science side but is the concern of the LSE industry. Consequently an important effort has to be done to formalise the LSE activity and the LSE information flow. However, data modelling methodologies seem not really adapted to end users. Three aspects of these methods have to be improved : - Validation : Nowadays, no methodologies and tools allow end users to appraise the quality of a data model and to validate it. Usually this model appears to end users as a very strange presentation of its knowledge, and they are totally dependent from some developers. - Dynamic aspect : data evolution is a fundamental aspect of a LSE project, but data modelling has actually some difficulties to undertake this aspect. - Capitalisation : If LSE industry wants to go forward, it is important to accumulate this kind of knowledge, which is not theoretical but representative of a know-how, practical processes and company knowledge. Another aspect of data modelling is the standardisation of the data models in order to facilitate the communication between various firms and even within the company. The international standard ISO 10303 STEP represents the international context which permits to develop such actions. In fine, the main problem of Data Management is the preservation of data during the life-cycle of a product. In many cases the life of a product, especially one in construction and civil engineering, easily spans more than thirty or forty years during which the data associated to the product must remain available and must be kept up-to-date. Not only must the data be available it should also be possible to correctly interpret the data. Until recently drawings represented the design specification of a product. Although electronic CAD systems are used to produce drawings, storage and archiving are still done on paper drawings. The paper drawing still is the official version of a product design specification. It is the paper version of the drawing which should be available for the product life-time, including results of design changes, maintenance adaptation and reworks. Today we are still able to read and interpret drawings on paper which are made thirty years ago. We are even able to read and interpret design drawings the Romans made for their public baths. Archiving paper drawings is built upon the preservation of the medium bearing the drawing, e.g. paper. Some decades ago large paper archives have been compacted using micro-fiche. The retrieval and viewing of the micro-fiche archives is now depending on some form of translation from a storage medium to a means for representation. Archiving is still done on the original data, i.e. the printed paper, which has been decreased in size. No interpretation step is involved and the compaction/de-compaction process is a pure physical one. Only quite recently the notion of product models arose. A Product Model is a presentation of a product in electronic, computer-readable form which is used by various computer applications to interpret the applications view and needs of the model. To enable different programs to interpret the same product model, a standard and interchange mechanisms are needed. Storage of a Product Model in electronic form over long periods also needs standards to ensure the data will mean the same in say, twenty years. Another concern then arises: the medium on which to save the data for archiving purposes. Today we are unable to read data which has been produced by an office application of 8 years ago. It is also not guaranteed that we can still read 51/4" computer disks made four years ago because of wear of the disk. This problem has been identified in several areas of industry which have a high data density. One of the trends in this area is the foundation of small to medium companies who make a living of archiving and retrieving data for a purpose, data services companies. The idea is to enable the user to get hold of the correct product data throughout the life-time of a product or service. The data services company, for a fee, takes care that upgrades to new versions of standards, tools, storage applications, e.g., databases, and new media are carried out. Such companies can only thrive based on standards, used and kept. CSTB - G. Sauce ELSEWISE 20/05/2010 - 22
    • Trends and expectations Current trends aim at adding value to Data Base Management system in one hand to specify more end users oriented DBMS, in other hand to develop Intelligent Database environment. The goal is to manage high level of information. The concepts of Data Warehouse and Engineering Data management are at the beginning of their development, and will procure in the next years more powerful possibilities. The semantic aspect induced by a high level of data allows to develop reasoning capabilities in order to propose : - declarative query language, to ease interrogations, - deductive system for instance query association permits to return additional relevant information not explicitly asked by users, - query optimisation to reduce time of transactions, - integration of heterogeneous information sources that may include both structured and unstructured data. From a technical point of view, ODBMS and Distributed Data Management will evolve in performance and in security. The Object Database Management Group continue its activity on standard deployment regarding object definition and query language. Numerous research projects deal with distribution. They aim at improving time of transactions, security, at establishing better connections using Web browser. New technologies are in progress, especially persistent distributed store which provides a shared memory abstraction. Relevance to LSE As every kind of activity, Data management represents the foundation of the LSE industry. Whether considered viewpoint, from a product one or a firm one, LSE activity constitutes an heterogeneous environment and manipulates a high semantic level of information which is recorded on different supports. The association of data management, distribution and new communication technologies is proving on the first importance. LSE is interested to DBMS with added value such as EDMS and Data Warehouse. Engineering Data management system constitutes a interesting response in term of project information management whereas Data Warehouse will interest the firm as its own. LSE industry is also mainly involved in standard activities, not directly for the computer science standards, but for establishing the product data models. We shown that data models are the keystone of data management. These ones contain an important part of LSE knowledge, still not yet formalised. Summary Nowadays, Data Management is based on Sound technology and lot of efficient software tools are available. LSE industry already uses this technology in the different part of companies. Lot of progress are expecting using object representation, probably more adapted to project design. Consequently great interests are waiting for the next evolution of OODBMS. At the present time, exchange of information and distribution of information are too limited. (mainly graphical document exchange). A higher semantic level within the exchange allows the different actors of a project to take more benefit of the new Information technologies. Developing data model, the key stone of exchange of information, needs sound standards at two levels : - Data modelling methodologies especially to facilitate the understanding of models and their validation. Communication is important for LSE actors allowing them a control of these models, but also to guarantee the continuity of the knowledge contained within models. No current modelling methodologies offers efficient way of model validation. This phase of validation is only based on the competencies of the model designer, and no technical, analytical or mathematical validation are performed. CSTB - G. Sauce ELSEWISE 20/05/2010 - 23
    • - Product data model : LSE industry has to be responsible on the development on its model. Actions have to be intensified to elaborate adapted models, notably in the context of the STEP standard. CSTB - G. Sauce ELSEWISE 20/05/2010 - 24
    • Knowledge management Introduction The previous chapter analysed Data management, the subject here is to present Knowledge management, but what is the difference between Data and Knowledge. Whereas Data represent the basic pieces of information, the term Knowledge recovers several definitions, depending on user viewpoint. It could be assimilated to the process of knowing or to a superior level of information. In this work, we won't debate on this definition and will consider the whole sense of knowledge, including both object of knowledge and process of knowing. The two following terms could be associated to knowledge : management and engineering. The main difference is that the manager establishes the direction (administrative, executive) the process should take, whereas the engineer develops the means to accomplish this direction. Then Knowledge Management is relating to the knowledge needs of the firm, to make what decisions and enable what actions. Knowledge Engineering develops technologies satisfying Knowledge Management needs. In fact, here we will develop Knowledge technologies, dependent on the knowledge engineer (usually a computer scientist) , which allows manager to develop policies for enterprise level knowledge ownership. Understanding knowledge is key to Knowledge Management process. It means how we acquire knowledge, how we represent it, how we store it, how we use it and what are the applications. Therefore we identify three main parts in knowledge management : - Knowledge representation The representation or modelling process consists of defining a model (or several models) able to represent knowledge as reliable as possible. The first result of the modelling process is to formalise the knowledge in order for accumulation and communication and sharing. The second aim is to establish a model which could be manage and use by computer programs. - Knowledge acquisition The smallest Knowledge Based system need a huge quantity of knowledge in order to be realistic. The knowledge acquisition process is critical to the development of any knowledge management program. It means to have a complete understanding of the nature and behaviour of knowledge. Then it is necessary to develop specific procedures to acquire this knowledge. This objectives of identifying and creating dynamic knowledge bases is itself also a knowledge process. It involves knowledge learning communications techniques, modelled on the human (or child) behaviour. - Knowledge usage The objective of accumulating its knowledge is on its own essential for an enterprise, it represents the memory of the firm. But this only interest is minor and the challenge is to develop a daily exploitation of the knowledge base, in every kind of action : technical, management, financial, etc. The possibilities includes decision making, diagnostic, control, training, command and so one. The number of potential applications is endless. The next paragraph will present the existing technologies in accordance with these three aspects. What exists today Knowledge representation The first steep regarding knowledge management process consist of choosing a knowledge representation. Effectively, the natural language usually used by every body is not adapted for developing a knowledge base. But the problem is that nowadays, it doesn't exist a universal model of knowledge representation. The existing formalisms depends on the application (diagnostics, decision making...) or on the knowledge domain(Natural language, technical application, medical science, etc.). In the past, two trends are developed which are knowing process oriented (the procedural knowledge representation) for the first and knowledge description oriented (declarative knowledge) for the second. Nowadays the trend tries to develop more universal knowledge representation. Without going into details or considerations of pure computer scientists, it is possible to identify four current in Knowledge representation. CSTB - G. Sauce ELSEWISE 20/05/2010 - 25
    • - Production rules - Logical form - Conceptual graphs - Object oriented representation Production rules A rule consists of two parts : - An antecedent or situation part usually represented by IF <condition>. This condition is expressed by mean of atomic formula and logical statement formed by using logical connectives such AND, OR, NOT. - A consequent part or action represented by THEN <conclusion>. The consequent may include several actions. Antecedents could be considered as patterns and consequences as conclusion or action to be taken. Each rules represents a independent grain of knowledge. Effectively, a rule could not refer to an other one, and contains on its own all the condition of its application. In addition, a rules base of a production system doesn't specify an order of application of the rules. Various methods have been used to deal with uncertain or incomplete information in such a knowledge base. - Certainty factors consists of assigning a number to each rule for indicating the level of confidence for this piece of knowledge. Usually this number, in the range of -1 to 1 or 0 to 1 or 0% to 100%, are combined with factors of evidence associated to fact to deduce factors of evidence for the conclusion. - Fuzzy logic : The theory of fuzzy sets can be used to present information with blurry or grey boundaries. In this theory, an element may be in a set with a degree of membership between 1 and 0. A fuzzy set contain elements with assigned degrees of membership. The main problem is to construct the membership function. - Probabilities : Bayes Theorem is well known in statistics and probabilities literature. It provides a method for calculating the probabilities of an event or fact based on the knowledge of prior probabilities. The difficulties of this method is to independently determine one another the probabilities of observed facts. Logical form This representation is based on the mathematical Logic. The most basic logical system is propositional logic. Each basic element of the system, or proposition, can be either true or false. The propositions can be connected to each other by using the connectives AND, OR, NOT, EQUIVALENT, IMPLIES, etc. First-order predicate logic is composed of : - atoms (symbols), - functions and predicates (a function with one or more atomic arguments which produce a boolean value), - two substatements joined by a conjunction, disjunction, or implication, - negated substatement, - statement with an existential or universal quantifier (in this case, atoms in the statements can be replaced by variables in the quantifier). For instance : Bob has a bicycle might be given the following logical form : ∃ B , Bicycle (B) ∧ Has (Bob, B) This is a very descriptive declarative representation with a well founded method of deriving new knowledge from a database. Its flexibility makes it a good choice when more than one module may add to or utilise a common database. Unfortunately, this flexibility has limitations. To maintain consistency, learning must be monotonic. This limits its effectiveness when there are incomplete domain theories This representation has been extended in several direction to correct these weaknesses and to propose new type of application. These extensions of logic include, but not limited to, closed world reasoning, defaults, epistemic reasoning, queries, constraints, temporal and spatial reasoning, procedural knowledge CSTB - G. Sauce ELSEWISE 20/05/2010 - 26
    • The integration of Logical form with other formalisms, such as object-oriented languages and systems, type systems, constraint-based programming, rule-based systems, etc., represents most of the actual development of these concepts. Applications concern numerous areas such as natural language, planning, learning, databases, software engineering, information management systems, etc. Conceptual graph Conceptual graphs (CG) are a graph knowledge representation system invented by John Sowa which integrate a concept hierarchy with the logic system of Peirce's existential graphs. Conceptual graphs are as general as predicate logic and have a standard mapping to natural language. Conceptual graphs are represented as finite, bipartite, connected graphs. The two kinds of nodes of the graphs are concepts and conceptual relations which are connected by untyped arcs. Each conceptual graph forms a proposition. The main characteristics of Conceptual graphs are the following : - A concept is an instantiation of a concept type. A concept by itself is a conceptual graph and conceptual graphs may be nested within concepts. A concept can be written in graphical notation or in linear notation. - A pre-existent type hierarchy of concept types is assumed to exist for each conceptual graph system. A relation < is defined over the set of concepts to show concept types that are subsumed in others. - Two or more concepts in the same conceptual graph must be connected using links to a relation. - For any given conceptual graph system there is a predefined catalogue of conceptual relations which defines how concepts may be linked with relations. - Each concept has a referent field used to identify the concept specifically or to generalise the concept, to perform quantification, to provide a query mechanism, etc. - Canonical graphs represent real or possible situations in the world. Canonical graphs are formed through perception, insight, or are derived from other canonical graphs using formation rules. Canonical graphs are a restriction of what is possible to model using conceptual graphs since their intent is to weed out absurdities or nonsense formulations. - A canonical graph may be derived from other canonical graphs by formation rules: copy, restruct, join, and simplify. The formation rules form one of two independent proof mechanisms. - A maximal join is the join of two graphs followed by a sequence of restrictions, internal joins (joins on other concepts), and simplifications until no further operation is possible. A maximal join acts as unification for conceptual graphs. - New concept types and relations may be defined using abstractions. An abstraction is a conceptual graph with generic concepts and a formal parameter list. - Schemata incorporate domain-specific knowledge about typical objects found in the real world. Schemata define what is plausible about the world. Schemata are similar to type definitions except that there may be only one type definition but many schemata. - A prototype is a typical instance of an object. A prototype specializes concepts in one or more schemata. The defaults are true of a typical case but may not be true of a specific instance. A prototype can be derived from schemata by maximal joins and then assigning referents to some of the generic concepts in the result. Conceptual graphs is considered as a good choice of knowledge representation for systems where knowledge sharing is a strong criterion. They have a formal, theoretical basis, support reasoning as well as model-based queries and have a universe of discourse that is described by a conceptual catalogue. They have no inherent mechanism for knowledge sharing or for translation but the flexible representation, mapping to natural language and, especially, work on their database semantics is an encouragement to consider CG as a candidate representation. Conceptual graphs include more or less Semantic Networks which consist of a collection of nodes for representation of concepts, objects, events, etc., and links for connecting the nodes and characterising their interrelationship. Semantic networks propose also inheritance possibility. CSTB - G. Sauce ELSEWISE 20/05/2010 - 27
    • What tools are available? There have been many experimental implementations of conceptual structures. The conceptual structures community are cooperating in the development of a industrial strength, freely available, portable conceptual structures workbench in the Peirce Project . Object oriented representation. There are many variants of object oriented representation (OOR), each of them have their own definition and characteristics. Even the response to the fundamental question : "what is an object ?" depends on the viewpoint of the author. The followings only describe the main aspects of OOR : - An object is an entity (a thing, an individual, a concept or an idea) that could be identified. An object has two parts : a descriptive part and a procedural part. - The descriptive part represents the state of the object, and is composed of attributes. Each attributes is specified by a type, a domain, a default value and other characteristics which describe this attribute. - The procedural part represents the behaviour of the object. Usually it consists of method which are composed by a descriptor (the name of the method) and the associated function (the procedure which performs the behaviour). It could also be a reflex which represent a specific behaviour attached to one attribute. A reflex is automatically triggered when an action affects this attribute : to read, to reset, to modify, to initialise... - Message : Objects usually communicate using messages. A message is compose of tree items at least : the addressee or the interrogated object, the descriptor which corresponds to a name of a method, and the arguments. Some times, the sender is mentioned which allows the object to adapt the response to this one. - Encapsulation : When the internal structure of an object is inaccessible, then this object is considered as a black box. In this case, the only way of communication consists of using messages. The notion of encapsulation leads to put the object interface with the environment before the internal structure, and facilitates the reuse and the evolution of objects. - Instanciation : Most of OOR are classes based. We just mentioned that it exists other alternatives such as prototypes. The notion of class comes from the concept of set. Every element of a set has the same characteristics and behaviours. Only the state (the values of attributes) of element change. A class describes the shell of its element : the descriptive part. Every element of this class is an instance, identified by a relation "is a" with its own class. Regarding its behaviour, every element, usually dynamically, refers to its class to know the understandable messages and the associated functions. A class is an abstraction that Emphasises relevant characteristics and Ignores or suppresses other characteristics. Each object is an instance of a class. - Specialisation : Inheritance is a relationship between classes where one class is the parent (base/superclass/ancestor/etc.) class of another. Inheritance provides programming by extension (as opposed to programming by reinvention) and can be used as an "is-a-kind-of" relationship. The Inheritance is effective for both descriptive part and procedural part. Inheritance provides a natural classification for kinds of objects and allows for the commonality of objects to be explicitly taken advantage of in modelling and constructing object systems. Multiple Inheritance occurs when a class inherits from more than one parent. - Reference : The value of an attribute can be more than a simple type and refers to another entity. It defines a kind of relationship. - Aggregation : A complex object may be composed by several other object which are "part-of" this one. This kind of relationship introduces a dependency between the parts of the complex object. It means that if the latest disappears, its parts are killed too. Since the first object-oriented language (in 1967 Simula was the first providing objects, classes, inheritance, and dynamic typing), numerous variants was developed. At the present time, it is important to distinguish between Object Oriented Programming Language (OOL) and OOR. OOL like C++, Classic-Ada, Java, Object Pascal, etc., aim at developing softwares based on object framework whereas OOR is a knowledge representation technology. Knowledge acquisition Learning may have several senses in computer science, three at least : CSTB - G. Sauce ELSEWISE 20/05/2010 - 28
    • - Human training : the aim is to increase the knowledge of a person using Computer Based Training Software. (This viewpoint is out of scope) - Computer training : here, the aim is to create, extend or modify a computer knowledge base. In a first case human uses a specific interactive and ergonomic software to perform this tasks. A second possibility consists of a computer self training. Specific softwares allow computer to develop, correct or update its knowledge base. Usually, practical approaches combine these two possibilities. - Discovery of new knowledge, using computer : It is a more ambitious objective consisting of going deeper in a specific knowledge domain or exploring new knowledge domain, in order to identify and validate new facts, hypotheses or theories both for human and computer. Learning involves different processes such as the acquisition of new knowledge, the organisation of the acquired knowledge into effective representations, the development of perceptual motor, and cognitive skills and the discovery of new facts, hypotheses, or theories about the world through exploration, experimentation, induction, deduction, or abduction. Learning structures and processes are essential components of adaptive, flexible, robust, and creative intelligent systems. Knowledge representation mechanisms play a central role in problem solving and learning. Indeed, learning can be thought of as the process of transforming observations or experience into knowledge to be stored in a form suitable for use whenever needed. Theoretical and empirical evidence emerging from investigations of learning within a number of research paradigms, using a variety of mathematical and computational tools strongly suggests the need for systems that can successfully exploit a panoply of representations and learning techniques. It is unlikely that a single knowledge representation scheme or a single reasoning or knowledge transformation mechanism would serve all of the system's needs effectively in a complex environment. Symbolic learning Al g no a y Dui n ec dto Idt n nc uio (opi obwn c a ne e ms r t e d g) s rent re oc a a t u Fmef a n r s cc i s o p ii to Fmaps r em o x l e Cs aa ul Ce t ds a Si s ue Pc ul r er o da Fmaps r em o x l e wo ih t t u un sg i d ak wg o i n ld m oe e n d ak wg o i n ld m oe e n Figure 2 : Different types of reasoning for acquisition knowledge Main technologies developed this last decade try to extract and formalise knowledge from various data sources. Datas may be examples, concepts or behaviour. The type of reasoning could be deductive, inductive or analogical. This taxonomy is more theoretical than practical, because, a learning software often involves several of these characteristics, which make its classification very critical. Figure 2 : Different types of reasoning for acquisition knowledge represents a simplified classification of the different types of acquisition possibilities. CSTB - G. Sauce ELSEWISE 20/05/2010 - 29
    • Deductive learning Analytic learning in its pure form uses deductive inference to transform knowledge into a form that is useful for efficient performance of tasks in a given environment. A commonly used analytical learning strategy is explanation-based learning using examples. For instance, given a solution to a particular problem, explanation- based learning mechanism generates an explanation using background knowledge (general rules of integration) showing how the solution deductively follows from what the system knows. The resulting explanation is then used to reformulate the given solution to the particular problem into a form that makes it possible to recognise and solve similar problems efficiently in the future. This means increase the efficiency of the system, for example, by reducing the amount of search effort needed to come up with a solution because the necessary knowledge now is in a more usable or operational form. Other related forms of analytic learning include deductive derivation of abstractions or generalisations using facts provided by the environment and the background knowledge. it consists on interpreting a specific function (a regulation for example) to deduce practical procedures. Inductive learning Induction is the primary inference mechanism used in synthetic learning. Unlike deduction which can lead to no fundamentally new knowledge (because all inferences logically follow from the assumed background knowledge and the given facts), inductive inference allows creation of new knowledge. A typical example of an inductive learning task involves the learning of an unknown concept given a set of examples and/or a counter-examples. This problem of concept learning essentially reduces to a search for a concept description that agrees with or approximates the unknown concept on the set of examples. This can be done without explicit knowledge about the specific domain of study. In this case the learning process involves very generic mechanisms. The inductive process may use also meta-knowledge depending on domain specificities and on the existing knowledge background. Inductive leaning concerns several types of knowledge representation as well as rules, as conceptual graphs and objects. Learning by analogy Relatively little is known about the formation of analogical mappings or analogical inference which appear to be an integral part of human reasoning. Analogy refers to a mapping between two entities (objects, events, problems, conceptual graphs, networks, behaviours, etc.), that consists of detecting difference and likeness named similarity relationship. In one case the search concerns causality relation, in the other case, the comparison interests the proximity relation. For instance, the template matching approach to pattern recognition is reminiscent of exemplar-based or case- based reasoning in artificial intelligence. Thus it is possible to adapt much of the work in exemplar-based learning to structured templates. Learning in this case reduces to the task of acquiring and modifying templates necessary for specific tasks (e.g., pattern recognition). Some other technologies of learning depend directly on the knowledge representation. For instance, learning in connectionist network system can be performed by modifying a parameter (weight) in network or changing the network architecture. Usually, the learning process needs more than one technologies. For instance in a deductive mechanism, if the background knowledge is incomplete, inconsistent or imprecise, explanation generation, the key component of explanation-based learning, cannot proceed without postulating changes in background knowledge. An interesting possibility that is worth exploring is to treat the knowledge base as though it were non-monotonic and the derived explanations as though they were tentative and use a variety of non-deductive learning strategies for hypothesising candidate revisions of the background knowledge. Other possibilities include non- deductive generalisation of explanations as hypotheses to be validated by further experimentation with the environment. Methods developed in the context of syntactic pattern recognition such as those based on distance measures for structured templates or grammar inference as well as connectionist learning models offer additional sources of ideas (e.g., adaptation of learned explanations to new situations as hypotheses to be tested) for extending the power of current deductive learning systems. Knowledge acquisition leads to verify coherency and integrity of the knowledge base. The level of syntactic control is easily performed, but nowadays, the only way for controlling the semantic of the base consists of using examples in order to detect incoherences and anomalies. CSTB - G. Sauce ELSEWISE 20/05/2010 - 30
    • Traditional artificial intelligence and syntactic pattern recognition provide a number of powerful tools. The problem is to qualified and compared the market offer. A broad range of performance measures (e.g., speed, accuracy, robustness, efficiency) are conceivable, but each kind of technologies depends on a specific knowledge representation and the knowledge domain semantic. Nowadays, knowledge acquisition technologies are used in various domain of industry such as default detection of helicopter propeller or financial domain such as bank loan management. Numerous comparisons shown that an expert system based on knowledge directly obtained from an expert is systematically less efficient than this one built with learning technologies. Furthermore, the size knowledge base of the latest is more concise for equivalent results. Resources notes : Honavar, Vasant. (1994). (ISU CS-TR 93-22) "Toward Learning Systems That Use Multiple Strategies and Representations." In: "Artificial Intelligence and Neural Networks: Steps Toward Principled Integration". Honavar, V. and Uhr, L. (Ed.), pp. 615-644. New York: Academic Press, 1994. Knowledge usage This aspects of knowledge management consists of exploiting the bases for a precise objective. It aims at demonstrating, detecting, deciding, analysing, proving, verifying, etc., in other terms to assist users in their various activities. This mainly involves the reasoning capacities of Knowledge Based Systems (KBS). Numerous possibilities of reasoning are explored in Artificial intelligence, in the image of human reasoning capacities. IA identifies mainly three mechanisms : formal reasoning (logical reasoning is the most current approach), analogical reasoning (including case based reasoning) and reasoning by abstraction, generalisation and specialisation (such as classification). The following paragraphs detail the different tool categories. The objectives isn't to be exhaustive but to describe the main families of tools, giving the keystone to get more information regarding the technology itself and particularly with the existing tools. Expert system (ES) Computer programs using AI techniques to assist people in solving difficult problems involving knowledge, heuristics and decision making are called expert systems. This first definition was limited to one kind of knowledge representation and reasoning : symbol processing systems. Nowadays, expert systems mainly concern knowledge base involving explicitly rule driven symbol manipulation using production rules and logic programming. An expert system tend to mimic the decision-making and reasoning process of human experts. They can provide advice, answer questions and justify their conclusions. An expert system may be highly interactive, and its output can be qualitative rather than quantitative.. Basic ES principle is a separation between knowledge base of the domain and reasoning mechanism. The knowledge base is divided into two main parts : - The generic knowledge of the domain, containing many independent production rules. Usually, these rules describes heuristics (called surface knowledge) but also more rarely, well established theories and principles (called deep knowledge). - The base of facts describing the studied case or the current problem. Its knowledge representation may be one or several of the following : logical propositions, objects, conceptual graphs, etc. The reasoning mechanism also called inference mechanism aims at producing new knowledge from the base of facts, applying the rules. The Inference mechanism determines the order of which rules should be invoked and resolves many conflicts when several rules are satisfied. Different strategies of reasoning, depending on the objectives of ES (e.g. proving an hypothesis, proposing a solution, etc.), may be applied : forward, backward and mixed chaining. The fundamental characteristic of such a mechanism is its independence with the knowledge domain. It depends mainly on the type of logic using for the rules representation : propositional or first order logic. ES are suitable for problems involving deduction and not so much for problems involving induction or analogy. ES prototypes have been largely developed in various industry domain, but very few of them are really efficient. This is mainly due to an initial objective of the ES usually too ambitious, involving a too large domain of expertise or with too rare experts. Nowadays, the main conditions for a successful application of ES are : - A well defined task, limited in size. This task is presently performed more by a specialist than an expert. CSTB - G. Sauce ELSEWISE 20/05/2010 - 31
    • - Identified end users : ES are not suitable for expert, but usually developed for aiding a non specialist during the absence of the expert. - Identified expert which are able to propose, to explain their knowledge and their reasoning. - To collect several cases with their solution. - To anticipate the integration of the ES in the activities and the information system of the end-user. Nowadays, the market doesn't propose ES but tools for developing ES. Two kinds of tools are available : - Expert Systems Generators which consist of toolkit containing all the components for building an ES. They usually offer several possibilities of knowledge representation, various reasoning strategies, different interfaces (human and machine). A lot of generators are available for every platforms at every prices. - Shell are generators limited or adapted to a specific class of problems. These classes could be identified by similar reasoning or by specific technical domain. These generators permit to concentrate the effort on the knowledge identification and acquisition. They allow very quick prototyping and development of ES. Various methods have been used to deal with uncertain or incomplete information in the knowledge base. Some of these methods are briefly described in section Logical form. Nowadays, no one of them is really satisfying. Multiple expert system is the last evolution of this kind of tool. In order to solve more complicated tasks involving several exerts, the proposed solution consists of developing cooperation between particular ES. So ES are able to communicate during reasoning, to ask information to an other one and to detect conflict between different views. The main problems are synchronisation in reasoning, solving conflict, and synthesis of solution. Generic mechanisms such as blackboard are not really satisfying, and the resolution of the problem needs specific domain knowledge. Case Based Reasoning Case-based reasoning (CBR) relies on the idea that similar problems have similar solutions. Facing a new problem, a case-based system retrieves similar cases stored in a case base and adapts them to fit the problem at hand. CBR is usually viewed as a process of remembering and recalling one or a small set of actual instances or generic cases. The decision making is based on comparisons between the new situation and the old experience; thus, the quality of reasoning depends on the quality and quantity of cases in the case base. With increased number of unique cases (instances or generic cases), problem-solving capabilities of CBRS improve while the effectiveness of the system (at least in terms of efficiency) may decrease. On the one hand, if the cases are not unique enough, i.e., the cases are highly correlated, then the system will be limited in the diversity of solutions it can generate. On the other hand, if the case base is small, then the possible solutions are limited. Regarding the case representation : There is still a lack of consensus as to exactly what information should be represented in a case. A case is an arbitrary set of features, used for describing a particular concept. Considering its form, a case can be a simple concept or a connected set of subcases; it can be a specific instance or a generalisation. Many different case representations (Objects, conceptual graphs for instance) have been proposed so far. There are two main reasons for this: 1. the motivation for CBR came from different directions, namely a desire to model human performance of knowledge-intensive tasks, and an attempt to use existing database technology to store and access knowledge; 2. different application areas required different functional requirements on the case representation. The major processes in CBR are as follows : - Case storage or case base construction representing all important features of the episode (actual or hypothetical or generic) in a case base. An important issue, related to representation, is case base construction. It is not a trivial task to decide how to represent a case in a particular domain and what cases are worth storing (because of the utility. Moreover, the representation can be extremely complex which makes it difficult to acquire a sufficient number of cases for a particular application. These problems are the main reasons why many CBR systems have relatively small case bases. The process of a case base construction can be simplified as follows: CSTB - G. Sauce ELSEWISE 20/05/2010 - 32
    • - A case base is created automatically from an existing database. - A special intelligent system is used as an assistance to case acquisition. The system may be based on the use of previous cases as a model and guide for expressing new cases. The basic assumption of the case-based reasoning system is that all relevant cases will be retrieved efficiently. In order to guarantee efficient retrieval, case storage requires specific organisation. Several general approaches have been proposed for this problem, namely: 1. Classification of all cases into a hierarchy, grouping relevant cases together (in CBR literature this process is usually referred to as the indexing problem); 2. Implementation of a CBR system on a parallel computer without using case classification; 3. Classification is supported by parallel implementation; 4. Neither classification nor the parallel implementation is. However, this approach is only usable for small case bases. - Case retrieval : Retrieval of cases from memory can be characterised as remembering of relevant past experience. Retrieval consists of the following sub-tasks. - Recall previous cases : the goal is to retrieve cases relevant to the current problem, e.g., retrieve similar cases. - Select the best subset : only the most promising case (or cases) retrieved in the initial step are selected. For some application only one case is required, for other, small set of relevant cases is important (for adaptation and/or justification). Retrieval strategies depends on the knowledge representation and organisation of the case base. The three main types algorithms are associative (retrieval classifies any or all feature independently of all the other features), hierarchical (organises into a general to specific concept structure) or mixed. The selection of the best cases needs to assess the similarities with the current problem. Case retrieval and similarity are inter-related. Stored cases and problem solving situations are represented in terms of surface and abstract features. Procedure to assess similarity relies on domain knowledge in addition to case features. Domain knowledge enables inferences of featural equivalence and evaluation of the importances of unmatched features in assessing similarity. - Case adaptation : There are only a few examples where the old solution can be reused without any change. More often, one must modify either a solution or strategy to get a solution for the current problem. The process of modifying the old example to fit a new problem is called adaptation. One can differentiate adaptation strategies based on diversity of CBR systems (planning, explanation, diagnosis, etc.) and designate the knowledge required for its application. The adaptation algorithms can be categorised into structural adaptation applying the adaptation rules directly to the solution stored in a case and the derivation adaptation using rules that generated the original solution to generate the new solution by their re-application. The following adaptation methods have been identified: 1. Null adaptation is the direct application of the retrieved solution to the new situation. 2. Parametrised solutions, a structural adaptation, is probably best understood. It is based on the comparison of the retrieved and input problem descriptions along the specified parameters. (This method was also called prototype recognition) 3. Abstraction and re-specialisation is a general structural adaptation technique that abstract the piece of the retrieved solution and re-specialises it later. 4. Critic-based adaptation is a structural adaptation based on using critics to debug almost correct solutions 5. Re-instantiation is a derivational adaptation method since it operates on the plan that was used to generate a particular solution. Application of Case-Based Systems concerns numerous and various areas : - Case-based design : problems are defined as a set of constraints; the problem solver is required to provide a concrete artefact that solves the constraint problem. - Case based planning : Planning is the process of coming up with a sequence of steps or a schedule for achieving a particular state of the world. CSTB - G. Sauce ELSEWISE 20/05/2010 - 33
    • - Case-based diagnosis : a solver is given a set of symptoms and is asked to explain them. - Case-based explanation :Explanation is introduced to the CBR terminology in the connection with the plan or solution failure Only recently, some attention has been paid to evaluation of CBR systems. System evaluation can include not only performance measure issues but also verification and validation. Verification is performed to ensure consistency, completeness, and correctness of the system. Validation ensures correctness of the final system with respect to the user needs and requirements. Resource notes : Representation and Management Issues for Case-Based Reasoning Systems- Igor Jurisica - Department of Computer Science -University of Toronto - Toronto, Ontario M5S 1A4, Canada - 15 Sept 1993 Neural networks Neural Networks are universal approximators : they can appropriate any non linear input-output mapping. This striking property opens up various fields of application such as automatic classification (for pattern recognition and expertise gathering through examples), industrial process modelling, non-linear system control, etc. Connectionist networks or artificial neural networks are massively parallel, shallowly serial, highly interconnected networks of relatively simple computing elements or neurones. Much of the attraction of connectionist networks is due to their massive parallelism that is amenable to co-operative and competitive computation, potential for limited noise and fault-tolerance, discovery of simple, mathematically elegant learning algorithms for certain classes of such networks, and to those interested in cognitive and brain- modelling, their similarity with networks of neurones in the brain. Each neurone computes a relatively simple function of its inputs and transmits outputs to other neurones to which it is connected via its output links. A variety of neurone functions are used in practice. The most commonly used are the linear, the threshold, and the sigmoid. Each neurone has associated with it a set of parameters which are modifiable through learning. The most commonly used parameters are the so-called weights. The representational an computational power of such networks depends on the functions computed by the individual neurones as well as the architecture of the network (e.g., the number of neurones and how the neurones are connected). Such networks find application in data compression, feature extraction, pattern classification and function approximation. The application of neural networks depends directly on their learning capacities, due to their very specific representation. variety of forms. A number of researchers have investigated techniques of initialising a network with knowledge available in the form of propositional rules. Very little systematic study has been done on the representation and use (especially in learning) of structured and conceptual graphs in connectionist networks. At the present time Neural Network applications concern very specific fields of industry such as non-linear process command or pattern recognition. Classification Classification is an important issue in pattern recognition and knowledge representation. Data classification can be performed in two modes: supervised and unsupervised. The task of supervised classification is classifying new objects (or cases) into predefined classes while unsupervised classification aims to determine homogeneous groups of objects in the data. - The supervised problem is based on the assumption that the data are made up of distinct classes or groups known a priori. Each case is described by a set of features or attributes (body-cover, body-temp, fertilisation-mode, ...). The general task is to develop decision rules in order to determine the class of any case from its attribute values. These classification rules are first inferred (or computed) from a set of available cases whose the classes are known (the training set), and are then used for class prediction of unknown cases. Supervised classification has become a major active research field during this last decade. Many new kinds of algorithms have been developed, including techniques from traditional Data Analysis, Pattern Recognition, Artificial Intelligence (Machine Learning) and Artificial Neural Networks. They differ principally in the kind of decision rules they produce and the kind of strategy they develop to realise the discriminating task. Hence, some algorithms focus their search on explicit decision boundaries that best discriminate among the different classes. Some other approaches focus on reference objects and use a resemblance measure to classify unknown instances. CSTB - G. Sauce ELSEWISE 20/05/2010 - 34
    • - In the case of unsupervised classification (or clustering), the allocation of data to distinct classes is not known a priori. The general task is now to discover homogenous groups or categories in the data, based on the attribute values of the cases concerned. The elements in each cluster must be as similar as possible to each other and dissimilar to those in the other clusters. Unsupervised classification is a useful tool in exploratory data analysis which allows pattern discovery in the data. Hence, clustering results can be used for hypothesis generation and testing during scientific inquiry. The applications of unsupervised classification methods are extensive such as in problem solving and planning, engineering, natural language processing, information retrieval, etc. Classification processes concern every types of knowledge representation as well as object, conceptual graphs, neural networks or rules. This is a keystone of the knowledge management, and more particularly for base acquisition and organisation. It find also several applications in other reasoning processes such as CBR or data completion. Several tools are available, depending on the knowledge representation, the objectives of the classification and the technical field. Constraints Satisfaction Problems Intuitively, a constraint satisfaction problem (CSP) is a search problem that involves finding an (or several) object (s) that satisfies a given specification. CSP is basically a set of variables with finite domains which are constrained by certain conditions. CSP may have a very great number of possible variable assignments that needed to be tested to find an optimal solution. Basically one encounters an exponential explosion of the search space volume when increasing the number of variables. That is why it is interesting to develop heuristic search algorithms to solve the problems within an acceptable time without searching the complete volume. Constraint logic programming provides an efficient problem-solving environment for solving constraint-based configuration tasks, often characterised as variants on dynamic constraint satisfaction problems. Several efficient tools are available and specifics technical applications are successful. Another interesting problem deals with constraint management. This problem does not consist of solving a constraint problem, but to assure that the solution satisfies the constraint. It is usually the case in design and various decision support techniques. It involves several aspects : - Constraint representation, - Constraint recognition, - Constraint Propagation, - Constraint satisfaction. Numerous domains of IA such as the theory of constraint hierarchies, the hierarchical constraint logic programming, and constraint solvers contribute to the development of this problem. The interest of constraint management has been its integration within the other knowledge representation such as object and conceptual graphs. Constraints have been used in a variety of languages and systems, particularly user interface toolkits, in planning and scheduling, and in simulation. Genetic algorithms Genetic Algorithms (GA) are adaptive methods which may be used to solve search and optimisation problems. They are based on the genetic processes of biological organisms. Over many generations, natural populations evolve according to the principles of natural selection and "survival of the fittest", first clearly stated by Charles Darwin in The Origin of Species. By mimicking this process, genetic algorithms are able to "evolve" solutions to real world problems, if they have been suitably encoded. For example, GA can be used to design bridge structures, for maximum strength/weight ratio, or to determine the least wasteful layout for cutting shapes from cloth. They can also be used for online process control, such as in a chemical plant, or load balancing on a multi-processor computer system. GA are not the only algorithms based on an analogy with nature. Neural networks are based on the behaviour of neurones in the brain. They can be used for a variety of classification tasks, such as pattern recognition, machine learning, image processing and expert systems. Before a GA can be run, a suitable representation (or coding) for the problem must be devised. It also requires a fitness function, which assigns a figure of merit to each coded solution. During the run, parents must be selected for reproduction, and recombined to generate offspring. CSTB - G. Sauce ELSEWISE 20/05/2010 - 35
    • Barriers to LSE uptake This problematic, knowledge management, is a very recent concept based on emergent technologies. First developments and applications really involving knowledge as a well identified concept concern specific and limited technical fields. In addition, they did not be all successful. Used technologies (generally coming from Artificial intelligence) was always immature and not really efficient, mainly due to the poor capacities (speed and memories) of computers. The main impediments to LSE uptake are both social and technical. - Social approach : Companies only begin to understand the advantages of capitalising knowledge for their activities. Then they have to develop specific competencies for their knowledge management, in term of policy, specialists, and resources. In order to develop knowledge based system, companies have to be involved more than in a classical software, because it concerns directly their own know how and specifities. Furthermore, firms must have more faith in the technology (i.e. Intelligence Artificial tools). - Technical approach : Nowadays, a total lack of standards and taxonomy of possibilities make a comparison or a choice between the different tools very difficult even impossible for a non specialist. It is the case for the knowledge representation and for the capacities of reasoning. As long as any standard (even de facto standard) are not available, companies will hesitate to invest in technologies which risk to disappear. The second main problem is related to the knowledge base construction. Efficient KBS needs high quality Knowledge base. Knowledge engineering is a difficult and tedious enterprise because experts are often unable to translate the mental processes that they use in solving problems in their domains of expertise into a sufficiently detailed, perfectly-defined, set of rules or procedures. Before extracting knowledge, it would be necessary to train experts to this very specific task. Furthermore, it is indispensable to develop knowledge acquisition and learning , which are the keystone of the future of knowledge management. Trends and expectations Every available technologies are always in development phase, in order to increase their reliability and performances. Every days, new features and applications field are explored. At the present time the challenge in terms of knowledge management involves two objectives : - Capitalising knowledge : this means that existing knowledge has to be the base of new knowledge which will enrich the knowledge fund of the company. - Sharing of knowledge : several systems using various reasoning, perhaps built on different knowledge representation, manipulate in fact the same knowledge. Furthermore, these systems will probably interoperate. Then, four major trends emerge at this time : - Representation knowledge level - Sharing of knowledge - Knowledge acquisition : - Distributed knowledge through the WEB Representation knowledge level Every type of representation has its own domain of application. The aim is not to propose an universal mode of knowledge representation, but to have a better definition of the existing ones. Too different dialects co-exist for the same knowledge representation. Then, some actions aim at defining common languages for each representation, for instance Object oriented knowledge principles are now sufficiently defined to be standardised. Efforts of formalisazion of knowledge representation are developed under the concept of ontology. Some of researchs propose ontologies as another knowledge representation, but their real interest is in their capacities of abstraction. Then ontologies are place as a meta level which allow the conceptualisation of knowledge. It is too soon to judge the results of the huge effort of research, but they correspond to a real need of capitalising, reuse and sharing (see next paragraphs) of knowledge. CSTB - G. Sauce ELSEWISE 20/05/2010 - 36
    • Tom Gruber (Knowledge system laboratory - Stanford University) define the term ontology as a specification of a conceptualisation, used for making ontological commitments. For pragmatic reasons, one chooses to write an ontology as a set of formal vocabulary in a way that is consistent with respect to the theory specified by the ontology. To define common ontologies means defining vocabularies of representational terms with agreed- upon definitions, in the form of human readable text and machine-enforceable, declarative constraints on their well formed use. Definitions may include restrictions on domains and ranges, placement in subsumption hierarchies, class-wide facts inherited to instances, and other axioms. describing a domain or performing a task. Each ontology embodies a set of ontological commitments in a form that enables one to build knowledge bases and tools based on those same commitments. The aim is to build libraries of shared, reusable knowledge. If the specification of a standard declarative language is like a grammar of English, ontologies are reference works akin to dictionaries. Libraries could contain "off-the-shelf" knowledge-based tools that perform well- defined tasks such as varieties of simulation, diagnosis, etc. Ontologies specify the terms by which a tool user writes the "domain knowledge" for the tool, such as the equation models that drive a simulation or the components and failure mode descriptions used by the diagnostic engine. A knowledge library could also contain reusable fragments of domain knowledge, such as component models (e.g., of transistors, gears, valves) that can be composed to produce device models (e.g., of amplifiers, servo mechanisms, and hydraulic systems). Ontologies define various ways of modelling electrical, mechanical, and fluid flow mechanisms that make such reusable component libraries possible. In practical terms, several prototypes are built on ontologies principles, using various implementation choices. Resource notes : "The Role of Common Ontology in Achieving Sharable, Reusable Knowledge Bases" Thomas R. Gruber - Knowledge Systems Laboratory Stanford University 701 Welch Road, Building C Palo Alto, CA 94304 - 31 January 1991 Sharing of knowledge This problematic is very close to the precedent one, but involves different practical solution, which are realistic in short term. Nowadays, the four main impediments to sharing and reuse of information are the followings : - Heterogeneous representations - Dialects within language families - Lack of communication conventions - Model mismatches at the knowledge level Sharing of knowledge involves knowledge base organisation, structured in several level since a practical level of knowledge, directly exploited in reasoning, to an abstraction level used for reuse and sharing. Sharing knowledge needs also well defined procedures to perform exchange of knowledge between several applications. Numerous research projects develop several approaches such as : - Knowledge Interchange Format (KIF) : KIF is a neutral specification language for the structures and relationships of a typical knowledge representation. It encompasses first-order predicate calculus and frame objects but it is not of itself a reasoning system. In computing terms, it is closest to a mechanism for type definition along with type restrictions in order to specify semantic integrity constraints that a compiler or translator can test and verify. - KQML : Knowledge Query and Manipulation Language is in fact a protocol by which queries can be exchanged between agents. - An approach to interoperation based on programs called agents. Agents use the Agent Communication Language (ACL) to supply machine processable documentation to the system programs (called facilitators), which coordinates the activities of the agents. The facilitators assume the burden of interoperation and the application programmers are relieved from this responsibility. - KQL : Knowledge Query language which claims to be analogous to SQL (Standard query Language for relational DBMS) for KM knowledge base. Nowadays, it is difficult to detect an emergent solution, but this trends concentrates great efforts., and will certainly produce exploitable results soon. CSTB - G. Sauce ELSEWISE 20/05/2010 - 37
    • Knowledge acquisition Knowledge management has also great hopes in Knowledge acquisition, the key stone of this technology, particularly in two axes : - Abduction involves the generation of candidate hypotheses for observed or otherwise given facts in terms of background knowledge and additional assumptions or hypotheses. The utility of such hypotheses is primarily in terms of guiding the search for useful theories of particular domains of interest. Abduction when used with analytic explanation-based learning can help extend background knowledge through the proposal and evaluation of candidate explanations. Unlike induction and deduction, the use of abduction in machine learning has not been widely explored, yet abduction appears to play a central role in learning and discovery. - Learning in evolutionary systems : since most task learning can be formulated as essentially search problems, it is natural to apply evolutionary algorithms in machine learning. Classifier systems, Genetic algorithms, evolutionary programming are all inspired by processes that appear to be used in biological evolution and the working of the immune system. These evolutionary methods offer relatively efficient mechanisms for exploring large spaces in the absence of other sources of information to direct search. Distributed knowledge through the WEB Knowledge modelling involves the management of many knowledge sources often geographically distributed. The World Wide Web is a distributed hypermedia system available internationally through the Internet. It provides general-purpose client-server technology which supports interaction through documents with embedded graphic user interfaces. Knowledge modelling tools operating through the web to support knowledge acquisition, representation and inference through semantic networks and repertory grids. It illustrates how web technology provides a new knowledge medium in which artificial intelligence methodologies and systems can be integrated with hypermedia systems to support the knowledge processes of professional communities world wide. The development of knowledge-based systems involves knowledge acquisition from a diversity of sources often geographically distributed. The sources include books, papers, manuals, videos of expert performance, transcripts of protocols and interviews, and human and computer interaction with experts. Expert time is usually a scarce resource and experts are often only accessible at different sites, particularly in international projects. Knowledge acquisition methodologies and tools have developed to take account of these issues by using hypermedia systems to manage a large volume of heterogeneous data; interactive graphic interfaces to present knowledge models in a form understandable to experts; rapid prototyping systems to test the models in operation; and model comparison systems to draw attention to anomalous variations between experts. The combination of web and knowledge-based systems technologies will make artificial intelligence systems widely accessible internationally and allows innovative knowledge-based multimedia and groupware systems to be developed.. Relevance to LSE Some companies are beginning to feel that the knowledge of their employees and their passed activities is their most valuable asset. They may be right, but few firms have actually begun to actively manage this knowledge assets on a broad scale. Some little development have been realised for solving specific limited problems, but pragmatic discussion on how knowledge can be managed and used more effectively on a daily basis should be addressed at either a strategical or a technological level. LSE knowledge, takes place in three different environments : - Common LSE environment : that is mainly regulation, rules, general technical knowledge sharable between all the LSE firms. - Intra enterprise : This is the specific technical, organisational knowledge of each firm, representing its originalities and its strength. - At the level of a specific project : A temporary enterprise, representing a global knowledge used to carry the project through to a successful conclusion. It is indisputable that LSE companies future depends on the development of a real knowledge management. But before using KM technologies they have to ask themselves some questions and establish a KM policy. CSTB - G. Sauce ELSEWISE 20/05/2010 - 38
    • Some other experiences lead to establish some principles that are a good starting point ("Some Principles of Knowledge Management" by Thomas H. Davenport, PhD) : Ten Principles of Knowledge Management 1 Knowledge management is expensive (but so is stupidity!) 2 Effective management of knowledge requires hybrid solutions of people & technology 3 Knowledge management is highly political 4 Knowledge management requires knowledge managers 5 Knowledge management benefits more from maps than models, more from markets than from hierarchies 6 Sharing & using knowledge are often unnatural acts 7 Knowledge management means improving knowledge work processes 8 Knowledge access is only the beginning 9 Knowledge management never ends 10 Knowledge management requires a knowledge contract It is very early days for knowledge management, and even the previous principles and rules of will engender considerable disagreement. The positive aspect is that almost anything that a firm does in managing knowledge will be a step forward. Summary Knowledge management, is a very recent concept based on emergent technologies. At the present time a great variety of knowledge reasoning developed in the context of Artificial Intelligence, is available but two main barriers freeze its usage : - Technological barrier: Efficient KBS needs high quality Knowledge Base. This means to have - efficient knowledge acquisition methodologies and tools. This point is the major objective of the research team in knowledge management for the next years. - the possibility of sharing and reuse of knowledge. This second aspect is on the main importance for a firm which wants to maintain as far as possible only one Knowledge base, but uses this knowledge within different systems. This implies universal (or less ambitious common) knowledge representation. - Social barrier: Companies only begin to understand the advantages of capitalising knowledge for their activities. Then they have to develop specific competencies for their knowledge management, in term of policy, specialists, and resources. As long as effective technologies are not available, Knowledge Management will be problematic. However, knowledge as the key stone of every activities of industry has to be managed, LSE industry must presently prepare the availability of new tools by analysing the social impact on the firm and beginning the study of their specific knowledge. CSTB - G. Sauce ELSEWISE 20/05/2010 - 39
    • Virtual enterprises Introduction Facing an increasing competitive environment where flexibility, adaptability to change, product quality and value are the obliged route to success, companies have to renew their working habits. As a consequence, the concept of Virtual Enterprise (VE) is emerging through the development of partnerships and collaboration agreements, the outsourcing of product components design where production leads to delocalised manufacturing processes in which each partner (manufacturer, suppliers, technical specialists, etc.) is focusing on his core domain of competence for the shared profitability of the industrial projects. The central issue of the project VE is the functional and shape oriented representation of a typical industrial enterprise. Based on the main business processes, the VE displays the internal relations and information flows between the different departments of this VE (e.g. business planning, research & development, operations scheduling, manufacturing, sales etc.). A Definition of a virtual enterprise may be the following : "Extending the Concurrent Engineering strategy, a Virtual Enterprise is a temporary consortium of companies that come together quickly to explore fast-changing business opportunities. Within the Virtual Enterprise, companies share costs, skills and profitability. They also access global markets as a unique entity. Finally, the Virtual Enterprise is gracefully dismantled when its business purpose is fulfilled." The keystones of a VE are communication and organisation. - Organisation : A VE provides a foundation for specific business solution that integrates people, process, and strategies in an enterprise model, unconstrained by time, place or form. Well the challenge of VE is to model new organisation, to be quickly adapted to new environment and to evolve continuously. - Communication : The necessary IT infrastructure that wires the Virtual Enterprise partners together as the main medium to information communication among the consortium has to be mentioned as part of the definition as it stands for the required backbone of the Virtual Enterprise. Information Management conveys certainly the right answer to this problematic. As a whole, Information Management means the availability of storing and conveniently retrieving any piece of information which is needed within the frame of a given project. The main problems can be summarised as follows : how to organise information so that retrieval is easy while remaining independent of any application ? What use of the existing Information Standards can be done ? How to design and implement an Information Management Infrastructure ? How to fill it in once implemented ? How to communicate information from the Information Management Infrastructure towards the various users working places ? These questions stands for key challenges towards business competitiveness and success. What exists today One can identify three technological levels to build a virtual enterprise software : - A level of Communication which is composed of a physical support and standards of communication. The notion of VE needs an efficient permanent network that allows to immediately create a VE between various partners all over the world. Nowadays, Internet in spite of its limitations represents such a physical support. The used communication protocols would be standard, easy to implement and flexible. - Organisational level : It concerns the functional model of the VE, the workflow between the actors, all the organisational aspects that allow the management of the VE. - Virtual Enterprise Environment, integrating human computer interfaces, aims at creating, parametrizing easily and quickly the VE. The visualisation of the VE allows the user to explore interactively the complex world of an industrial enterprise The two first levels rely on the notion of "plug and play". This notion takes place at different levels : - Hardware plug and play corresponds to the problem of device connection with computers. - Object plug and play is at the level of standard of communication. It addresses the different object managers and databases which has to be assembled to exchange objects as necessary without excessive dependency on the user's knowledge. CSTB - G. Sauce ELSEWISE 20/05/2010 - 40
    • - Semantic plug and play means that different applications can exchange specific types of objects, each with specific roles in each of several applications. So applications interact by providing services to one another. Communication Level One major requirement for the formation of a VE is a transparent interface between the units. An ideal interface would allow the units to establish communication links with minimum effort. The communication protocols of such an interface would be standard, easy to implement and flexible. The units would be able to access product development resources through an existing data communication network. With that respect, advanced IT, in particular existing or emerging Information Standards have to be acknowledged for the desired Information Management Framework to guaranty interoperability among various software components. The deliverable D301 "the supporting environment" proposes an overview of these different. These standards addressed the physical way of communicating and the format of the contents of communication. - EDIFACT the International Standard for Electronic Data Interchange, - CORBA, the OMG specifications tackling multi-platform interoperability, - STEP, the International Standard for product data exchange, - SGML, the International Standard for document representation, - and of course the WEB technologies appear to be the reliable components towards software architectures bridging the gap between multiple and delocalized proprietary software systems and thus satisfying industrial needs. Organisational Level This level relies on the semantic plug and play level. Explicit models of the required tools and provided by each tool, are mapped to the models of the enterprise: Models packaged with the tools are made available to the modelling repository, which already has access to the models of the current enterprise. These models define the services to be exchanged between the tool and other tools within the enterprise, including data, processing and knowledge services (i.e. the workflow). Modelling language standards allow to provide models to VE software. They could do so through any of several standard forms: - CDIF (CASE Data Interchange Format) from EIA (Electronic Industries Association) - EXPRESS from STEP (STandard for the Exchange of Product model data ) - IDEF (Icam DEFinition) based upon Softech’s Structured Analysis and Design Technique (SADT) Unfortunately, these standards are relatively immature or are not fully implemented in the modelling tools available on the market. Moreover the standards are overlapping and non-interchangeable. A vendor who selected a CDIF-based tool could not supply models to an EXPRESS-based, or vice-versa. But vendors should not have to know or care whether their customers model in CDIF or EXPRESS or IDEF, no more (now that STEP is in place) than CAD tool vendors need to know which product data managers are used by their customers. Standards organisations should strive to develop process-driven modelling techniques. Every STEP application protocol (AP) is driven by a consensus business process, as expressed in an Application Activity Model (AAM). The process is defined generically so that it can be tailored to the details of any enterprise, but the process provides a context for deciding what kinds of data services must be exchanged through a STEP standard. An AAM is not really exploitable for a VE development. At present the integration of a tool into an environment is not accomplished through model mapping, although reverse engineering a model from a tool might be used as a step in that process. Even if tools came out of the box with complete models, there are neither technology nor techniques nor standards available to use those models to support the integration process. Instead tool integration is achieved at the implementation level by writing translators or APIs that physically translate a client view into a server view. Modelling tools that might provide cost or impact analysis are not generally used to accomplish such a mapping. Some modelling languages, such as EXPRESS and CDIF, have capabilities that could be used to relate the objects in one model to those in another, but those capabilities were not designed for the application of modelling to semantic plug and play environments. However, ISO TC184/SC4 has done some work in this area. STEP is investigating the development of a dialect of EXPRESS (currently referred to as EXPRESS-X) to be used as a mapping language between EXPRESS models. CSTB - G. Sauce ELSEWISE 20/05/2010 - 41
    • Virtual Enterprise Environment Today, this notion addresses mainly the class of software called groupware. Definitions of groupware vary. The sense of groupware has evolved as the term used to describe a class of computer technologies that enable information sharing, coordination and collaboration between groups of people. Groupware has it's initial roots in electronic mail systems, as these were the first systems to provide computer-to-computer, electronic communication. As the need to work together as groups became more prevalent and as the workgroup technology environment matured, email alone began to fall short. In addition to the ability to universally share collected information, it was necessary to interactively collaborate on subject matter, and share that "collaborative memory" with others as they became involved later in the process. Also, distributed workgroups became more prevalent, resulting in the formation of vast "information islands". Groupware technology, as it is now evolving, is addressing these previous issues. It has given the ability to store information and universally share this information with others. It has enabled automated workflow and provided an environment that moves us away from paper. It has also allowed us to "extend" our participation to virtually every corner of the world. More importantly, it has done so with consideration to workers regardless of physical location. Whether it be in a home, car, plane, mobile office, hotel room, office, or the beach, we all can be active participants in the information sharing workspace. In the past few years, software companies have taken electronic communications to the next step by introducing mechanisms to collect, retain and distribute electronic information (e.g. text, graphics, voice, sound and video) in a collaborative environment. The flagship product of Lotus Corporation, Lotus Notes, has literally "set the standard" for workgroup and raised the awareness of groupware to the level it is today. Other vendors offering groupware products include Oracle Corporation, Digital, Team Software Inc., Action Technology Inc., ForeFront Group Inc., ICL Inc. and others. Recently, Collabra Software introduced a new product to the market, Collabra Share. This product builds upon existing e-mail systems to provide support for forums and electronic discussions. This collaboration capability is one of the most widely used features of the Lotus Notes product. Other major players to soon enter the groupware market are: Microsoft with Microsoft Exchange and Novell with their Wordperfect, FileNet, ICL and Xerox alliance. Other categories of products, converging in the market at the same time, expand the groupware offerings. A summary of these products categories are: advanced messaging capabilities, electronic document management, workflow, group decision & support, calendaring/scheduling and electronic/video conferencing, shared whiteboard, etc. This, coupled with the notion of "information highway" and the "Cyberspace of the Internet" has escalated the idea of information sharing not just within companies, but with all participants in society at large. Thus the explosive growth in the communications and computers industries, as the infrastructures and mechanisms to support connectivity and information exchange "anywhere, anytime" evolves. The present evolution of the notion of groupware will propose an incomplete response for the VE architecture. Nowadays, some new softwares are addressing the capacities to define and build quickly a VE to rapidly develop new business opportunities. For instance, Andersen Consulting’s Center for Strategic Technology Research (CSTaR ®) developed the Prairie prototype: through innovative technologies such as knowledge management, teamware, software agents, video conferencing, high-bandwidth networking, and megapixel displays, Prairie envisions a business environment where people, process, technology, and strategy are perfectly aligned. The materialisation or the visualisation of the VE is also an important aspect needed for understanding its temporary and moving configuration. No real product satisfies this objective. Barriers to LSE uptake If the LSE industry has been rather successful in implementing from place to place sophisticated software solutions, which too often unfortunately look very much like islands of automation, however, these have never been integrated into enterprise wide systems. No system takes in charge the complete range of needs companies are facing in their daily management and operating job. The self-contained Business Management Framework (BMF) that would support the multiple facets of designing, manufacturing, maintaining and decommissioning an industrial product, whether these product life-cycle stages are accomplished by the company itself or result from a collaborative effort between the company and some partners, doesn’t exist. Even though the vertical integration of these processes within a single enterprise, aiming at a true collaborative and concurrent engineering practice is a goal that has not yet been fully realised, the exigencies arising from today's requirements have forced industry to look beyond this goal toward the horizontal integration in the Virtual Enterprise. The next generation of BFM should result from the integration or interoperability of these systems. However, most of them, partly for competitive reasons, partly for reasons of functionality, are associated with the use of CSTB - G. Sauce ELSEWISE 20/05/2010 - 42
    • proprietary (ie non-standard) information representation formats. Information, even if it already exists on a numeric storage unit, needs often to be retyped from a format to another. Another point is the fact that electronically stored information may not be presented in an understandable way and may be difficult to retrieve. The incapacity to support the distribution and communication of information on a wide basis is also a recurrent problem. The three keystones to the development of VE are the followings : - An existent world wide efficient physical support. Internet represents the basis of VE communication, but it must evolves in term of speed of transfer, reliability and security. - Standardisation is indispensable to allow "plug and play" possibilities between every companies, at the level of communication (i.e. supporting environment) and semantics. This level of semantics refers to knowledge and data management. - Generic models (also called sometimes "meta-models")for VE organisation. It is necessary to develop model which allows every VE configurations, describing the actors, workflow, etc. Trends and expectations Targeting VE development tools, several prospective initiatives have been launched at the European level under the auspices of the Esprit IV Research programme (e.g. RISESTEP and VEGA projects) or at various national level (NIIIP project in the US, CORENET project at Singapore). The RISESTEP project targets STEP distributed technologies as an approach to bridge the gap between multiple and delocalised PDM and CAD systems in the aeronautic and automobile industries. Beyond proprietary platform, distributed architectures based on the STEP and CORBA standards appear to be the way towards software architectures satisfying industrial needs. Building upon dedicated end-users scenarios, RISESTEP is willing to demonstrate the feasibility and relevance of such approaches at an industrial level. Targeting the Architecture Engineering and Construction domain and LSE domain, the VEGA project also aims at establishing an information infrastructure which will support the technical activities and the business operations of the upcoming Virtual Enterprises in this sector. Groupware tools and distributed architectures developed in good compliance with on-going standardisation activities stands for the overall approach to satisfy the broad expectation shared by the end-users and expressed as a consensus about the need to support distributed working. Other projects are interested in VE environment. Central issue of the project Virtual Enterprise (Heinz Nixdorf Intitut - Univerität-GH Padeborn) is the functional and shape oriented representation of a typical industrial enterprise. Based on the main business processes, the Virtual Enterprise displays the internal relations and information flows between the different departments of an enterprise (e.g. business planning, research & development, operations scheduling, manufacturing, sales etc.). The visualisation of the Virtual Enterprise is realised by a virtual environment (VE) allowing the user to explore interactively the complex world of an industrial enterprise. Standardisation will produce soon an efficient framework to VE architecture development. Formal languages for workflow and distribution modelling form the subject of great efforts of standardisation in the context of International projects (e.g. STEP and Esprit IV) Relevance to LSE Large scale projects require the involvement of many body entities (client, architect, design engineers, specialists from different technical disciplines, technical controllers, construction companies) sitting at various locations, with different views and needs on the project. Furthermore, this temporary team specially created for one project, is usually reorganised for another one. Even more, its organisation evolves during the life cycle of the operation, from the inception to the demolition. On the information side, numerous documents of diverse nature are involved in the construction process. Some of them such as regulations define the legal context of a project. Others like masterplan drawings, technical specification documents or bills of quantities are generated by the engineering activities and often have a contractual importance. Drawings are the straight forward media to convey most of the information needed by the construction companies. They include a lot of information that can hardly be put into words. Textual documents are complementary to drawings. They traditionally support the engineering aspects of the project description. Then, the functioning of the LSE industry already corresponds to the notion of virtual enterprise. Furthermore, the complexity of engineering projects has increased markedly along over the past two decades regarding their CSTB - G. Sauce ELSEWISE 20/05/2010 - 43
    • scale, strengthening constraints (quality, duration, cost, regulations), number of actors involved, volume of information required and produced. New work models and advanced technologies allow to support this organisation, corresponding to a business entity defined not by real estate but by business opportunities. Because the VE is not limited by the constraints of physical reality (i.e. the location and form of resources, the time required to access or modify them) it will operate more cheaply, respond faster, change more easily. Virtualising the operations of an enterprise will soon become a requirement for conducting business. The true value of virtualisation, however, will depend on the ability to align and integrate the enterprise's people, processes and technologies with its business strategies. By freeing these elements from the constraints of time, space and form imposed by the physical world, the enterprise of the future will make quantum leaps in performance, realise the full value of its people and its knowledge capital, and discover new opportunities for doing business. Flexibility and adaptability are certainly the main benefits of the Virtual Enterprise. Flexibility and adaptability are associated with the straightforward way such consortia are formed and dismantled : upon (market) requests. The capacity of reactivity (i.e. quickness of creation, adaptation and evolution) of the Virtual enterprise represents also an advantage in terms of business and adaptation to the market. More explicitly, creating a Virtual Enterprise is becoming the natural way to quickly address business opportunities in a fast changing environment. Furthermore, the quality of the project will increase with the reliability of the information exchange and transfer. Effectively, nowadays, numerous dysfunctions, leading to fall of quality, defaults and additional costs are mainly due to losses of information during their exchanges between partners. Summary The concept of Virtual enterprise represents the main expectation of LSE industry. It corresponds to the product point of view. Available software offers Partial response to virtual enterprise. However, enough technology is available, notably relating to the communication level, to develop representative experiments (cf. RISESTEP, VEGA). But, as for Data Management, this computer technological level needs LSE knowledge to be implemented : - Product Data model - LSE Workflow model Moreover, these results should be standardised to ensure a communication between every possible partners. The STEP standard offers a response of the first point, the second must be considered in future actions. In addition, ergonomic software should be developed, giving a certain reality to virtual enterprise. The virtual reality will provide new way of representing the complexity of this association. A last question is related to which network should be used : Internet has the advantage of a world wide coverage, but problems of security and speed rate have to be solved. CSTB - G. Sauce ELSEWISE 20/05/2010 - 44
    • Conclusion A large panel of systems and technologies are available at the present time, LSE industry uses a great part of them. Numerous progress are expected notably relating to - Human Interface - Recognition techniques - Virtual reality - Data Management - Data modelling methodologies - Knowledge Management - Acquisition - Sharing and reuse - Virtual reality - Whole software At the present time, LSE industry has to make great effort to adapt the existing technologies to its own needs and specifities. The main part of the knowledge of LSE is not identified and formalised. Systems and technologies offer enormous possibilities but are only an empty container. LSE actors are responsible of the material needed to fulfil these containers. This means they have to develop standards in the domain of workflow model and product data model. The must presently begin to collect and organise their knowledge. CSTB - G. Sauce ELSEWISE 20/05/2010 - 45