Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Classification : Restricted Deliverable Reference WP Task No. 3 3100 D302 Status Draft/Issued/Revised Rev. working 310003doc3983 .doc Date Release to CEC CSTB - G. Sauce ELSEWISE 03/04/1997 01:58:00 PM - 20/05/2010 - 1
  2. 2. Document Control Sheet Amendment Record ESPRIT 20876 - ELSEWISE Revision Status Page Nos. Amendment Date By 1 WP3 : PDIT Support (draft Report Section) Systems and technologies Authors: Gerard Sauce CSTB Contributors Dee Chaudhari BICC Theo van Rijn TNO Alastair Watson Leeds University Distribution _01 _02 _03 Date 28/1/97 14/3/97 3/4/97 TW * * * BOU * * * HBG * * * SKA * BICC * * CAP * * * CSTB * * * DSBC * ECI * ULeed * * * RAM * RAKLI * TNO * * * TR * CEC * VTT * * CSTB - G. Sauce ELSEWISE 20/05/2010 - 2
  3. 3. Table of contents CSTB - G. Sauce ELSEWISE 20/05/2010 - 3
  4. 4. Introduction This Work package aims at detailing a critical evaluation of the key technology and methodology developments in PDIT which are likely to provide the tools to support and shape the LSE Industry in the future. It concentrates on those technologies, methodologies and standards most likely to impact on the working and thinking in LSE. A layered approach will be adopted, considering first the supporting PDIT environment (Task 3000), then the systems and technologies (Task 3100), and finally the application software (Task 3200). (c.f. Figure 1 : Layered approach of the key technologies) T3 0StmnT h l g A 0: yes de noe 1 s a c o is di ii n e t f o n S pt g u on pri Eio e n nn v m r t Stmn yes d s a Th l g e noe c o is Alci n pi a p t o st a o r f e w de e e nn pdt oH w na a r r d e ide e npdt e nn oH w a Ur na a n s r r d e d e de e e nn pdt oUr nse Figure 1 : Layered approach of the key technologies This task 3100 deals with the intermediary layer between users applications and supporting environment. The main characteristic of systems and technologies is the independence from hardware on one side and from users on the other side. A software of this category has to be adapted to the technical domain by specifying the technical knowledge and the particular environment of the user in order to develop a specific application. Moreover it is possible to find at least one software implementing each of these technologies on every kind of platforms. At each stage, the objective is to provide an assessment of the existing technologies, the trends, and an evaluation of the relevance to LSE. The aim being to report what can be expected to exist by 2005. The methodology will be to use existing sources, to review published material and to consult with the key players. It is important to note that our objective of identifying key technologies doesn't include an analysis of the software market. Effectively, we list and detail the main technologies in order to give to the lecturer the general concepts and the key point to go further. The latest will found neither an assessment of specific software, nor a testing protocol to choice one of them. This analyse report addresses LSE industry, but needs some IT background to be understood. The scope of this task covers a large panel of systems and technologies which we have grouped into four categories : - Human Interface : Human-Computer Interface considers anything which allows a user to interact with a computer. Examples are command line interpreter, windows, icons, menus, pointer or virtual reality. This doesn't include the different devices and way of communication between human and computer. We analyse Graphical User Interface, Virtual reality, technologies of recognition and voice synthesis. CSTB - G. Sauce ELSEWISE 20/05/2010 - 4
  5. 5. - Data Management: Data Management consists of controlling, protecting, and facilitating access to data in order to provide information consumers with timely access to the data they need. Data modelling, Object databases, distributed data bases and access via Internet are detailed. - Knowledge management Within this chapter, the whole sense of knowledge, including both object of knowledge and process of knowing are considered. We analyse the three main aspects of knowledge managment : representation, acquisition and usage - Virtual enterprise Even if this concept and the relevant technology are recent, several systems are available to allow the creation of part of Virtual Enterprise. A Virtual Enterprise is defined as a temporary consortium of companies that come together quickly to explore fast-changing business opportunities. This definition is detailed through the communication level, the organisational level and the virtual enterprise environment. For each key technology, this report tries to give an answer to the four following questions: What exists today? What are barriers to LSE uptake? What can we expect for 2005? What are the relevance for LSE? At the end of each chapter, a summary emphasises the main points, particularly barriers and relevance to LSE industry. CSTB - G. Sauce ELSEWISE 20/05/2010 - 5
  6. 6. Human Interface Introduction Two definitions could be associated to the abbreviation "HCI" : 1 - Human-Computer Interaction : The study of how humans use computers and how to design computer systems which are easy, quick and productive for humans to use. 2 - Human-Computer Interface : Anything which allows a user to interact with a computer. Examples are command line interpreter, windows, icons, menus, pointer or virtual reality. In this work, the focus is put on the second definition of HCI, considering mainly the notion of interface. This problematic should be characterised by five key words : communication, language, interaction, process and ergonomics. The three main following aspects summarise the latest. Human characteristics Human characteristics govern how people work and how they expected to interact with computers. To design the best interface, it is important to understand how human processes information, how he structures his actions, how he communicates and what are his physical and psychological requirements. The Human information processing has been and is always the object of numerous research which led to various models : models of cognitive architecture, symbol-system models, connectionist models, engineering models, etc. Language and communication are so natural to us that we rarely think about it. Human interaction using natural language not only depends on the words spoken but also on tone of voice, which enables parties to interpret the semantics of the spoken message. In vis-à-vis interaction this is emphasised by the physical body language, consciously or unconsciously shown by people. Developing a language between human and computers underscores the complexity and difficulty of a language as a communication and interface medium. Ergonomics. Anthropomorphic and physiological characteristics of people can be related and applied to workspace and environmental parameters. Some frequent problems people using computers face today are the operation of the computer, using a pointing device, such as the mouse, and watching the screen. Labour laws restrict the periods people are allowed to spend behind the computer, trying to prevent sight problems and Repetitive Strain Injury (RSI). Use and Context of Computers The general social, work and business context may be important and has a profound impact on every part of the interface and its success. • The social organisation , the nature of work must be considered as a whole. this concerns points of view, models of human , models of small-groups and organisations, models of work, workflow, co-operative activity, offices socio-technical systems, etc. Let us remark an important notion : human systems and technical systems mutually adapt to each other. Once a system has been written, most of the adaptation is by the human, Effectively, he may be able to configure the software, but not much software will yet adapt (automatically or not) to a user. • There are classes of application domains and particular application areas where characteristic interfaces have been developed for example : document-oriented interfaces (e.g., text-editing, document formatting, structure oriented editor, illustrators, spreadsheets, hypertext), communications-oriented interfaces (e.g., electronic mail, computer conferencing, telephone and voice messaging), design environments (e.g., programming environments, CAD/CAM), etc. Part of the purpose of design is to arrange a fit between the designed object and its use. Adjustments to fit can be made either at design time or at time of use by either changing the system (or the user at last) and changes can be made by either the users themselves or, sometimes, by the system. CSTB - G. Sauce ELSEWISE 20/05/2010 - 6
  7. 7. Computer systems and architecture. Computers have specialised components for interacting with humans. Some of these components are basically transducers for moving information physically between human and machine (Input and Output Devices). Others have to do with the control structure and representation of aspects of the interaction (basic software). Devices must be constructed for mediating between humans and computers. They are mainly input devices (keyboard, mouse, devices for the disabled, handwriting and gestures, speech input, eye tracking, exotic devices such as EEG and other biological signals) and output devices (display, vector and raster devices, frame buffers and image stores, canvases, event handling, performance characteristics, devices for the disabled, sound and speech output, 3D displays, motion e.g. flight simulators, exotic devices). This technical aspect is out of scope of this chapter. The software basic architecture and techniques for human computer interaction recover four aspects : a - Dialogue genre : The conceptual uses to which the technical means are put. Such concepts arise in any media discipline such as film and graphical design. (Workspace model, transition management, design, style and aesthetics b - Dialogue inputs : Types of input purposes (e.g., selection, discrete parameter specification, continuous control) and input techniques: keyboards (e.g., commands, menus), mouse-based (e.g., picking, rubber- banding), pen-based (e.g., character recognition, gesture), voice-based. c - Dialogue outputs : Types of output purposes (e.g., convey precise information, summary information, illustrate processes, create visualisations of information) and output techniques (e.g., scrolling display, windows, animation, sprites, fish-eye displays) d - Dialogue interaction techniques : Dialogue type and techniques (e.g., commands, form filling, menu selection, icons and direct manipulation, generic functions, natural language), navigation and orientation in dialogues, error management, and multimedia and non-graphical dialogues: speech i/o, voice and video mail, active documents, videodisk, CD ROM. The Dialogue techniques are the subject of this section. What exists today We identified three groups within the existing technologies : - Graphical User Interface, - Virtual Reality, - Other Interfaces Graphical User Interface (GUI) The style of graphical user interface invented at Xerox PARC, popularised by the Apple Macintosh and now available in other varieties such as the X Window System, OSF/Motif, NeWS and RISC OS, are also known as WIMP (Windows, Icons, Menus and Pointers or maybe Windows, Icons, Mouse, Pull-down menus). GUI uses pictures rather than just words to represent the input and output of a program. A program with a GUI runs under some windowing systems (e.g.; The X window System, Microsoft Windows, Acorn RISC OS). The program displays some icons, buttons, dialogue boxes etc. In its window on the screen, the user controls the program by moving a pointer on the screen (typically controlled by a mouse) and selecting certain objects by pressing buttons on the mouse while the pointer is pointing at them. Each type of GUI is available on several platforms. One can classify the main WIMP under three categories using a platform criteria : - Workstation (mainly using UNIX as operating system). - PC computer. - Macintosh. Furthermore, it's possible to find some software packages running on one of the previous platform, which emulate the GUI of another platform. CSTB - G. Sauce ELSEWISE 20/05/2010 - 7
  8. 8. GUI On workstation X Windows : Sponsoring body: Initially developed by MIT's project Athena. It became a de facto standard supported by the X consortium. X (X window) is a specification for device-independent windowing operations on bitmap display devices. X uses a client-server protocol, the X protocol. The server is the computer or X terminal with the screen, keyboard, mouse and server program and the clients are application programs. X Windows is used on many Unix systems. OpenWindows is a server for Sun workstations handling X window System protocol. X has also been described as over-sized, over-features, over-engineered and incredibly over-complicated. Clients may run on the same computer as the server or on a different computer, communicating over Internet via TCP/IP protocols. This is confusing because X clients often run on what people usually think of as their server (e.g. a file server) but in X, is the screen and keyboard etc. which is being "served out" to the application. For instance, Solaris OpenStep for Sun workstation, is implemented using the facilities of the Solaris Operating Environment, including the X11 windowing system. Other GUI on Workstation OSF Motif (Open Software Foundation - Motif : Sponsoring Body : OSF Open Software Foundation is the standard graphical user interface and window manager (responsible for moving and resizing windows and other practical functions). This is one of the choices of look and feel under the X Window system. Motif is based on IBM's Common User Access (CUA) specification which permits users from the PC world to migrate to UNIX with relative ease. Products implementation: Motif is used on a wide variety of platforms by a large population of Unix users. Machine types range from PCs running Unix, to workstations and servers for major manufacturers including Sun, Hewlett Packard, IBM, Digital Equipment Corporation, and Silicon Graphics. In order to have one's software certified as Motif compliant, one must pay a fee to OSF. Motif is running under the X Window system. Developers who wish to distribute their OSF/MOTIF applications may do so as long as the license agreement is not violated. There are no non-commercial Motif toolkits available, and the Motif source by OSF is fairly expensive. Open Look is a graphical interface and window manager from Sun and AT&T. This is one of the choices of look and feel under the X Window system. It determines the `look and feel' of a system, the shape of windows, buttons and scroll-bars, how you resize things, how you edit files, etc. It was originally championed by SUN microsystems before they agreed to support COSE (Common Open Software Environment) OLIT, XView, TNT are toolkits for programmers to use in developing programs that conform to the OPEN LOOK specifications. - OLIT was AT&T's OPEN LOOK Intrinsics Toolkit for the X Window system; - XView is Sun's toolkit for X11, written in C. XView is similar in programmer interface to SunView. - The NeWS Toolkit (TNT) was an object-oriented programming system based on the PostScript language and NeWS. TNT implements many of the OPEN LOOK interface components required to build the user interface of an application. It's included in OpenWindows up to release 3.2, but is not supported (and will not run) under OpenWindows 3.3 (based on X11R5). PC Emulator WABI (Developer/Company: IBM) A software package to emulate Microsoft Windows under the X window System: Wabi 1.1 for AIX is actually ordered as a feature of AIXwindows(r), IBM's implementation of the advanced X Window system using the OSF/Motif(tm) interface standard. Since Wabi 1.1 for AIX is mapped to the AIXwindows interface on the AIX/6000 system, it provides all the function of the industry-standard X Window, including the ability to exploit Xstations at a lower cost-per-seat than fully configured workstations. What's more, X Windowing can CSTB - G. Sauce ELSEWISE 20/05/2010 - 8
  9. 9. improve application performance by offloading the graphics-intensive work from the server to the Xstations. WABI is also widely available on SUN. GUI On PC MS-Windows : Developed by the Company Microsoft MS-Windows was a windowing system running on a DOS System. It replaces the DOS prompt as a mean of entering commands to the computer : Instead of typing a command at the "C:" on the screen, entails typing the command correctly and remembering all the necessary parameters, a pointer on the screen is moved to an icon (small picture) that represents the command to be run. By pressing a button on a mouse, a window appears on the screen with the activated program running inside. Several windows can be opened at once, each with a different program running inside. The most important role of Windows has been to promulgate some interface standards. Although software designers have used the components of the Windows interface in different way, in order to make their product unique, the same components are used in every Windows interface : Windows, Menus and Commands, dialogue boxes, button and check boxes, Icons. The window system and user interface software released by Microsoft in 1987 was widely criticised for being too slow on the machines available. The latest release of the primary feature (May 1994 :windows 3.1, or maybe Windows 3.11) was more efficient and became really a standard GUI for PC, used by developers. The new feature of MS-Windows (95 and NT) replaces both MS-DOS and Windows 3 on workstations. It became multi-tasks. A server version (Windows NT 3.5) is available today. These two versions are replaced by Windows NT 4. X-Windows Emulator : Some software packages propose to emulate a X-Windows system, in order to transform the PC to a X terminal, connected to a X server. (X-Win32, Exceed 5, etc.). This allows the user to run UNIX applications from his PC. GUI on Macintosh Macintosh user interface Originally developed at Xerox's Palo Alto Research Centre and commercially introduced on the Xerox Star computer in 1981, Apple later built a very similar version for the Macintosh. The Macintosh user interface has become a very popular method for commanding the computer. This style of user interface uses a graphical metaphor based on familiar office objects positioned on the two-dimensional "desktop" workspace. Programs and data files are represented on screen by small pictures (icons) that look like the actual objects. An object is selected by moving a mouse over the real desktop which correspondingly moves the pointer on screen. When the pointer is over an icon on screen, the icon is selected by pressing a button on the mouse. A hierarchical file system is provided that lets a user "drag" a document (a file) icon into and out of a folder (directory) icon. Folders can also contain other folders and so on. To delete a document, its icon is dragged into a trash can icon. The Macintosh always displays a row of menu titles at the top of the screen. When a mouse button is pressed over a title, a pull-down menu appears below it. With the mouse held down, the option within the menu is selected by pointing to it and then releasing the button. For people that were not computer enthusiasts, managing files on the Macintosh was easier than using the MS- DOS or Unix command line interpreter, and this fact explains the great success of the Macintosh. Unlike the IBM PC worlds, which, prior to Microsoft Windows had no standard graphical user interface, Macintosh developers almost always conformed to the Macintosh interface. As a result, users feel comfortable with the interface of a new program from the start even if it takes a while to learn all the rest of it. They know there will be a row of menu options at the top of the screen, and basic tasks are always performed in the same way. Apple also kept technical jargon down to a minimum. MS-Windows Emulator: CSTB - G. Sauce ELSEWISE 20/05/2010 - 9
  10. 10. The last version of Macintosh user interface provides some of the MS-Windows functions, and then allows the possibility to run software developed for MS-Windows. For instance, the PowerPC Macintosh have an easy way to instantly switch between the Macintosh operating system and the DOS or Windows 3.1 environment. X-Windows Emulator : Some software packages propose to emulate a X-Windows system, in order to transform the Macintosh to a X terminal, connected to a X server. MacX 1.5 is an enhanced version easy-to-use software for high-performance X Window System computing on Macintosh and other MacOS-based systems. MacX helps increase the productivity of Macintosh users in UNIX/VMS environments by enabling them to seamlessly run both network-based X applications and Macintosh applications on one Macintosh computer. Virtual reality Virtual Reality is a set of computer technologies the combination of which provides an interface to a computer-generated world, and in particular, provide such a convincing interface that the user believes he is actually in a three dimensional computer-generated world. This computer generated world may be a model of a real-world object, such as a house; it might be an abstract world that does not exist in a real sense but is understood by humans, such as a chemical molecule or a representation of a set of data; or it might be in a completely imaginary science fiction world. A key feature is that the user believes that he is actually in this different world. This can only be achieved if the user's senses - sight, touch, smell, sound and taste, but primarily sight, touch and sound - give convincing evidence that he or she is in that other world. There must be no hint of the real world outside. Thus, when the person moves, reaches out, sniffs the air and so on, the effect must be convincing and consistent. A second key feature of Virtual Reality is that if the human moves his head, arms or legs, the shift of visual cues must be those he would expect in a real world. In other words, besides immersion, there must be navigation and interaction. Three key words characterise the current common Virtual Reality softwares : - 3 dimensions, - Interaction. - Real-Time, 3 dimensions The first condition to obtain a reality base and a real immersion is to address the sight. This sense takes a particular importance and refers to the nature of reality being shown. One of the most important cues is the perception of depth. To achieve this illusion, it is necessary to have a 3D model of the world and to send slightly different images to the left and right eyes in order to mimic the sense. In this context, actual displays are unsuitable tools, and if a user wants to have a very believable experience, he must use specific headset. Recent research have shown that good quality 'surround sound' can be just as important to the sense of reality as are visual clues. Interaction VR allows the user to see how pieces inter-relate and how they interact with one another. Interactivity is considered as the core element of VR. Rather than just passively observe it, users should be able to act on the environment, to control events which transform the world. Real-Time The speed of response of the system to movement and interaction is an essential aspect to appreciate how the virtual world convincing is. In order to obtain a natural speed, the system should perform a perfect synchronisation between the person moving his head, the visual clues following, and eventually the other sense reactions. Two fundamental aspects of VR could be identified : technology and software. True interactivity in immersive VR requires adapted feedback, i.e. controlling the interaction not by a keyboard or mouse, but by physical feedback to the participant using special gloves, hydraulically controlled motion platforms, etc. The technology of VR includes Headset, data glove, body suit, sound generation. CSTB - G. Sauce ELSEWISE 20/05/2010 - 10
  11. 11. Sotfware packages propose 2 types of tools : - Tools for modelling virtual world through 3D representation (also sound for the most efficient ones), and for defining specific behaviour and interaction between objets. These tools, for developers, are call world builder. A lot of various world builders are available for every kind of platform. - Tools for representing the world and allowing users immersion. They control the specific devices. They are named world browser. Nowadays, each package has its own model representation, no de facto standard emerges, except the Virtual Reality Modelling Language (VRML) which is a standard language for describing interactive 3-D objects and worlds delivered across the Internet. Nowadays, this language limits the VR application to one sense : the sight. Effectively, users immersion is only performed into a 3D model of the world, allowing then the virtual world access to users with ordinary computers with simple devices (keyboard and mouse). The next version of VRML will extend its functionnalities by providing sensors and allowing 3D animations. In practice only a small number of applications are already available, mainly for testing further developments. VR industry is immature, but it will in a medium term become a key component of Human Interface. Other Interfaces Multimedia Multimedia information is defined usually as combinations of different data types, like audio, voice, still and moving pictures, and drawings, as well as the traditional text and numerical data. Multimedia is a way to satisfy people need's for communication, using computer. One aspect of multimedia is the amount of control exercised by the user, i.e. how does the user interact and “browse” through the information presented in a multimedia fashion, such as video-disks or information columns. This is why we cannot really consider multimedia as a human computer interface. It involves several techniques used for HCI, specially graphic representation and audio/video effects, and adapted devices. Document recognition technology Close to specific devices several systems refer recognition technology for data entry and document management. Characters and graphical elements can now be recognised automatically. The main technologies are the followings : - Optical character recognition (OCR) is a process which transforms a page into ASCII code. Scanning devices can capture a page of text in a few seconds with relatively little manual effort. Afterward, OCR software can analyse and transform image into string and text which can be edited, indexed, etc. Several methods of scanning of text exist : analytical, morphological, statistical methods and "neural network" technology. An advanced OCR package also uses lexical context to improve accuracy. Success depends in a large part on the quality of the original text. - Optical mark reading : This variant of OCR is suitable when user can describe specific forms to search. - Intelligent Character recognition : this technique is widely used for Handwriting acquisition. The two main problems are the accuracy and the effective speed of ICR packages. - Barcoding : a very efficient and suitable technology anywhere where data can be pre-coded, with a typical error rate of 0.05 %. Voice synthesis and recognition Several years of research have been conducted with the goal of obtaining speech recognition by computers. Whereas today's systems are far from reaching the goal of real time recognition of unconstrained human language, the technology has evolved to a point where it is useful in a number of applications. There are several classification of voice recognition technologies. They are as follows: - Speaker Independent vs. Speaker Dependent Speaker independent systems are designed to work for anyone without having to train them to a specific person voice. Speaker dependent systems are trained to recognise a single person voice. Speaker independent systems work immediately for anyone, but usually have less accuracy and a smaller vocabulary than speaker dependent systems. These systems are commonly used for commands such as "copy" or "paste" because the vocabulary is generally around 300 words. Speaker dependent systems CSTB - G. Sauce ELSEWISE 20/05/2010 - 11
  12. 12. require about 1 hour of training for a new user. It also requires that you make corrections as you are dictating to help the system keep on learning. These systems have vocabularies of 30,000 to 60,000 words. There are also many specialised vocabularies available mainly in medicine and legal areas. Vocabularies are often specific to a field such as radiology and emergency medicine. The accuracy of voice recognition can be up to 98% (low 90's is more realistic). - Continuous Speech vs. Discrete Speech Continuous speech is basically talking at a normal rate. Most of the voice recognition industry is currently researching continuous speech systems, but they are years away. There are optimistic predictions to have it by the year 2000, but there are also critics doubting this. Discrete speech is pausing between each word. Voice recognition technology currently requires discrete speech. The pauses between words do not have to be long, but it definitely takes practice. Manufacturers claim up to 80 words per minute, but 60 wpm is probably more realistic. Speaking with pauses between words has the potential of being very distracting to the thought process. - Natural Speech Natural speech refers to understanding of language. This is more of an ideal than a current goal. For voice recognition systems, they would hypothetically function more like a human transcriptionist. This would allow saying things like "Replace those last two sentences with…" and know what to do with "um, ah,". There are several features in voice recognition systems. For example, some systems use the context to help select the correct word, (night / knight). Some allow macros or templates that build a standard reports and then only variances need to be dictated. There are a number of speech synthesis systems on the market today. It is interesting to note that in artificial speech generation there is a tradeoff between intelligibility and naturalness. Currently, the industry has placed emphasis on naturalness, with the unfortunate consequence that even the high end systems are sometimes hard to understand. Speech synthesis is generally regarded as a secondary and much less complex issue when compared to speech recognition and understanding. The current trends in the industry are for speaker-independent systems, software based systems, and extensive use of post-processing. Of great interest is the emergence of spoken language systems which interpret spontaneous speech. Speech recognition systems are being widely used by telephone companies, banks, and as dictation systems in many offices. These applications are highly constrained and often require the user to pause between each word (isolated speech recognition). The main disadvantages of this kind of technology are related to : - The accuracy : Most speech understanding systems are capable of high accuracy given unlimited processing time. The challenge is to achieve high accuracy in real-time. - Speech recognisers are susceptible to ambient noise. This is especially true for speaker independent systems which use speaker models that are developed in quiet laboratories. The models may not perform well in noisy environments. - Out-of-Phraseology Speech - Speech recognisers are not yet capable of understanding unconstrained human speech. Accordingly, applications are developed based on constrained vocabularies. The challenge is to detect out-of-phraseology speech and reject it before it is post-processed. It is clear that speech is the most familiar way for humans to communicate, unfortunately, speech recognition systems are not yet capable of perfect recognition of speech in real time. Barriers to LSE uptake GUI have taken a primordial importance for the software development. Nowadays, old softwares using a simple alphanumerical interface (on VT100 terminal or DOS Operating system) are progressively removed from companies. Apple first with Macintosh, and recently Microsoft with MS-Windows have widely contributed to the new feature of the software packages. The restriction to few de facto standards is really a good aspect for users, it facilitates the changing of computer and reduces the adaptation period. However, actual GUI need more and more memory and performances, sometimes to the detriment of the pure engineering application. Considering Virtual reality, this new technology is confronted with several problems actually limiting its usage : CSTB - G. Sauce ELSEWISE 20/05/2010 - 12
  13. 13. - Firstly a technological problem due essentially to the poor performance : The objects of the virtual world do not move or react fast enough to provide natural scene movement, or need super computer to obtain a reality base. The computation needed to render and display a 3D model are enormous, and consequently need a important investment. - Secondly, the characteristics of specific devices, weight of headset and gloves, communication between computer and devices through direct cable links, low resolution of the images are great disadvantages for the realism. - Thirdly, defining a virtual world model of high quality is a very long and expensive work. A great part of this action is actually "hand made". Some software packages propose an automatic generation from CAD Data, but the latest are not thinking for this purpose and consequently not really adapted. Furthermore, for a whole virtual world, the model size is usually enormous and leads problems of memory and calculation time. - Fourthly, the absence of a de facto standard limits the investment in such technologies. Effectively, only big firm can invest in a software package which risk to disappear few years later. Whereas this package is standard compliant, modelling worlds, data and results could be reused within an other package. Regarding other HCI technologies, LSE as other industries is waiting for more efficient and suitable tools. This is the case for voice and Handwriting recognition. Trends and expectations Optimal exploitation of computers can be achieved only if staff adapt to machines. Human Computer Interface (HCI) research aims to reduce this adaptation and produce a more 'user natural' environment. Five definitions of different areas of HCI research are the following: - Interactional hardware and software : Developing, studying and providing guidelines for various ways of interacting with computers, both at the software and hardware level. - Matching models : studying how users interact with computers. - Task level : Determining how well systems meet user's needs. - Design and development : studying designers and the design process. - Organisational impact : studying the impact of new systems on organisations, groups and individuals. Human Computer interfaces include speech technologies (speech synthesis and speech recognition), gesture recognition, natural language, virtual reality. There is a new emerging generation of intelligent multimedia human-computer interfaces with the ability to interpret some forms of multimedia input and to generate co-ordinated multimedia output : from images to text, from text to images, co-ordinating gestures and language, and integrating multiple media in adaptive presentation systems. Over the past years, researchers have begun to explore how to translate visual information into natural language and the inverse, the generation of images from natural language text. This work has shown how a physically based semantics of motion verbs and locative prepositions can be seen as conveying spatial, kinematic and temporal constraints, thereby enabling a system to create an animated graphical simulation of events described by natural language utterances. There is an expanding range of exciting applications for these methods such as advanced simulation, entertainment, animation and CAD systems. The use of both gestures and verbal descriptions is of great importance for multimedia interfaces, because it simplifies and speeds up reference to objects in a visual context. However, natural pointing behaviour is possibly ambiguous and vague, so that without a careful analysis of the discourse context of a gesture there is a high risk of reference failure Practically, for the next ten years, realistic expectation appear for the following technologies : Virtual reality Two trends characterise the development of virtual reality software packages : - Virtual reality in the pure sense, that includes total immersion addressing the three main senses (sight, sound and touch) using specific devices. CSTB - G. Sauce ELSEWISE 20/05/2010 - 13
  14. 14. - Simplified Virtual Reality based mainly on sight and sound, using ordinary display. This trends will probably lead to very efficient tools. In fact this will add to actual graphical modelers two fundamental functionalities : interactivity and pseudo immersion in real-time. It introduces the temporal dimension. Although these specifications are restrictive, first very efficient available tools are really interesting. At this time, it is very difficult to identify an emerging de facto standard for Virtual reality model. If VRML version 2 makes a real and positive contribution to this technology, a big step will be done in this direction. Moreover, VRML affers a new dimension to virtual reality : multi-user capability. VRML will evolve into a multi-user shared experience in the near future, allowing complete collaboration and communication in interactive 3D spaces. Any proposal must anticipate the needs of multi-user VRML in its design, considering the possibility that VRML browsers might eventually need to support synchronisation of changes to the world, locking, persistent distributed worlds, event rollback and dead reckoning. Voice recognition In the past, we have been required to interact with machines in the language of those machines. With the advent of speech recognition and synthesis technology, humans can communicate with machines using constrained natural language without noisy ambience. Much research is being conducted in the area of spoken language understanding. Spoken language systems attempt to take the best possible result of a speech recognition system and further process the result. A spoken language system is defined as a system understanding spontaneous spoken input. Spontaneous speech is both acoustically and grammatically challenging to understand. Speech recognition applications have a tremendous opportunity for growth. Besides the speaker-independent continuous speech, the future development of speech recognition technology will be dedicated to the following applications: - Client/Server Speech Recognition :Client/Server software will allow speech applications to work over a wired or wireless network. This means that after users dictate from their workstation, users can press a "SEND" button and convey their recorded voice file to the server, where the speech engine will perform the recognition and return them to users. - PC-Based Computer-Telephony Application : Dictation, call processing, and personal information manager (PIM) will be integrated into one system installed in a computer. For example, users will use voice to update their daily schedule and use telephone voice-menu to order products. Because prices of Central Processing Units and Digital Signalling Processors are coming down, speech recognition systems are seen as affordable and feasible equipments for computers. The goal of speech recognition is to speak in a speaker-independent continuous fashion into computers. However, the present situation for most widely used speech-recognition technology is "discrete." This means that users need to pause between words so that the computer can distinguish the beginning and ending of each word. In the future, continuous speaker-independent speech recognition will be due to better processor and algorithmic breakthroughs. Handwriting recognition Optical Character Recognition (OCR) is already well established in commercial applications. The recognition of handwritten text is an important requirement for Pen Computing but far from a trivial one. Many different initiatives were taken to solve this problem and it is not yet known which is the best and the results are still not perfect. It is important to stress two points: - Handwriting recognition is no more a new technology, but it has not gained public attention until recently. - The ideal goal of designing a handwriting recognition method with 100% accuracy is illusionary, because even human beings are not able to recognise every handwritten text without any doubt, e.g. it happens to most people that they sometimes cannot even read their own notes. There will always be an obligation for the writer to write clearly. Recognition rates of 97% and higher would be acceptable by most users. The requirements for user interfaces supporting handwriting recognition can easily be deducted from the pen and paper interface metaphor. First, there should be no constraints on what and where the user writes. It should be possible to use any special character commonly used in various languages, the constraint to use only ASCII characters is awkward, especially with the growing internationalisation that the world is facing these days. The ideal system is one that supports handwritten input of Unicode characters. The second requirement is that text CSTB - G. Sauce ELSEWISE 20/05/2010 - 14
  15. 15. can be written along with non-textual input, e.g. graphics, gestures, etc. The recogniser must separate these kinds of input. Off-line recognition is done by a program that analyses a given text when it is completely entered, hence there is no interaction with the user at this point. On-line recognition, on the other hand, takes place while the user is writing. The recognizer works on small bits of information (characters or words) at a time and the results of the recognition are presented immediately. The nature of on-line systems is that they must be able to respond in real time to a user's action, while off-line systems can take their time to evaluate their input : speed is only a measure of performance, not of quality. Until recently, only the recognition of handprinted writing has been studied, but with the growing wide-spread use of pen-based systems, cursive writing becomes important. Most systems do not yet support the recognition of cursive (script) writing. Printed characters are much easier to recognise because they are effortlessly separated and the amount of variability is limited. Pen-based systems are already available (Microsoft Windows for Pen Computing), GO (PenPoint), Apple (Newton), General Magic (Magic Cap, Telescript). In the next decade new results in handwriting recognition research will make thess systems more and more attractive. Relevance to LSE Computers are now imperative to realise LSE projects, with a high quality for company and users. This means to improve industrial actor's productivity, to reduce time to market through prototyping, digital mock-up, etc. In this context, everything which facilitates the usage of computer (and consequently increases the productivity) is interesting for LSE. In order to detail the relevance of HCI to LSE, let us focus on two important stages of LSE projects : - Design stage : During the design phase, the first problem is a perception of the project, this means objects which have no real physical existence, id virtual object. We can consider at this stage two points of view : designers viewpoint, and the one from the others (client, environment, ...). If designers have a relatively good understanding of the project, it is usually a very difficult task to give this vision to the others. The assistance of Virtual reality will represent an primordial advantage in terms of communication and client perception. Expectation are also for designers, VR will add a new dimension in their project vision which is limited by display size. The notion of immersion will allow them to have a better self possession and control on the project. VR will also enable a better relation to be made between Design stage and Construction stage by assessing the produceability / constructability through visualisation of the product design, i.e. closing in on Design for Construction. Obviously, designers are also interested in GUI for their daily work, in order to spend more time for working on the project than for communicating with computer. - Construction stage: Here, the two main components are related to the place of actors : in office or on site. In office, Virtual Reality is always interesting in order to represent a permanent actualised view of the site, rightly to fulfil the gap with reality, due to the distance or difficulties to access of the project. On the other side, actors on site, sometimes in difficult situation, need more ergonomics to communicate with computer and simply use it for increase the productivity and the quality. All recognition techniques, voice, handwriting, gesture will facilitate HCI in this context. Summary GUI have taken a primordial importance for the software development. The restriction to few de facto standards is really a good aspect for users, it facilitates the changing of computer and reduces the adaptation period. On the contrary, this de facto standard introduces a dependence to specific platforms. LSE Industry as other industrial domain, is always waiting for more ergonomics and new ways to interact with computer. CSTB - G. Sauce ELSEWISE 20/05/2010 - 15
  16. 16. - As input, recognition technologies promise soon efficient application (Voice and Handwriting) - Virtual reality offers new possibilities which will completely change HCI for input communication as well for output LSE actors has to contribute to the development of virtual reality by experimenting this technology in two directions : - for the whole project perception, for project communication but also for engineering design. One of the most problem of IT is that the project model stored in the computer is more and more abstract, and more and more difficult to perceive as a whole. Virtual reality appears as a mean to extend the limits of current display in order for instance to facilitate the presentation of the project to the client or its integration within the environment. The extension to other senses and the capacity of interaction will be a complementary advantage notably for engineers, allowing them new possibilities of simulation. - Virtual reality will probably stir up HCI : For instance, Virtual enterprise needs some new kind of ergonomics, and on can easily imagine a software which proposes a virtual environment with virtual actors, meeting, and so on. LSE Industry has to explore this technology to be able to establish a set of specifications in order to influence the future software tools. CSTB - G. Sauce ELSEWISE 20/05/2010 - 16
  17. 17. Data Management Introduction First of all, it seems necessary to define the notion of Data Management. At the basic level, the term of DATA characterises numbers, characters, images or other method of recording, in a form which can be assessed by a human or (especially) input into a computer, stored and processed there, or transmitted on some digital channel. Data on its own has no meaning, only when interpreted by some kind of data processing system does it take on meaning and become information. For example, the number 5.5 is data but if it is the length of a beam, then that is information. People or computers can find patterns in data to perceive information, and information can be used to enhance knowledge. In the past, Data Element which is the most elementary unit of data identified and described in a dictionary or repository which cannot be subdivided, evolved from simple alphanumerical item to become more complex items. Nowadays, Data Element represents facts, text, graphics, bit-mapped images, sound, analog or digital live-video segments. Data is the raw material of a system supplied by data producers and is used by information consumers to create information. The term management has evolved too. At the beginning the main sense considers mainly the data point of view. Data Management consists of controlling, protecting, and facilitating access to data in order to provide information consumers with timely access to the data they need. The notion of Database refers to a collection of related data stored in one or more computerized files in a manner that can be accessed by users or computer programs by a database management system. Then, a Database management system (DBMS) is an integrated set of computer programs that provide the capabilities needed to establish, modify, make available, maintain the integrity of and generate reports from a database. It has evolved along four generic forms : 1 Hierarchical DBMS (1960s) - records were organised in a pyramid-like structure, with each record linked to a parent. 2 Network DBMS(1970s) - records could have many parents, with embedded pointers indicating the physical location of all related records in a file. 3 Relational DBMS (1980s) - records were conceptually held in tables, similar in concept to a spreadsheet. Relationships between the data entities were kept separate from itself. Data manipulation created new tables, called views. 4 Object DBMS (1990s) - Object Oriented DBMS combine capabilities of conventional DBMS and object oriented programming languages. Data are considered as objects. Other classes of ODBMS associates Relational DBMS and Object capabilities. In a parallel direction, Data Modelling methods used to analyse data requirements and define data structure needed to support the business functions and processes of a company evolves a lot. These data requirements are recorded as a conceptual data model, a logical map that represents the inherent properties of the data independent of software, hardware or machine performance considerations. The model shows data elements grouped into records, as well as the association around those records. Data modelling defines the relationships between data elements and structures with associated data definitions. Database schema became more logical, this means that some methodologies allowed developers to elaborate schema more and more independently from specific DBMS. These methodologies are an essential part of Data Management technology. Last ten years, main basic problems such as data access, security, sharing which represents the fundamental functionalities of DBMS have been completed with an essential aspect : the distribution not in a homogeneous context (hardware and software) but in a heterogeneous environment. The development of network (Intranet and Internet) have created new kind of possibilities, users and needs for DBMS. Client/server systems and distributed systems using network is the actual challenge of Data Management technology. CSTB - G. Sauce ELSEWISE 20/05/2010 - 17
  18. 18. Regarding the increasing complexity of data, new definitions of data management systems are proposed, more end user oriented. Objectives consist in giving more sense to information in order to take maximum benefits from the stored data. Effectively, a company has in its own lot of information not really useful for an other goal than the designed one. Then it is necessary to organise the whole information manipulated by a company or a project. Two main classes of such systems are now in rapid development. The first one is product oriented, and the second one business oriented. - Product Data Management Systems (PDMS) or Engineering Data Management System (EDMS) : The fundamental role of a these systems is to support the storage of technical documentation and the multiple processes related to the products design, manufacturing, assembly, inspection, testing and maintenance (the whole life-cycle). In project, characterised with globally distributed design, and complex production processes the task to manage all product related information is of first importance for the project success. By the introduction of CAD/CAM/CAE systems the amount of product information has increased tremendously resulting in the need for management systems to control the creation and maintenance of technical documentation. This implies the use of well defined configuration management processes and software tools capable of controlling, maintaining all the data and processes needed for the development and implementation of the whole product. Product being a transversal project shared by several companies, P&EDMS involves heterogeneous environment and distributed data. - Business oriented - Material Requirements Planning Systems: Implementing the material requirements planning process has been ongoing for several years. MRP systems have evolved from pure Material Requirements Planning through integrated business support systems, which integrate information about the major business processes involved in manufacturing industries. Support exists for registration of prospects and customers, acquisition, proposals and sales orders, the associated financial procedures, inventory and distribution control, global production and capacity planning, with a limited capability for planning, analysis and derivation of management information. The majority of the systems are registration systems by nature, advanced planning tools are lacking. Because of the registrative nature of the systems, its lack of flexibility in querying the available information no easy management information can be produced which might be used to manage and control a business. - Data Warehouse Management Tool : An implementation of an informational database that allows users to tap into a company's vast store of operational data to track and respond to business trends and facilitate forecasting and planning efforts. It corresponds to a manager viewpoint. From clients management to stocks management, the whole set of data of a company is distributed in several databases, managed by different DBMS. Objective of Data Warehousing is to arrange internal and external data in order to create pertinent information easily accessible by manager to assist them in their decision activity. This aims at taking out a maximum benefit of the huge mass of information available in the company. The two main characteristics of data warehouse are firstly to consider the whole company and not only a part of it like a department, and secondly to analyse information through time scale. Managers want to analyse information state evolution and origins of change, this means to maintain an historical record in order to really have decision information to better appraise the trends. Data mining is a mechanism for retrieving data from the data warehouse. In fact many companies would claim that they have been using these sort of tools for a number of years. However, when there is a significant amount of data, the retrieving and analysis is a complex and potentially have a poor response. Data mining derives its name from searching fro valuable business information in a large database, and mining a mountain of vein of valuable ore. Both processes require either sifting through a large amount of material or find out where the value resides. Data mining tools allow professional analysts make a structured examination of the data warehouse using proven statistical techniques. They can establish cross-correlations which would be difficult or impossible to establish using normal analysis and query tools There are a number of companies who claims to offer tools and services to implement the Data Warehousing system e.g. IBM, Oracle, Software AG, OLAP databases, Business Objects, Brio Technology, etc. CSTB - G. Sauce ELSEWISE 20/05/2010 - 18
  19. 19. In spite of different objectives, problematics involved in these two cases is relatively the same and can be summarised by several keywords. - Meta model : This means to integrate information between the different sources (marketing, production, etc. or design, manufacturing, etc.) at a conceptual level. - Heterogeneous environment : this second aspect of integration takes place at a hardware level. - Distributed databases : It characterises sharing and access to information across multiple platforms. - Historical records of data in order to manage the product or company activities evolution. - Navigator, Information filtering and Analysing tools to exploit the mass of information. System objectives induce specific tools, for instances, graphical aspects are indispensable for EDMS whereas financial analysing tools are specific for data warehouse. What exists today In terms of technology, four points seem very interesting : Data Modelling, OODBMS, Distributed Database management and access to Database applications running over Internet. Data Modelling Data modelling aims at analysing and expressing in a formal way data requirements needed to support the functions of an activity. These requirements are recorded as a conceptual data model with associated data definitions, including relationships. The three following points can summarise the objectives of such a method : 1 To build data models : the first goal consists of defining a model which represent as faithfully as possible the information needed to perform an activity. It is important to place this work at a conceptual level to be independent from a specific DBMS. 2 To communicate models : In order to validate the data organisation, a maximum number of lecturers should analyse the proposed data model. Then a fundamental function of a method is to allow uninitiated people to have a global view on a data model, to perceive the main hypothesis and to analyse in detail very quickly. 3 To capitalise knowledge : Data model and its context (i.e. activities) represent an important part of specific knowledge of an application domain. Taking place at a conceptual level, this analysis should increase the global knowledge and contribute having a better understanding of this domain, on condition that the modelling method permits the knowledge durability. These methods represent the keystone of data management. Effectively the system efficiency depends directly on the quality of the data models. One can list three classes of methods : 1 Method for Relational DBMS : these methods consider clearly two aspects : functions and data, and for the latest, entities and relationships. 2 Object Oriented methodologies : This kind of method are characterised by an incremental and iterative approach. Certain methods are pure object whereas other ones use a functional approach associated with objects. 3 STEP methodologies : The international standard ISO 10303 STEP prescribes a specific methodology for analysing activity of an application domain (IDEF0) and for Application Reference Model (EXPRESS). This approach is completed with implementation specifications (SPF, SDAI). Every modelling methods have the same drawback : in fact they are not really independent of the software context. This is mainly the consequences of developers interest. Actually they are the most interested actors in the result of modelling, Effectively, the aspect of capitalising and increasing the application domain knowledge is not yet a primary objective of such an action. CSTB - G. Sauce ELSEWISE 20/05/2010 - 19
  20. 20. OODBMS Relational representation is always a current interesting technology and proposes efficient solutions for numerous data management problems. Object Oriented DBMS is a more recent technology developed to take maximum advantage of Object Oriented Languages. OODBMS are attractive because of their ability to support applications involving complex data types, such as graphics, voice, text which are not correctly handled by relational DBMS A number of reasonably mature OODBMS products exist, and are widely used in certain specialised applications. Increasingly, these products are being enhanced with query and other facilities similar to those provided by DBMS and are being applied in more general purpose applications. Another class of DBMS, the Object-Relational DBMS (ORDBMS), constitutes the most recent system. ORDBMS are both based on the results of research on extending relational DBMS and on the emergent viable OODBMS. Current individual products represent different mixtures of these capabilities. The appearance of these two classes of product indicates a general convergence on the object concept as a key mechanism for extended DBMS capabilities. Without further detail of OODBMS technology, it is important to note that an considerable standardisation activity affects the development of this technology, including Object extension to SQL (The relational standard query language) being developed by ANSI and ISO standard committees, and proposed standards for OODBMS, defined by the Object Database Management Group. Distributed Data Base Management (DDBM) A distributed database manager is responsible for providing transparent and simultaneous access to several databases that are located on, possibly, dissimilar and remote computer systems. In a distributed environment, multiple users physically dispersed in a network of autonomous computers share information. Regarding this objective, the challenge is to provide this functionality (sharing of information) in a secure, reliable, efficient and usable manner that is independent of the size and complexity of the distributed system. An abundant literature is dedicated to DDBM which can be summarised by the following functions : - Schema Integration : One can identified different levels of schema : - A global schema which represents an enterprise-wide view of data, is the basis for providing transparent access to data located at different sites, perhaps in different format. - External schemata show the user view of the data - Local conceptual schemata show the data model at each site. - Local internal conceptual schemata represents the physical data organisation at each computer. - Location transparency and distributed query processing : in order to provide transparency of data access and manipulation, the data base queries do not have to indicate where the data bases are located, decompose and route automatically to appropriate locations and access to appropriate copy if several copies of data exist. This allows the databases to be located at different sites and move around if needed. - Concurrency control and failure handling in order to update synchronisation of data for concurrent users and to maintain data consistency under failure. - Administration facilities enforce global security and provide audit trails. A DDBM will allow an end user to : - Create and store a new entity anywhere in the network; - Access an entity without knowledge of its physical location; - Delete an entity without having to worry about possible duplication in other databases; - Update an entity without having to worry about updating other databases; - Access an entity from an alternate computer. The sharing of data using DDBM is already common and will become persuasive as this system grows in scale and importance. But the problem mainly consists on mixing Object concept, distributed technology and standard aspect. The generalisation of this kind of system depends directly on the development of standards. Three main candidate technologies are interested on distributed Object oriented Techniques. CSTB - G. Sauce ELSEWISE 20/05/2010 - 20
  21. 21. - Remote procedure call technologies as DCE (Distributed Common Environment) : This technique provides low level distribution semantics and requires high programming effort. - Microsoft distribution technology (also called CAIRO or Distributed COM) will only be available on Microsoft systems, and nowadays is not yet really marketed. - OMG distributed objects : The Object Management Group aims at promoting the theory and practice of Object Technology for the development of Distributed Computing System. OMG provides a forum and framework for the definition of a general architecture for distributed systems called the Object Management Architecture. This object management architecture is divided in 5 components. Object Request Broker (ORB) know as CORBA is the communication heart of the standard; it provides an infrastructure to object communication, independently of any specific platforms, a technical foundation for distribution, it specifies a generic distributed programming interface (API) called the Interface Definition Language (IDL) It also specifies architecture and interface of the Object Request Broker software that provides the software backplane for object distribution. The other parts of these specifications concern Object Services (life-cycle, persistence, event notification, naming), Common Facilities (printing, document management, database, electronic mail), Domain Interfaces designed to perform particular task for users and Application Objects. The market of CORBA products evolves very fast, and several Software claim to be CORBA compliant, but usually they do not implement all CORBA functionalities. Nowadays, the technology allows true distributed data management, and soon efficient DDBM will be marketed. Database applications running over Internet The development of Internet leads to propose new possibilities in term of communication with databases : a dynamic database interaction with Web pages. These pages that are data driven are created through the use of technologies such as CGI and Java to link a server database with a client browser. CGI (Common Gateway Interface) provides a simple mechanism for web servers to run server-side programs or scripts. JavaScript is a compact, object-based scripting language for developing client and server Internet applications. The JavaScript language resembles Java, but without Java's static typing and strong type checking. JavaScript supports most of Java's expression syntax and basic control flow constructs. One of the main differencies is that JavaScript is interpreted (not compiled) by client whereas Java is compiled on server before execution on client. The problem of security is on the first importance over Internet. The communication link can be made secure through the Secure Sockets Layer (SSL). SSL is an industry-standard protocol that makes substantial use of public-key technology. SSL provides three fundamental security services, all of which use public-key techniques: - Message privacy. Message privacy is achieved through a combination of public-key and symmetric key. All traffic between an SSL server and an SSL client is encrypted using a key and an encryption algorithm negotiated during the SSL handshake. - Message integrity. The message integrity service ensures that SSL session traffic does not change in progress to its final destination. If the Internet is going to be a viable platform for electronic commerce, we must ensure that vandals do not tamper with message contents as they travel between clients and servers. SSL uses a combination of a shared secret and special mathematical functions called hash functions to provide the message integrity service. - Mutual authentication. Mutual authentication is the process whereby the server convinces the client of its identity and (optionally) the client convinces the server of its identity. These identities are coded in the form of public-key certificates, and the certificates are exchanged during the SSL handshake. During its handshake, SSL is designed to make its security services as transparent as possible to the end user. This technology is already available and several products are marketed, based on different Web browsers, specific DBMS and operating systems. CSTB - G. Sauce ELSEWISE 20/05/2010 - 21
  22. 22. Barriers to LSE uptake LSE already uses Data Base Management technology, but certainly not as much as possible. Although the development of certain techniques such as distributed databases is quite recent, the main reason is probably due to a knowledge problem. Effectively, the keystone of Data Management is data model. The latest is not under the responsibility of the computer science side but is the concern of the LSE industry. Consequently an important effort has to be done to formalise the LSE activity and the LSE information flow. However, data modelling methodologies seem not really adapted to end users. Three aspects of these methods have to be improved : - Validation : Nowadays, no methodologies and tools allow end users to appraise the quality of a data model and to validate it. Usually this model appears to end users as a very strange presentation of its knowledge, and they are totally dependent from some developers. - Dynamic aspect : data evolution is a fundamental aspect of a LSE project, but data modelling has actually some difficulties to undertake this aspect. - Capitalisation : If LSE industry wants to go forward, it is important to accumulate this kind of knowledge, which is not theoretical but representative of a know-how, practical processes and company knowledge. Another aspect of data modelling is the standardisation of the data models in order to facilitate the communication between various firms and even within the company. The international standard ISO 10303 STEP represents the international context which permits to develop such actions. In fine, the main problem of Data Management is the preservation of data during the life-cycle of a product. In many cases the life of a product, especially one in construction and civil engineering, easily spans more than thirty or forty years during which the data associated to the product must remain available and must be kept up-to-date. Not only must the data be available it should also be possible to correctly interpret the data. Until recently drawings represented the design specification of a product. Although electronic CAD systems are used to produce drawings, storage and archiving are still done on paper drawings. The paper drawing still is the official version of a product design specification. It is the paper version of the drawing which should be available for the product life-time, including results of design changes, maintenance adaptation and reworks. Today we are still able to read and interpret drawings on paper which are made thirty years ago. We are even able to read and interpret design drawings the Romans made for their public baths. Archiving paper drawings is built upon the preservation of the medium bearing the drawing, e.g. paper. Some decades ago large paper archives have been compacted using micro-fiche. The retrieval and viewing of the micro-fiche archives is now depending on some form of translation from a storage medium to a means for representation. Archiving is still done on the original data, i.e. the printed paper, which has been decreased in size. No interpretation step is involved and the compaction/de-compaction process is a pure physical one. Only quite recently the notion of product models arose. A Product Model is a presentation of a product in electronic, computer-readable form which is used by various computer applications to interpret the applications view and needs of the model. To enable different programs to interpret the same product model, a standard and interchange mechanisms are needed. Storage of a Product Model in electronic form over long periods also needs standards to ensure the data will mean the same in say, twenty years. Another concern then arises: the medium on which to save the data for archiving purposes. Today we are unable to read data which has been produced by an office application of 8 years ago. It is also not guaranteed that we can still read 51/4" computer disks made four years ago because of wear of the disk. This problem has been identified in several areas of industry which have a high data density. One of the trends in this area is the foundation of small to medium companies who make a living of archiving and retrieving data for a purpose, data services companies. The idea is to enable the user to get hold of the correct product data throughout the life-time of a product or service. The data services company, for a fee, takes care that upgrades to new versions of standards, tools, storage applications, e.g., databases, and new media are carried out. Such companies can only thrive based on standards, used and kept. CSTB - G. Sauce ELSEWISE 20/05/2010 - 22
  23. 23. Trends and expectations Current trends aim at adding value to Data Base Management system in one hand to specify more end users oriented DBMS, in other hand to develop Intelligent Database environment. The goal is to manage high level of information. The concepts of Data Warehouse and Engineering Data management are at the beginning of their development, and will procure in the next years more powerful possibilities. The semantic aspect induced by a high level of data allows to develop reasoning capabilities in order to propose : - declarative query language, to ease interrogations, - deductive system for instance query association permits to return additional relevant information not explicitly asked by users, - query optimisation to reduce time of transactions, - integration of heterogeneous information sources that may include both structured and unstructured data. From a technical point of view, ODBMS and Distributed Data Management will evolve in performance and in security. The Object Database Management Group continue its activity on standard deployment regarding object definition and query language. Numerous research projects deal with distribution. They aim at improving time of transactions, security, at establishing better connections using Web browser. New technologies are in progress, especially persistent distributed store which provides a shared memory abstraction. Relevance to LSE As every kind of activity, Data management represents the foundation of the LSE industry. Whether considered viewpoint, from a product one or a firm one, LSE activity constitutes an heterogeneous environment and manipulates a high semantic level of information which is recorded on different supports. The association of data management, distribution and new communication technologies is proving on the first importance. LSE is interested to DBMS with added value such as EDMS and Data Warehouse. Engineering Data management system constitutes a interesting response in term of project information management whereas Data Warehouse will interest the firm as its own. LSE industry is also mainly involved in standard activities, not directly for the computer science standards, but for establishing the product data models. We shown that data models are the keystone of data management. These ones contain an important part of LSE knowledge, still not yet formalised. Summary Nowadays, Data Management is based on Sound technology and lot of efficient software tools are available. LSE industry already uses this technology in the different part of companies. Lot of progress are expecting using object representation, probably more adapted to project design. Consequently great interests are waiting for the next evolution of OODBMS. At the present time, exchange of information and distribution of information are too limited. (mainly graphical document exchange). A higher semantic level within the exchange allows the different actors of a project to take more benefit of the new Information technologies. Developing data model, the key stone of exchange of information, needs sound standards at two levels : - Data modelling methodologies especially to facilitate the understanding of models and their validation. Communication is important for LSE actors allowing them a control of these models, but also to guarantee the continuity of the knowledge contained within models. No current modelling methodologies offers efficient way of model validation. This phase of validation is only based on the competencies of the model designer, and no technical, analytical or mathematical validation are performed. CSTB - G. Sauce ELSEWISE 20/05/2010 - 23
  24. 24. - Product data model : LSE industry has to be responsible on the development on its model. Actions have to be intensified to elaborate adapted models, notably in the context of the STEP standard. CSTB - G. Sauce ELSEWISE 20/05/2010 - 24
  25. 25. Knowledge management Introduction The previous chapter analysed Data management, the subject here is to present Knowledge management, but what is the difference between Data and Knowledge. Whereas Data represent the basic pieces of information, the term Knowledge recovers several definitions, depending on user viewpoint. It could be assimilated to the process of knowing or to a superior level of information. In this work, we won't debate on this definition and will consider the whole sense of knowledge, including both object of knowledge and process of knowing. The two following terms could be associated to knowledge : management and engineering. The main difference is that the manager establishes the direction (administrative, executive) the process should take, whereas the engineer develops the means to accomplish this direction. Then Knowledge Management is relating to the knowledge needs of the firm, to make what decisions and enable what actions. Knowledge Engineering develops technologies satisfying Knowledge Management needs. In fact, here we will develop Knowledge technologies, dependent on the knowledge engineer (usually a computer scientist) , which allows manager to develop policies for enterprise level knowledge ownership. Understanding knowledge is key to Knowledge Management process. It means how we acquire knowledge, how we represent it, how we store it, how we use it and what are the applications. Therefore we identify three main parts in knowledge management : - Knowledge representation The representation or modelling process consists of defining a model (or several models) able to represent knowledge as reliable as possible. The first result of the modelling process is to formalise the knowledge in order for accumulation and communication and sharing. The second aim is to establish a model which could be manage and use by computer programs. - Knowledge acquisition The smallest Knowledge Based system need a huge quantity of knowledge in order to be realistic. The knowledge acquisition process is critical to the development of any knowledge management program. It means to have a complete understanding of the nature and behaviour of knowledge. Then it is necessary to develop specific procedures to acquire this knowledge. This objectives of identifying and creating dynamic knowledge bases is itself also a knowledge process. It involves knowledge learning communications techniques, modelled on the human (or child) behaviour. - Knowledge usage The objective of accumulating its knowledge is on its own essential for an enterprise, it represents the memory of the firm. But this only interest is minor and the challenge is to develop a daily exploitation of the knowledge base, in every kind of action : technical, management, financial, etc. The possibilities includes decision making, diagnostic, control, training, command and so one. The number of potential applications is endless. The next paragraph will present the existing technologies in accordance with these three aspects. What exists today Knowledge representation The first steep regarding knowledge management process consist of choosing a knowledge representation. Effectively, the natural language usually used by every body is not adapted for developing a knowledge base. But the problem is that nowadays, it doesn't exist a universal model of knowledge representation. The existing formalisms depends on the application (diagnostics, decision making...) or on the knowledge domain(Natural language, technical application, medical science, etc.). In the past, two trends are developed which are knowing process oriented (the procedural knowledge representation) for the first and knowledge description oriented (declarative knowledge) for the second. Nowadays the trend tries to develop more universal knowledge representation. Without going into details or considerations of pure computer scientists, it is possible to identify four current in Knowledge representation. CSTB - G. Sauce ELSEWISE 20/05/2010 - 25
  26. 26. - Production rules - Logical form - Conceptual graphs - Object oriented representation Production rules A rule consists of two parts : - An antecedent or situation part usually represented by IF <condition>. This condition is expressed by mean of atomic formula and logical statement formed by using logical connectives such AND, OR, NOT. - A consequent part or action represented by THEN <conclusion>. The consequent may include several actions. Antecedents could be considered as patterns and consequences as conclusion or action to be taken. Each rules represents a independent grain of knowledge. Effectively, a rule could not refer to an other one, and contains on its own all the condition of its application. In addition, a rules base of a production system doesn't specify an order of application of the rules. Various methods have been used to deal with uncertain or incomplete information in such a knowledge base. - Certainty factors consists of assigning a number to each rule for indicating the level of confidence for this piece of knowledge. Usually this number, in the range of -1 to 1 or 0 to 1 or 0% to 100%, are combined with factors of evidence associated to fact to deduce factors of evidence for the conclusion. - Fuzzy logic : The theory of fuzzy sets can be used to present information with blurry or grey boundaries. In this theory, an element may be in a set with a degree of membership between 1 and 0. A fuzzy set contain elements with assigned degrees of membership. The main problem is to construct the membership function. - Probabilities : Bayes Theorem is well known in statistics and probabilities literature. It provides a method for calculating the probabilities of an event or fact based on the knowledge of prior probabilities. The difficulties of this method is to independently determine one another the probabilities of observed facts. Logical form This representation is based on the mathematical Logic. The most basic logical system is propositional logic. Each basic element of the system, or proposition, can be either true or false. The propositions can be connected to each other by using the connectives AND, OR, NOT, EQUIVALENT, IMPLIES, etc. First-order predicate logic is composed of : - atoms (symbols), - functions and predicates (a function with one or more atomic arguments which produce a boolean value), - two substatements joined by a conjunction, disjunction, or implication, - negated substatement, - statement with an existential or universal quantifier (in this case, atoms in the statements can be replaced by variables in the quantifier). For instance : Bob has a bicycle might be given the following logical form : ∃ B , Bicycle (B) ∧ Has (Bob, B) This is a very descriptive declarative representation with a well founded method of deriving new knowledge from a database. Its flexibility makes it a good choice when more than one module may add to or utilise a common database. Unfortunately, this flexibility has limitations. To maintain consistency, learning must be monotonic. This limits its effectiveness when there are incomplete domain theories This representation has been extended in several direction to correct these weaknesses and to propose new type of application. These extensions of logic include, but not limited to, closed world reasoning, defaults, epistemic reasoning, queries, constraints, temporal and spatial reasoning, procedural knowledge CSTB - G. Sauce ELSEWISE 20/05/2010 - 26
  27. 27. The integration of Logical form with other formalisms, such as object-oriented languages and systems, type systems, constraint-based programming, rule-based systems, etc., represents most of the actual development of these concepts. Applications concern numerous areas such as natural language, planning, learning, databases, software engineering, information management systems, etc. Conceptual graph Conceptual graphs (CG) are a graph knowledge representation system invented by John Sowa which integrate a concept hierarchy with the logic system of Peirce's existential graphs. Conceptual graphs are as general as predicate logic and have a standard mapping to natural language. Conceptual graphs are represented as finite, bipartite, connected graphs. The two kinds of nodes of the graphs are concepts and conceptual relations which are connected by untyped arcs. Each conceptual graph forms a proposition. The main characteristics of Conceptual graphs are the following : - A concept is an instantiation of a concept type. A concept by itself is a conceptual graph and conceptual graphs may be nested within concepts. A concept can be written in graphical notation or in linear notation. - A pre-existent type hierarchy of concept types is assumed to exist for each conceptual graph system. A relation < is defined over the set of concepts to show concept types that are subsumed in others. - Two or more concepts in the same conceptual graph must be connected using links to a relation. - For any given conceptual graph system there is a predefined catalogue of conceptual relations which defines how concepts may be linked with relations. - Each concept has a referent field used to identify the concept specifically or to generalise the concept, to perform quantification, to provide a query mechanism, etc. - Canonical graphs represent real or possible situations in the world. Canonical graphs are formed through perception, insight, or are derived from other canonical graphs using formation rules. Canonical graphs are a restriction of what is possible to model using conceptual graphs since their intent is to weed out absurdities or nonsense formulations. - A canonical graph may be derived from other canonical graphs by formation rules: copy, restruct, join, and simplify. The formation rules form one of two independent proof mechanisms. - A maximal join is the join of two graphs followed by a sequence of restrictions, internal joins (joins on other concepts), and simplifications until no further operation is possible. A maximal join acts as unification for conceptual graphs. - New concept types and relations may be defined using abstractions. An abstraction is a conceptual graph with generic concepts and a formal parameter list. - Schemata incorporate domain-specific knowledge about typical objects found in the real world. Schemata define what is plausible about the world. Schemata are similar to type definitions except that there may be only one type definition but many schemata. - A prototype is a typical instance of an object. A prototype specializes concepts in one or more schemata. The defaults are true of a typical case but may not be true of a specific instance. A prototype can be derived from schemata by maximal joins and then assigning referents to some of the generic concepts in the result. Conceptual graphs is considered as a good choice of knowledge representation for systems where knowledge sharing is a strong criterion. They have a formal, theoretical basis, support reasoning as well as model-based queries and have a universe of discourse that is described by a conceptual catalogue. They have no inherent mechanism for knowledge sharing or for translation but the flexible representation, mapping to natural language and, especially, work on their database semantics is an encouragement to consider CG as a candidate representation. Conceptual graphs include more or less Semantic Networks which consist of a collection of nodes for representation of concepts, objects, events, etc., and links for connecting the nodes and characterising their interrelationship. Semantic networks propose also inheritance possibility. CSTB - G. Sauce ELSEWISE 20/05/2010 - 27