Your SlideShare is downloading. ×

FOCUS K3D Research Road Map


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. FOCUS K3D D1.4.1 Deliverable D1.4.1 of Task T1.4 Road map for Future Research Circulation: PU1 Partners: ALL Authors: Chiara Catalano (IMATI), Michela Spagnuolo (IMATI), Michela Mortara (IMATI), Bianca Falcidieno (IMATI), Andre Stork (FRAUNHOFER), Marianne Koch (FRAUNHOFER), Pierre Alliez (INRIA), Frederic Cazals (INRIA), Mariette Yvenec (INRIA), Wolfgang Huerst (UU), Remco Veltkamp (UU), Marios Pitikakis (CERETETH), Caecilia Charbonnier (MIRALab), Lazhari Assassi (MIRALab), Jinman Kim (MIRALab), Nadia Magnenat-Thalmann (MIRALab), Patrick Salamin (EPFL), Daniel Thalmann (EPFL), Tor Dokken (SINTEF), Ewald Quak (SINTEF) Version: 02 Date: Thursday, 01 April 2010 1 Please indicate the dissemination level using one of the following codes: PU=Public PP=Restricted to other programme participants (including the Commission Services). RE= Restricted to a group specified by the Consortium (including the Commission Services). CO= Confidential, only for members of the Consortium (including the Commission Services). 01/04/2010 1
  • 2. FOCUS K3D D1.4.1 Copyright © Copyright 2010 The FOCUS K3D Consortium consisting of: CNR-IMATI-GE C.N.R. – Istituto di Matematica Applicata e Tecnologie Informatiche Dept. of Genova, Italy CERETETH Informatics and Telematics Institut – Center for Research and Technology Hellas, Greece EPFL Ecole Polytechnique Fédérale de Lausanne, Switzerland FRAUNHOFER Fraunhofer Institut für Graphische Datenverarbeitung, Germany INRIA Institut National de Recherche en Informatique et Automatique, France MIRALab Université de Genève, Switzerland SINTEF Stiftelsen SINTEF, Norway UU Utrecht University, the Netherlands This document may not be copied, reproduced, or modified in whole or in part for any purpose without permission from the FOCUS K3D Consortium. An acknowledgement of the authors of the document should be clearly referenced. All rights reserved. This document may change without notice. Project Coordinator: Dr. Bianca Falcidieno Organisation: IMATI Phone: + 39 010 6475667 Fax: + 39 010 6475660 E-mail: Project web site: 01/04/2010 2
  • 3. FOCUS K3D D1.4.1 Document History Vers. Issue Date Stage Content and changes 01 2nd February 2010 80% First version submitted st 02 31 March 2010 100% Final submitted version 03 04 Executive Summary This document contains the deliverable D1.4.1 of the FOCUS K3D Coordination Action. Deliverable D1.4.1 is the final technical report of Task 1.4 entitled “Road map for Future Research”. This deliverable defines the guidelines envisaged by the FOCUS K3D Consortium for future research, consolidating input from all the different kinds of AWG activities (i.e. ad hoc meetings, questionnaires, workshops, dissemination events) carried out during the two years of the project. The document also includes the outcomes of the discussion with AWG members about the challenges of the road map, as presented during the final FOCUS K3D conference. 01/04/2010 3
  • 4. FOCUS K3D D1.4.1 Table of Contents Chapter 0 : Preface................................................................................................. 5 Chapter 1 : Visionary scenarios .............................................................................. 7 1.1 Visionary scenario in BioTech, health & medicine .................................................. 7 1.2 Visionary scenario in Robotics ............................................................................ 8 1.3 Visionary scenario in Business & education, home & leisure.................................... 9 1.4 Discussion ...................................................................................................... 9 Chapter 2 : Definition of the field ......................................................................... 12 2.1 3D media representation: from geometry to semantics........................................ 12 2.2 Knowledge sources in 3D applications ............................................................... 14 2.3 Discussion .................................................................................................... 16 Chapter 3 : 3D in application domains .................................................................. 17 3.1 Medicine ....................................................................................................... 17 3.2 Bioinformatics ............................................................................................... 20 3.3 CAD/CAE and Virtual Product Modelling ............................................................. 23 3.4 Gaming and Simulation................................................................................... 26 3.5 Cultural Heritage and Archaeology.................................................................... 28 3.6 Discussion and conclusion ............................................................................... 30 Chapter 4 : The FOCUS K3D research road map .................................................... 32 4.1 Derive symbolic representations....................................................................... 34 4.1.1 Time line and dependence diagram ......................................................... 43 4.2 Goal-oriented 3D model synthesising ................................................................ 44 4.2.1 Time line and dependence diagram ......................................................... 48 4.3 Documenting the life cycle of 3D objects ........................................................... 49 4.3.1 Time line and dependence diagram ......................................................... 55 4.4 Semantic interaction and visualisation............................................................... 56 4.4.1 Time line and dependence diagram ......................................................... 62 4.5 Standards ..................................................................................................... 63 4.5.1 Time line and dependence diagram ......................................................... 64 4.6 Trends in Semantic Web research..................................................................... 65 4.7 Final discussion ............................................................................................. 68 4.7.1 Medicine scenario: liver segmentation ..................................................... 69 4.7.2 Bioinformatics scenario: from models to annotation................................... 70 4.7.3 Gaming and Simulation scenario: vessel design workflow ........................... 70 4.7.4 CAD/CAE and Virtual Product Modelling scenario: semantics based virtual 3D product modelling............................................................................................. 72 4.7.5 Archaeology and Cultural Heritage scenario: large-scale repository of 3D digital artefacts ......................................................................................................... 73 4.7.6 Robust geometry processing: practical benefits across the AWGs ................ 75 4.7.7 Conclusion........................................................................................... 77 References ........................................................................................................... 81 01/04/2010 4
  • 5. FOCUS K3D D1.4.1 Chapter 0 : Preface 3D media are digital representations of either physically existing objects or virtual objects that can be processed by computer applications. They may be defined either directly in the virtual world with a modelling system, or acquired by scanning the surfaces of a real physical object. The production and processing of digital 3D content was and still is a traditional field of expertise of Computer Graphics, but only recently 3D entered the multimedia world: only in the last decade, indeed, has Computer Graphics reached a mature stage where fundamental problems related to the modelling, visualisation and streaming of static and dynamic 3D shapes are well understood and solved. Considering that nowadays most PCs connected to the Internet are equipped with high-performance 3D graphics hardware, it seems clear that in the near future 3D data will represent a huge amount of the data stored and transmitted using Internet technologies. Therefore, 3D media introduce a new kind of content in the established multimedia scenario, with the term multimedia characterised by the possible multiplicity of content, by its availability in digital form and its accessibility via electronic media. Text, audio, animations, videos, and graphics are typical forms of content combined in multimedia, and their consumption can be either linear (e.g. sound) or non-linear (e.g. hypermedia), usually allowing for some degree of interactivity. At the same time, research on semantic multimedia, defined by the deep integration of semantic web techniques with multimedia analysis tools, has shown how to use and share content of multiple forms, endowed with some kind of intelligence, accessible in digital form and in distributed or networked environments. The success of semantic multimedia largely depends on the extent to which we will be able to use them in systems that provide efficient and effective search capabilities, analysis mechanisms, and intuitive re-use and creation facilities, at the level of content, semantics and context (Golshani06). Going one step further, we could easily envisage semantic multimedia systems of a higher level of complexity, in which the generic architectural framework underpinning semantic multimedia systems could be extended to knowledge and data intensive applications, which have been historically developed in sectors that were quite far from multimedia and knowledge technologies. Throughout this document, we will illustrate this idea focusing on prospective applications of 3D content, as a rapidly emerging new form of media in the semantic multimedia panorama and an extremely challenging application context for semantic multimedia. 3D media, indeed, encompass all forms of digital content concerning 3D objects used and managed in networked environments, and not only fancy-looking graphics used in entertainment applications. 3D media are endowed with a high knowledge value carried either by the expertise needed to design them or by the information content itself. Currently, research on multimedia and semantic multimedia is largely devoted to pixel-based content, which is at most two-dimensional (e.g. images), possibly with the addition of time and audio (e.g., animations or videos), while 3D media are defined by vector-based representations. Due to its distinctive properties, 3D media make it necessary to develop ad hoc solutions for content analysis, content-based and context-based retrieval, modelling and presentation, simply because most 2D methods do not generalise directly to 3D (SpFa2009). The AIM@SHAPE Network of Excellence ( was the first EC project addressing the issue of adding semantics to plain geometrical shapes. The prototype infrastructure created by AIM@SHAPE and the ontologies developed by it demonstrated the potential gain for the Computer Graphics community of having detailed metadata attached to 3D models. Based on the experience of AIM@SHAPE, the FOCUS K3D project ( tackled the more complex problem of raising the interest of users in a number of application domains for semantics-driven processing of 3D data. 3D graphics are key media in many 01/04/2010 5
  • 6. FOCUS K3D D1.4.1 sectors, among them the ones we selected for the application working groups in FOCUS K3D: Medicine and Bioinformatics, Gaming and Simulation, CAD/CAE and Virtual Product Modelling, Archaeology and Cultural Heritage. In these areas, representing a complex shape in the various stages of its complete life-cycle is known to be highly non-trivial, due to the sheer mass of information involved and the complexity of the knowledge that a shape can reveal as the result of a modelling process. The research road map presented in this deliverable is an attempt to synthesise the vision of the FOCUS K3D project on these themes, after extensive discussions with a variety of users in the application domains. The perspective of the road map is oriented towards challenges that exist more in the content production and sharing phase, rather than in the networking phase. Thinking about life fifteen years ago, you will discover how it is different from now and how some technologies we regularly use today were unimaginable at that time, mobile phones and internet above all. Therefore, we decided to start the document with three visionary scenarios, which imagine different aspects of life in 2040. What emerged from the stories in the end is that semantic 3D content will strongly permeate everyday life, even if not directly perceived. A discussion of where, how and which kind of 3D data will be involved follows to make the link between geometry and semantics explicit. In the second chapter, we take a step backward and briefly define our view of semantic 3D media, as originated from AIM@SHAPE, while, in the third chapter, we describe their current status in the application working groups selected in FOCUS K3D. The state-of-the-art reports produced by FOCUS K3D are an important source of information for the interested readers (see deliverables D2.1.1, D2.2.1, D2.3.1, D2.4.1), as they give an idea of the level of acquaintance with 3D semantics in these fields, with expected and natural differences between more established fields, such as CAD/CAE and Product Design, when compared to fields such as the Gaming and Simulation domains, where the potential of a concurrent use of semantics and digital 3D data is really high but relatively less perceived and reflected by current practices. Finally, chapter 4 provides the real road map, in which we have identified high-level goals that somehow represent the long-term challenges that the Computer Graphics community should face as targets for new and disruptive research, with a strong need to breach the borders of a single discipline, calling for a truly multi-disciplinary effort. Within each of these high-level challenges, we have also identified a number of mid-term challenges that, without being exhaustive of course, could be required building blocks for further important research advances. After the experience of FOCUS K3D and as reflected by the research road map, the path towards really effective semantic 3D content is still complex, but doable. As the research road map tries to convey, we believe that the potential of semantic multimedia technologies could be fully exploited in 3D and knowledge intensive application areas, where the processes deal with contents of multiple form and type, the processing workflows are guided by knowledge and semantics, and the working environment is usually distributed. Computer Graphics is ready to answer these challenges, but it needs a closer connection with Knowledge Management, Machine Learning, and Cognitive Modelling. 01/04/2010 6
  • 7. FOCUS K3D D1.4.1 Chapter 1 : Visionary scenarios To illustrate how pervasive semantic 3D technology will be in about 30 years, we present in the following three visionary scenarios and discuss which aspects related to 3D and semantics are involved and should be tackled in the future. The three scenarios cover some areas where 3D media are a horizontal technology enabling the megatrends of the future, which are, as many experts agree (Fut09): • BioTech, health, medicine; • Robotics; • Business & education, home & leisure. 1.1 Visionary scenario in BioTech, health & medicine The scenario takes place in 2040. The current challenge for Dr Edwardes is to improve the quality of life of his patient Bill Thomson who is suffering from a serious arthritic problem in one of his knees. Bill not only suffers from unbearable pain, but also from the serious discomfort of having limited flexion. To begin with and check whether the pathology is amenable to drug treatment, Dr Edwardes decides to have Bill's genome fully sequenced using the latest high throughput sequencing technique. The output of the procedure being the collection of Bill's genes, Dr Edwardes aligns portions of this genome against those of other patients suffering from related disorders, so as to precisely identify the genetic determinants of the disease and check for possible treatments. The matching portions of the genome are visualised through a holographic projector. Interestingly for Dr Edwardes – but not for Bill – three genes appear as mutated, respectively HOOK-01, ESS-21, and RGid-07, each of them being involved in a different form of arthritis. While the first two mutations are amenable to drug treatment, no drug is known for the third one. Therefore, Dr Edwardes carries out a homology search in the Advanced Protein Data Bank to see whether the structure of proteins whose sequence is similar to that of the protein coded by RGid-07 is known, and indeed finds one such structure. Screening a library of candidate drugs through a virtual panel against this protein, Dr Edwardes identifies two of them with good binding affinity. These molecules are candidate drugs, and he hands the investigation to a colleague of his at the National Institute of Arthritis, so as to check whether the protein coded by RGid-07 can be patched by one of these drugs. Meanwhile, he and Bill decide to go for a surgical treatment. The latest advances in computer-aided medicine enable Dr Edwardes to tackle the challenge through function-driven modelling and simulation, with precise estimates of the risks so as to optimise both the decisions and the treatments. The novel methodology consists of driving the whole medical process by modelling, simulating and monitoring it with a constant eye on the end goal, here restoring the function (knee motion) while minimizing the pain. The first crucial decision is to move from non-operative to operative treatment: on the one hand, pain medication controls pain but does not change the underlying arthritic problem; on the other hand, it is possible to go ahead with prosthesis surgery albeit with some non-negligible risks. In the present case the results of a simulation on patient-specific data led Bill to decide to go ahead with surgery in order to place an implant. Different from previous approaches the shape of the implant is not chosen among a predefined set but instead optimised through the simulation so as to best restore the original motion amplitude and flexion of the knee. The implant is manufactured by a 3D printer, which can make physical 3D shapes from a variety of biologically compatible materials, while the actual surgical operation is robot-assisted. Another novelty is that the body is treated as a global system in order to best engineer the implant together with the treatment that comes with it in order to avoid infections and other complications such as late mechanical dysfunction of the implant. The present system mixes knowledge and geometric modeling of the implant. This requires simulating and monitoring 01/04/2010 7
  • 8. FOCUS K3D D1.4.1 Bill’s organism at different scales ranging from the proteins to the organs through the cells, tissues and ligaments. After surgery Bill has to live with a machine, which monitors his body until the risk of complication is considered low enough to return to a normal life. One year later Bill asks Dr Edwardes whether or not he would be able to pursue an athletic activity, like when he was 25. To answer this question Dr Edwardes runs a new series of acquisitions and simulations (not reimbursed by social security!), which provide precise quantitative risk estimates. While looking at the simulation Bill has the strange feeling of being engineered as an industrial product by a computational engineer. This is, however, nothing compared to the fun of running once more the Boston marathon. 1.2 Visionary scenario in Robotics It is Monday morning and Cynthia is starting her new working week. She is a product development engineer at Robotical Inc., a company focusing on the development of the next generation of humanoids. Cynthia is specialised in behaviour modelling and simulation with a special focus on swarm behaviour. At the beginning of her career she developed a 3D sensor system and the embedded software for the interpretation of hand gestures of human beings for robot-human-interaction. Her system was able to build a 3D model of the situation and to interpret the gestures in a semantically described context. After breakfast Cynthia gets into her car. The car is the new AUTONObile 2040 model. Cynthia just tells the car the destination of her ride. The autonomous automobile checks the traffic situation, plans the trip and starts to move. All of a sudden a child runs across the street and the AUTONObile brakes instantly. Its semantic reasoning systems have come to the decision to brake after analysing the 3D scene, which is constantly generated and updated by various 3D sensors (global positioning system, lasers, radars, and cameras). The decision to brake is taken after considering the distance between the car and the child and rejecting the possibility to drive around the child because of opposing traffic. After the situation is cleared, the car continues. Having ‘delivered’ Cynthia to her office, the car searches for a parking space and waits for the next order. During the day, Cynthia is trying to optimise the shape and behaviour of the new humanoid she is developing with her colleagues at Robotical Inc. She is using the newest version of CAHIA (Computer-Aided Humanoid Interactive Application), which provides function-based constraints on free-space deformation techniques. With CAHIA all kinds of behaviour models can be embedded and simulation models are generated on the fly and evaluated in real-time. Not only one humanoid can be optimised but also strategies for jointly solving problems can be used. The humanoid is to work as a fire fighter to replace humans in this dangerous job. To achieve this, group behaviour in 3D environments needs to be trained and simulated. While waiting for Cynthia, her intelligent house communicates with AUTONObile and asks to bring some beverages from the supermarket. The car uses the drive-through option of the supermarket and a service robot is putting the beverages into its trunk. In the meantime the car’s traffic system receives the information that weather conditions will change rapidly within the next hours and snowy roads are expected. So on the way back to Cynthia’s office it drives to the next garage to have the tires changed. After her working day, Cynthia calls the car to fetch her at the entrance of her office and they start their way home. During the drive, the car is informed about an accident that happened on the road ahead. Thus, it is starting to re-plan the trip. This is done in online- communication with the other vehicles participating in traffic to avoid jams due to too many cars planning the same route. In addition to the street map, also the 3D profile of the landscape is taken into account to optimise energy consumption. AUTONObile returns Cynthia safely to her home. Cynthia is very happy with her personal mobility assistant. Also her children appreciate their personal chauffeur and Cynthia is unworried since she knows that the grandparents will supervise the safe and comfortable drive she already programmed to allow only for certain routes, e.g. to their friends and the football stadium. AUTONObile recognises the people authorised to drive using 3D face recognition together with voice analysis, a system Cynthia developed herself after finishing the gesture 01/04/2010 8
  • 9. FOCUS K3D D1.4.1 recognition system. 1.3 Visionary scenario in Business & education, home & leisure When Semantha gets into her next sleeping phase, the mattress adjusts automatically, and the room lighting changes slightly, to anticipate her waking up soon. After she gets up, she prepares for going to work, and leaves a message to her daughter Alice, that will be synthesised by her private avatar with her own voice. Semantha is a 3D modeller at the gaming company 'Iterative Life'. When she approaches the company building, her gait and face are recognised, and entrance is permitted. She is currently working on the design and modelling of a collection of characters for a new game: "Parallel Universe Loophole Paradise". The characters in the game, the PULPese, have feet like wheels, arms like fins, and ears like wings. In her 'Evolving Personalised Information Construct' (Epico, marketed by GoogleZone, the joint venture of Google and Amazone) Semantha sees that she has modelled the feet last week. According to the schedule, she now must design the accompanying gadgets for chiropody. She invokes the encyclopaedic dictionary to learn about that, and the semantic network helps her to link these concepts to the maintenance tools for wheels. In the collection of instruments she wants to select those instruments that somewhat fit to the feet and can be used as a start for modelling new ones. Then, she activates her “FindMeShape”, the shape matcher tool able to select the appropriate models for the redesign process. She is interrupted by her iWant, her personal digital life assistant, a seventh sense to her. She has to collect Alice from school. On her way, she consults City 3.0 to find a restaurant near school and near the transportation dock, so that they do not lose time on their way to the afternoon event. One of her friends that are currently also signed in to City 3.0 tells her of a good place to eat. They decide to spend the afternoon visiting the Clonehenge area, a replica of the Stonehenge site that disappeared under water after the sudden ice meltdown catastrophe in 2025. When Alice puts on her glasses, she sees the enhanced environment consisting of projected 3D media and the real world. Through the looking glass, Alice sees a wonderland with a lot of cultural information on this piece of heritage, and she gets really affected by experiencing the past. When they finally get back home, Semantha continues her work to meet the planning schedule, while Alice is doing homework with her iCat. 1.4 Discussion In the three stories above, we tried to envisage how technology will influence and support different aspects of our future life. Although it may not be obvious, 3D multimedia play a key role in the megatrends identified. In all the three scenarios futuristic technology makes strong use of 3D data: since our perception of the world is 3D, the advanced tools of the future should be able to support it as efficiently and effectively as (and together with) any other kind of multimedia. The interaction between real and virtual world will be tighter: virtual avatars will replace humans, while intelligent environments will become the human-computer interface. As a consequence, one of the main research trends refers to the description and understanding of the surrounding world, which have to rely on efficient methodologies for modelling and processing 3D data. Another prominent aspect is the personalisation of the digital content and technology according to the context and the user. Therefore, the formalisation of the data cannot be static, but it should vary dynamically according to the purpose, the context and the user. In particular, the BioTech, health & medicine scenario implies a function-centric modelling and simulation of the human body, which relates different scales ranging from proteins to organs. In such a context where the human body is simulated as a complex system, semantic plays a key role to model the knowledge that links the various scales and the border between medicine and computational engineering becomes fuzzier. The future of computer-assisted medicine will certainly require thinking in terms of goal-oriented modelling and simulation 01/04/2010 9
  • 10. FOCUS K3D D1.4.1 (with the goal being a knee that functions well in terms of motion amplitude, and with no pain), where some parts of the data come from knowledge about the generic human anatomy and other parts come from patient-specific acquisition and semantic annotation both before and after surgery. Furthermore, as medicine is not an exact science and developed countries care increasingly about legal issues and quality assurance, precise semantic description and estimates of the risks of any physical medical procedure will probably be of increasing importance. While generic knowledge about the human body model will certainly be of crucial importance for statistical risk estimation, a patient-specific estimate will require semantics- based analysis of massive data sets coming both from past simulations and real experiments carried out on other patients, which show some similarity. In addition, the adoption of theragnostics is investigated more and more, which is a treatment strategy that combines therapeutics with diagnostics. It associates both a diagnostic (semantically rich) test that identifies patients most likely to be helped or harmed by a new medication, and a targeted drug therapy based on the test results. Bioinformatics, genomics, proteomics, and functional genomics are molecular biology tools essential for the progress of molecular theragnostics. Generic human anatomy, as well as patient specific data, will be modelled and integrated on 3D replicas of the physical parts and used to attach information and run simulation models that will be tailored for each patient. Finally, and instead of thinking of computer-aided medicine only when the disease or pain has already occurred, the future of computer-aided medicine will most probably consist of increasingly preventive medicine, where modelling and simulation will again be put to use for monitoring, predicting and avoiding dysfunction as much as possible. In addition, robots will have an increased role in assisting surgery with high precision. Not only in medical applications the number of robots will be huge in the future, as technology and market forecasts indicate. Not only their population will increase in terms of sheer numbers but they will also drive some key technology developments and will become ever more intelligent. Marvin J. Cetron, President of Forecasting International and a member of the world Future Society board, forecasts (Fut09) that: • until 2020 we will have self-diagnostic and self-repairing robots in self-monitoring infrastructures for almost any job in homes and hospitals; • until 2030 the robot population will surpass human population in the developed world, and • in 2040 robots will completely replace humans in the workforce. Robots can be regarded as intelligent machines, and semantic 3D models are key to their development and for their successful use. As pointed out in the robotics scenario, to orient themselves in the real world and to communicate with human beings and other robots, they first and foremost need to build up an understanding about the scene they are surrounded by. It is not sufficient to create just a point cloud or a static (or dynamic) distance field – e.g. to avoid collisions –, the robot has to create a ‘mental’ model from the objects it is surrounded with. It needs to understand the meaning of those objects, it needs to know the actions they may perform, it needs to understand their dynamics if they are moving, in order to be able to plan its own actions. These are issues related to understanding the 3D world, valid not only for robots but for all kinds of interaction between virtual entities and humans. The outdoor navigation domain is a great challenge for three-dimensional perception. One key to successful navigation in real world environments is the robot’s ability to reason about its environment in three dimensions, to handle unknown space in an effective manner and to detect and navigate in environments with obstacles of varying shapes and sizes. We believe that the problem should be addressed by a tighter coupling of geometry processing methods with cognitive science, so that the properties characterising the shapes can be better aligned with features that are used by humans to “understand and classify” shapes. Also machine learning or similar statistical 01/04/2010 10
  • 11. FOCUS K3D D1.4.1 methods are expected to play a key role to cope with the complexity and variability of 3D object shapes, and as tools supporting the automatic segmentation of massive datasets. To be able to do so, the robot needs semantic and geometric 3D descriptions of reference objects (learnt/trained knowledge about objects potentially being present in the scene and their functionality). Such models have to be ‘implanted’ into the robot possibly along with self- learning mechanisms, i.e. they have to exist when the robot ‘comes to life’, which takes us to the development phase for robots. There is already a growing tendency to introduce high-level semantic information into robotic systems. This tendency is visible in many areas, including mapping and localisation, robotic vision, human-robot interaction, and the increasing use of ontologies in robotics. Today, robots are systems consisting of mechanics, electronics, and software. Some behave in a pre-programmed way, some react to their surroundings. In any case the development process of robots also requires semantic 3D models (not only the use phase of a robot). Robots can be built more rapidly if functional-oriented design and reuse of functional components are intrinsically supported by virtual product development tools along with advanced simulation techniques. Ontological information is has increasingly being used in distributed systems in order to allow for automatic reconfiguration in the areas of flexible automation and of ubiquitous robotics. There is a need to use ontological information to improve the inter-operability of robotic components developed for different systems. Also in the Business & education, home & leisure scenario, we see that a number of new technologies will be present, which allow for an integrated approach to search for and find functionalities, are built on the massive semantic annotation of 3D media, exploit the re-use of personalised information, depend on the connectedness between real and virtual worlds, and utilise location-aware ambient intelligence and smart objects, which altogether make the production and management workflows much more effective, and the leisure and care tasks much more affective and satisfying than in the old days before 2010. Although some of the above technologies are currently being developed, there is still a long road to go. Through professional production, user content generation, and digitization and preservation of cultural heritage, large amounts of digital media such as images, video, music, three-dimensional objects and environments are becoming available. Tools for creation, editing, searching and interaction have been developed by now. New techniques are being developed to build games, model 3D worlds, and create virtual prototypes of new products or architectural plans. The next generation of applications and environments will require more natural interaction and intelligence to become more dynamic, affective, and satisfying. Tools are necessary to create virtual persons that are endowed with speech and social and emotional intelligence. Toolboxes and interfaces must be created to add ambient intelligence to working and living spaces. These will lead to newly created products and services for personalised information finding, connecting real and virtual worlds, experiencing past culture, and adopting its lifestyle. Functionality and elements that are currently missing are rich semantic annotations of the world and environments, massive semantic annotation of 3D objects, localisation in 3D environments, and tangible 3D visual interfaces. In the next chapter, our view of 3D semantic multimedia will be introduced and the current status of the field will be briefly described; chapter 3 will then give an overview of some application fields. 01/04/2010 11
  • 12. FOCUS K3D D1.4.1 Chapter 2 : Definition of the field In this chapter, we will introduce our view of semantic 3D media, taking into account the perspective of researchers in the field of Computer Graphics as well as the perspective of the users in the reference application domains. Researchers in Computer Graphics are indeed key players in the process of formalising the semantics through knowledge technologies as they are the experts able to develop the appropriate tools and schemes for documenting and sharing 3D media representations, in close cooperation with experts in different domains. In the following, Section 2.1 will discuss the evolution of 3D modelling paradigms, from the traditional geometry-oriented to the emerging semantics-driven approaches. In Section 2.2, we will outline the sources of knowledge that we believe need a formalisation and a more effective management much more integrated with the geometric modelling pipeline of the object’s shape. A discussion in Section 2.3 concludes this chapter. 2.1 3D media representation: from geometry to semantics Knowledge technologies can be effectively used in complex 3D application fields if the underlying 3D modelling approach is able to support and encapsulate the different levels of abstraction needed for describing an object’s form, function and meaning in a suitable way. The EC Network of Excellence project AIM@SHAPE proposed an evolution of the traditional geometric modelling paradigm towards a semantics-based modelling paradigm, which is consistent with the different description levels used for traditional 2D media and reflects an organisation of the information content that ontologies dealing with 3D application domains could nicely exploit. The use of computers has revolutionised the approach to shape modelling, opening new frontiers in research and application fields: Computer Aided Design, Computer Graphics and Computer Vision, whose main goal is to discover basic models for generating and representing shapes. In the beginning, this effort gave rise to research in geometric modelling, which sought to define the abstract properties, which completely describe the geometry of an object, and the tools to handle this embedding into a symbolic structure. The visual aspect of shapes has deeply influenced the development of techniques for digital shape modelling, which have mainly focused on the definition of mathematical frameworks for approximating the outer form of objects using a variety of representation schemes. The same geometry can be represented by different approximations (e.g. meshes, parametric surfaces, unstructured point clouds, to cite a few), each representation being chosen according to its advantages and drawbacks with respect to application purposes. For example, complex triangulated meshes are badly suited to interactive animation; simpler representations such as point set surfaces can be sufficient for special classes of applications (e.g. ray-tracing techniques only require ray-surface interrogations). The conversion between distinct representations is still a delicate issue in most systems but there are tools to derive triangular meshes from the majority of representation schemes. The selected representation model will eventually be mapped into an appropriate data structure, which is computer- understandable and devised according to optimisation and efficiency criteria. It is important to point out that for the development of semantic 3D media the conceptualisation of the geometry is not an issue: geometric modelling is, by itself, the answer of the Computer Graphics community to the need of defining and expressing in formal terms concepts and relationships concerning the geometric representation of 3D media. Shape models and related data structures encapsulate all the geometric and topological information needed to store and manipulate 3D content in digital form. While the technological advances in terms of hardware and software have made available plenty of tools for using and interacting with the geometry of shapes, the interaction with the semantic content of digital shapes is still far from being satisfactory: we still miss effective and robust tools to manage non-spatial information, i.e. the information not related to the shape of 01/04/2010 12
  • 13. FOCUS K3D D1.4.1 the object (e.g. ownership, material, price) and the information related to the high-level conceptualisation of the object, either expressed as features (e.g. size, volume, compactness, presence of holes) or as synthetic concepts (e.g. what it is, what it is used for). Stated differently, the semantic gap - the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation (Smeulders00) –is still far from being filled. To tackle this problem we may follow two complementary paths: on the one hand, it is possible to integrate the missing information through the definition of concept-based metadata (either by following a formal conceptualisation encoded in an ontology or by allowing a free tagging of the resources); on the other hand, it is possible to exploit state-of-the-art descriptors and analysis tools to extract content-based information from spatial information. Both of the mentioned approaches help to narrow the semantic gap: the design of ad-hoc ontologies and the development of tools for a versatile annotation of 3D objects are part of the plan, but yet another important element is crucial for the description process: the role of the user. In fact, it is not sufficient to provide a generic interpretation of the data, the interpretation should take the context into account. The awareness of the context should enable a dynamic and user-oriented description in which only the most relevant content-based descriptors are activated and combined. The integration of concept-based and content-based approaches and their embedding in a context dynamically defined by the user could be an efficient solution, whose evaluation was one of the main goals of FOCUS K3D through the Application Working Groups. This integration can be achieved if the 3D content is organised in a way that takes into account and supports reasoning at different levels of abstraction and that goes beyond the limits of the pure geometry. The conceptualisation of the knowledge domain has to adhere to a suitable organisation of the geometric data about the shapes. In AIM@SHAPE, shapes were defined as characterised by a geometry (i.e. the spatial extent of the object), describable by structures (e.g. form features or part-whole decomposition), having attributes (e.g. colours, textures, or names, attached to an object, its parts and/or its features), having a semantics (e.g. meaning, purpose, functionality), and possibly having interaction with time (e.g. history, shape morphing, animation) (Falcidieno et al. 2004). Leaving the traditional modelling paradigm, a simple one-level geometric model has to be replaced by a multi-layer view, where both the geometry and the semantics contribute to the representation of the shape. At the same time the structure is seen as an important bridge towards the semantics, as it supports a natural manner to annotate the geometry with semantic information. The key idea is that, while the geometry of a shape is unique, its structure can be defined in many possible ways, each contributing to the creation of a specific structural model, or view, of the shape. These views of the shape make it easier to interpret the geometry according to different semantic contexts and attach appropriate content to the relevant parts of the shape. To realise the shift towards this new modelling paradigm, it is also necessary to develop new and advanced tools for supporting semantics-based analysis, synthesis and annotation of shape models. They correspond to image analysis and segmentation for 2D media: features of a 3D model are equivalent to regions-of-interests in images. There is, however, a different input and a slightly different target, as image analysis is trying to understand what kind of objects are present in the scene captured by the image, while in 3D segmentation the object is known and it has to be decomposed into meaningful components that might be used to manipulate and modify the shape at later stages. From a high-level perspective, the main differences concern the different nature of the content: descriptors used for 2D images are concerned with colour, textures, and properties that capture geometric details of the shapes segmented in the image. While one-dimensional boundaries of 2D shapes have a direct parameterisation (e.g. arc length), the boundary of arbitrary 3D objects cannot be parameterised in a natural manner, especially when the shape exhibits a complex topology, e.g. many through-holes or handles. Most notably, feature extraction for image retrieval is intrinsically affected by the so-called sensory gap: “The sensory gap is the gap between the 01/04/2010 13
  • 14. FOCUS K3D D1.4.1 object in the world and the information in a (computational) description derived from a recording of that scene” (Smeulders00). This gap makes the description of objects an ill-posed problem and casts an intrinsic uncertainty on the descriptions due to the presence of information, which is only accidental in the image or due to occlusion and/or perspective distortion. However, the boundary of 3D models is represented in vector form and therefore does not need to be segmented from a background. Hence, while the understanding of the content of a 3D vector graphics remains an arduous problem, the initial conditions are different and allow for more effective and reliable analysis results and offer more potential for interactivity since they can be observed and manipulated from different viewpoints. Finally, at the semantic level, which is the most abstract level, there is the association of specific semantics to structured and/or geometric models through annotation of shapes, or shape parts, according to the concepts formalised by a specific domain ontology. In the Figure below, a possible shape analysis pipeline is shown: in (a) the digital model of a chair obtained by acquisition is shown, in (b) the shape represented in structural form can be analysed in the domain of knowledge related to Computer Animation and the regions of possible grasping are annotated accordingly (see (c)). The aim of structural models is, therefore, to provide the user with a rich geometry organisation, which supports the process of semantic annotation. Consequently, a semantic model is the representation of a shape embedded into a specific context, and the multi-layer architecture emphasises the separation between the various levels of representations, depending on the knowledge embedded as well as on their mutual relationships. (a) (b) (c) From geometry to semantics The multi-layer view of 3D media resembles the different levels of description used for other types of media, but there is a kind of conceptual shift when dealing with 3D media: here, we have the complete description of the object and we want to be able to describe its main parts, or features, usually in terms of low-level characteristics (e.g. curvature, ridges or ravines). These features define segmentations of the shape itself, which is independent of a specific domain of application but that carry a geometric or morphological meaning (e.g. protrusions, depressions, and through holes). 2.2 Knowledge sources in 3D applications Effective and efficient information management, knowledge sharing and integration have become an essential part of more and more professional tasks and workflows in Product Modelling, one of the first fields where semantics came into play. However, it is clear that the same applies also to other contexts, from the personal environment to other applied sectors. There is a variety of information related to the shape itself, to the way it has been acquired or modelled, to the style in which it is represented, processed, and even visualised, and many more aspects to consider. The description of a shape is intrinsically not unique and varies according to both the application and user contexts. Therefore, the abstraction levels used to process or reason 01/04/2010 14
  • 15. FOCUS K3D D1.4.1 about 3D media should correspond to the mental models used to answer questions such as “what does it look like?”, “what is its function?”, thus making it possible to model, manipulate and compare the various 3D media in a semantics-oriented framework. Therefore, the ingredients needed to implement a 3D semantic application should definitely include a conceptualisation of the shape itself, in terms of geometry, structure and semantics, and of the knowledge pertaining to the application domain. In order to fulfil the requirements of complex 3D applications, we need tools and methods to formalise and manage knowledge related to the media content and to the application domain, at least at the following levels [SpFa2007]: • knowledge related to the geometry of 3D media: while the descriptions of digital 3D media can vary according to the contexts, the geometry of the object remains the same and it is captured by a set of geometric and topological data that define the digital shape; • knowledge related to the application domain in which 3D media are manipulated: the application domain casts its rules on the way the 3D shape should be represented, processed, and interpreted. A big role is played by the knowledge of the domain experts, which is used to manipulate the digital model: for example, the correct manner to compute a finite element mesh of a 3D object represented by free-form surfaces is subject also to informal rules that should be captured in a knowledge formalisation framework; • knowledge related to the meaning of the object represented by 3D media: 3D media may represent objects that belong to a category of shapes, either in broad unrestricted domains (e.g. chair, table in home furniture) or narrow specific domains (e.g. T-slots, pockets in mechanical engineering). The shape categories can also be defined or described by domain-specific features, which are the key entities to describe the media content, and these are obviously dependent on the domain. The first bullet point of the list is concerned with knowledge, which has geometry as its background domain. There are a variety of different representations for the geometry of 3D media that cannot be simply reduced to the classical Virtual Reality Modelling Language (VRML) descriptions and its variations, as currently supported by MPEG-4. Here, the view of 3D media is more concerned with the visualisation, streaming and interaction aspects than with requirements imposed by applications. 3D geometry, as used in applications, has to do with a much richer variety of methods and models, and users might have to deal with different representation schemes for the same product within the same modelling pipeline. In this sense, describing the content of 3D media in terms of geometric data is much more complex for 3D than for 2D media. There are many attributes and properties of 3D models that scientists and professionals use to exchange, process and share content, and all these have to be classified and formalised. The second bullet point refers to the knowledge pertaining to the specific application domain, but it has to be linked to the geometric content of the 3D media. Therefore, if we want to devise semantic 3D media systems with some reasoning capabilities, we have to formalise also expert knowledge owned by the professionals of the field. Finally, the third bullet point has to do with the knowledge related to the existence of categories of shapes; as such, it is related both to generic and specific domains. Usually in 3D applications, it is neither necessary nor feasible to formalise the rules that precisely define these categories in terms of geometric properties of the shape, besides very simple cases. However, due to the potential impact of methods for search and retrieval of 3D media, there is a growing interest in methods that can be used to derive feature vectors or more structured descriptors that could be used to automatically classify 3D media. 01/04/2010 15
  • 16. FOCUS K3D D1.4.1 2.3 Discussion 3D media content is growing both in volume and importance in more and more applications and contexts. In this chapter, we have introduced the issues related to handling this form of content from the point of view of semantic media systems, focusing on the level at which we should be able to capture knowledge pertaining to 3D media. 3D applications are characterised by several knowledge-intensive tasks that are not limited to browsing and retrieving properly annotated material, but deal with the manipulation, analysis, modification and creation of new 3D content out of the existing one. While this is true also for traditional 2D media applications, the tools and expertise needed to manipulate and analyse 3D vector-based media are still the prerogative of a rather specialised community of professionals and researchers in the Computer Graphics field. In this sense, the Computer Graphics community could make a significant contribution: on the one hand, it could provide modelling and processing tools able to handle and exploit the semantics of the shape, and interaction and visualisation tools, which allow users to interact with the shape in an intelligent way; on the other hand, it could contribute to comprehensive schemes for documenting and sharing 3D media and related processing tools, to be linked and further specialised by experts in different domains. Semantics-aware classification of available tools for processing the geometry and manipulating 3D shapes will be an important building block for the composition and creation of new easy-to-use tools for 3D media, whose use and selection can be made available also to professional users of 3D who are not experts in Computer Graphics. The use of 3D is indeed spreading out of the traditional communities of professional users and will soon reach a general and inexperienced audience. The role of experts in 3D modelling for the development of semantic 3D media is twofold: on the one side, the identification of key properties for the description of 3D media and processing tools, and on the other side, the contribution to the development of advanced tools for the interpretation, analysis and retrieval of 3D content. It is evident that if we want to be able to reason about shapes at the geometric, structural and semantic level, then we have to be also able to annotate, retrieve and compare 3D content at each of the three layers (SpFa2009). In the next chapter the current status of semantic 3D media in the four Application Working Groups explored during the FOCUS K3D project will be described, while in chapter 4 the grand challenges identified along the way will be stated together with a list of open issues necessary to reach these goals. 01/04/2010 16
  • 17. FOCUS K3D D1.4.1 Chapter 3 : 3D in application domains FOCUS K3D has identified four representative application areas that are both consolidated (Medicine and Bioinformatics and CAD/CAE and Virtual Product Modelling) and emerging (Gaming and Simulation and Archaeology and Cultural Heritage) in the massive use of 3D digital resources coupled with knowledge. The selection of these different application areas was motivated by two reasons. First, all these fields face a huge amount of available 3D data; second, the use of 3D data is not only related to visual aspects and rendering processes, but also involves an adequate representation of domain knowledge to exploit and preserve both the expertise used for their creation and the information content carried. Among the first actions of FOCUS K3D, a state-of-the-art report (STAR) was produced for each of the four application areas to report on the differences and similarities in the approaches and desiderata of users in these representative areas (see deliverables D2.1.1, D2.2.1, D2.3.1, D.2.4.1). These four STARs were meant to identify the use of current methodologies for handling 3D content and coding knowledge related to 3D digital content in the different contexts, and also to identify possible gaps. Each STAR therefore reflects a snapshot of the current situation in the four subfields that all massively use 3D data, but which are distinct in their characteristics, requirements, status, and condition. In this chapter, we provide a summary of the STARs discussed from the perspective of the research road map, aiming at describing each of the five application areas (Medicine and Bioinformatics are presented separately for the sake of clarity) and discussing the current situations and limitations as perceived by the users in the different domains. 3.1 Medicine Computer-aided medicine is a rapidly growing field and recent advances in medical imaging yield 3D shape data that are more and more reliable, accurate and massive. Different medical applications such as diagnosis, radiotherapy planning, image-guided surgery, prosthesis fitting or legal medicine now heavily rely on the analysis and processing of 3D information (e.g., data measuring the spatial extent of organs, data about the electrical activity of the brain), requiring different criteria for the specific 3D content they use. 3D content is sometimes used for shape measurements, as basis for diagnostic information, or to establish statistical reference data. Therapy planning, image- guided surgery and assessments of post-operative results are other typical examples of use of 3D content. The starting point of a typical computer-assisted diagnosis session is often a set of image data acquired from CT, MRI, MicroCT or other medical imaging devices. The current goal is to generate a digital 3D model suited to visual inspection and analysis, but the goal could be much more ambitious, such as simulation, electromagnetic wave propagation and temperature distribution for hyperthermia, soft tissue deformation for cranial-maxillofacial surgery, biomechanics for orthopaedic surgery, computational fluid dynamics for rhino-surgery, electrical potentials for electro-cardiology, and many more. In between, we find not only a sequence of automatic image processing algorithms but also a true reconstruction pipeline, which involves data registration, segmentation, 3D geometry generation and processing. While it is now commonplace to store medical images in digital form, 3D content used in medical applications is far from easy to obtain: there is a labour-intensive pipeline to get from the raw images generated by the medical imaging devices to the digital reconstructed anatomical models suitable to medical applications, such as therapy planning or surgical simulation. The current assessment reveals that the pipeline is far from being automatic and requires a 01/04/2010 17
  • 18. FOCUS K3D D1.4.1 lot of knowledge-based interaction, in particular for the segmentation of the images. Even after decades of research, a major challenge is to recover the inherent 3D models of anatomical structures from 3D medical images, as accurately and efficiently as possible. Surprisingly, the most common clinical practice for this task is an interactive correction of an automatic segmentation algorithm by a medical expert, which incorporates some knowledge throughout the pipeline. One important thread of research to facilitate this process is therefore devoted to the elaboration of semi-automatic tools such as magic wands, intelligent scissors or deformable models. As the resolution of the images always increases, it is however commonly admitted that such a labour-intensive interactive approach is not a long term approach. This motivates another thread of research devoted to the elaboration of fully automatic methods, which incorporate the knowledge through statistical or model-based approaches. One required feature is to let a door open for the expert to adjust the segmentation as a post-process in the few cases where the automated process fails. Similar issues arise for geometry modelling and processing, where one of the goals is to preserve the multi-domain labels during surface extraction and mesh generation. The practitioners resort to creating 3D models of organs or of human beings for simulation purposes. Those models are frequently input to finite element software, for example to simulate problems related to the hip joint from patient-specific data. This is a typical example of goal-oriented modeling, where the geometry of the organ to be modelled needs to fulfil the specific requirements imposed by the technical solution required to solve the objective, or function to be simulated, which might differ from those of the simple visual inspection: for finite element analysis, for instance, a fine volumetric mesh representing the boundary and interior of the bone is required, while for the simple visualisation of the bone shape a coarser triangulation of its boundary surface would suffice. Another example is the simulation of the human articulations for replacement by prosthesis. The different steps are bone resection, prosthesis alignment, and checking of motion amplitudes. In terms of knowledge, the parameters of interest are contact pressure within the cartilages, forces and muscle biomechanics. The connections to physiology are related to material properties as well as to bone motion and geometry. The current state of the art assesses that such a data process is considerably simplified when these parameters are stored into an ontology, and even more so if parts of the models of the geometry could be annotated with functional information or comments by the medical doctors. 3D data and knowledge are again tightly coupled in the simulation of laparoscopic surgery for training and practising. The goal is to simulate the planned surgery with a robot acting on a 3D model, so as to rehearse the gestures and to check the feasibility of the complete process. The knowledge involved is related to the anatomy, the forces applicable to the organs and the physical properties of a tumour. Virtual anatomy is perceived as a central problem concerned with the elaboration of 3D geometry models of human anatomy. 3D models must be augmented with knowledge about the function of the organs, as the primary end goal in computer-aided medicine is to obtain a human body, which functions well after surgery or treatment. The function itself cannot be decoupled from the topological and geometrical relationships between the organs. Virtual anatomy is important in medicine with respect to patient-specific computer-assisted therapy planning, as well as for industrial applications, such as virtual crash tests where human models are involved. It is common to distinguish between individual anatomy and generic anatomic atlases, which demonstrate anatomical structures and their relationships. Individual anatomies are of special interest when pathological situations need specific attention with respect to a medical treatment. Generic anatomy models are carefully designed using modelling software, whereas patient-specific anatomy models come as output of a scanning device, associated with point set based or image-based reconstruction algorithms. Another noticeable trend, in simulating and monitoring surgery and therapy is the replacement of generic anatomic atlases by patient-specific digital anatomical models. This trend results in the generation of more and more massive 3D content, thus requiring the help of knowledge technologies for classification, storage and retrieval of those data. 01/04/2010 18
  • 19. FOCUS K3D D1.4.1 Monitoring therapy or following disease evolution requires storing and retrieving various data for each patient over long time scales. A typical workflow example is provided by a clinician in a department of neuro-rehabilitation, which routinely performs transcranial magnetic simulations (TMS). The data are initially acquired by MRI. The MRI images are then segmented and a so-called 3D anatomical project is created for each patient. Such project, which corresponds to a geometry reconstruction process, includes surface reconstruction and mesh generation for the head as well as for the cortical surfaces. This anatomical project is stored and reused for repeated TMS sessions performed for the same patient. Overall the evolution toward patient-specific modelling and simulation is recognised as an advance. However, the fact that either little or no reuse of data and processes is currently performed across patients is recognised as being very detrimental to the productivity. Intuitively, the more patients processed the more productive (and hence assisted) the expert would like to be. The experts and clinicians are conscious that the goal here is not only to reuse the data, but also to reuse the processes themselves through documenting the whole process. This challenge is where knowledge-based technologies must play a role. Knowledge technologies are increasingly required to manage structural, functional and topological information extracted from the medical data. More specifically, semantics is required at every step of the modeling and processing pipeline, which ranges from raw data to goal-oriented 3D numerical models through the crucial and yet current bottleneck step of data segmentation. Similarly, visualisation is always at the core of many medical applications such as diagnosis, surgical simulations, image-guided surgery and training. In the context of knowledge-based computer-aided medicine visualisation should be not only concerned with the plain rendering of the geometry, but also with semantics. While our current assessment is that geometry and semantics are most often decoupled when it comes to interaction and visualisation, some recent research experiments show that it would be very beneficial to combine both. For instance, the process of simulating a complex orthopaedic surgery is considerably simplified when both geometry and knowledge are stored into an ontology, as the practitioner can get an interactive, faithful simulation including both geometry and semantic information. Such simplification translates into a reduced surgery planning duration, which is critical for the practitioners. Having knowledge such as organ dependency and material properties combined with geometry is already a significant step toward efficient and reliable surgery planning. What is missing in the current knowledge representation is a detailed description of the link between the geometry (and topology) of an organ or tissue and its function. For example, the shape of a joint (e.g., a spherical cap) partially explains its function in terms of motion. The geometry and topology of an organ also often explains its function, such as connection to a bone or to another organ, etc. The simulation could thus be improved by integrating into its model the inherent functional constraints of the musculoskeletal system. Standards already play a prominent role in what is currently the most important use of knowledge technology in medical applications: description of anatomical features, pathologies and protocols. Examples are the Unified Medical Language System (UMLS), the Current Procedural Terminology (CPT), and the Foundational Model of Anatomy (FMA). These standards serve different purposes such as training and education, communication among practitioners, statistical analysis for health insurance, preventive medicine and legal issues. Current standards are primarily required to ensure the continuity of medical imaging data and the sharing of those data among care and research centres. The development of standards will most likely continue fostering the mutual enhancement of different imaging technologies. Standards for 3D medical content are however currently lacking, although they are likely to have in the near future a fundamental role in the field of computer-aided medicine where patient-specific anatomical modeling is of increasing importance. Similarly, the development of standardised procedures (including geometry and knowledge) for goal-oriented modelling and processing will be of increasing importance both for the sake of quality and for legal issues, when knowledge-intensive modeling and processing techniques will be routinely used. 01/04/2010 19
  • 20. FOCUS K3D D1.4.1 3.2 Bioinformatics While computer science is concerned with the development of (generic) algorithms, which in general may be applied in a variety of settings, biology is more of a system-centric activity, since in general a panel of methods are used to investigate a particular system (molecule, cell, tissue, organ, organism). In biology, knowledge technologies are especially suited to structure the knowledge relative to a particular piece of this multi-scale puzzle, as well as to integrate data across the different scales. In this broad context, knowledge technologies in general and ontologies in particular are characterised by two main features. First, their development is under the guidance of biologists, since a precise understanding of the systems under investigation is necessary in order to select relevant annotations. Second, the knowledge technologies used cover the full range, from simple databases all the way to complex ontologies, as evidenced by the numerous tools which are made available from the web portal of the Protein Databank in Europe (PDBe). If one narrows down the application scope of knowledge technologies to applications dealing with 3D shapes, the natural setting in biology is that of structural bioinformatics. With structural biology being the sub-domain of biophysics and biology concerned with the investigation of the connection between the structure of macromolecules and their function, structural bioinformatics is concerned with the development of methods and algorithms meant to complement experiments in this perspective. Structural biology poses a number of difficult problems, among them the acquisition of reliable physical signals, the reconstruction of macromolecular models from these signals, the description of these models so as to design parameters best describing their biophysical properties, and finally, the utilisation of these models for prediction. In particular, predicting the 3D shape of a protein (the folding problem) and predicting the interaction of two or several partners (the docking problem) are clearly two of the main challenges in structural biology. When working on these challenges, since the shape of molecules is so important, all scientists resort to 3D content. This 3D content consists of a mix between geometry and biochemistry. The geometric side corresponds to molecular representations and conformations, either represented as a collection of balls, using Cartesian coordinates, or using internal coordinates (bond angles, valence and dihedral angles). On the knowledge side, one specifies the types of atoms and bonds, as well as a number of biochemical annotations (positional fluctuations for experimentally resolved structures, known partners of a protein, affinity constants, diseases the protein is involved with, etc.). The complexity of the concepts dealt with can again be seen from the various tools, which are accessible from the web portal to the European PDB at From a modelling perspective, the topics of interest depend on the application scenario. Experts in reconstruction from experimental measurements wish to output a model coherent with the physical measurements. Scientists designing drugs are especially interested in pockets on protein surfaces, since these accommodate drugs. Similarly, the affinity and specificity of protein–protein interactions are tightly coupled to geometric and biophysical properties of the molecules. Biologists wish to understand how the structure of molecules accounts for their function. This requires making progress on the two aforementioned mechanisms, namely folding and docking, which involves manipulating a mix of quantitative and symbolic information. We substantiate this claim by examining four situations, which exhibit a wide range of difficulties: • Characterising the packing properties of atoms. Native forms of proteins, that is biologically active forms, exhibit very specific patterns in terms of spatial properties, in particular regarding the packing properties of atoms. Packing properties are obviously numeric properties (for example atomic volumes). On the other hand, annotations of proteins are key: well-packed proteins are called native like, while proteins with loose packing are called decoys. In between these two extremes, a wide variety of situations occur, and some of them are particularly important for the description of diseases (e.g., 01/04/2010 20
  • 21. FOCUS K3D D1.4.1 misfolding in amyeloid diseases); • Computing the depressions and/or protrusions of molecules. When docking two molecules in a blind fashion, i.e., without any a-priori knowledge, knowing where the binding sites is a must. Depressions/pockets and protrusions/knobs on molecular surfaces are good hints. In general, a pocket is defined in geometric terms (e.g., a concave region of sufficient volume, or the volume corresponding to the stable manifold of a maximum of the distance function to the molecular surface, etc.), but such quantitative models are not easily handled, and annotations such as flat pocket, a pocket with two crevices, etc. would help the design process. • Developing proper notions of curvature, in conjunction with solving models and binding modes. Curvature inherently refers to differential geometry, and there are indeed several strategies to quantify the curvature of a molecular surface. Aside from these quantitative statements, the qualitative description of regions of molecular surfaces in qualitative terms such as flat, convex, concave, saddle-like, etc. would help biochemists. • Partitioning a molecule into regions, which are likely to be flexible and rigid. Such regions are of utmost interest for flexible docking algorithms. Rigidity characterisation typically resorts to normal modes – the eigenvalues and eigenvectors of the internal energy of the molecular system about a minimum. These quantities are used to define molecular motion involved in deformations coupled to docking. Practically, annotations such as hinge motion, shear motion, etc. are more easily handled than the underlying geometric descriptions. Documenting the life cycle of 3D objects manipulated in bioinformatics naturally depends on this context. We shall discuss three examples illustrating the complexity of the documentation problem. Consider first a biophysicist who is experimentally solving the 3D structure of a protein: he/she aims at producing a final product, namely the 3D structure of the molecule. The structure submitted to the Protein Data Bank contains specific pieces of information, namely those imposed by the PDB file format. However, the difficulties handled along the process may or may not be documented. To make a long story short, a biophysicist interested in solving at the atomic level the structure of a given protein using X-ray crystallography has to go through the following steps: (i) introduction of the gene coding for this protein into a host cell, (ii) large scale protein expression and purification, (iii) crystallisation, i.e., crystal growth, (iv) X-ray diffraction analysis of this crystal and model reconstruction. Each of these steps might be challenging and may require years of work. Experiments along the way are written down in the Laboratory Information Management Systems, which is the memory of the process followed. Still for complex cases the whole process might be more similar to a handcrafted activity, and rationalising might be very difficult. It is also important to note that a person who managed to carry out a complex process will first exploit it before releasing the details. Consider now a person dealing with molecular modeling, who is working on the problem of inferring a putative complex from two unbound partners. Key questions contributing to the ability to run a reliable prediction are: which evidence of interactions between homologous proteins does one find in the biochemistry literature (this requires a thorough literature search, together with a precise compilation of experimental conditions)? What are the key regions in the interaction? Are additional molecules involved to stabilise the interactions (ions or water molecules)? Are there flexible regions? What is the role of electrostatics – which often plays a crucial role at early stages of the binding? These questions involve a subtle mix of relatively simple and complex tasks. Simple questions are those related to the search for proteins within a given homology threshold with respect to either a given reference protein, the classification and the search of protein folds, the search of protein atoms with prescribed biophysical properties (e.g., temperature factors), or the search for bibliographical pieces of information related to a molecule or a complex. 01/04/2010 21
  • 22. FOCUS K3D D1.4.1 These tasks clearly benefit from knowledge technologies and can be easily documented. Nevertheless, there are other and clearly very difficult questions, for example handling flexible regions, computing precisely free energies, handling mutagenesis data, and understanding the precise connection between coarse and atomic modeling results. These questions in their full generality are still open. The route taken to produce a prediction cannot be fully and easily traced and there is no standard way to rationalise and document it. Finally, consider a pharmaceutical company designing a drug. The docking process just described is now a mere step in a much longer process. To begin with, the chemist will aim at identifying candidate drugs for the disease of interest. Drugs are small molecules, which underwent a wealth of experiments in the chemistry and biochemistry communities over the past five decades. The zoo of small organic molecules is rather stable. Simple rules involving the molecular weight or the solubility of a drug allow one to infer pretty reliably the chances for a molecule to be a lead – cf. Lipinski's rule of five (Lipinski et al. 2007). These pieces of information can be rationalised and organised into ontologies, and can also be easily traced in Laboratory Information Management Systems. The second step, docking, is more difficult to handle as seen above. Finally, upon identifying good hits thanks to a docking prediction, wet lab tests must be carried out, and finally clinical trials implemented. Overall, the whole process lasts more than a decade, and there is no standard way to trace it thoroughly. In fact, the Product Life Management within pharmaceutical companies is a major problem and certainly an obstacle to benefiting from the experience accumulated over a large variety of projects. This observation motivates the strategy of companies such as Dassault Systemes: having to some extent mastered the life cycle of manufactured objects, Dassault Systemes now aims at providing their tools and environments to pharmaceutical companies. To conclude, there is no general and unified way to handle the documentation of the life cycle of 3D objects encountered in bioinformatics and applications. In the most general setting, our understanding is not yet sufficient to document and elaborate easily simple knowledge technology models. The semantic interaction found in structural bioinformatics is not different from that found in other domains. As mentioned before, a number of tools have actually been developed and made accessible. Let us mention three of them, which are available from the EBI institute at, and allow running queries on the Protein Data Bank. The rationale for mentioning these three tools is that they are of increasing complexity in terms of knowledge technologies. The first one, PDBelite, allows the user to pre-define criteria to be used by a search engine. The second one, MSDpro allows the user to define its own logical queries using a drag-and- drop interface. Finally, the third one, MSDmine gives access to a complex data model allowing the user to take full advantage of the features of a relational database. Interestingly, the knowledge data made available from these services are rather elementary in terms of geometry, that is, no advanced annotation of geometric nature is provided. This is consistent with the complexity of the aforementioned mechanisms. The role and status of visualisation in the field of structural bioinformatics is rather interesting. On the one hand, molecules are 3D objects, and it is rather tempting to inspect their geometry so as to guess what the binding patches might be, or to assess the complementarities of the molecular surfaces of interacting molecules. On the other hand, the binding of two molecules is a very subtle mechanism, with a subtle interplay between enthalpic and entropic phenomena, and a mere 3D visualisation cannot account for these delicate mechanisms. However, the visualisation of 3D structures is important to develop one's intuition. It is also of prime importance to communicate with large audiences and promote this kind or research. Standards do not play a prominent role in structural bioinformatics. File formats, for example those that are used in the Protein Data Bank, come with standards. However, as seen above, the complexity of the biological processes involved, e.g., in the experimental resolution of a protein structure, is such that typically non standard problems are faced, which calls for 01/04/2010 22
  • 23. FOCUS K3D D1.4.1 an unstructured part in the files themselves. These difficulties and also the variety of scenarios handled are clearly a hindrance to the exchange of information. 3.3 CAD/CAE and Virtual Product Modelling CAD/CAE and Virtual Product Modelling cover all ranges of man- made objects: electronics, transportation devices, tooling machines and many more. There are important differences between CAD/CAE and the other fields considered in FOCUS K3D. Medicine and Cultural Heritage are of global interest and collaboration amongst researchers, building ontologies, etc., has a long tradition, whereas product-developing companies face much harder competition amongst each other. Although the human body can be looked at as 'the most complex machine' known, it is just one type of object that Medicine and Bioinformatics are trying to decode, understand and conceptualise/formalise fully. In Virtual Product Modelling the objects (machines) modelled are much less complex than the human body but the range of different types is enormous and cannot be easily mapped into one scheme. This explains to a certain degree the different status in terms of ontologies (in virtual product modelling there are few) and standards in these areas. Nevertheless, the main problems of CAD/CAE are still comparable to those of the other fields, e.g.: • How to generate a model from a digitally acquired object? • How to model a shape easily? • How to model function or behaviour? • How to retrieve a model with certain properties? • How to handle all the information involved? Virtual Product Modelling is part of the virtual product developing process. The 3D modelling phase is typically fed with goal market characteristics, market/cost figures, requirements (aesthetic, functional, environmental, safety, etc.), concept sketches, and so on and so forth. Stylists and designers use CAD (computer-aided design) and CAS (computer-aided styling) systems to create virtual 3D models that foremost represent shape. The product functionality is usually modelled using a variety of behaviour/function modelling tools and is simulated with corresponding solvers. In the process, physical models still need to be created for styling reviews or to validate behaviour. In the styling phase, the shape of physical models is changed manually, which requires scanning the resulting 3D surface and re-creating a virtual model for it (re-engineering). In the simulation phase data from real experiments are used to tweak material models (to just name an example). The data created along the process typically reside in a multitude of data sources with a tendency to integrate product relevant data into a product data management or product lifecycle management system. Today most of the content of those systems is restricted to geometric 3D information, the whole semantics being in the heads of the engineers, and not explicitly documented in a computer-interpretable way. Although being the field where 3D media really started off (Sutherlands SketchPad 1965, company-owned CAD systems in the 1960s, etc.), the software in use today still requires a lot of manual intervention and tedious work and typically imposes certain concepts onto the user instead of giving him/her full freedom in choosing the best-suited tools at will. In the following we shortly present the state of play for the most important phases and kinds of software tools and give hints on current limitations. 01/04/2010 23
  • 24. FOCUS K3D D1.4.1 CAD systems enable designers to create geometric models of products with the computer, to be reused and manipulated by the designers as needed. CAD systems were, and remain, highly technical software with an extremely rich feature set and functions for detailed design work. CAE tools are widely used in the automotive and aerospace industry. In fact, their use has enabled engineers to reduce product development cost and time while improving the safety, comfort, and durability of their products. The power of CAE tools has progressed to the point where much of the design verification is now done using computer simulations rather than physical prototype testing. Today’s shape representations in CAD (mainly NURBS) do not feed all purposes well, especially not those of simulations. For designing, the concept of creating a larger base surface from which certain parts are cut out by trim curves to create the final piece of surface is certainly artificial – it has no natural equivalent. Other representation schemes are needed that allow for more intuitive and easy shape manipulation and also enable easily integrating concepts of function-oriented design, etc. Simulations typically require a different kind of shape representation than provided by CAD systems, a discrete instead of a continuous one. This implies a tedious manual process of model preparation to suit different simulation needs. The integration of CAD and CAE using FEA (Finite Element Analysis) is today very complex due to the different principles for shape representation employed. In CAD, a volumetric object is described by shells described by a patchwork of mathematical surfaces representing the outer hull and inner hulls of the object, and data structures conveying the volumetric relationships. There is no requirement that adjacent surfaces match exactly, only that they match within specified tolerances. In FEA, the object has a complete mathematical description through watertight structures of trivariate parametric volumes. There is a fast growing research field in both the US and Europe concerning this discrepancy, addressing how to solve better this great interoperability challenge of CAD and FEA. This research field can be denoted as “Isogeometric representation and analysis” aiming at a common shape representation for both CAD and FEA to integrate the two disciplines and drastically improve the quality of Finite Element Analysis. Consequently, there is a need for investigating the concepts of this new approach both from a purely mathematical and from a semantic perspective to integrate it into current industrial information pipelines. Isogeometric representation and analysis employ new concepts in both the shape representation phase and the analysis phase. We will need ontologies that encompass these concepts and their relationships to the traditional concepts to facilitate interoperability between the new isogeometric systems and traditional CAD and FEM. Today, many companies use product data management (PDM) systems to store product related information, such as geometry, assembly structure, and materials. Considering the full product development process and the mechatronic disciplines (mechanics, electronics, software), there exists much more information spanning from requirements, to specifications, to mechanic, electronic and software behaviour models, to cross-domain dependencies, influences and effects, largely carrying the semantics of the product and not represented in PDM. The question is whether all the data should be in one PDM system. A more promising idea is to bridge the existing islands with semantic linking to manage knowledge effectively in order to reuse past positive experiences and solutions and avoid repeating past efforts or errors. When talking about knowledge in industrial companies, one of the important islands of information is the one that deals with the knowledge of the geometrical properties of the products manufactured, bought or integrated, and to be maintained during product changes. Therefore, models and methods for making explicit and maintaining links between geometric properties and domain specific knowledge have to be developed for improving both the specific process – either computer-aided or manually performed by the operator – and also the inter-process communication. This would be beneficial for the integration between design and simulation in product development, and in simulation during maintenance. Currently, searching in distributed information sources is hampering efficient work 01/04/2010 24
  • 25. FOCUS K3D D1.4.1 procedures. When linking those information islands together, new search functionality is needed that spreads out over these sources, takes shape, function, metadata and semantics into account, and combines partial results to meaningful answers of the user’s questions. New partial/local 3D retrieval approaches and combined searches contribute to overcoming the current limitations of text-based or current global 3D content-based retrieval. Another aspect is the appropriate visualisation and navigation through results being returned from combined searches and user feedback for self-improving and self-learning mechanisms. Another current limitation in the downstream process of virtual assembly is the tedious way to create virtual kinematics and simulate this interactively within the virtual environment. Nowadays kinematics simulation is performed in the domain of CAD and CAE software packages, which need specially trained personnel and are designed to work on detailed product models without providing the degree of realism and interactivity virtual environments offer. Here, using semantics from CAD and parametric feature-based design for virtual kinematics in a Virtual Conceptual Design (VCD) platform would ease the process and support the early conceptual design phase as well as the embodiment design, allowing the immediate validation of assemblies and mechanisms by experiencing their behaviour during the modelling, simulation and assembling process. 3D model exchange between different systems has been and is still a great challenge with respect to model quality, as different CAD-systems and CAD-system kernels employ several approaches (and tolerances) to handle the inherit inaccuracies of STEP-type CAD-models. CAD-model check and repair have been on the agenda for many years and resulted in a number of standardisation initiatives: • AIA/ASD EN-9300 Long Term Archiving (LOTAR) addressing High Quality Geometry verification and validations and rules to execute. • ISO 14721.4, Open archival information system – Reference model. • ISO 10303-59 Product Data Quality (PDQ), Part 59: Integrated generic resource – Quality of product shape data. Since standards for CAD model quality are in place, the focus of advanced industry has now turned to the validation of the CAD-models exchanged (e.g., the STEP file, or the CAD-model generated by the receiving system), being an exact representation of or equivalent to the CAD-model in the sending system. This demand for equivalence checking poses new challenges to the CAD-model representation, to the semantic annotation of the CAD-model and to CAD-vendors in general, as industry requires more than the basic CAD-technology, CAD models, representation and algorithms to be available for validation. The ‘standard file format problem’ has to be addressed by academia and research institutes in the future. We need to develop a 3D file format addressing the following problems that are still open: • compatibility / interoperability / semantic mapping; • shape representation (procedural, parametric); • functional expressiveness; • transport of objects together with their interactively modifiable parameters. 01/04/2010 25
  • 26. FOCUS K3D D1.4.1 3.4 Gaming and Simulation Compared to Biomedicine and CAD/CAE, the gaming industry is relatively young. In addition, it developed mainly independently of research in gaming technologies – which until recently was often not even considered as an academic activity. Also knowledge technologies, such as taxonomies and ontologies have limited application in the gaming-related industries. Relevant semantic information is thus mostly created manually. Mathematical descriptions of physical characteristics (e.g., max. rotation and state of a door handle) are often used to describe implicitly semantic characteristics instead of modelling them explicitly (e.g., door opens to the inside, door handle can only be pushed down). In simulations that are, for example, used for training and education (flight simulation, fire fighter training, etc.), realistic visualisation is obviously an important if not essential issue. However, it also has a high relevance in gaming where having the latest, “coolest”, state-of- the-art computer graphics is often an essential criterion to survive in a highly competitive market. One unique characteristic in gaming and simulation is that objects and characters are normally part of a virtual world in which they move, operate, and interact with each other. Thus, animations, emotions, changes over time, interactions between characters, etc., play an important role and require descriptions of, for example, material characteristics (flammability, behavior under pressure, etc.), how subjects and objects react to each other (e.g., a car driving over an icy road, a game character getting punched in the face by another one) and so on. As mentioned above, although this is a classical example for the beneficial usage of semantic knowledge, related techniques are hardly used, and a lot of this information is often created manually. Some semantic characteristics are however acquired from the real world. Motion capturing is commonly used in the gaming industry to capture typical movements of a game character and thus create animations that better model how humans move and behave in the real world. In addition, research is working on methods to model and better describe related aspects. Examples include work on modelling emotions such as crying and realistic visualisation of tears, and more realistic simulation of how people move in crowds. Data from Graphical Information Systems (GIS) is sometimes used in modelling worlds and landscapes. As already said, providing a high level of realism is normally essential for simulations that try to mimic the real world and are used, for example, for training purposes and serious gaming applications. Leisure games can include real world simulations as well (e.g., sports games, car races), but also fantasy worlds and characters that do not follow common physical rules, but are just the product of a game designer’s imagination. In all cases though, high quality graphics and realism (or a “convincing” look and behaviour of fantasy characters) are important. Common 3D modelling and advanced computer graphics techniques are therefore commonly used, including meshes of different sizes with texturing, bump mapping, skeleton- based modelling for characters that are, for example, animated based on motion capture data, statistical modelling of virtual worlds based on the likelihood principle, and so on. Related models are often created from scratch, but sometimes also models from CAD are used (cars, ships, etc.) and integrated into the gaming environment. Similarly, but much more rarely, tools such as 3D scanners are used to create realistic models for simulations. Adding game-characteristic issues to CAD models often means modelling the aforementioned issues, resulting from the fact that in games we are usually dealing with objects in an interacting virtual world where temporal changes, relationships, and interactive behaviour need to be considered. The high need for realistic modelling and visualisation is often in conflict with the demand for real-time behaviour of the objects and characters in the virtual worlds. Objects such as fast 01/04/2010 26
  • 27. FOCUS K3D D1.4.1 cars in a race simulation and characters such as fleeing people in a fire fighter simulator training do not only have to look and move realistically, they also have to do it in a fast and temporally appropriate way. Game and simulation developers therefore often have to make a decision about the level of detail (LOD) of a model and might even be forced to sacrifice its geometric quality for the sake of speed. Automatic tools supporting a 3D designer in such decisions with consideration of the semantic characteristics would therefore be quite useful. Creating simulations and games that take place in large, highly complex virtual worlds can also require from developers to reuse models (or parts thereof). However, storage, management, and organisation of 3D data are surprisingly often realised using rather primitive approaches that rely on naming conventions, pre-specified folder structures, textual descriptions, etc., instead of using tools designed for such purposes. To browse and investigate sets of 3D models, the modeling software that was used to actually create the models in the first place, is often used instead of special browsing and search tools. Although knowledge technologies promise to assist significantly in such tasks and offer the potential for a more effective management and re-use of existing data, currently they seem to be hardly used in this context. Reasons for this might include the sometimes still prototypical status of related approaches, heavy dependence on existing proprietary data and workflows, but also scepticism towards new ideas from academia, or simply a lack of knowledge of existing techniques. However, despite a seemingly hesitant behaviour when it comes to the adoption of knowledge technologies, the related community is well aware of the lacks and limitations of existing workflows and generally agrees that – if done right – related tools could have a high benefit. Semantically annotated libraries and databases with semantic search functionality that allow you to easily find and re-use existing models or give you parts that allow you to easily create new ones that do not only “look good”, but also fit together considering their semantic behaviour would certainly be beneficial and have a positive impact on the design and development workflow. One critical issue for many people is that semantic annotation still requires a significant amount of manual labour. This is particularly critical for the parts of the gaming industries that are characterised by high competition, extremely short development circles, and tight deadlines. Semantic annotation in such a scenario does indeed seem like a major burden in adapting knowledge technologies. However, one possible option to deal with this issue is to transfer the related effort from the developers to the actual users. For example, online games often have a highly active and committed user community that is not only interested in playing games together, but contributing, building, and refining the game itself. The tremendous success of social networking and tagging sites suggests that users might indeed be willing to provide such input and first successful examples from the gaming area already exist. Aside from such aspects related to the synthesis and documentation of 3D models and their semantics, there is also an apparent need for semantic search and retrieval in this community, as illustrated by a quote from our state-of-the-art report, saying that “a web-based search and retrieval system would be nicer, with also version control and more metadata, like polygon count, texture count, formats, texture size, and retrieval tags; info on which projects it was used for, and when; name of creator and people who worked on it.” However, people are usually quite sceptical if such a goal can ever be achieved. Reasons for this are not necessarily motivated by the tremendous technical challenges we would have to solve for it, but also to a large degree by the nature of this industry and the related market. For example, although desirable, sharing data and semantic information across companies is something that is unlikely to happen. Games and simulations usually have a temporal component and as such often heavily rely on real-time behaviour. Similarly, both are often highly interactive. For example, players of a game often take the role of a character or steer an object such as a car. They expect the virtual world and the characters and objects therein to react promptly and in a semantically reasonable way. Finding the right level of detail (LOD) that guarantees real-time behaviour, while at the same time providing a high-quality visual representation, is therefore a very critical issue. However, semantic technologies cannot only be helpful in the visualisation of the game or simulation itself, but also in the production of material such as related documentation. 01/04/2010 27
  • 28. FOCUS K3D D1.4.1 For example, creating icons of ship models is considered as a separate and significant step in the workflow when adding such models to a ship simulator environment. Semantic approaches that automatically calculate the “best view” of a model promise to be very useful in this step that is currently mostly done manually. If we look at the users’ perspective, semantic interaction does not only involve the related realistic visualisation and modelling of the virtual world and its characters, but also how they can interact with them and how they control them. While special hardware can often be used for serious simulations, interfaces for leisure games have long been dominated by game specific controllers such as joysticks, steering wheels, light guns, etc., but also simple keyboard and mouse input. However, a recent trend in gaming interaction and input mechanisms suggests an increasing relevance of 3D semantics in this area as well. For example, the tremendously successful Wii remote controller maps the 3D motion that a user is making to related movements in the game. Other companies are working on 3D tracking by video-based interaction. Gaming on the iPhone is becoming increasingly popular and has pushed mobile gaming to a new level – partly because of the innovative usage of new interaction modes that map the 3D orientation of the device to motion in the game. Related research is just in its beginning, but the need for better tracking of human motion and a more accurate and realistic mapping of it to actions in a game that also considers the semantic restrictions and provides a natural interaction seems obvious. Like any other area that is dealing with large amounts of 3D data, gaming and simulation could benefit significantly from the introduction of general standards and commonly used file formats. However, even if this is often acknowledged by people working in these branches of the industry and mostly considered as an important objective, activities towards creating and establishing them are rather limited, with few notable exceptions, such as COLLADA managed by the Khronos Group. One reason for this seems to be the large scope of the field including areas as different as sports games, car race and flight simulators, realistic battlefield and war simulations, fantasy and adventure games, etc. Being able to create a common platform such as “a game character ontology” or the like does indeed seem quite unrealistic and not to be a reasonable effort. Focusing on sub-areas (e.g., taxonomies for sports games or ontologies for race cars) might seem reasonable, but again the success is questionable given that people working in related sub-areas are often direct competitors and thus not interested in creating a common basis for exchange and sharing. Considering the file format, the situation might be different because it could be beneficial not only for data exchange among different users but also, for example, for re-usage and exchange within one institution. It would also be of high relevance for the adoption of semantic processing tools, such as approaches to automatically separate complex models into smaller, semantically useful parts and to automate the semantic annotation process. 3.5 Cultural Heritage and Archaeology The Cultural Heritage and Archaeology domain is characterised by increased 3D digitisation efforts and an increasing volume of 3D content during the last few years with several past and ongoing projects. The main interests of professionals (e.g., curators, archaeologists, historians) in this field are: firstly, the e- documentation of the past in 3D; secondly, an effective and exact organisation and presentation of archaeological/cultural heritage content to virtual visitors. Another area of interest for the cultural heritage community is the development of educational applications for connecting real and virtual artefacts. 3D models and virtual spaces have a huge potential for enhancing the way people interact with museum collections and e-learning environments. The creation or capture of 3D models does not appear to trouble professionals (users, developers, scientists, creators of 3D content, publishers/dealers of 3D repositories, etc.). Laser scanners and photogrammetric methods are the most prominent techniques used for acquiring artefacts or large structures and complex archaeological sites, while many models and 3D environments are built from scratch. However, the lack of widely accepted (by all 01/04/2010 28
  • 29. FOCUS K3D D1.4.1 cultural heritage professionals) data formats, metadata structures and/or ontologies makes the semantic integration of 3D models very difficult. A significant amount of time is spent on manual model annotation and documentation. Moreover, the management of 3D collections is at an early stage and in many cases nonexistent at all. Our investigation confirmed the current situation and revealed some specific problems and shortcomings related to 3D knowledge management. Although great technological progress has been achieved, the involvement of cultural institutions with knowledge technologies related to 3D is still limited. 3D professionals mainly deal with the geometric and structural aspects of 3D models while the semantic aspect appears only for the management of more complex models. However, in archaeology and cultural heritage, object semantics is typically just as important as the actual geometry. Thus, it is a key requirement to assign thematic information to entire objects and to individual geometric elements in a virtual environment. This also makes it possible to either select, analyse, or edit the geometry and the appearance of objects based on semantic criteria. The benefits of adding semantic information in the different stages of the reconstruction and presentation of an artefact or an entire virtual space are twofold. From the modeller/creator of 3D content perspective, re-creating an ancient artefact in 3D or an ancient site (architectural space) is difficult and time-consuming work. One cannot simply use a scanner. 3D reconstruction is a long design process based on available data (pictures of the remains, archaeological research drawings and maps, etc.) and evolving in close collaboration with archaeologists and historians. For example, a scanned artefact may be incomplete or damaged, and needs to be reconstructed by applying rules like repeating structures, symmetry or boundary conditions, knowledge of similar artefacts, etc. The import of high polygon models and rich object textures is a key issue for creating the realism necessary for a believable immersion into a virtual environment. Another perspective/area of interest is the organisation and presentation of archaeological and cultural heritage content to virtual visitors (virtual exhibits). Developing educational/training application scenarios and environments, which are visually complex and information-rich are a very effective learning tool. The overall experience of a student or virtual tourist is defined by the virtual reality representation/re-creation. The idea of learning as an active, self-directed, and context-dependent process (e.g., walking around, admiring the ancient buildings, interacting with objects, etc.) can greatly contribute to gaining new knowledge. Practices and methodologies obviously vary widely from scientists, researchers, developers, designers, project managers, to publishers and dealers of 3D content. Concerning the actual amount of data, there is a huge variety ranging from just a few models (e.g., ten) to very large collections (e.g., thousands of models). Most of the data are stored on file servers or proprietary repositories. Often rather primitive approaches are used for handling 3D content (e.g., using file and folder names to encode information about the contained data), and there is an apparent lack of information about knowledge technologies or at least a common misunderstanding or misuse of the related terminology. There is a lack of specialised tools that are particularly designed for organising, browsing, and searching 3D content. Concerning the e-documentation of 3D objects, a common practice is to attach annotation and other general information to the objects’ geometry. However, the annotations frequently get lost when geometry changes due to various reasons (e.g., simplification, transformation, even standardisation), while preserving semantics in this process is crucial. Traceability to sources and data provenance must be documented and guaranteed. Embedding semantic information and descriptive metadata related to the original source, age, design and existing knowledge on associated artefacts can contribute both to the efficient work of a cultural heritage professional and to the overall experience of a virtual visitor. People from the Archaeology and Cultural Heritage domains claim to be dissatisfied with the way they manage and store 3D content and that they would expect an improvement. Areas mentioned for improvement or for which they feel that the use of 3D knowledge technologies 01/04/2010 29
  • 30. FOCUS K3D D1.4.1 can provide a solution include functionalities related to the documentation and identification of objects and of parts of them, automatic extraction of geometric and semantic information from models, better visualisation, and improved search using semantic and geometric criteria or a combination of both. A noteworthy observation is that 3D collections are becoming more and more demanding in terms of management, preservation, and delivery mechanisms. Our investigation confirmed the current situation and revealed some specific problems and shortcomings related to 3D knowledge management. There is significant debate whether the approaches adopted for digital libraries are suitable also for this kind of digital objects. 3D digital representations are often neglected in efforts to create large-scale libraries and there is also a lack in effort by cultural heritage institutions to acquire and use knowledge about 3D content. An obvious shortcoming in the European Digital Library (EDL) project and its prototype implementation (Europeana site: is the absence of 3D content. One of the reasons is the lack of widely accepted standards and protocols for knowledge management in relation to 3D objects. Whereas text and image digitisation and management are rather mature technologies, the technology necessary to benefit from 3D knowledge management fully is not yet within easy reach of most cultural institutions and professionals. A major challenge towards semantic 3D data management is to provide effective content-based and semantic-based organisation and searching. Although the Cultural Heritage community is well aware of knowledge technologies, which are used for other kind of content (e.g., text, images), no particular methods/standards are used for managing and organising 3D content. Concerning the adopted knowledge technologies, there is a wide preference for databases and metadata and many use quite often taxonomies and ontologies, however not for 3D content. Current practices in knowledge management include the use of standards for interoperability like CIDOC-CRM, Dublin Core metadata and MPEG-7, and standards for data exchange like X3D, VRML and COLLADA. In general, there is no consensus in the Cultural Heritage world on metadata standards, which affects negatively long-term preservation. There are many proprietary data formats by hardware vendors (this mainly concerns captured data and is less important for born-digital objects). 3.6 Discussion and conclusion In this chapter, we provided a snapshot of the current situation in the five application fields, as a result of the numerous project dissemination and networking activities. To sum up, the adoption of 3D data can be considered a common practice in the different domains, both by acquisition and by modelling from scratch. In fact, a large number of acquisition techniques are used to obtain 3D digital resources, requiring cumbersome manual labour and knowledge expertise in order to become suitable for downstream applications. In modelling there exist numerous ways to represent shapes, but also different needs according to the applications (real-time vs. offline, low-resolution model, partial information, etc.). Moreover, during modelling and simulation, many factors must be taken into account, which require user know-how. For both the acquisition and the modelling workflow pipelines, the importance of semantics is clear, and a strong effort has to be spent on devising more effective techniques, able to capture as much context information as possible. Semantically rich (procedural and feature-based) modelling seems to be not fully exploited (e.g., for high- level editable and adaptive models or for codifying functional properties of 3D objects). Semantics would help in semi-automated processing procedures, providing relevant information at each step of the design phase. Moreover, annotations have been considered helpful to manage knowledge effectively in order to reuse past positive experiences and solutions. The annotation of a geometric 3D model with additional data (from geometric properties to information on its history to information on its purpose or function) is a major semantic issue and was recognised as such. It is important to note that the users (such as medical doctors but also in the CAD industry) 01/04/2010 30
  • 31. FOCUS K3D D1.4.1 accept and even welcome semi-automatic approaches for annotation (of organs, CAD features, etc.), which leave them the choice to pick the option that appears most probable to them as experts. In all the discussions the key role of semantics became evident not only in the Knowledge Technology and Knowledge Management sense as a best practice for 3D content documentation and sharing, but also as a driving factor for the development of new and more effective computational tools to process, manage and analyse 3D content. Finally, the increasing size and number of models is a major issue in all application areas. Using semantics would help to display the right info at the right time (decision support) and improve the technologies with automated reasoning. Furthermore, standards are lacking to improve data exchange between the research and industrial community. Even within one company, certainly if it is a big one, data sharing is not a given. Software vendors (for equipment to acquire medical data, for scanners to acquire cultural heritage objects, etc.) are interested in keeping their data proprietary. The confidentiality of medical data must be respected. In addition, as one representative from industry noted at the CAD workshop, academic researchers favour research over file format development. Still the development of standards and common file formats was always identified as an important objective. This issue is even more critical for standards to express semantics of 3D data. This is a rather un-explored area and only partial solutions exist (see MPEG-7 for instance). The reason for this lack of solutions is probably the large variety of representation schemas used for 3D data and the necessity to make the semantic descriptions somehow independent of the geometric representation of the 3D models. 01/04/2010 31
  • 32. FOCUS K3D D1.4.1 Chapter 4 : The FOCUS K3D research road map Delineating the research pathway to reach semantic 3D media is not an easy task: the underlying grand challenge – understanding and documenting the meaning of 3D data – and the complexity of the application domains involved make it difficult to express an exhaustive set of issues that need to be solved to open the door to the next generation of 3D media. Nevertheless, the FOCUS K3D project was successful in gathering a large number of requirements from the user communities, via questionnaires and STARs, but especially via the discussions promoted during the thematic workshops that conveyed a good picture of what the open issues are, as perceived by the users in the application domains. The outcomes of the thematic workshops (see deliverable D1.3.2 for more details) were the primary source of information that the consortium used to match the open issues against the state of the art in Computer Graphics, and to assess during ad hoc brainstorming meetings if and which of these open issues were pointing to real research challenges and which were instead caused by a lack of cross-fertilisation between the various fields of expertise. As an intermediate stage towards the research road map, the FOCUS K3D Consortium also produced a list of open issues that were indicated as crucial in the various steps of a typical, yet generic, pipeline of 3D content creation and processing in the chosen four representative fields. This exercise, which was interesting to show the relative importance of the issues in the various domains, led us to a better understanding of the differences and commonalities in the fields, and helped us in preparing the next step of brainstorming. Given the workshop outcomes and the consortium view of the open issues to be faced, we then tried to synthesise the high-level goals that lie beneath the visionary scenarios envisaged for the far future and expressed by the storytelling descriptions of Chapter 1. The resulting five high-level goals represent the long-term challenges that we think the Computer Graphics community should face as future targets for new and disruptive research, with a strong need to breach the borders of a single discipline and a call for a truly multi-disciplinary effort to 3D shape modelling. Within each of these high-level challenges, we have also identified a number of mid-term challenges that, without being exhaustive of course, are judged as important building blocks for further relevant research advances towards the goals. Finally, we have tried to avoid the presentation of separate research road maps for each application domain, and preferred to give a general overview while focusing on application domain issues, when suitable, to underline problems arising from specific shapes introduced by the application domain. First of all, the adoption of acquisition techniques has become typical in the workflows of the four AWGs considered. In Medicine, MRI and TAC are common tools for diagnostics, while in Gaming motion captures are used for instance in movies to give a natural behaviour to virtual characters. In CAD, reverse engineering is a widely adopted approach to redesign, whereas in Cultural Heritage the digitisation of artefacts normally happens through laser scanning sessions. As we assessed in the thematic workshops, the passage from the initial point cloud or 3D image to a 3D model requires usually a lot of manual labour, while at the same time, it loses all the knowledge related to the entity acquired. Moreover, handling 3D data is not as simple as 2D data for the users: reconstructing high quality surfaces/volumes is not possible as a direct result of the acquisition session and geometric processing tools have to be used. A high- quality model is also needed to apply shape understanding methodologies to extract implicit models. Currently, the available tools are still too focussed on geometry and not ready yet to extract many semantic features of the object. For instance, from curvature analysis or segmentation techniques special areas can be identified but the intervention of the user is still fundamental to select the correct regions of interest. Some information simply cannot be derived from geometry. In Cultural Heritage, it would be an improvement being able to acquire also colours and materials, for example; in CAD the identification of components from their function has been pointed out as a substantial advance. More generally, tracing all the digital workflows of models becomes crucial if digitisation is to 01/04/2010 32
  • 33. FOCUS K3D D1.4.1 serve for documenting, sharing and reuse of knowledge. In fact, only geometry is not enough for non-experts in Computer Graphics, while a semantic model, in which the context knowledge is explicitly related to the shape, is one of the big challenges to tackle. In the following, it will be named Derive symbolic representations. The importance to include specific knowledge is not only high when the model is created from acquisition, but also when it is created from scratch. Including and preserving the design intent would make the 3D model much more powerful in all the application fields. In Medicine, patient-specific data pave the road to precise diagnoses and personalised therapies; in CAD efficient links between 3D product models, their behaviour/function and their properties have been identified as a strong need; in Gaming creating autonomous virtual characters able to take decisions according to the surrounding scene would definitely be an advance. Among the desiderata collected during the thematic workshops, we can mention also: modelling shapes evolving in time (e.g. in medicine), coding construction rules of the shape in CAD, and obtaining a complete 3D anatomical human. The inclusion of knowledge directly in the modelling phase not only supports the modelling itself but also the simulation phase. In fact, simulations could be performed exploiting the semantics of the model, performing much more efficiently and effectively than nowadays where the quality and the interpretation of the simulation are only in the expertise of the practitioner. The second big challenge we identified is then Goal-oriented 3D model synthesising. The third big challenge is predictably Documenting the 3D lifecycle. In fact, efficiently interpreting, sharing, searching, and reusing 3D data has been proven to be fundamental, together with the preservation of the annotation according to the digital shape workflow. The discussions on this theme solicited the common necessity of a dynamic documentation, which should be able to trace source and data provenance, to preserve the annotation whenever the geometry changes and propagate it for similar objects. Another issue raised was related to the integration of different information sources in the documentation process: 3D information but also 2D images and text should be interlinked to make the annotation of the object complete. Not only the modelling phase calls for semantics, but also visualisation and interaction. Due to the different devices supporting 3D, such as 3D TV and mobiles, the information should be visualised semantically, which means according to the device, and according to the semantic priority given to the content. Interaction includes not only strict human-machine interaction but also how consumers are able to search and retrieve the shape data related to their current task. Among the issues collected among our AWG members, we can mention intuitive GUIs for practitioners who are not experts of Computer Graphics, the use of natural language to formulate queries, and consequently the need of multilingual and multimedia documents. A mechanism to express queries by the user effectively, on the one hand, and on the other hand, the development of strong reasoning methodologies to retrieve resources more precisely appeared as crucial. The fourth challenge is thus named Semantic visualisation and interaction. Finally, from the FOCUS K3D experience, we can generalise that there is still a strong need of Standards in the different application fields. Interoperability and distributed design oblige users to big efforts, when even possible, in making metadata and models consistent without losing information. Unfortunately, the FOCUS community confirmed that both vendors and researchers currently do not focus on tackling this issue. In the next sections all five big challenges will be treated in detail, pointing out a list of open issues to include in the research agenda in order to develop semantic 3D media. 01/04/2010 33
  • 34. FOCUS K3D D1.4.1 4.1 Derive symbolic representations Deriving symbolic representations mainly addresses the path from physically born objects to possibly symbolic 3D models in the digital domain. There are two main ways to construct a digital 3D representation out of a physically born object: either the digital model is manually designed by a graphic designer using appropriate software and shaped so as to resemble the physical counterpart as much as possible (or to the desired extent), or an acquisition session is carried out where data from the real object are captured with the help of an acquisition device. The choice of suitable acquisition techniques and devices depends on many factors such as the size of the object to be acquired, its physical properties (e.g. colour and reflectance) and the accuracy required by the downstream application. The acquisition session generates raw data, typically millions of points in 3D space. In both cases, the knowledge related to the object is somehow lost in the modelling process: the operator performing the acquisition as well as the designer perfectly know what kind of object is being captured/designed, what its meaning is, what it can be used for and in which context; what the main features are and what the details; if (and how) it can interact with other objects, for instance if it can be grasped by the human hand and which parts offer the best grasping points, and so on. This information is not stored in the digital model, which only retains data on the geometric appearance of the object: basic geometric representations do not give explicit information about the content semantics, which one can grasp only by viewing the object. If we want machines to understand the content of digital 3D media, we need tools that classify automatically objects in semantic classes, extract salient features, and segment the model into meaningful parts. In other words, the geometric model must be associated with an abstract and machine understandable representation that embeds knowledge about its content (semantic/meaning). Some of this knowledge can be automatically extracted by processing, analyzing and structuring the raw geometry: we call this process deriving symbolic representations. Building up symbolic representations out of acquired data is vital to let the vision of semantic 3D objects come true. This topic is relevant in all our application domains as the following examples illustrate: in Medicine to reconstruct anatomical structures and monitor morphological changes over time; in Virtual Product Modeling to re-engineer physical models; in Cultural Heritage to digitise cultural heritage artefacts; in Bioinformatics to highlight protein binding interfaces; in Gaming and Simulation to attach interaction rules to object parts. Although the digitised physical models may vary in shape, size, material properties and purpose (surgery planning, re-modeling, virtual exhibitions, etc.), the problems in generating a semantically rich description out of the scanned data are pretty much the same. Considering also our autonomous robot scenario, without sensing the environment, understanding the surroundings, building up conceptual models and planning actions, robotics as described in section 1.2 will not become reality in the future. To derive symbolic representations, a direct flow can be devised from geometry acquisition to building a semantic-aware model, where the necessary intermediate step is represented by shape understanding, which is the process of making explicit the knowledge, which is implicitly hidden in the geometry, through feature identification, structuring and classification. Acquiring geometric models can be simple, or extremely complex. In the simplest setting, a laser scanner operating in good conditions so as to scan a clay mock-up of a car will return a point set. If the point set is noise free and the clay mock-up relatively smooth, the geometry encoded is stand-alone and the modeling process can start, for example in terms of reverse- engineering to build NURBS patches from the point set data. In such a setting, the output of the acquisition process is essentially stand-alone, and the modeling process straightforward. Conversely, most of the time the acquired data are raw in the sense of being sparse, irregularly sampled, and riddled with uncertainty, so that they require heavy processing and a relevant amount of manual intervention; the amount of such data is getting overwhelming. 01/04/2010 34
  • 35. FOCUS K3D D1.4.1 The abundance of geometric data is explained by the recent, considerable advances in the modeling paradigms, in the acquisition technologies, and in the variety of automatic conversion methods. Measurement data are acquired with an increasing variety of acquisition technologies, whose evolution is characterised by a shift from contact to contact-free sensors and from short to long range sensing, culminating with satellite images. Medical imaging is another domain where handling raw data from imaging devices is an intensive pipeline process with numerous steps, especially when heterogeneous data from multiple imaging technologies have to be merged together. The performance of non-invasive medical imaging modalities such as Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) is excellent in terms of speed and resolution, which is a critical requirement in the medical field. Quantitative and qualitative medical image interpretation is successfully used for diagnosis and to evaluate new therapies. In addition, the reconstructed 3D anatomical models (e.g., bones, cartilages, ligaments, muscles, tendons, etc.) from imaging data have been used in different applications (e.g., endoscopic surgery, computer-aided surgery, surgery simulators for training, personalised prosthesis design, etc.). However, difficult challenges remain concerning the standardisation of the acquisition protocol, data heterogeneity, storage and interoperability, data processing and hardware limitations. Ultimately, multi-modal images introduce a great amount of information that can be used to construct medical ontologies and derive symbolic representations. As multi-modality introduces complementing information, more details about the images can be represented, and thus translated into symbols. Currently, the most commonly used dual-modal scanner in clinical practices is the PET-CT. With this scanner, a CT of the subject is acquired to provide the structural definition, which is then followed by the PET scan that provides functional information. Together, it is now possible to visualise the functional structure, such as the lesion, in relation to its anatomical surrounding. Open problems in multi-modal images are associated with hardware registration, such as caused by patient breathing and movement, and with software-based registration, which typically relies on the similarity of spatial features between the images, and typically involves image alignment, image scaling, warping, and transformations. Various applications in different fields are based on the acquisition of whole human body models, e.g. computer graphics animation, body shape analysis, sport, medicine, etc. A multitude of 3D scanning solutions and technologies (e.g., laser scanning, white light scanning, photogrammetry and image processing) are proposed, which are characterised by high resolution (≈ 1 mm) and allow a fast (less then 6 seconds) digitisation of the human body. Again, and despite great progress in hardware, the proposed software solutions still have limitations and obtaining a good quality model requires still a lot of human intervention during the acquisition pipeline. When in addition specific movements of the human body need to be acquired from a real person, motion capture is the appropriate modality to record the human joint kinematics. The skeletal motion is recorded by infrared cameras that track the trajectories of skin markers attached to different portions of the human body (e.g., pelvis, thigh, etc.). In kinesiology, it is common to combine motion capture with electromyography (EMG) and force plates for assessing musculoskeletal dynamics. Motion capture is used in many domains such as sports, gaming and medical applications. However, this recording system also presents some limitations: for instance intensive post-processing (filtering, noise removing, skin artefact correction, etc.) is needed to evaluate the true joint kinematics from recorded data. Basically, when classifying an object or parts of an object, one faces the recognition dilemma: one can only recognise things that are known a-priori, to know a thing means to be able to recognise it. The same paradigm applies to 2D data, where the so-called sensory gap further complicates the problem due to the presence of occlusions in the images. For being able to classify object parts, the object has to be segmented, that is, decomposed into its main constituting parts. With a slightly different meaning, the term segmentation is used also for detecting objects of interest in a raw data set, typically an image. In both cases segmentation is fundamental for the extraction of knowledge from the data available. 01/04/2010 35
  • 36. FOCUS K3D D1.4.1 In medicine segmentation definitely plays a key role in the construction of semantically meaningful models. Image segmentation is the process where regions of interest (ROI) are extracted from the image data based on domain-specific knowledge about the images. For example, tumours from MRI can be automatically identified based on the unique intensity and textures of the pixel characteristics belonging to these tumours. There is extensive literature available on medical image segmentation, which can be found in (Charbonnier et al. 2009). However, with the rapid increase in image resolution and quality, the ability to automatically derive the meaningful features more and more needs to be assisted and improved. Although there are signs that the segmentation automation is improving, it is still a significant challenge to produce reliable and robust results, and with the arrival of multi-modal imaging, even more sophisticated segmentation algorithms need to be developed. Object segmentation is also a very complex problem. In this case the shape of the object is known, but not the meaning and location of its relevant parts, or features. At the state-of-the- art there are a lot of segmentation algorithms tailored for different purposes, and most of them work on the basis of geometric features such as curvature, attempting to mimic high- level semantic conditions via low-level geometric properties. Only recently, the research community has started to explore semantically rooted segmentation algorithms, e.g. based on knowledge about the object’s class, or object taxonomies. Another approach to object segmentation is to propagate automatically the segmentation of one manually annotated model to other models in the same class, via partial matching. If we are given tools to detect automatically partial similarities between the object and a set of object parts (a data base) then the annotation of one object can be transferred to all the others in the class. Again, this is a developing research field, and no generic solutions are available yet. There is a whole set of objects that can be effectively characterised by a skeletal structure such as animals and articulated shapes. Configurations of tubular parts can be represented at an abstract level by a centreline skeleton, i.e., a graph whose nodes correspond to the joints and whose arcs correspond to parts of the shape. Skeletons are very useful to describe a wide range of shapes and are frequently used in application domains such as Gaming and Simulation to code information about the motion of the shape. Another developing idea to derive a symbolic representation of raw data is to fit parametric models to the scanned geometry. All the semantics embedded in the parametric model, and expressed by rules and constraints, can be inherited by the scanned model, which is – after the fitting process – represented by the re-parameterised 3D reference model. Generative and procedural modeling approaches are especially well suited to pursue such a process. Finally, these models become editable through its embedded parameters, which have been used to perform the fitting process. Preliminary research is being carried out in this field but the idea is still largely unexploited. Approaches to cope with the permutative parametric complexity are needed. In the structural biology field the construction of symbolic representations deserves a discussion of its own starting from the acquisition phase, where data about biological signals result from a complex experimental process, the interpretation of those data being heavily reliant on a priori knowledge. Biological functions are accounted for by biological complexes, and over the past five years, the structural proteomics project gave access to the structure of biological complexes at an unprecedented pace. The complexes solved are stored in the Protein Data Bank, and provide a unique opportunity to learn the key determinants of the stability and the specificity of biological interactions. Molecules interact by forming a complex, which is known as the docking process; understanding and predicting docking is still recognised as the major open problem in structural bioinformatics. While early docking models, such as the lock-and-key model proposed by the second Nobel Prize winner in chemistry Emil Fischer in 1894, stipulated that macro-molecules are rigidly assembling, it has been suggested by Koshland in 1958 that complex formation is often accompanied by an induced fit (deformation) phenomenon. Indeed, proteins are intrinsically flexible, i.e. continuously undergo conformational changes over time, or equivalently, exist at a 01/04/2010 36
  • 37. FOCUS K3D D1.4.1 given time as an ensemble of conformations in equilibrium. This key property makes the life of structural biologists especially difficult. On the one hand, experimentalists working in crystallography often face situations where a structure cannot be resolved because the flexibility impairs the biophysical signal in the X-ray diffraction spectrum. On the other hand, modellers investigating complexes involving partners undergoing large conformational changes are in general unable to predict satisfactory conformations of the complex, a fact well established by the Critical Assessment of PRedicted Interactions community wide experiment (CAPRI). Besides taking properly into account molecule flexibility, two main questions are posed. First, given two isolated proteins, one would like to know whether they interact, and if so, whether this interaction is specific and stable. Second, given two proteins known to interact, one would like to optimise their interaction, for example in the context of drug design. Answering these questions is a non-trivial task. So far, they have been approached from different perspectives, both theoretical and experimental. Bridging the gap between insights gained in the two realms will require a subtle mix of biophysics, geometric modeling, and knowledge management. The former two disciplines are naturally coupled since the conformation of molecules is the fingerprint of forces acting on the atoms. As for knowledge technologies, they provide the cement to annotate subtle and diverse properties of proteins and complexes, thus underpinning the modelling on interactions. In the following, we are going to describe some of the open issues that have been indicated as particularly relevant to the steps of data acquisition, shape understanding and semantics- aware representations, and that will contribute to the achievement of the global challenge of an automatic derivation of symbolic and semantic descriptions of 3D models. These sub- challenges are presented by a title, which synthesises the issue, and a short description. Challenge: From offline to semantically annotated and nearly real-time interpretation in 3D data acquisition The recent advances in the computational performance of multi and many core processors have a potential of providing the data acquisition operator with feedback on the completeness of the data and a preliminary suggestions for interpretation, segmentation and semantic classification of the data acquired. Such feedback will enable the data acquisition operator to augment or correct the proposed semantic classification, and further to complement the 3D data acquired by further acquisition during the actual acquisition. As the data acquisition operator knows the context in which (s)he operates, this also opens the possibility for the operator to provide information on the context of the data acquisition and to thus enable a near real-time data interpretation to use the context oriented rule sets for segmentation and interpretation. Examples of relevant contexts are: a traditional building (mostly horizontal or vertical planar surfaces, most often 90% angles between planar surfaces), a process plant (combination of planar surfaces, and piping structures), a cultural artefact (sculptured shape degraded over the centuries), a cityscape (dominated by buildings and human made objects and structures), nature without human footprint (sculptured and fractal shapes), nature influence by humans (landscape with housed and human made objects), exterior of car (Class A-surfaces). Challenge: Managing the abundance, heterogeneity and uncertainty of data The recent advances in acquisition technologies have made a massive amount of data available about the geometry of objects; unfortunately in the majority of cases, the acquired data are raw in the sense of coming from different devices, being irregularly sampled, and riddled with uncertainty (including noise and outliers). Therefore, despite the expectation that technological advances should improve quality, geometric data sets increasingly unfit for direct processing. In addition, geometric data are increasingly heterogeneous due to the combination of several distinct acquisition systems, each of them acquiring a certain type of feature and level of detail. Furthermore, we are observing a trend brought on by the speed of technological progress: while many practitioners use high–end acquisition systems, an increasing number of them turn to consumer–level acquisition devices such as digital cameras, willing to replace an accurate but expensive acquisition by a series of low-cost acquisitions. Overall robustness to 01/04/2010 37
  • 38. FOCUS K3D D1.4.1 defect-laden inputs is a largely unaddressed scientific challenge: there is a need for a unified theory and for algorithms capable of dealing with these data hampered with a variety of imperfections. There is agreement that formalizing the a priori knowledge about a 3D shape before acquisition is instrumental for reaching the required level of robustness. The challenge contributes to the creation of digital models that are as close as possible to the original physical shape, and therefore capable of replacing the original faithfully in digital processing and manipulation tasks. Challenge: Overcoming hardware limitations in medical images and kinematic data acquisition As a result of the recent advances in technology, in terms of resolution capability and acquisition time several techniques based on MRI data are now available for dynamic tissue analysis and especially for joint kinematic studies (Gilles et al. 2004). However, due to the limited volume of acquisition caused by the reduced space in the scanner, the range of the motion is reduced. The same issue holds for Motion Capture techniques, where the capture volume is limited by the amount of space that the cameras can "see". In the future, MRI, open MRI and MoCap technologies could provide more flexibility (e.g., portable systems) and new opportunities to record the subject in extreme postures.   The challenge contributes to the acquisition of dynamic 3D data about the patient, which are essential for the definition of patient-specific models with the required functional data.   The subject is scanned in split position. This picture depicts the difficulties with current hardware to acquire new motions or positions (©MIRALab, University of Geneva).   Real-time dynamic MRI of the thigh. Image from (Gilles et al. 2004) used by permission. Challenge: Improving the accuracy and automation of multi-modal biomedical image registration Multi-modal medical imaging has made breakthrough advances in recent years, with the introduction of PET/CT and with PET/MR to be made available for clinical use in the near future (Heitbrink 2008). These hardware-based dual-modality images use “pre-registered” data, 01/04/2010 38
  • 39. FOCUS K3D D1.4.1 where the images are considered to be already registered since they were acquired from a single scanner sequentially (Rosset 2006), as exemplified in the picture below. Although such hardware methods are producing good results, there continue to be problems in multi-modal image analysis. Firstly, there will always be the unavoidable issue of with patient movement such as from breathing and the beating heart. Secondly, there are only limited types of multi- modal scanners that are clinically utilised (PET/CT; SPECT/CT, US/PET etc.),) with new scanners only starting to be made available in clinical practices. For example PET/CT has been around since mid-2000, while the first PET/MR will be put into clinical service in 2010. Fortunately, there appears to be a stream of new multi-modal devices entering the market within the next few years. Thirdly, hardware solutions cannot register more than two images, thus not supporting truly multi-modal registration, which often involves three or more image data to be fused. Currently (as of this road map), there are no hardware solutions that support more greater than two modalities. Nevertheless, in the future, we expect such systems to exist. There has been a tremendous effort put into solving, in particular, the first problem, developing software solutions that use the hardware data, however, a lot of additional work needs to be done. Software based registration typically relies on the similarity of spatial features between the images, and typically involves image alignment, image scaling, warping, and transformations. Software based approaches have the advantage that with manual intervention almost any image modalities can be registered together, i.e. dynamic MRI/US and SPECT/MRI, thus addressing the second and third problem. Future work must lead towards further automation and improved accuracy in both the software and hardware registration solutions. With the hardware developers working in conjunction with the software teams, it is feasible for a breakthrough solution to be devised. Ultimately, multi-modal images, either through software or hardware, introduce a great amount of information that can be used to construct a medical ontology and derive symbolic representations. As multi-modality introduces complementing information, more details about the images can be represented, and thus translated into symbols. Currently, the most commonly utilised dual-modal scanner in clinical practices is the PET-CT. With this scanner, a CT of the subject is acquired to provide the structural definition, which is then followed by the PET scan that provides functional information. Together, it is now possible to visualise the functional structure, such as the lesion, in relation to its anatomical surrounding, as in the example below from (Nestlea 2006). The challenge contributes to the acquisition of highly complex phenomena that can be captured with different measuring devices, and occur at specific locations in the human body. The registration of the measures and their fusion is essential for the creation of a patient- specific model, and for the derivation of symbolic representations of morphological and functional information.    FDG-PET-CT fusion images of patient with [18F]-FDG-negative atelectasis. Pink structure: GTV derived from PET (source-background algorithm). Red contour: GTV derived from CT. Image from (Nestlea 2006) used by permission. 01/04/2010 39
  • 40. FOCUS K3D D1.4.1 Challenge: Standardisation of the acquisition protocol for medical images and kinematic data to facilitate data exchange and comparison The new scanning technologies are introducing multiscale capabilities, which allow to image living tissues at radically different scales to establish links between these levels of detail. There are equally important advances on the software side as advanced processing and visualisation tools are directly integrated to the clinical scanners to improve the clinicians’ diagnostic (see the picture below). Nonetheless, beside the data heterogeneity, storage and interoperability issues, difficult challenges remain concerning the standardisation of the acquisition protocol not only of still medical images, but also of kinematic data acquired by Motion Capture systems. The kinematic acquisition via MoCap indeed requires specific protocols to set up the system (camera calibration, coordinate systems) and to define the number and placement of the skin markers. Intensive post-processing (filtering, noise removing, skin artefact correction, etc.) is also needed to evaluate the true joint kinematics from recorded data (Charbonnier et al. 2009). Eventually, skin deformation artefacts (Lafortune et al. 1992) are a major issue that could be solved by using bone implants as markers (which is a common practice in surgical navigation). Therefore, depending on the application and needs, different protocols and post- processing methods are used. Consequently, the exchange and comparison of data between different scientific communities becomes very difficult due to the lack of a common methodology. In the future, it is necessary to overcome these limitations by offering more flexibility (e.g., portable systems) and standard protocols. The use of semantics in this process could be beneficial to improve the standardisation and data sharing. Recently, Motion Capture system manufacturers have introduced new systems offering more flexibility such as marker-less tracking that simplifies the setup and portable systems that can be used indoors and outdoors (see Xsens company, However, these technologies are still not accurate enough, and ongoing work tries to resolve that issue in the next years. 3D models of the hip reconstructed from MRI data (©MIRALab, University of Geneva). Challenge: Knowledge-based X-ray crystallography in structural biology The main acquisition techniques in Structural Biology are X-ray crystallography, nuclear magnetic resonance, and cryo-electron microscopy. While the first two are geared towards atomic resolution models, the latter is dedicated to the reconstruction of larger assemblies. We use here the example of X-ray crystallography, and discuss the challenges faced in terms of algorithms and knowledge analysis. To make a long story short, a biophysicist interested in solving at the atomic level the structure of a given protein using X-ray crystallography has to go through the following steps: (i) introduction of the gene coding for this protein into a host cell, (ii) large scale protein expression and purification, (iii) crystallisation, i.e. crystal growth, (iv) XModel reconstruction from the X-ray diffraction spectrum. In the following, we focus on two difficulties faced during this whole process, which are coupled to important knowledge on the protein and require the development of specific 01/04/2010 40
  • 41. FOCUS K3D D1.4.1 algorithms. The first problem is that of protein flexibility. If the protein processed has one or more flexible regions, the signal associated with this region will be low in the diffraction spectrum. For favourable cases, the biophysicist may be able to reconstruct a set of plausible conformations for this flexible loop, that is a mixture of loop conformations; in the most difficult cases, he may have to leave a blank in the reconstruction, meaning that the reconstruction of this particular region has been impossible. From an algorithmic standpoint, reconstructing a flexible loop is a challenging task as soon as the number of residues involved is larger than 10. This requires (i) solving an inverse problem, namely generating candidate conformations of the loop given fixed anchors at the extremities, and (ii) choosing the conformation achieving the best correlation with the experimental data. Knowing that a particular region is highly flexible is of course key in modeling, in particular for the docking problem. The second problem is related to the biological significance of what is observed in the crystal. The output of the reconstruction process is the so-called asymmetric unit of the crystal, namely the smallest structural unit which, when operated upon by the symmetry elements of the space group, yields the total crystal structure. On the other hand, the biologist is interested in the biological unit, which is the unit responsible for a particular biological function. Establishing the relationship between the asymmetric unit and the biological unit is a non- trivial task. In a crystal structure, contacts between polypeptide chains are forced, so that most of the time, the contacts observed do not correspond to protein complexes associated with biological reactions. From an algorithmic perspective, distinguishing biological versus crystal contacts requires characterizing the packing properties of the atoms in contact, in conjunction with the energetic functions assessing the stability of the crystal structure. This problem is still open in its full generality. From the knowledge standpoint, the information gained from it is crucial. For example, knowing that a protein is a hetero-dimer is a prerequisite to using this protein in any modeling or simulation process. Challenge: Coupling theoretical and experimental approaches to unravel the structure of protein interfaces Open issues about protein interactions have been approached from different perspectives, both theoretical and experimental. In the first realm, models and algorithms have been designed to investigate the geometric structure of interfaces, in conjunction with properties such as the phylogenetic conservation of amino acids and their polarity, or the dynamics of solvent molecules squeezed in-between the partners. On the experimental side, techniques such as isothermal titration calorimetry or surface plasmon resonance allow a quantitative assessment of the binding affinity of an interaction, while directed mutagenesis provides a direct assessment of the importance of amino acids. Yet, the fine structure of macromolecular interactions still has to be unravelled. Bridging the gap between insights gained in the two realms will require a subtle mix of biophysics, geometric modeling, and knowledge management. It is a common opinion that knowledge technologies can provide the means to encode and correlate these subtle and diverse properties of proteins and complexes into a unique symbolic source, thus underpinning the modeling on interactions. In fact, one would ideally like to develop high-level descriptions of interfaces, providing unified and symbolic representations of key biophysical, phylogenetical, and geometric properties. Such representations would ease the manipulation of proteins to model large networks of interacting proteins. 01/04/2010 41
  • 42. FOCUS K3D D1.4.1 (a) (b) A protein - protein interface, as a function of the depth of interfacial atoms, from (Bouvier et al. 2009). Only one of the two proteins forming the complex is shown, the bottom one. (a) Voronoi diagram restricted to the interface between the two proteins. The colour shade, from cold (dark blue) to hot (red) colours, measures the depth of a Voronoi polygon at the interface. (b) The depth mapped onto the atoms found at the interface. The depth map of the interface provides a unified way to apprehend correlations between a number of geometric and biophysical parameters. It paves the way to the development of abstract interface models that will be amenable to annotations and symbolic processing. Challenge: Modeling molecule flexibility Flexibility is actually a multi-scale phenomenon involving correlated motions of atoms interacting both covalently and non-covalently (atoms that share or do not share covalent bonds) in dense surroundings, so that the heart of the problem consists of disentangling these couplings. From the computer science perspective, this process is reminiscent of dimensionality reduction (finding couplings), but also rigidity analysis (finding the hinges of a molecule). On the other hand, abstract mathematical models encoding the flexibility of molecules are not easily accessible to people designing experiments, for example the optimisation of a protein complex. Ideally, one would like to enjoy a classification of proteins based on the flexibility properties of their domains or secondary structure elements, depending on certain additional parameters (such as the biological environment for example). Deriving a symbolic representation based on flexibility would clearly provide a strategic advantage when docking molecules, and call for further work on the problem of protein shape analysis and annotation (Guharoy et al. 2010). Challenge: Automatic semantic-driven segmentation of 3D objects Many automatic segmentation methods already exist in the literature, which are suited to identify different kinds of features on specific shape classes, while none of them is recognised to work better than the others in the general case. Since it seems unlikely that such a general segmentation will ever be conceived, there is a trend in using intelligent combinations of different methods, each optimising a different segmentation criterion; the challenge here is how to select automatically the proper segmentation methods and/or how to combine the output of a set of decomposition algorithms in order to return a meaningful segmentation of a given object according to its class, its meaning/function, or to the particular application domain. This may require a formalisation of the correspondence between shape categories, shape properties that characterise a shape in a specific category, and segmentation methods able to identify such properties. Knowledge technologies are a promising way to perform this task. We believe that the problem should be addressed by a tighter coupling of geometry processing methods with cognitive science, so that the properties characterizing the shapes can be better aligned with features that are used by humans to “understand and classify” shapes. Also, machine learning or similar statistical methods are expected to play a key role to cope with the complexity and variability of 3D object shapes, and as a tool supporting the automatic segmentation of massive datasets. 01/04/2010 42
  • 43. FOCUS K3D D1.4.1 Challenge: Derive expressive symbolic descriptions of 3D shapes Provided that meaningful segmentation methods will be available which, singularly or combined in some smart way, provide all the important features and components of a shape, the challenge concerns the issue how to organise them symbolically by encoding their metric and semantic properties, relevance, and level of detail in a hierarchy of symbols that represents fully the semantics of the object and its parts. Much like in linguistic morphology, where words and lexical groups are seen as interfaces between syntax and semantics and are organised in a semantic skeleton (Rochelle 2004), 3D shape features and components should be hierarchically arranged without losing the link with the source geometry of each entry. An example may be the shape graph proposed in (Mortara et al. 2004) where the skeleton is the adjacency graph of segments given by a multi-scale decomposition into tubular features. Each segment retains geometric and topological attributes, but the only semantic knowledge encoded so far is whether a segment is a tube or not. The semantic skeleton envisaged by this challenge, however, should be so expressive to make a 3D shape completely machine understandable. 4.1.1 Time line and dependence diagram In this section, a diagram showing the time lines of the proposed issues and the dependencies between the issues connected with the grand challenge Derive symbolic representations is presented. The dark boxes cluster families of challenges that, in our opinion, are related to each other: we gave each cluster a name, i.e. DSR1, DSR2, DSR3 (Derive Symbolic Representation 1, 2, 3). The dependences are marked with an arrow from A to B, which means "know-how from A will support the development of B". The boxes are put in line with a time, which represents the time when we estimate the challenge will be achieved. The names of the challenges in the boxes do not coincide exactly with the ones used in the text. 01/04/2010 43
  • 44. FOCUS K3D D1.4.1 4.2 Goal-oriented 3D model synthesising When modeling a 3D model, the practitioner has most often an end goal, i.e., an application, in mind. For the computational engineer, the model must fulfil a number of functions while matching a set of constraints related to its size or manufacturing cost. For the medical practitioner an organ must function well after surgery or drug treatment. In gaming and simulation, the model must interact well while providing a faithful or artistic depiction. For these reasons the computerised representation of the model is conceived with a substantial amount of knowledge, which – among many application-dependent principles –, relates the shape and its function. While this is currently satisfactory for a number of applications, the knowledge is often incorporated into the process only through the practitioner. In addition to the computerised representation of the object, an ambitious challenge consists of also formalising the knowledge in a computerised manner such that the whole conception cycle (which comprises modelling and simulation) ultimately becomes solely driven by the targeted end goal. Furthermore, when using acquisition instead of modelling such as for medicine or reverse engineering applications, another challenge consists of elaborating upon an integrated process, which is again driven by the end goal (e.g., machining and quality control, surgery) with a feedback loop, which relates acquisition to simulation. In this section, we draw a research road map geared toward goal-oriented model synthesising. To this end, we discuss the problems and algorithms required to elaborate upon models specifically tuned for the aimed applications and the organisation of storage and retrieval for these models. One enduring challenge is the problem of synthesising a shape to fulfil a certain function. Some examples of simulation-driven modelling exist for instance in the field of computational fluid dynamics simulations, where an optimal shape is generated through repeated simulations. Nevertheless, no general and knowledge-based approach exists to achieve a well- 01/04/2010 44
  • 45. FOCUS K3D D1.4.1 principled function-oriented shape synthesis. A first step in this direction is to relate shapes to functions and vice versa, and to apply rule-based shape modeling. In mechanical engineering the ideas of construction methodology and principle solution to achieve certain effects (“Wirkprinzipien”, physical principles) exist. Transforming these well-documented principle solutions into a computer-interpretable knowledge base can help re-using these provably working solutions. The first difficulty stems from the fact that the term function can have a variety of meanings in the different domains and even within a single domain. In engineering function may cover, e.g., mechanics, dynamics, hydraulics, and electronics behaviour. Furthermore the design process must consider all aspects and phases of the product life cycle such as tool making, production, use, environmental compliance and disposal. The second major difficulty stems from the fact that simulation-driven modelling requires looping between simulation and modelling. This requires many traversals of the geometry processing pipeline, which is notoriously difficult to streamline. Overall the problem of synthesizing “functional” shapes is very complex and automatisms need to be studied carefully depending on the context. Such contexts need to be conceptualised to build up knowledge bases. Semantics in this context must be interpreted as being functional behaviour and requires simulating the functions modelled. As of today there is not a single modelling language to cover all different kinds of functional modeling nor a single simulator to simulate all kinds of models. All attempts to find such a modeling language or to develop such a simulator have failed in the past. Careful studies need to approach the problem field on a case-by-case basis, advancing step by step. In some areas, e.g., in simulation, automatisms have to be interwoven with the process of finding the best representations and the simulation process itself. In some cases it can be decided only during the simulation process where more detail is needed to create a reliable simulation result. For all these reasons the function should ideally be abstracted in the sense that its description must be independent of the 3D representation. Finally, we need to conceptualise application and experience knowledge to bring the abstract concepts of shape and function into context and apply semantically aware algorithms to practical problems. In this context trying to come up with one single magic solution is likely to fail. Nevertheless, semantically-based automatisms can generate high increases in productivity in industry by giving engineers access to proven principles and by relieving them of repetitive manual modelling and processing stages. The potential benefits are multiple and affect almost all areas where application working groups have been active in FOCUS K3D: i) virtual products or anatomical models could faster be defined, transformed for being used for different purposes and re-used in a variety of evolving contexts; ii) virtual models for games could be derived from master models and automatically adapted to various hardware environments; iii) digital cultural heritage models could be more easily re-purposed for different applications such as research or virtual exhibitions. For goal-oriented model synthesising an additional challenge is to provide the practitioners with additional function-centric semantic facets and descriptors for browsing and retrieval from massive datasets. One example of a targeted application scenario is the conception of a mechatronic product where the engineer would reuse existing knowledge through querying for a model which fulfils a set of specified functions. Part of the problem is to obtain a computer- interpretable knowledge base together with instances of reusable models. Furthermore, in the context of simulation-driven modelling another important aspect of the problem is how to formalise the knowledge about the results of the simulations themselves, which may be massive data as well. The later models are in a sense already matured over time while still being purely digital. We now list a number of focused challenges that we use to exemplify the proposed research road map. Challenge: Streamlining the geometry processing pipeline for goal-oriented modeling and simulation The geometry processing pipeline ranges from acquisition to machining, including registration, reconstruction, repair, simplification, analysis, editing, watermarking, storage, 01/04/2010 45
  • 46. FOCUS K3D D1.4.1 transmission, searching and browsing. The lack of robustness, generality, and guarantees concerning the algorithms makes the streamlining of the whole processing pipeline quite impossible. While each building block along the pipeline is presented as a fully automatic process in academic papers, their requirements for inputs and lack of guarantees of outputs prevents these building blocks from working together seamlessly. Practitioners thus need to deal with a trial-and-error iterative process not conducive to efficient geometry processing. This lack of automation can have dire consequences for the computational engineer, especially in the context of goal-oriented modelling and simulation. For instance, an aircraft manufacturer may need to perform a computational fluid dynamics simulation of a production- level CAD model of a helicopter, which will have to be converted into a surface mesh, which is watertight and intersection-free. Unnecessary features (such as interior details) may also need to be removed from the CAD model before being converted to a smooth surface mesh amenable to computations. The overall procedure involving conversion, repair and defeaturing currently takes weeks for an experienced engineer, while the simulation takes one hour of parallel computation on a cluster of 2K computers. As this procedure must be repeated for each major editing of the CAD model, and as it is the wall clock time of a process that matters in such industrial applications, it is of crucial importance to reduce its duration from weeks to hours through automation. The solution would be to elaborate upon algorithms, which are extremely resilient to the input while guaranteeing the output. Challenge: Reusing the knowledge in maintenance Numerical behaviour simulation is fundamental for new solution assessment in various engineering activities as it allows avoiding physical validation tests. Numerical simulations are used in various product life cycle phases from initial design to maintenance or other lifecycle problem analysis. Such activities are frequently submitted to various constraints crucial from the production point of view. In the case of maintenance for instance various constraints related to production interruption are normally arising that make it crucial to accelerate part substitution or adjustment. Today product behaviour analyses are classically carried out by 5 steps as follows: (1) CAD solution modelling, (2) mesh model preparation, which also includes some shape adaptation processes to make the model tractable, (3) FE model preparation for adding additional semantic information related to the behaviour modelling, such as Boundary Conditions and behaviour laws, this includes creating groups of mesh elements for supporting the specification of the simulation semantics, (4) FE simulation, (5) result analysis and optimisation loops. These optimisation loops normally require going back to the CAD model (step 1), adjust the shape accordingly and then perform again all the preparation and simulation phases. Therefore, the first four steps, shown in the figure (taken from Lou et al. 2009), are often executed several times in the loop until the results are satisfactory. It is then clear which advantages could be obtained if the behaviour-related semantic data could be maintained during all these modification activities, even more it is evident that having the possibility of acting directly on the FE mesh model with efficient shape modification tools able to maintain such semantics would drastically reduce the evaluation of different alternatives to find the optimal solution. The development of such tools needs to: 1) provide efficient shape modelling capabilities directly working on 2D and 3D meshes, possibly having idealised elements; 2) maintain and update the associated geometric support for the associated behaviour-related semantics according to the shape changes. 01/04/2010 46
  • 47. FOCUS K3D D1.4.1 Challenge: Simulating functions of the human body The human body is composed of different complex and heterogeneous elements. Modelling this system remains a big challenge due to its complexity of geometry, mechanical behaviour and interactions. Consequently, the study of the human body is divided into different disciplines. Despite the interdependencies, the models developed in the various domains focused on one specific aspect. Nowadays, creating a complete and accurate human model is still a difficult task. Therefore, pluri-disciplinary networks are created in order to exchange and combine knowledge from different domains of expertise (e.g., clinicians and engineers), and to develop applied technologies for medical purposes. Simulating the functions of the human body requires multimodal data provided by various acquisition modalities. For instance, the 3D anatomical models (e.g., bones, cartilages, ligaments, muscles, tendons, etc.) are reconstructed from imaging data using segmentation techniques, while the functional and kinematical parameters (e.g., motion parameters, material properties, applied loads and forces, etc.) require several acquisition and post-processing techniques. A semantic-driven medical application could be more efficient to manage and structure the large amount of acquired data (Charbonnier et al. 2007). As a result, the 3D visualisation, navigation and analysis of the human body would be improved and would have a positive impact on the medical procedures, as well as for teaching and training. We believe that in the coming years the growing advances in hardware technologies will have great impact on the simulation phase. However, the integration with the medical devices (e.g. scanners, surgery simulators, etc) and the human physiology modelling needs more time. Challenge: Models combining geometry and biophysics knowledge for docking of proteins Simulation using macromolecular models encompasses a huge number of scenarios, and we focus in the sequel on one of the most important ones, namely docking. Generically, docking aims at predicting an assembly from the unbound partners. The number of protein structures known to date is about 100 times smaller than the total number of non-redundant coding sequences. In fact, the 55,000 structures found in the Protein Data Bank contain a mere 1000 biological complexes or so. In other words, there is an extreme paucity in terms of biological complexes known. Figuring out such complexes from the unbound partners is the goal of docking. Practically, a number of variants of docking exist. First, one may distinguish depending on the number and on the nature of partners: binary protein-protein docking; binary protein-drug docking; docking of multiple component assemblies, etc. Second, one may also distinguish depending on the properties of these partners: rigid body docking if they do not undergo any significant deformation upon binding, flexible docking otherwise. As a pre-requisite to docking, structural biologists analyze data and identify amino acids as well as regions on the protein surface that may be involved in the binding process. These residues and regions often concentrate the bulk of the effort during the docking. But as of today, the signals conveyed by conservation and mutagenesis data are complex and poorly understood. In particular, the couplings between residues, both in terms of conservation and variation of free energy deserve further investigations. Here such investigation will undoubtedly yield advanced annotations of proteins. Rigid body docking aims at finding the best relative position of two partners which do not deform upon binding. This problem greatly benefits from the identification of the regions most likely to be in contact, since the bulk of the effort can be concentrated there. When the biological information is not precise enough, the geometric detection of pockets and more generally surface regions favourable to binding is called for. As evidenced by the CAPRI community-wide experiment, rigid body docking is pretty much under control. Yet in many 01/04/2010 47
  • 48. FOCUS K3D D1.4.1 cases, the molecules deform so as to bind. First, rigid body docking is applied to coarse representations of the molecules, so as to select approximate bound conformations and relative positions of the partners. Second, upon switching to atomic models, flexibility is investigated at a local scale, typically that of side chains. That is, the conformations of the side chains are explored thanks to so-called rotamer libraries, the admissible conformations for the side-chains. Third, the atomic positions are adjusted by running a molecular dynamics simulation. Finally, for each conformation retained, a score is computed so as to rank all the putative solutions. For a very flexible region, the question arises of generating families of conformations, which represent the admissible geometries. For cases where the flexibility is milder, the question of developing scoring functions able to single out the near native conformations is a central one. All these developments require a tight coupling between geometry for the description of the models and biophysics knowledge for the calculation of energies. 4.2.1 Time line and dependence diagram In this section, a diagram showing the time lines of the proposed issues and the dependencies between the issues connected with the grand challenge Goal-oriented 3D model synthesising is presented. The dark boxes cluster families of challenges that, in our opinion, are related to each other: we gave each cluster a name, i.e. G3DMS1, G3DMS2, G3DMS3 (Goal-oriented 3D model synthesising 1, 2, 3). The dependences are marked with an arrow from A to B, which means "know-how from A will support the development of B". The boxes are put in line with a time, which represents the time when we estimate the challenge will be achieved. The names of the challenges in the boxes do not coincide exactly with the ones used in the text and we added a few more issues in the diagram. 01/04/2010 48
  • 49. FOCUS K3D D1.4.1 4.3 Documenting the life cycle of 3D objects Documentation of 3D objects is a continuation of the traditional activities, which scholars and scientists have been pursuing for several centuries. Whereas static 2D forms of documentation (plans, sections, elevations, reconstructions) created on paper and published in print have been used for a long time, 3D interactive digital tools are now increasingly employed. This transformation of the medium of expression and publication has spread rapidly through a large, well-established field, which has generally embraced the new technologies in recognition of their obvious superiority to what they have replaced. Ironically, scientists and practitioners are doing little to conserve their own digital products. There are a number of research challenges that should be addressed to realise an ideal cataloguing and documentation of the life cycle of 3D objects. These challenges include, for instance, coding of the data provenance and version control for 3D models, effective metadata structures, interoperability, object and part-based annotation. Central to the documentation is the annotation of a 3D object. By 3D annotation we mean the process by which a text-based piece of information is linked/associated to the object and stored for subsequent uses. We speak about semantic annotation because (or when) the text associated is meaningful in some context, and used for understanding and storing information about the object, which is not explicit or not contained in the geometric data that define the digital object. 3D annotation is a fundamental step in the documentation of 3D objects. Annotation may be related, for instance, to the process/workflow that was used to create or process the object and/or its parts. Annotation may be also related to the meaning of an object or an object’s parts. If the target of the annotation is a part of the object, this part to be associated with textual data has to be selected with appropriate tools. Selecting regions of interest in the manual annotation of 2D media is rather simple in terms of the user interface: dragging a selection box or lasso tool over an image achieves the necessary functionality. The same does not hold for 3D media, or at least it is complicated by the data’s nature: parts might be out of reach for mouse interaction, and bounding a part can be rather complex. Therefore, the issue of smart identification of semantically relevant parts/features arises, either in an interactive or in an automatic way. Part-based textual annotations might be very useful also to support 3D content-based retrieval relying on textual queries instead of, or in addition to, classical geometry-based techniques. One can easily imagine a process where a new shape is created by searching for, modifying and composing parts of already existing objects. This will be possible only if existing shapes have been previously segmented and annotated at the component level. Also the editing and assembling of parts may be further supported by the semantics associated to them: the function or meaning of a feature is likely to give hints about where and how it should be placed on another object (e.g. where to place a handle on a bowl to shape a teapot). The annotation related to the meaning or functionality of objects and objects’ parts could support advanced mechanisms of interaction among objects in virtual worlds, no built-in behaviours but reactivity and self-adaptability to the context. The duration of the life cycle of 3D objects varies a lot between the different application fields. In the aerospace industry all relevant data needs to be kept for more than 50 years – think about any 3D data format, which sustained the last 50 years - there is none. Long term archiving is thus mainly done by storing blue prints or digital 2D drawings in TIFF. In the area of Cultural Heritage digital representations of physical objects are to be provided to the next couple of generations. In games 3D objects typically live only a few years during the production process of a particular game. To come back to the aerospace example, not only is it necessary to document the status, which represents the airplane as-built, but also all the maintenance work done over its 30-year use phase. Imagine all the data in some archives (in analogical and digital form) and the value and knowledge this data represents as well as the efforts implied when searching specific information in those archives, being it paper folders, TIFF-files or product life cycle 01/04/2010 49
  • 50. FOCUS K3D D1.4.1 management systems. In the following, the challenges related to 3D documentation are described. Challenge: Documenting provenance data The documentation of provenance, that is the source, origin or history of derivation of a particular object, is considered a critical requirement in many practical fields and is highly content and application dependent. The reason is that 3D content without the context of what is being represented is not very useful for users. It is regarded as important to demonstrate the value of 3D content for contextualizing other information that will benefit i) general users, as not everybody has the same expertise for reading lengthy descriptions on the physical parts, history and meaning of each of the parts; and ii) scholars, who will be able to explore new methods to investigate and discover the objects’ relevance and their context. Developing a data model, e.g. by creating an ontology or extending an existing one, that maps the most important concepts of provenance for both physical and born digital 3D objects is a research challenge. In this context, provenance describes the events that occur during a digital object’s life cycle including the process of digitizing and documenting the 3D content. This information is important to record the provenance of the 3D content (Theodoridou et al. 2008). Then, it should be possible to record important information such as: who made the digitisation? Which were the settings of the hardware? Who made the post-processing of the 3D content? Who interpreted the documentation? Challenge: Version control for 3D models Virtual environments are a particularly demanding class of 3D models for managing versioning. For example, 3D models of archaeological reconstructions often are changed and evolve over long periods of time as new knowledge comes to light from excavations and further scholarship. Models themselves may depict changes across large temporal spans, with multiple versions corresponding to reconstructions at specific points in time. Additionally, the technological aspects of 3D modelling often require the development and distribution of multiple versions of models. For example, 3D models constructed for use in fully interactive, real-time environments are usually of much lower geometric complexity than models used for static, high-quality renderings. The methods of version control as applied to software development can theoretically be adapted to 3D models, but no concerted research effort has yet studied the application of version control techniques to the development and dissemination of 3D models. Version control should be a fundamental feature of an effective archive for publishing 3D models. Future research should investigate the applicability of traditional revision control methods, and further extend them to be suitable for managing large archives of 3D models. We anticipate (in the near future) developing solutions for a 3D model version control system with the following features: 3D Difference Computation and Visualisation. Traditional revision control systems for textual documents provide functionality for convenient viewing of the differences between two document versions, similar to the “diff” command available on UNIX systems. A “3D diff” function that allows rapid 3D visualisation of the difference between two versions of a 3D model calls for further investigation. This capability could use geometric analysis algorithms and information visualisation techniques to automatically identify and then visually highlight the variations among 3D models. Compression for 3D Models. For purposes of storage efficiency, revision control systems usually retain only the differences between successive or branched versions of documents; similar techniques for compression of 3D geometric components should be investigated. Tracking changes (addition, deletion, and modification) to 3D Models. As with source code revision systems, each edit should be stored with metadata representing the time of the edit, the identity of the editor, and comments or annotations describing the nature of the modification. 01/04/2010 50
  • 51. FOCUS K3D D1.4.1 Challenge: Digital rights management for 3D models Providing for the digital rights management of 3D models is of paramount importance in the development of 3D archives and disseminating 3D content securely is a difficult problem. Most of the prior research in this area has focused on the a posteriori detection of piracy (e.g. watermarking), not its a priori prevention. Such an a priori approach is ScanView (Koller et al. 2004), which provides the user with only a very low-resolution version of the 3D model for navigation purposes, and delegating high-quality renderings to a remote secured server that contains the high-resolution model. Other methods should also be investigated, which may hold better promise for allowing protected sharing of 3D models. One of such methods is secure graphics hardware, using a decryption key that is embedded in the hardware, e.g. in 3D graphics cards. While this approach is certainly sound from a security point of view, the architectural implications of adding cryptographic capabilities to graphics hardware have not been studied. Another interesting possible approach for allowing protected distribution and usage of 3D models involves the application of encrypted computation techniques for digital rights management (PaFi2005). If such a scheme could be implemented, the 3D models would be distributed in an encrypted form, along with a custom viewer that performs the critical rendering functionality directly on the encrypted representation. The limitations of encrypted computation are currently an open research problem, and encrypted rendering appears far from practical at this time. Defending against reconstruction attacks that draw on computer vision algorithms to recover the model from a sequence of high-quality renderings is another research challenge. Whereas the techniques described before are concerned primarily with allowing protected rendering and display of 3D models, an ideal secure dissemination system would also allow users a greater degree of geometric analysis of the protected 3D models without further exposing the data to theft. Database researchers have developed privacy-preserving data mining techniques that extract useful knowledge from databases without significantly compromising data security (Verykios et al. 2004). Applying similar methods to the problem of protecting 3D data could be a feasible solution. Challenge: Workflow annotation With the prevalence of distributed computing a growing demand for tracking, recording and managing data sources and derivation has become a reality and there are provenance-related requirements identified within the scope of many applications. However, there is no commonly agreed conception of provenance in the context of service-oriented computing, nor any concretely implemented prototypes. Workflow provenance has been investigated in many contexts using definitions such as audit trail, lineage, dataset dependence and execution trace. We usually refer to workflow provenance as the process that led to the data. This process-centred view of provenance is motivated by the observation that most scientific (and business) activities are usually accomplished by a sequence of actions performed. Process documentation and preservation of 3D model annotation in workflows is a research challenge. The overall data flow (internal and external) characterises the process that led to a result. The key role of semantics becomes evident, not only in the Knowledge Technology and Knowledge Management sense as a best practice for 3D content documentation and sharing, but also as a driving factor for the development of new and more effective computational tools as well as for peer-reviewing and validation of e-Scientific results. Challenge: Mark-up language to associate 3D geometry to its annotation There is no clear standard way to link annotations to 3D geometry. Current standards for 01/04/2010 51
  • 52. FOCUS K3D D1.4.1 expressing geometric data, such as X3D, allow the possibility of describing compound scenes as assemblies of simpler ones, and describing behaviours and interactions. X3D, however, is mainly used to code information needed for interactive applications rather than for a complete 3D Semantic environment. Traditional geometric representations could be enhanced with tags, annotations, and even hyperlinks to other 3D models, making them evolve towards what Havemann and Fellner call generalised 3D documents (HaFe2007). The problem of defining a stable 3D markup has to be tackled in its generality. Preliminary work in this direction has been proposed with the ShapeAnnotator (Attene et al. 2009), which keeps the geometry and annotation in two distinct files; the annotation produces a set of instances that, together with the domain ontology, form a knowledge base. Each instance is related to one part of the model, and it is defined by its URI, its type (the class the feature belongs to) and other attribute values and relations. The Shape Annotator An important point is that annotations, or tags attached to parts of 3D models, should survive changes in the geometric representation. The object is unique, and so should be its annotated features, while the representations used to manipulate the object may vary at different stages of the modeling workflows. Maintaining the annotations consistently is not trivial at all, even if we consider just one representation type, such as for instance triangle meshes. Think of part-based annotation of a 3D model representing a statue or a complex artefact. For visualisation, we need to simplify the models; that is, we need to remove a number of vertices and triangles. What happens to the annotations? How do we keep them consistent across resolution changes? The problem gets even more complicated if we think of completely changing the representation type - for instance, switching from triangle to quadrilateral meshes. The statue’s shape, together with its relevant features, remains the same, and the annotations should follow the scale changes accordingly and smoothly. 01/04/2010 52
  • 53. FOCUS K3D D1.4.1 The Bimba model represented by a triangle mesh, by a simplified triangle mesh, a quadrilateral mesh and by resampling Challenge: Massive semantic 3D annotation More and more 3D models are becoming available. In order to recognise, retrieve, reuse, interact, and design with 3D models, a semantic annotation of the whole and the parts of the model is necessary. What we are looking for are 3D models that have more and more information attached to them or parts of them in the form of annotations presenting knowledge about the object, a part of the object, its purpose or function. Annotations support machine-processable knowledge sharing of semantically enriched objects and therefore open new frontiers to the usage of 3D content in virtual and networked environments. Only recently, and especially in the CAx domain, 3D models embed certain types of annotations and data formats exist, which encode the 3D model together with its annotations. Those annotations are typically restricted to material information and manufacturing process information. In the Cultural Heritage field, 3D annotation is perceived as crucial but is not established yet. It offers Cultural Heritage experts the possibility to link their hypotheses with digital 3D artefacts, or parts of them. When talking about massive amounts of 3D objects becoming available on the Internet, social tagging becomes the only reasonable way (besides automatisms but they will have limitations) to handle and enrich these models. Social tagging for 3D objects is also a relatively new field. Quality of data, tags, motivation schemes for users, etc., are largely unexploited in that area. For today’s professional use 3D objects from the Internet are not mature enough, unless they are provided by companies that ensure their quality. Nevertheless 3D warehouses are further developing. Exploiting mechanisms to stimulate social tagging for 3D models along with developing better tools form an interesting research topic. Imagine for instance the impact that a massive annotation of 3D city models could have. There is nowadays a large availability of 3D city models that are produced thanks to Lidar data. To make sense out of these data, semantic annotation is needed but it is implausible to think of manual annotations of the datasets. Literally millions of models must be semantically analysed and annotated, in order to make it possible to use the models in an intelligent way. The problems to be solved are to decide what kinds of annotations are desired, and how 3D shape can be segmented in advance so that the individual parts of the shapes can be used. This requires both the development of effective segmentation algorithms to identify parts, as well as the development of a strategy to have those parts semantically annotated at a large scale (see also the challenge on automatic semantics-driven segmentation). Many tasks are trivial for humans but continue to challenge even the most sophisticated computer programs. It has been advocated to constructively channel human brainpower through a class of computer games called "games with a purpose", or GWAPs, in which people, as a side effect of playing, perform tasks computers are unable to perform (AhDa2008). The Entertainment Software Association has reported that more than 200 million hours are spent each day in the US playing computer and video games. Indeed, by age 21, the average American has spent more than 10,000 hours playing such games—equivalent to five years of working a full-time job 40 hours per week. What if this time and energy were also channelled 01/04/2010 53
  • 54. FOCUS K3D D1.4.1 toward solving computational problems and training AI algorithms? People playing GWAPs perform basic tasks that cannot be automated. The challenge is to design a massive initiative that leads to semantically rich annotations of 3D models. For specific classes of shapes, there are promising approaches that deserve more investigation. The propagation of annotations is gathering interest within the research community. The approach is to segment manually a representative shape, annotate it with metadata, and then propagate both the segmentation and annotation in a consistent way to all the similar shapes, and to refine finally the annotation for each specific shape, if needed. The problem of consistent segmentation is very recent and preliminary results work by propagating segmentations within pre-defined classes of similar shapes (GoFu2009). The challenge for the future is then how to assign automatically a segmentation (and an annotation) to an input shape by automatically selecting the proper reference object (already segmented and annotated). (see also the challenge on 3D search). Propagation of consistent segmentations across models of the same class (GoFu2009) Challenge: Long-Term preservation of 3D data One of the most pressing needs is to ensure the survivability of 3D models. More and more 3D models are being created, however, poor archival practices make them lose their original functionality and information richness after only a few years, or even sooner. A more active approach should be adopted for the preservation issue and the management of the life cycle of digital resources, from data creation and management to data use and rights management. The digital libraries community has made strong progress in technical solutions for preservation and reliability of access, and we recommend extending and applying these techniques to 3D archives. Semantic web technologies can effectively be used to facilitate the integration and interchange of heterogeneous information and to generate metadata. We have to acknowledge again that end-users’ contributions will be critical for acquiring and documenting 3D resources, just as it is now for images and videos on the web. Nevertheless, it is recognised that there are challenges involved in requiring non-experts to work with software and hardware for 3D graphics, which are not yet commonly used. The main challenges involve designing processes (software and hardware tools), which can be easy-to-use and can be followed by users with a minimum amount of training. The final aim is to enable users to capture and document 3D representations of objects and their metadata. There is an urgent need for reliable documentation as the key step required for preservation (preservation through documentation). An example of this trend is the CyArk 3D Cultural Heritage Archive (, an open access Internet archive where the stored data provide a valuable resource for preservation professionals (site managers, archaeologists, conservators, architects, and engineers). It uses an integrated methodology - 3D High Definition Documentation (HDD) - that utilises advanced survey and imaging technologies. The core technology is 3D laser scanning combined with high-resolution photography, while other technologies may also include GPS, photogrammetry, GIS, and remote sensing. 01/04/2010 54
  • 55. FOCUS K3D D1.4.1 Challenge: Operational 3D cultural repositories The rapid spread of virtual heritage has inevitably created some new problems and set forth some major challenges facing the virtual heritage community. There are a number of technical research challenges that should be addressed (Koller et al. 2009) to realise an ideal 3D repository (digital archive). These challenges include digital rights management, version control for 3D models, effective metadata structures for publication, updating and long-term preservation of 3D models, interoperability, and 3D searching. Other critical functions associated with digital archives of 3D models that require further research are indexing and searching. An active area of research that is directly relevant is the development of shape-based retrieval methods that measure shape similarity between 3D models. Rich textual annotation of 3D models can often provide metadata for text matching and retrieval, but the value of semantic searching that can combine both text and 3D shape similarity must be further investigated. Also, further research into algorithms for indexing and searching on inherent model attributes that do not depend on human annotation should be considered. Techniques for organising associated metadata and its effective presentation to users should be investigated as well as methods for managing and disseminating 3D models. One particularly interesting challenge is to provide methods for allowing interactive exploration of relevant metadata displayed in corresponding locations in the 3D model or virtual environment. For example, ancient cities may contain hundreds of individual buildings, each of which can have distinct associated metadata. Thus the metadata will need to be spatially referenced to the underlying 3D model, and the 3D exploration interface linked to an appropriate display methodology that combines presentation of 3D, 2D images and/or textual metadata elements. Current metadata structures are not likely to be appropriate for organizing 3D metadata, and thus we suggest research into alternative models. Challenge: Complete and detailed documentation of the anatomical model of the human being In Medicine, there is an issue about results validation because a complete detailed anatomical model does not exist. In order to validate results obtained with 3D models and simulation, the availability of a large set of reference digital body models would be highly beneficial. The Visible Human project already pointed to this problem more than 10 years ago (Spitzer et al. 1996, ImMo2005). Indeed, creating standards and norms about the organs of the human body will help to detect malformations, diseases, and injuries. Moreover, on one hand, having a 3D digital representation model of the human body accurate enough at different levels will lead to finding the “best diagnostic” to apply. On the other hand, such a model will also improve the communication between computer scientists and doctors because they will have a common representation with semantics to treat the patient. The process needed to create these reference models is, however, not trivial at all as a significant number of data of individuals in different conditions and with different characteristics are required. Also, non-invasive techniques for acquiring the data should be used. The Visible Human project 4.3.1 Time line and dependence diagram In this section, a diagram showing the time lines of the proposed issues and the dependencies between the issues connected with the grand challenge Documenting the 3D 01/04/2010 55
  • 56. FOCUS K3D D1.4.1 lifecycle is presented. The dark boxes cluster families of challenges that, in our opinion, are related to each other: we gave each cluster a name, i.e. D3DL1 and D3DL2 (Documenting the 3D lifecycle 1 and 2). The dependences are marked with an arrow from A to B, which means "know-how from A will support the development of B". The boxes are put in line with a time, which represents the time when we estimate the challenge will be achieved. The names of the challenges in the boxes do not coincide exactly with the ones used in the text. 4.4 Semantic interaction and visualisation The most natural and traditional manner to interact with 3D content is by visualisation and rendering of the geometry that defines the objects. While we have plenty of tools for visualising, streaming and interacting with the geometric information related to 3D objects, tools for retrieving, manipulating and presenting the semantic content of 3D media are still far from being satisfactory. In other words, current graphics systems are not conceived to give explicit information about the semantics of the 3D content, which can be grasped only by viewing the object itself. If we are able to embed semantics in 3D models with goal-oriented modelling methodologies, however, also new approaches to the interaction and visualisation can be introduced. For instance, by semantic interaction we address the issue of interacting with 3D models/objects using the semantic description of the object and the intent of the user to drive the interaction process. Let us give an example in the CAD domain: parametric feature-based modeling is a first step into the direction of semantic interaction. Tools to create shapes by using terminology and 01/04/2010 56
  • 57. FOCUS K3D D1.4.1 operations the user is familiar with are placed at his/her disposal. Amongst those features are simple concepts such as holes, rips, or slots: the user does not need to model the geometry of the part he/she has in mind, but needs simply to instantiate the class of the part with the parameters required. But what happens – for instance - if the user wants to model a building that has some arcades and he/she has to change the size of the building later on? Will the arcades be scaled automatically or do new ones have to be introduced by manual copy operations? Semantic interaction, at the level of modelling tasks, would take the goal of the operation as well as the characteristics of the object into account and provide the correct modification possibilities, which comply with the user intention. This requires both: • Modelling the semantics of the shape, how it is structured, what the construction rules are, and • Modelling the semantics of the interaction, or in other words, the semantics of the intended change. Today, both aspects are only in their infancy. Another example is interaction in virtual reality that should respect and be guided automatically by the semantics of the 3D models (scene). Devising methods to automatically process and use the semantic description of 3D objects would open up new possibilities for Human Machine Interaction using natural human communication channels, such as speech. Understanding the meaning (semantics) of the virtual scene being manipulated and the statements of the users and mapping them to each other is a field of ongoing and future research. While speech understanding has largely evolved over time, semantic scene descriptions are rather uncommon in 3D modeling applications. This field is further developed rather in the area of Computer Vision, which creates potential for synergies with Computer Graphics. Moreover, if objects are properly described in terms of meaning and functionality of their parts, we might even think of devising methods to support the autonomous interaction of objects or avatars in virtual worlds, thus opening an entirely new perspective to the development of simulation and gaming applications. Another aspect concerns the semantic visualisation, which is a kind of analogue to semantic interaction on the ‘output channel’ from the model to the user. While semantic interaction aims at interpreting the inputs of a user in a goal-oriented and semantically reasonable way, semantic visualisation aims at providing the ‘right’ visual output given a certain visualisation ‘problem’ with properties such as: • Model/data set to be visualised • Visualisation purpose • Kind of insight to be achieved • Kind of decision to be taken. While semantic visualisation can easily be imagined in a number of general purpose scenarios – automatic production of thumbnails for 3D models, automatic production of catalogues for shops selling 3D objects, furniture, cars, or other goods –, CAE with its multitude of simulation results, huge data sets, and largely varying kinds of analysis (post- processing) done based on simulation results, is the field where the benefits of semantic visualisation can be easily understood. In CAE and scientific visualisation a lot of dedicated tools exist to fulfil specific data analysis (post-processing) requirements. These have been developed on a case-by-case basis. Only little work exists trying to conceptualise those analysis scenarios and build up a theory for goal-driven visualisation of semantics contained within data sets, while semantics (here meaning) contained within the data sets largely depends on the insight the user wants to gain by looking at the simulation results. Last but not least, we believe that semantic search of 3D models will soon become a key issue in 3D content management and consumption. The success of 3D communities, gaming technologies and mapping applications (e.g., Second Life, GoogleEarth, 3D city modelling) are 01/04/2010 57
  • 58. FOCUS K3D D1.4.1 causing a dramatic shift in the way people see and navigate the Internet. In this panorama, search and retrieval of 3D media is rapidly becoming a key paradigm of interaction with the huge amount of traffic and data stored in and shared over the Internet. New requirements for information representation, filtering, aggregation and networking need to be addressed, which are as intuitive as possible and effective in their capability of bringing the users to the 3D content they wish to access. The problem of 3D retrieval can be rephrased as follows: given a certain shape, please find similar shapes, or shapes which the given shape is part of and vice versa. Assessing the similarity among 3D shapes is a very complex and challenging research topic and the computational aspects of 3D shape retrieval and matching have been only recently addressed (see TaVe32008 and Bustos et al. 2007 for recent surveys). The methods developed so far span from coarse filters suited to browse very large 3D repositories on the web, to domain- specific approaches. The majority of the methods proposed in the literature mainly focus on the geometry of shapes, in the sense of considering its spatial distribution or extent in the 3D space. Nevertheless, there is a consensus that the shape of objects is recognised and coded mentally in terms of relevant parts and their spatial configuration, or structure. The use of structural descriptions for shape similarity is very interesting as it supports reasoning on shape similarity at a local and/or partial level. So far, however, there exist no solutions that work best for all possible cases and in terms of recognition fidelity they all lag behind humans. New approaches need to be explored, possibly more deeply motivated by human perception principles, especially developing further in the future the techniques for partial matching. At the same time, semantic search is traditionally performed in text-based documents using knowledge representation technologies such as taxonomies, thesauri, and ontologies. Today the area of content-based 3D retrieval is orthogonal to the traditional semantic search. In the future different types of media as sources for searching and information/knowledge management have to be addressed together, especially 3D media have to be combined with what already exists in the field of text, image-based, audio, and video retrieval. To this end new ways of combining, ranking, presenting and feeding back user input are required. In the following, we will specialise some of these issues in the presentation of the sub- challenges that we think are most relevant to the achievement of the main goal. Challenge: User-guided modelling of 3D content For generic usage scenarios, it is necessary to consider that inexperienced users are becoming more and more actively involved in the content creation pipeline and ask for more and more intuitive and effective tools for creating, sharing, retrieving and re-using 3D content. These issues are common to more established media types, such as images and videos, but 3D media open a number of specific challenges due to the geometric nature of the data involved. Even non-professionals can easily re-mix user generated or broadcast 2D content with software like Photoshop or Final Cut, but the same level of functionalities for 3D media is available only within complex software systems. Nevertheless, there are signs that these 3D content creation tools are getting easier to use and more accessible to the general audience, for example, video games such as Little Big Planet (Sony computer systems) introduce tool sets, with which the users without any 3D experiences can build and play in a 3D world with 3D objects and interactions. Similar Microsoft tools exist in the software named Kudo – research software that is designed for general users to create 3D content. In the near future, i.e. in 2-3 years, we will see a huge influx of user-generated content, which will benefit of the semantic annotation and preservation. Challenge: Automatic extraction of the best view of 3D shapes The best view problem consists in the automatic selection of the pose of a 3D object that corresponds to the most informative and intuitive 2D view of the 3D shape. Among the possible applications, the creation of thumbnails for huge repositories or catalogues of 3D models, shape recognition and classification in Computer Vision, and grasping or next-best 01/04/2010 58
  • 59. FOCUS K3D D1.4.1 viewpoint for path planning in Robotics may be mentioned. There is agreement that the best view is closely related to the semantics of a shape and/or of its salient components, in a specific context or application domain, and it should make it easier to recognise the 3D object according to its meaning or purpose. While previous work in this area dates back to 10 years ago [Blanz et al. 99], the actual use of semantic information has only recently been addressed, and demonstrated to improve the results significantly [MoSp09]. Being driven by the semantics of shape and most of all by meaningful features of a shape, the achievement of a best view tool is naturally dependent on the development of smart methods for part-based annotation of shapes and/or for authoring objects and their semantics. Extraction of the best view adopting the method described in (MoSp09) Challenge: Semantic rendering of shapes in medical applications The rendering of shapes for semantic representation has played an important role for virtual environments (Gutierrez 2005) and in the modeling and understanding of complex non- manifold shapes for analysis and retrieval (DeFloriani 2009). In medical applications, semantic shapes have been used successfully in both the teaching and clinical understanding of medical images. In recent studies, due to their ability to convey information in an interactive manner, such systems have been integrated into 3D atlases of the human body, as exemplified in (Höhne 1995), where by integrating concepts of computer graphics and artificial intelligence, novel ways of representing medical knowledge were developed. In another study (Ogiela2009), the author presented a new approach for semantic descriptions and analysis of medical structures, especially coronary vessels (from CT spatial reconstructions), with the use of AI graph−based linguistic formalisms. Such descriptions were suggested as being useful for both smart ordering of images while archiving them and for semantic searches in medical multimedia databases. Existing applications are, however, typically based on a particular part of the body or an articulation, e.g. shoulders, brain, etc., while for a true semantic representation of medical shapes, there needs to be a complete human body that includes all body parts. These challenges are openly acknowledged and there is momentum to reach a complete semantic representation of the human body. The primary dependencies are that all the separate body parts first need to be reconstructed, thus enabling the whole body integration. Because of its importance, this challenge is expected to be resolved in the next decade, partly due to progress in medical image acquisition and processing, as described previously. 01/04/2010 59
  • 60. FOCUS K3D D1.4.1 Exploration of the brain: Arbitrary cutting reveals the interior. The user has gained access to the available information on the optic nerve concerning functional anatomy, which appears as a cascade of pop-up menus. He has asked the system to colour-mark some blood supply areas and cortical areas. He can derive from the colours that the visuallworld center is supplied by the left middle cerebral artery (Höhne 1995). Challenge: 3D retrieval based on multi-modal search 3D shape retrieval is a complex interaction process between the user and the 3D content, along with its semantics. The so-called semantic gap, that is, the gap between the visual data information and the meaning of the data for the user, can be closed by 3D search mechanisms able to integrate content-based criteria, driven by global or partial shape similarity, with concept-based semantics-driven ones, supported by global and part-based annotations of 3D data. For example, it should be possible to pose queries such as “find the 3D models in the repository that represent a vase with handles, and whose handles are globally similar in shape to a given query model”. In the example “vase” and “handle” could refer to semantic annotations and be resolved via a semantic search, whereas “handles are globally similar in shape” will be resolved by applying a geometric search to the models selected by the semantic search. Current 3D search engines work with one single descriptor or a set of predefined alternative descriptors, sometimes as a pipeline of conservative filters to select iteratively smaller subsets of the candidates. It is well known, however, that a single shape descriptor cannot work equally well for all types of properties and all types of object shapes: even within the same application domain, the shape variability of the objects may be too large with respect to the capabilities of a single shape descriptor. Imagine for instance the complexity and variability of 3D shapes that we can expect in the medical domain, where the new acquisition devices acquire data that are used in a combined manner; for instance MRI and motion capture data are used together for gait analysis and should be indexed together. Semantics-guided 3D search and retrieval of the huge amount of data stored digitally in the medical domain could be really beneficial to the clinicians and incorporated into routine clinical medicine or medical research. We believe a big challenge in visual search systems, and 3D in particular, is the development of smarter multi-modal search mechanisms to allow the user to customise each search session and tune the selection of the indices, or descriptors, to his/her search context by exploiting also semantic information attached as annotation tags to the objects and object parts. 01/04/2010 60
  • 61. FOCUS K3D D1.4.1 Challenge: User-centric semantic search by effective relevance feed back Multi-modal search mechanisms are not enough to guarantee that the 3D search results fit the subjective ideas of the observer. To be really effective, the search requires including a human in the loop, that is, the user should be an active player in the search process. In this context, relevance feedback plays an important role in 3D search, as it helps to bridge the semantic gap between the user and the system. By means of relevance feedback, the user can feed the retrieval system with his/her thoughts via the repetition of three processes: he/she submits a query, which is answered by a list of items; then he/she gives feedback about the relevance of some of the items, according to his/her needs; the system refines its set of answers, so that they better fit the user's similarity concept. Relevance feedback techniques help in understanding the semantics of similarity for an observer, in the context of a specific query. Hence, they attempt to solve the semantic gap between description and meaning, between system and user. To improve the retrieval effectiveness we believe it is necessary to develop more flexible and effective user-system interaction methods, which enable to capture, through a user- friendly interface, as much information as possible about the user's perceptual world, minimising his/her effort. Therefore the need exists to develop new techniques for search refinement, through relevance-feedback methods that go beyond the traditional relevant/non relevant assessment (multi-relevance methods) or even capture the implicit feedback, and support adjusting the query and its parameterisation towards the user’s notion of similarity. Challenge: 3D retrieval result visualisation As an essential interaction layer, the visualisation of the results of a 3D search should provide novel result navigation facilities able to support the complexity of the envisaged search modalities. The rationale behind is that result visualisation must not simply present a sorted list of answers, but should communicate to the user why the specific results were retrieved. Considering partial similarity among objects, for example, it is important to be able to visualise the correspondences between similar parts: an aspect, which is currently neglected by the few and prototypical partial 3D search engines. An important aspect of result visualisation is the representation of the data base context in which the search results are obtained (focus and context). The context visualisation should be tightly coupled with free navigation and browsing functionality. Challenge: Smart objects – interaction with the environment It has been shown that using avatars in a virtual environment is mandatory to improve the subject’s immersion. Nevertheless, these avatars must interact with the environment in order to provide the subject with the feeling of realism. They must in essence communicate with the objects of the virtual environment in order to know how to interact with them in the right way. The concept of smart objects appeared more than ten years ago (KaTh1999). The interest is to encapsulate semantic information in each object represented as a 3D model. On the one hand, there is information about the elements or parts of this object, but there is also information about how to handle this object that, for instance, will notify the avatar on how it can interact. Nevertheless, working with such an environment depends on semantic information provided for each element of the environment (Abaci et al. 2005). As shown in the picture above, in the kitchen made of smart objects (here oven, casserole, shelf, tap) the avatar knows how to interact with the objects. Inverse kinematics is then used in order to link the current position to the required position to handle the object. 01/04/2010 61
  • 62. FOCUS K3D D1.4.1 Challenge: Preserve, find, and interpret 3D information in a lifetime of recordings The term life-logging describes the process of capturing, storing, and distributing everyday experiences and information for objects and people with the intention of preserving one’ s entire life or large portions of it. There exists some justifiable disbelief against the wide adoption of related techniques that is predicted in related visionary scenarios due to, for example, critical privacy issues. However, given the already existing trend to capture parts of one’ s everyday life by the widespread use of digital and cell phone cameras and private information sharing via various social networking platforms, there is no doubt that techniques to capture and preserve parts of one’ s personal life will play a significant role. Whereas the actual capturing and preservation process is a problem that should be solved within the next couple of years, the major critical issue with life-logging is how to access and manage the huge amount of resulting data. Searching in millions of images and terabytes of audio and video data is a challenge that goes way beyond the possibilities of today’s search and retrieval techniques and will certainly not be doable without a significant amount of semantic information being available. Because the related information is captured automatically from the 3-dimensional world surrounding us, 3D information and semantics will be a major issue in this context. For example, knowing when and where a photo was taken will not be enough information, but we also need to know in which direction the camera was pointing. We do not only need to automatically identify relevant objects and people in them, but also their position in 3D, in relation to each other, and over time in order to track them over a sequence of captured images. In this context, we can learn from Google Map’s new feature – ‘Street view’. In this system cars mounted with a camera were driven around streets in the world to acquire images, which were annotated with localisation tags (GPS co-ordinates) and camera angle, resolution, etc., which were used to reconstruct the view on the map. Such technologies can be made available, in the near future, on mobile phones that everyone can use. Street View in Google Maps that uses semantic information of the image acquired by a camera Current research results in semantically annotating live-logged data (e.g. by identifying locations, events, people) in personal archives of automatically captured images (cf., for example, Bryne et al. 2008) suggest that automatic semantic annotation and search on a semantic level on a large scale will be possible within the next five to ten years. However, considering 3D data and related semantic information, research is only in its beginnings and at a stage where related work on video and image data was about five years ago. 4.4.1 Time line and dependence diagram In this section, a diagram showing the time lines of the proposed issues and the dependencies between the issues connected with the grand challenge Semantic interaction and 01/04/2010 62
  • 63. FOCUS K3D D1.4.1 visualisation is presented. The dark boxes cluster families of challenges that, in our opinion, are related to each other: we gave each cluster a name, i.e. SIV1 and SIV2 (Semantic interaction and visualisation 1 and 2). The dependences are marked with an arrow from A to B, which means "know-how from A will support the development of B". The boxes are put in line with a time, which represents the time when we estimate the challenge will be achieved. The names of the challenges in the boxes do not coincide exactly with the ones used in the text and we added one more issue in the diagram. 4.5 Standards Taking the broader scope of Visual Computing into account, we need formats that are not only able to represent shape and semantics (as well as the visualisation aspects of the shape) but also formats, which address the requirements of Computer Vision (CV) and Machine Learning (ML) approaches. Typically computer vision applications perform certain tasks, e.g. person tracking. To achieve this, reference data is needed, being it codebook knowledge bases or similar data with known semantics. During the analysis/interpretation stage, hypotheses are built and confidence values are assigned to the probability that a certain hypothesis holds true. Should not interoperable formats exist, which are able to capture and represent both digitally-born objects and scenes as well as interpreted scanned scenes? To accelerate and deepen the convolution process of computer graphics and computer vision, we are convinced such formats would make a great contribution and new research is needed in various fields. 01/04/2010 63
  • 64. FOCUS K3D D1.4.1 Yet to achieve this goal of a semantic format we also need to develop a tool that has the power to express the goal of a CV task in a semantic and computer-understandable way. • A possibility to describe reference data, codebooks and, templates in an exchangeable way would be a revolution for the development of software and its application. It was never foreseen to exchange this kind of data but it would strengthen the software development process. • The semantic representation of object relationships in a “scene graph” would be a possibility to avoid an enormous effort for the re-use of models. If the semantics of objects could be saved while data is exchanged either between systems or companies we could save the value of the invested work. • The evolution process of shapes from raw data to interpreted data is a long process where users have invested an enormous time effort. If we could solve ambiguous research topics like segmentation and classification of objects it is natural that we have to deal with the problem of representing this evolution process to keep the semantics in a standardised way. • To derive digital 3D representations out of physically born objects, usually scanning devices are applied, typically generating millions of points in 3D space. Another upcoming question concerning challenges in standards is how we can represent and encode the semantics of these digitally-born objects in order not to lose already existing semantics of the object. • Because of the heterogeneity of representation and processing algorithms we need a standard for capturing provenance and editing operations that offers efficiency and authenticity for capturing editing steps (granularity). Beside these fundamental challenges, ways have to be found the represent the visual(isation) aspects of shape, such as detailed material information (e.g. measured BTF data of physical props). The more semantics we put into formats the more value we add. Thus, digital rights management has to be handled as an intrinsic part of the development of these new format(s). Having semantics at hand allows for example to do semantic filtering. Depending on the shape representation(s) chosen, ‘unfolding’ of information/shape can be supported by tools operating on these formats (e.g. GML viewer). Access rights may determine how much semantics are exposed. An abstract semantic scene description could be used for imprecise rendering purposes. Beside the representation power of the format(s), we need to investigate how to implement such a format to guarantee desired properties such as: interoperability, extensibility, self- descriptiveness, etc. To our knowledge, there exists no format with the features described above. We are convinced that research in this field can deeply contribute not only to new kinds of formats but also that such formats can considerably stimulate the field of Visual Computing by bringing computer graphics and computer vision aspects more closely together in one representation scheme and finally in a format. 4.5.1 Time line and dependence diagram In this section, a diagram showing the time lines of the proposed issues and the dependencies between the issues connected with the grand challenge Standards is presented. The dark boxes cluster families of challenges that, in our opinion, are related to each other: we gave each cluster a name, i.e. STA1 and STA2 (Standards 1 and 2). The dependences are marked with an arrow from A to B, which means "know-how from A will support the 01/04/2010 64
  • 65. FOCUS K3D D1.4.1 development of B". The boxes are put in line with a time, which represents the time when we estimate the challenge will be achieved. The challenges in the diagrams specify better the general grand challenge described above. 4.6 Trends in Semantic Web research Semantic web technologies, especially ontologies and machine-processable relational metadata, pave the way to knowledge management solutions that are based on semantically related knowledge pieces of different granularity: Ontologies define a shared conceptualisation of the application domain and provide the basis for defining metadata that have precisely defined semantics and are therefore machine-processable. Although knowledge management approaches and solutions have shown the benefits of ontologies and related methods, there still exist a large number of open research issues that have to be addressed in order to make semantic web technologies a complete success. A seamless integration of knowledge creation, e.g. content and metadata specification, and knowledge access, e.g. querying or browsing, into the working environment is required. Strategies and automated methods are needed that support the creation of knowledge as side effects of activities that are carried out anyway. This requires means for emergent semantics, e.g. through ontology learning, which reduces the overhead of building-up and maintaining ontologies. Access to, as well as presentation of, knowledge has to be context-dependent. Knowledge management approaches being able to manage knowledge pieces provide a promising starting point for “smart” services that will proactively deliver relevant knowledge for carrying out tasks related to the acquisition/creation, processing and documentation of 3D models. Contextualisation has to be supplemented by personalisation, taking into account the experience of the user and delivering knowledge on the right level of granularity. 01/04/2010 65
  • 66. FOCUS K3D D1.4.1 Knowledge management solutions will be based on a combination of internet-based and mobile functions in the very near future. Semantic web technologies are a promising approach to meet the needs of the mobile environments, like e.g. location-aware personalisation and adaptation of the presentation to the specific needs of the user and the mobile device, i.e. the presentation of the required information at an appropriate level of granularity. In essence, users should have access to the knowledge management application anywhere and anytime. The Semantic Web stack There are a number of major research questions which have not been solved yet by the current efforts, as identified in (Euzenat 2002) and also in the final report of the EU-NSF strategic workshop on Research Challenges and Perspectives of the Semantic Web, Sophia- Antipolis, France, 3-5 October, 2001. Also, as shown in the diagram above, the top layers of the Semantic Web stack contain technologies that are not yet standardised or contain just ideas that should/will be implemented in the future. Multiplicity of languages. Different languages apply to different situations. Some applications require expressive languages with expensive computational costs, while others need simple languages. Research should be carried out on the lines of (a) designing languages that stack easily or at least that can be combined and compared easily and (b) make explicit the relations between the languages and which needs they can fulfil, so that application developers can choose the appropriate language. The current situation suffers from rather an explosion of languages. At some point in the future, a shakeout and reconciliation of many of these separate languages will be required. Reconciling different modelling styles. Different communities in different situations adopt different knowledge modelling styles: axioms (from logic), objects (from software engineering), constraints (from artificial intelligence), view-queries (from databases). It is important to know how to combine these modeling styles and how to implement one style into another. This is partly a technical process requiring translations between the different modeling paradigms. Different reasoning services. As with languages, different reasoning services are required by different applications. Examples of different reasoning services are: querying consequences of a domain description, checking for consistency, matching between two separate descriptions, determining the degree of similarity between descriptions, detecting cycles and other possible anomalies, classifying instances in a given hierarchy, etc. 01/04/2010 66
  • 67. FOCUS K3D D1.4.1 Semantic Web infrastructure/services From the architectural point of view, it is necessary to take into account several requirements that will help to support storage, caching, optimisation, query, inference, and distribution of the computation on many computers (if necessary), and to design an infrastructure that scales, is efficient and robust. Distributed systems for Web-scale reasoning. Current Semantic Web reasoning systems do not scale to the requirements of demanding applications, such as analysing data from millions of mobile devices, dealing with terabytes of scientific data, and enterprise content management. To go beyond the limited storage, querying and inference technology currently available for semantic computing, a platform for Web-scale reasoning must be developed that can handle very large volumes of data. The FP7 IP project LarKc ( is working towards this direction. This vision can be achieved by - Enriching the current logic-based Semantic Web reasoning with methods from information retrieval, machine learning, information theory, databases, and probabilistic reasoning. - Employing cognitively inspired approaches and techniques such as spreading activation, focus of attention, reinforcement, habituation, relevance reasoning, and bounded rationality. - Building a distributed reasoning platform and realising it on both a high-performance computing cluster and via “computing at home”. Semantic transformations. A crucial need is to be able to merge ontologies for integrating data. This could be achieved through the development of transformation services (mediators) and libraries of transformations. However, not all transformations are suited to every task. A first research effort should provide characterisation of transformations in terms of useful properties and providing general schemes for expressing these transformations and properties in languages that the machine can understand. Taking the example of ontology versioning, it is very dangerous to use a knowledge base with a new version of the ontology. However if the ontology maintainers provide a transformation from old to new versions, it becomes possible to use an old knowledge base once transformed with the new ontology (and other new resources). Research is of course needed on transformation composition, proof-carrying transformations, proof languages and proof-checkers. Developing ontologies - requirements for tools and methods. Building and especially maintaining ontologies will require tools and methodologies. Some requirements on these are listed below: - Visualisation of complex ontologies is an important issue because it is easy to be overwhelmed by the complexity of ontologies. Moreover, tools should be able to provide comparative displays of different ontologies (or ontology versions). Specific tools based on the objective semantics and the perceived meaning should be developed. - Design rationales should be attached to ontologies so that the application developers (and, maybe, the applications) can use them to choose the adequate ontology. - Support for modularity and transformation of ontologies by recording the links between modules and proposing transformations to export parts of ontologies in a context with specific requirements. Some recommendations for future research include ontology acquisition from multiple primary sources (texts, multimedia, images) by means of learning techniques, ontology comparison, merging, versioning, conceptual refinement, and evaluation. Coping with “messy metadata” (from people and machines). When untrained people and imperfect machines generate metadata or populate ontologies, the results will be “messy”. Depending on the context (e.g. the criticality of maintaining high quality capture), three non- exclusive strategies are: 01/04/2010 67
  • 68. FOCUS K3D D1.4.1 - Avoid it: the cost of incorrect metadata is too high (e.g. medicine or safety-critical applications). - Tolerate it: the cost-benefit trade-off is good enough that imperfections do not get disruptive. - Clean it: this may be a cooperative process between humans and intelligent agents, or a purely discursive process, e.g. expert conferences to decide on a new taxonomy, or how to codify new discoveries (e.g. found in some areas of bioinformatics). 4.7 Final discussion In a long-term perspective, achieving the five grand challenges and the open issues discussed in this chapter will be a strong support to the realisation of the visionary scenarios presented in chapter 1. Deriving symbolic representations will guarantee very accurate and complete patient-specific data in medical diagnoses and treatment, a trustworthy and rich digital reconstruction of the environment from 3D sensors as a support for city management systems or for robots to move and interpret information around them. Goal-oriented 3D model synthesising will permit a function-centric modelling and simulation on patient data to obtain very reliable assessments and predictions, the creation of 3D models which have “self- consciousness”: it would definitely benefit the design phase if it becomes possible to reuse past projects for modelling new robots or components, new characters of a virtual game, new behaviours to substitute humans with humanoids in dangerous tasks, new affective cultural experiences. Documenting the 3D lifecycle will imply a 3D mark-up language for the annotation of 3D models consistently with any other media, even massively. We can imagine a precise correspondence between the real semantic world and the digital semantic description, allowing for understanding, processing and retrieval of suitable information not only by humans but also by robots and machines in general. Semantic visualisation and interaction will permit a smooth interaction between the real and virtual words, and within the virtual world itself. Developing tools to make the required information accessible to any user (either human, robot or machine), dynamically according to the specific goal, will guarantee real time decision support mechanisms and the feasibility of new interactions among real and virtual entities. The adoption of stable, rich and extensible Standards for semantic 3D media is the fundamental technological step required to make the visionary scenarios conceived truly operational. In the following figure, a time line diagram across the grand challenges is shown, which provides a visual and global road map to the realisation of 3D semantic media as proposed in the FOCUS K3D project. The clusters introduced previously illustrate the timing with respect to the four phases; moreover, clusters with the same colour placed horizontally indicate that they can be reached independently from each other. 01/04/2010 68
  • 69. FOCUS K3D D1.4.1 In a mid-term perspective, achieving – even partially - the goals proposed will clearly improve the digital shape workflows in applications. To show this more practically we describe four real life application scenarios, which we outlined together with the AWG members and which prove how the application fields considered in FOCUS K3D would benefit from the use of semantic 3D media. In the case of bioinformatics, some comments will be given on the feasibility of semantic annotation of biological data, still very far from being fulfilled. 4.7.1 Medicine scenario: liver segmentation As explained by our AWG member IRCAD, in medicine the long-term grand challenge is to perform patient-specific modeling which ranges from the cell to the organs through globules. This requires new models and architecture to link all extracted models from various modalities (functional, density, etc). In addition, the patient modeling must integrate therapy, which requires new models and architecture to plan, simulate, apply and follow therapy application. Looking at a much shorter-term goal, the practitioners are currently facing a stringent real- life problem, which is considered a priority for the specific problem of liver segmentation and simulation. For applications such as resection, segmenting the liver is still a labour-intensive process, which takes a considerable amount of wall clock time for a trial-and-error process by an expert (the algorithms are not fully automatic but rather devised to assist the expert). While the process is now well mastered and used routinely in medical applications, it is simply economically not sustainable in a normal production scenario to take one hour of an expert surgeon’s time to accomplish this task. In addition, each new segmentation is restarted from scratch and does not benefit from the considerable experience of having performed the tasks on many patients before. One solution to this problem would consist of formalising the knowledge about the functional constraints of the organ (which include geometrical, topological, mechanical and biological constraints) to assist the expert during the segmentation. In addition, it is a priority matter to formalise the interactive segmentation 01/04/2010 69
  • 70. FOCUS K3D D1.4.1 process using knowledge technologies so that each new segmentation learns and reuses from the past segmentations performed. 4.7.2 Bioinformatics scenario: from models to annotation The application of knowledge technology techniques in structural bioinformatics faces major obstacles, which are not present when the geometry is stand-alone: for macromolecules, the couplings between geometry, biology and biochemistry complicate things a lot. In addition, the diversity of alternatives even for simple problems is such that consensus is difficult to reach. In the sequel, we examine these issues and conclude with a list of plausible actions to be undertaken. Coupling between geometry and biology. The central question in structural biology is the investigation of the structure-to-function relationship: given that the functions of a protein are accounted for by complexes formed with partners, one wishes to understand how the shape of the molecule accounts for the binding to these partners. This goal is such that the geometric description has to be coated with functionality-related annotations. In structural biology the focus is precisely on the contexts in which this molecule may be found since these contexts provide the biological knowledge, i.e. the function. Phrased differently, purely geometric descriptions of shapes de-correlated from the contexts are of little interest for (structural) biologists. As a matter of fact, the data models used in structural bioinformatics relate the geometry of a molecule (admittedly simple facts at this stage), to the gene coding for it, and to interactions with partners. That is, the structural, genomic and interaction levels are tightly coupled. Coupling between geometry and biochemistry. Annotating requires a certain level of abstraction, and greatly benefits from parametrised models, whose development is in general impossible for molecules. For example, adding say one carbon atom to an aliphatic chain, thus resulting in a longer chain, may radically change the chemical properties of the molecule, thus resulting in completely different abilities. In drug design, a number of cases are known where elongation of one atom enables the formation of a hydrogen bond at the end of the chain, thus greatly stabilizing interactions. The same holds for proteins, where a moderate elongation can trigger a folding event, thus completely changing the secondary and tertiary structures of a polypeptide chain. This complex coupling between the geometry and biochemistry is a major hindrance for the development of parametric models. On consensus models. Another difficulty is that even simple tasks can be performed in a number of ways. Consequently, consensus is hard to reach, and this penalises annotations. We substantiate this claim with a simple example, that of normal modes. Given a macromolecular system about a minimum of its potential energy, its normal modes encode the couplings with the atomic variables describing the flexibility of the structure. The normal modes are computed by diagonalising the quadratic form representing the energy well about the minimum. Yet starting from a crystal structure extracted from the Protein Data Bank (PDB), computing the normal modes requires minimising the structure to bring it into a minimum of the potential energy. This requires choosing an energy minimisation method, either based on a force field, or on a knowledge-based potential. For the former strategy, a solvent model must be chosen, and molecular dynamics need to be run (with degrees of freedom regarding the length of the simulation and the stop criterion). For the latter, a zoo of potentials exists. Finally, the modes need to be computed, and a relevant number of them need to be selected. The complexity of this process is such that even for the same system different scientists will opt for different approaches, let alone the case of different systems. Therefore, no consensus strategy can in general be used to store normal modes in general audience databases such as the PDB. 4.7.3 Gaming and Simulation scenario: vessel design workflow In the domain of gaming and simulation, one of the members of our AWGs is VSTEP, a company that developed a ship simulator game. Currently, there are 30 civilian vessels as player vessels or autonomous vessels, and 40 additional autonomous vessels. The workflow of adding a new ship to the simulator was described in a presentation at one of the FOCUS K3D 01/04/2010 70
  • 71. FOCUS K3D D1.4.1 events, namely the CASA Workshop on 3D Advanced Media In Gaming and Simulation (3AMIGAS), 16 June 2009, Amsterdam, The Netherlands. For the complete set of slides, see Ship similar game. Courtesy VSTEP Serious Games. The workflow consists of 47 discrete steps, exploiting 10 different tools, involving 8 different people. Some of these steps, such as the production of a press release, have little to do with 3D media, but some would benefit from more knowledge intensive 3D media technologies, and such technologies would improve the quality of the simulation. Of the 47 steps in adding a ship to the simulator, 26 steps are related to shape processing, which can benefit from semantic 3D technology. In fact, the following steps will be able to be automated: the eight processing steps that involve decimation, levels of detail, collision detection preparation, the six texturing steps needed for visualisations that depend on weather and sailing conditions, and two steps about generating icons and renderings. Achieving such automatic processing needs research and development in the direction of combining the geometry with properties and processing steps depending on the context. Ten steps are involved in modeling interior, ornaments, special objects, foam, etc. Techniques such as smart objects and procedural modelling involving semantics need to be developed to speed up the modelling phase and enhance the quality. In addition, in the modelling of a new vessel, parts of other ships are generally reused, such as bolts, anchors, and doors. Exploiting innovative semantically semantic meaningful retrieval capabilities could improve efficiency and completeness of the search within a component library. Moreover, in the simulation, rich semantic annotations about combustibility, rigidity, and buoyancy would improve the simulation quality. There are also steps requiring user interaction and semantic visualisation aspects. For instance, the generation of chart icons, thumbnail images of parts and models, is currently done by visual inspection of alternative views. The automatic generation of information rich and visually appealing views would speed up and ease the workflow. During rendering, smart level-of-detail algorithms will improve both efficiency and effectiveness of information conveyance, leaving out unnecessary detail. Autonomous vessels and characters must be able to navigate and plan routes in a natural way: smart object like doors, handles, hinges, and tools, which would know how they are operated and treated, can improve interaction planning and suspension of disbelief. Although a lot of work will remain labour intensive, these examples of knowledge and semantically rich 3D media (models and environments) show how efficiency and quality may 01/04/2010 71
  • 72. FOCUS K3D D1.4.1 be increased. There are two direct ways how knowledge intensive 3D media technology might have impact in this case. First, a more efficient workflow could save many hours of work, leading to a saving in the order of a hundred thousand Euro for the whole range of vessels. There are currently over half a million copies sold of the ship simulator game. Second, a more advanced, intelligent, meaningful behavior and functionality could lead to extra sales of the game and add-ons, leading to extra revenues in the order of a million Euro. 4.7.4 CAD/CAE and Virtual Product Modelling scenario: semantics based virtual 3D product modelling Let us imagine that, in a 3D shape semantics based world, a new rim is to be developed. Existing rims have been modelled using a procedural modeling approach and are kept in a knowledge base, along with both their geometric semantics and a computer-interpretable description of the requirements they are designed to fulfil. The designer/engineer starts his/her task by sketching a 6-spoke rim and stating some engineering requirements towards the rim: the load (weight of the vehicle to carry), the intended size and width, the speed the rim shall operate under, and the like. Since the company policy is not to start modeling a new product without researching the knowledge base, a segmentation algorithm starts to interpret the designer’s sketch. Knowing that the designer is looking for rims, the segmentation algorithm uses a semantic model of rims to perform the segmentation task, to extract the number of spokes and finally to generate a combined query to the knowledge-based model repository to look for similar rims which fulfil the stated requirements. The repository returns the best matches. The designer chooses the one that she/he likes the best. A geometric modeling tool pops-up and allows for adapting the shape of the existing rim towards better matching the shape of the sketch. In parallel, since the functional requirements are known, a simulation engine starts to check whether the new design actually fulfils the requirements and provides hints to the designer if problems are to be expected in the use phase. The simulation engine leverages the semantic information contained in the model and the semantic description of the requirements to automatically generate corresponding discrete simulation models out of the shape description. Understanding the problem, the simulation engine (actually its meshing component) can decide which resolution is required where and then it can optimise for performance and accuracy at the same time, relieving the user from tedious manual meshing work and algorithmically using experience knowledge collected on a broad basis. What is needed to let this scenario become reality? 1. Semantically rich procedural models; 2. Knowledge technologies to express, understand, and reason about requirements; 3. Improved 3D shape retrieval mechanisms that take additional information into account; 4. New modelling approaches to combine sketch-based and semantic geometric modelling; 5. New representation schemes to handle this extended set of information/knowledge; 6. Meshing and simulation algorithms that can be ‘parameterised’ by problem definitions; 7. Fast feedback loops between modelling and simulation to create functional-driven models; 8. New ways to communicate results to the user on a semantic level if something fails. To enable all these points to inter-operate requires new data formats (grand challenge: Standards). The grand challenge Deriving Symbolic Representations is not particularly addressed here although symbolic representations are involved, e.g. in the semantic-rich procedural models. The points 1, 4, 6, and 7 refer to the grand challenge Goal-oriented 3D 01/04/2010 72
  • 73. FOCUS K3D D1.4.1 modelling synthesising; the points 2 and 5 refer to Documenting the 3D lifecycle, while 3 and 8 belong to Semantic visualisation and interaction. 4.7.5 Archaeology and Cultural Heritage scenario: large-scale repository of 3D digital artefacts A great part of Europe’s heritage is linked to historical and cultural evolution. Hence, it is clear that without preservation, documentation and easy reference to various facets of this great cultural tradition, which moulded the life of our ancestors for many centuries, mankind cannot understand its past and thus it cannot lead them into the future. Therefore, a great benefit would come from the development of a large-scale, documented, distributed 3D/multimedia repository for culture heritage and archaeology in order to preserve, share and reuse knowledge. The main objective of this project will be to support the archaeology and cultural heritage community in the creation, interpretation and access of 3D content. The major challenge will be the creation of a large-scale distributed 3D digital library which will be developed using a novel reliable cost-effective 3D reconstruction system that can concurrently capture images, videos and 3D representations for the preservation of cultural artefacts. One of the main components of the technological infrastructure required to support such an initiative will be a Knowledge Management system. Such a system will support the creation and maintenance of artefacts, which will be composed of the digitised assets as well as all the necessary metadata, which will store the information required for the proper documentation of each artefact or each collection of artefacts. All these tasks are related to the grand challenge Derive Symbolic Representations. Such an ideal system will also support the creation and maintenance of multilingual thesauri, which will be used for the deployment of cultural semantic web dictionaries, a feature that is necessary not only for advanced searching mechanisms but also for the correct one-to- one information mapping since the content spans multi-era, multinational and multilingual barriers. These objectives are clearly included in the grand challenges Documenting the 3D lifecycle and Standards. Moreover, users will be able to search and retrieve multimedia and scientific data from the distributed library, using novel sophisticated search techniques that evolve both content and context, since content-based access to information has already started to become unsatisfactory. Thus, low-level geometric feature extraction algorithms will be developed and knowledge extraction techniques will be introduced to extract semantic information/features from scientific documentation. This is related to the grand challenge Semantic Visualisation and Interaction. In archaeology and cultural heritage, object semantics is typically just as important as the actual geometry. Thus, it is a key requirement to assign thematic information to entire objects and to individual geometric elements. This also makes it possible to select, analyze or edit the geometry and the appearance of objects based on semantic criteria. One major issue is the (re-)creation, reuse and maintenance of the artefacts. Storing the models of those artefacts, along with additional semantic information and provenance data, can ease the procedure of creation or even replication of old masterpieces. For example, it could be used for the reconstruction and reproduction/repair of archaeological findings, e.g. missing parts of a hand in a statue that was broken, and/or any such application. An instructive counterexample is what happened to the Athina temple in Aigina, which was excavated by German archaeologists. The findings were moved to Munich and the various pieces were reconstructed by actual sculptors. Later on it was discovered that many mistakes had been made and the various statues constructed this way were disassembled. Thus, many ancient pieces were destroyed in the process. If this had been done digitally many ancient pieces would have been saved. Partial matching is another application area. For example, if a handle is found and the rest is missing, after partial matching with already stored similar objects we can accurately guess the origin of the handle and reconstruct the artefact quite closely. Another strong interest of the archaeological/cultural heritage community is in the 01/04/2010 73
  • 74. FOCUS K3D D1.4.1 organisation and presentation of content to virtual visitors (virtual exhibits) and developing educational and training application scenarios for connecting real and virtual artefacts. To re- create an architectural space one cannot simply use a 3D scanner. 3D reconstruction of architectural spaces is a long design process based on available data (pictures of the remains, archaeological research drawings and maps, etc.), evolving in close collaboration with archaeologists and historians, and taking advantage of any available knowledge (annotated 3D objects). A framework like the one proposed here will contribute to the reconstruction of such virtual environments. A success story example is the "Tholos", a Virtual Reality museum of the Foundation of the Hellenic World (FHW). Such a framework will also support collaboration communities providing the required infrastructure to build targeted applications. It will also provide appropriate interfaces and tools to support the logging, the merging and the extraction of new knowledge, the sharing and reuse of existing context, and personalised services. 3D digitisation can be a key factor for the sustainability of the economic development of virtual museums. The on-line availability of heritage-related information allows better management of the impact of economic development on the cultural environment. The impact on cultural tourism is self-evident. Considering more practical and real life scenarios, examples of tasks regarding 3D processing operations that are useful in the cultural heritage field include all the steps of the digital lifecycle of a resource, i.e. from the acquisition and data preparation, to the annotation and classification, and also to the analysis and virtual restoration, reconstruction and visualization. A CH professional could be guided in the phases and supported with a technological platform able to provide tools for performing steps that deal with: • data acquisition: e.g. 3D scanning, registration, and merging; • archiving and retrieval: e.g. classification of data, storage of 3D content and metadata, and retrieval of data from a repository; • virtual presentation: e.g. data healing, level-of-detail-driven simplification, association of material properties, texturing, and viewing; • virtual reconstruction and restoration: e.g. composition/assembling of parts, data conversion, modification/deformation of models, and morphing between two different objects; • prototyping and replication: e.g. fairing, generation of a uniform polygonal mesh, for instance, in the STL format; • deformation monitoring: comparison of geometric deviation of artefacts or buildings over time. Possible applications of the described ideal system can include the following: Acquisition and Reconstruction. The system will provide a set of functionalities that allow the generation of a digital model from a physical object (e.g. the front of a building, a fountain, a statue, etc.). The digital model may be further processed depending on the application task: visualization, simplification at different levels of detail, deformation, prototype generation, etc. For instance, the use of a laser scanner to acquire a façade of a building or a white light pattern projection scanner scanning in close range 3D artefacts produces a huge amount of data, and usually more than one scanning session is necessary to acquire the whole façade or artefact, after which an accurate digital model is produced. It is important to choose the best scanner device suitable for the specific scanning session, depending on the physical object. Each scanning session produces a dataset (point cloud), and different datasets have to be registered and merged in order to produce the whole digital raw data from which the final digital model is obtained. Several post processing tools can be applied to the digital model, depending on the application task (e.g., visualization, prototype generation, validation and certification). Smart Archiving and Retrieval. Once the digital model of an object is available, it should be 01/04/2010 74
  • 75. FOCUS K3D D1.4.1 classified and put into a repository to be made available to the community. The classification process can be supported by smart tools that automatically encode geometrical properties (e.g., this ancient column has a specific size, it is cylindrical, its average diameter is…) and structural properties (e.g., this amphora has three handles; this façade of a temple has four columns). Moreover, semantic information (e.g., this is a piece of an amphora and the material is pottery) and environmental information (e.g., this fountain is located in a specific square, it has been acquired using a specific device through a specific methodology) can also be encoded in order to enrich the cultural and scientific value of the digital content. All this information improves the quality of a later retrieval: e.g., I am looking for a statue whose arms are missing (structural information); I am looking for a marble fountain (semantic information) whose maximum height is three meters (geometrical information). Virtual Restoration of 3D artefacts. The framework could provide all the necessary scientific resources for (semi-) automatically completing the missing parts in a statue or an ancient building based on the selection of semantically similar parts retrieved from a 3D repository. In this case, for instance, an analysis based on the Virtual Human domain expertise could capture information on the body landmarks of the Venus of Milos. An arm could be selected elsewhere, and its measures could be adapted to fit the missing arm of the statue. A process could produce the fitting arm, and another process could make the actual merging. This could be done for scientific studies but also for training and entertainment (“Lend a hand to Venus!”). Another application, once a critical mass of CH content is achieved, could be to seek for fitting parts and fragments of historical artefacts often distributed in locations all over the world. Virtual Reconstruction. Another application scenario could be the reconstruction of historical artefacts and whole historical environments. Virtual reconstruction is based on information sources like historical texts, drawings and other 2D information, and the comparison with historically similar artefacts or buildings, which could be assessed applying semantically driven queries for historical information available in the repository. Besides, existing 3D digital models and shape processing tools for modelling would enable economically more feasible approaches to such reconstructions. Deformation Monitoring. Another application scenario is the monitoring of the erosion of statues and buildings or the deformation of artefacts over time. Such deformation takes place, for instance, because of changes of the climatic conditions. Canvas and panel paintings are examples of such artefacts sensible to changes of air humidity. The shape processing tools would include the geometrical comparison of the deviation between the two digitised models of an artefact obtained at different time stamps. 4.7.6 Robust geometry processing: practical benefits across the AWGs Although the strong impact of robust geometric processing of 3D data was mentioned several times in this document, we motivate with more details its practical importance in the applications by describing the context in terms of acquired and generated geometric data, and state-of-the-art methods for digital geometry processing. Geometric data are most commonly generated by modelling, by measurements or through automated processes. The abundance of data is explained by the recent considerable advances in the modelling paradigms, in the acquisition technologies, and in the variety of automatic conversion methods. In addition, numerous algorithms along the geometry processing pipeline generate new, processed geometric data. Measurement data range from point sets to depth images and contours. As we have seen, they are acquired with an increasing variety of acquisition technologies, whose evolution is characterized by a shift from contact to contact-free sensors and from short to long range sensing, culminating with satellite images. In many cases the acquired data are “raw” in the sense of being sparse, irregularly sampled, and riddled with uncertainty (noise and outliers). Therefore, despite the expectation that technological advances should improve quality, geometric datasets are increasingly unfit for direct processing. This trend is explained by a drastic change of scale in geometric datasets: projects such as Google Earth, geophysical measurements, or climate analysis involve measurements from satellite images or seismologic 01/04/2010 75
  • 76. FOCUS K3D D1.4.1 sensors - all of which contain a significant number of outliers. In addition, geometric data are increasingly heterogeneous due to the combination of distinct acquisition systems, each of them acquiring a certain type of feature and level of detail. Examples include the measurement of digital cities from satellites, planes and by pedestrians. Other examples in computer-aided medicine include measurements of non-parallel slices, 3D volumes and diffusion tensors. Data are thus not only heterogeneous, but may also be redundant. Although beneficial from the sampling point of view, redundancy hampers data registration. Furthermore, we are observing a trend brought on by the speed of technological progress: while many practitioners use high– end acquisition systems, an increasing number of them turn to consumer–level acquisition devices such as low cost free-hand medical devices or digital cameras, hoping to replace an accurate but expensive acquisition by a series of low-cost acquisitions. Another evolution consists in dealing with community data: acquisition of our physical world can be achieved by exploiting the massive data sets available online. Automatically generated data appear throughout the geometry processing pipeline every time a conversion occurs or the geometry is altered. Examples of generated data are surface meshes generated by marching cubes from 3D medical images or by meshing parametric surfaces from CAD models. As the input data and the meshing algorithms are imperfect, the output meshes may contain spurious defects such as gaps and self-intersections. This type of raw meshes, referred to as polygon soups, may severely hamper the robustness of geometry processing algorithms. In general processed data are riddled with imperfections due to the lack of guarantees provided by algorithms used along the processing pipeline. Ironically, many algorithms do not even guarantee for their output the very properties they require of their input. Since geometric data are increasingly heterogeneous as mentioned above, they require more conversions and processing, and are thus even more prone to contain flaws. Methods. A major research effort in geometry processing in recent years has been to elaborate upon mathematical and algorithmic foundations for analysing and processing complex shapes through so-called discrete differential geometric operators. Significant theoretical and numerical advances have been made to show that these operators mimic their smooth counterpart when applied to discrete representations of 3D shapes. The main results are discrete equivalents of basic notions and methods of differential geometry, such as curvature and shape fairing of polyhedral surfaces. Another important research direction has been to elaborate upon methods for converting shapes from one representation to another. In particular, the issue of shape acquisition and reconstruction has stimulated a considerable number of contributions. In an AIM@SHAPE survey we have listed more than 500 publications on this topic, revealing the lack of a unified solution. Specifically, it took 15 years to invent a provably correct reconstruction approach based on Voronoi filtering together with the first ingredients of a sampling theory for smooth shapes of arbitrary topology. As these foundations are valid only for input measurement data, which are both dense and noise free, they are of little practical use for the practitioners that increasingly have to deal with raw data. Consequently, a series of heuristics was devised to deal with noisy data sets, but for example no well-principled approach can deal simultaneously with piecewise-smooth reconstruction, outliers and heterogeneous inputs. Targeted problem. The initial grand promise of digital geometry processing (DGP) was to achieve what had been done in digital signal processing (DSP) but for the very special “signals” that shapes represent, with highly distinctive properties such as topology, dimensionality, and lack of global parameterisation bringing a wealth of challenges. In many ways, DGP has been successful in the sense that a variety of methods have been developed to offer a vast array of geometry editing tools. However, digital geometry processing still has yet to provide the type of robustness to which digital signal processing owes its success: to have a tangible impact at the scientific, technological and societal levels, DGP must offer a set of robust and reliable methods (as currently available for sounds or images) to practitioners. The level of robustness sought after (resilience to uncertainty and heterogeneity) constitutes an enduring scientific challenge, even more difficult than for ordinary signals due to the variety of geometric data. While creative and exploratory, the resulting algorithms of the first wave of DGP research are too brittle and too input-dependent to offer solid foundations. 01/04/2010 76
  • 77. FOCUS K3D D1.4.1 To fully realize the potential of geometry processing, one must tenaciously address the most enduring and fundamental problems, which hamper robustness to imperfect and heterogeneous input. One promising approach consists of formalizing the a priori knowledge both about the measured shape and about the acquisition system such that this knowledge is exploited as early as possible during the acquisition phase. Regarding the generality of heterogeneous data, the current approach consists of choosing the data structure, which is best suited to the queries required by the algorithms at hand. Turning the problem around, we propose instead to adapt the algorithm queries to the geometric data, and isolate a minimal set of so-called oracles required by the algorithms. In addition, these oracles may provide access not directly to the input data but instead to a more abstract class of objects such as compact sets equipped with a metric space. This way the algorithm gets access to the sensed data only through an error metric which can be made resilient to imperfect data once the metric is enriched with the knowledge (formalized with semantics) of what the sensed shape is. As these oracles are the only interface through which the input geometry is known (or sensed), the algorithm becomes independent of the representation of the input geometry: generality is gained and heterogeneous data can be handled gracefully. More concretely, for a scanner manufacturer this may translate into a sensor not data-centric, but query-centric, where each query can be specialized to the type of shape of interest and enriched with the semantic of the sensed shape. We can think of, e.g., a free-hand ultrasound medical device, which provides answers to geometric queries about a specific organ such as non-parallel slices with topology and geometric regularity constraints specific to the said organ. In addition, a formalization of the knowledge about the acquisition process itself can be instrumental to provide robustness to noise, outliers and more generally to defect-laden data. 4.7.7 Conclusion In the final FOCUS K3D Conference on Semantic 3D media and content we had the opportunity to discuss the grand challenges and open issues of this road map with many experts of the applications and we have been able to compare our insight on the future research agenda with the perspectives of both academia and industry. The experts in Bioinformatics confirmed that knowledge can be related either to the shape or to the function/structure but the first one is useless without the knowledge of the application domain. Therefore, the general target of semantics has to be contextualised and the biggest challenge is relating the 3D shape to the biological context. Annotation is clearly important but in this domain it is not obvious how to represent such knowledge and relations. In addition, the problem of preservation of the annotation is also delicate because it depends on the single case: even performing a minor change in the molecule geometry may trigger a major event in terms of annotations. For example, in the active sites of proteins you may have a certain geometry of residues, which carry on a specific reaction; any slight change can either stop that reaction from happening or make it happen; then, the geometry in the active site is extremely important but it is not clear how these situations can be represented. The natural conclusion is in line with ours: the big challenge, which will have a very high impact, is modelling efficiently such biological knowledge and relations between data. Until parameterised models able to include knowledge are conceived, annotation will not be possible. For Gaming and Simulation applications it has been noticed that often research is technology-driven while it should be more user-driven. Semantic 3D media can be exploited in learning experiences and semantics in general could be beneficial for learning models, personalisation, and user-environment interactions. There is a lot of potential in how to model towards the user, towards the learners; in other words, there is a real benefit of Knowledge Technology in the assessment of use/learning models and their capabilities. Interaction becomes crucial, where the interaction is between the user and the technology, but is also the social interaction between users. An important issue in this perspective could be how to visualise (textual, visual) datasets to support interaction. The social and economic benefit is in how this interaction between the users and the environment could be accelerated by the technology. The future of semantic 3D technology in Gaming applications could probably be in 01/04/2010 77
  • 78. FOCUS K3D D1.4.1 the direction of providing very layered approaches to knowledge acquisition whilst being within a very massive kind of environment. For instance, the current attempts of geo-tagging the 3D environment, while also linking to text and objects coming from a whole range of different data, go in this direction. The case is different for the mechanical industry and telecommunications (CAD/CAE and Virtual Product Modelling), where 3D models and semantics have a long successful tradition. In these sectors engineering drawings are fundamental to transfer a message in a community that shares the same culture: they contain a lot of implicit semantics which engineers are able to interpret. Hence, a big challenge is the possibility of transferring the implicit semantics included in the drawings and this would result in a huge economic value since a sector like mechanical engineering does not evolve fast and the life of design and architecture is very long. In the automotive sector, the design of a new car implies the reuse of 90% of old components. As a consequence, there is an enormous value in drawings and there is a lot of work to be done to extract the semantics from engineering drawings, where the geometry is just one of the aspects to consider. Another aspect is that CAD modelling systems cannot evolve since they are based on nominal shapes. What makes the difference is not the precision of the model but the tolerance the designer gives to it, since tolerances are connected to costs and functionality. This kind of information is not included in the idea of nominal geometry, and then it appears to be evident that deriving symbolic representations is a huge challenge with very good perspectives. Embedding semantics in CAD models automatically such that knowledge could be reused in downstream phases is perceived as a challenge even in current scenarios. In fact, the way to include semantics is now up to the designer, thus it is difficult to control and reuse in the process. Although such an issue could be seen as a research objective for the near future, the annotation of past projects, even dating back to several years ago, appears unfeasible in practice. Moreover, considering the diversity of product modelling phases, it is clear that semantics not only has to be preserved during the pipeline but also has to be adapted when the geometry changes; for example, during model simplification and idealisation for simulation, it would be valuable to adapt the FEM semantics to the new simplified/idealised model. The benefits of an effective semantic search also emerged during the conference, considering the large amount of data already available now. It currently happens that some useful information is deployed and available in the system, but there is no standard way to access this information. Knowledge technologies could offer such standards so that generic “semantic” modules are plugged together (i.e., CAD systems cooperate with knowledge bases) to provide the same kind of functionality that is in use today but saving a lot of time. However, the gap between the academic prototypes and real application systems is still perceived as a strong practical limitation by industry. In addition, software vendors have no real interest in investing in new tools unless customers explicitly ask for them. Only skilled customers, i.e. industrial companies, aware of new technologies can push software vendors to advance their tools. The CAD community includes very dissimilar profiles, which require ad hoc solutions. For instance, the styling department of a company works differently and uses different tools because they follow a completely different process. Technical people usually design from technical constraints, and try to find the best compromise; oppositely, people creating shapes usually start from an idea and then develop a nice shape. Procedural shapes, as have been presented during the conference, cannot support stylists but engineers, who have to manage the design constraints and can do it more intuitively with such a methodology. Clearly creativity cannot be disjoint with object produceability. The high economic value of creative work in today’s market has been highlighted, which needs freedom to evolve and consequently a strong interaction in the modelling system between the stylists and the product they are creating. Also Cultural Heritage is a very heterogeneous field, and also here semantics is strictly related to the context: for instance, the information necessary for a curator operating in archaeology is different from the one needed by a curator of a library. In CH 3D models appear in many tasks, such as representation, documentation, video, educational material, but it is 01/04/2010 78
  • 79. FOCUS K3D D1.4.1 important to notice that 3D cannot be considered as the solution but more properly as an evolution of the traditional media to be integrated. The benefit of the FOCUS K3D approach is considered valuable especially for the “scientific” cultural heritage community, which aims at studying, cataloguing and documenting the objects. A big issue is related to standards of data models and this is affected also by multi-linguism because in archaeology and architecture, terminology is very close to the knowledge structure, especially when used to describe shapes. Since the domain does not share the same vocabulary, semantic representations should take care of this aspect and, more generally, of the context of use. Another challenge from the FOCUS road map, which appeared important in the field, was the integration of results coming from semantic-based classification and geometry-based classification. In this perspective, shape may be regarded as the common denominator between different knowledge universes. The vision of the Semantic Web community, also represented at the conference, is slightly different from the one of FOCUS K3D. Semantics is not generally used in the sense of intent and knowledge but in the sense of description of the relations between units, where the relationships are described so that there are mechanisms/algorithms to derive many types of inference (ontological description). The two definitions do not diverge but focus on different aspects: semantic web is primarily a web of data, and then including effectively 3D media as part of such data requires conforming to the development of web. In more practical terms, whichever the semantics one intends to formalise (both geometrical and contextual), the annotation should make use of http URIs to add metadata to objects and parts of objects, because this permits to identify univocally a resource through a shared syntax. In addition, it is profitable to use public vocabularies when possible and make our own 3D data publicly available and link them with any relevant resource accessible on the net, since this opens up to the new application facilities available on the web for other kind of data. Being the semantic web focussed mainly on relations between resources, the future challenges addressed by the community are related to the reasoning about the data. One aspect to tackle is the introduction of uncertainty into the picture and then developing fuzzy description logic algorithms. In fact, the different inference mechanisms used today are all based on formal and binary logic. There are few approaches trying to solve this problem, adopting probabilistic approaches in description logic. Research is now far from a solution: the mechanisms have still to be understood and then reach a state of equilibrium to be standardised. A second issue, which has been raised in the past few years and relevant in the context of 3D semantic media, is the impossibility of description logic to find partial solutions to a query since description logic based inference tries to find all the solutions of a search. Going to the web scale where the amount of data is now massive, this approach does not work anymore: it is preferable to retrieve partial data in a few seconds instead of all data (or even failure) in days. Another difference between the professional users addressed by FOCUS K3D and the variety of possible users of the web is related to the freedom in documenting data. In the web, the control on the annotation becomes crucial to an effective retrieval and reuse of very specialised data. Oppositely, the web evolves in an organic way, and consequently it is fundamental leaving people free to find new ways to exploit relationships among data, which will produce new semantic data we even did not think of before. It was suggested to leave room to the social mechanism and learn from the way it happened on the web in the documentation of 3D data because the additional value in the retrieval of results is in the combination of results coming from different kinds of annotation. Actually this is one of the approaches we mentioned to tackle the massive annotation issue: in fact, the way the web evolves today is extremely interesting beyond the technology and is something new for us – classical scientific people - since we were not trained in such an approach. Summing up, we can conclude that it was confirmed by the audience that the grand challenges proposed by FOCUS K3D are perceived as the major issues of the future of semantic 3D media. Effective ways to extract and represent knowledge for an effective understanding and interaction with the model are necessary together with a standardisation of the terminology and the documentation itself. Even effective ways to visualise data and metadata have to be investigated: the best way to categorise non-textual data could be not by 01/04/2010 79
  • 80. FOCUS K3D D1.4.1 text that we are used to now, as there could be better representations. Accomplishing these goals would benefit the sharing, the transfer, the retrieval and reuse of semantic data in the different contexts. Due to the explosion of data of different nature, all the application domains emphasised the urgency of handling a massive amount of heterogeneous and uncertain data, which indeed proves the high potential of such research themes in terms of technological and economic impact on different user communities. 01/04/2010 80
  • 81. FOCUS K3D D1.4.1 References (Abaci et al. 2005) T. Abaci, J. Ciger and D. Thalmann, Planning with Smart Objects. International Conferences in Central Europe on Computer Graphics, Visualisation and Computer Vision, 2005. (AhDa2008) L. von Ahn and L. Dabbish, Designing Games With A Purpose, Communication of the ACM, 51(8), pp. 58-67, 2008. (AIM@SHAPE) AIM@SHAPE: Advanced and Innovative Models And Tools for the development of Semantic-based systems for Handling, Acquiring, and Processing knowledge Embedded in multidimensional digital objects, European Network of Excellence, Key Action: Semantic-based knowledge systems, VI Framework, Contract IST 506766, URL:; (Attene et al. 2009) M. Attene, F. Robbiano, M. Spagnuolo, B. Falcidieno, Characterisation of 3D Shape Parts for Semantic Annotation. Computer- Aided Design, Vol. 41, No. 10, pp. 756-763, 2009. (Blanz et al. 99) V. Blanz, M.J. Tarr and H.H. Bülthoff, What object attributes determine canonical views? Perception 28, pp. 575–599, 1999. (Bouvier et al. 2009) B. Bouvier, R. Grunberg, M. Nilges, and F. Cazals, Shelling the Voronoi interface of protein-protein complexes reveals patterns of residue conservation, dynamics and composition. Proteins: structure, function, and bioinformatics, 76 (3), 2009. (Bustos et al. 2007) B. Bustos, D. Keim, D. Saupe, T. Schreck, Content-based 3D object retrieval, IEEE Computer Graphics and Applications (CG&A), Vol. 27, No. 4, 2007. (Byrne et al. 2008) D. Byrne, A.R. Doherty., C.G.M. Snoek, G.F. Jones, A.F. Smeaton, Validating the Detection of Everyday Concepts in Visual Lifelogs. SAMT 2008 – 3rd International Conference on Semantic and Digital Media Technologies, Koblenz, Germany, 3-5 December 2008. (Charbonnier et al. 2007) C. Charbonnier, B. Gilles, N. Magnenat-Thalmann, A Semantic- Driven Clinical Examination Platform. In Surgetica'2007, Computer-Aided Medical Interventions: Tools and Applications, Chambéry, France, pp. 183-189, September 2007. (Charbonnier et al. 2009) C. Charbonnier, L. Assassi, P. Volino, N. Magnenat-Thalmann, Motion Study of the Hip Joint in Extreme Postures. Vis Comput, Springer-Verlag, Vol. 25, No. 9, pp. 873-882, 2009. (DeFloriani 2009) L. De Floriani, Computing and Visualizing a Graph-Based Decomposition for Non-manifold Shapes. LNCS Graph-Based Representations in Pattern Recognition, 2009. (Euzenat2002) J. Euzenat, Research Challenges and Perspectives of the Semantic Web. IEEE Intelligent Systems, vol. 17, no. 5, pp. 86-88, Sep./Oct. 2002. (Falcidieno et al. 2004) B. Falcidieno, M. Spagnuolo, P. Alliez, E. Quak, E. Vavalis, C. Houstis, Towards the semantics of digital shapes: the AIM@SHAPE approach. Proc. of the European Workshop on the Integration of Knowledge, Semantics and Digital Media Technology (EWIMT2004), P. Hobson, E. Izquierdo, I. Kompatsiaris and N.E.O'Connor Eds., pp. 1-4, November 2004. (Fut09) Timeline for the Future: Potential Developments and Likely Impacts. The Futurist, Volume 43, No.2, p: 33-38, March-April 2009, ISSN 0016-3317. (Gilles et al. 2004) B. Gilles, R. Perrin, N. Magnenat-Thalmann, J-P. Vallee, Bones motion analysis from dynamic MRI: acquisition and tracking. Proc. of MICCAI'04, Vol. 2, pp. 942- 949, 2004. (GoFu2009) A. Golovinskiy, T. Funkhouser, Consistent Segmentation of 3D Models. Computers 01/04/2010 81
  • 82. FOCUS K3D D1.4.1 & Graphics, IEEE SMI 2009 proceedings, (33)3, pp. 262-269, June 2009. (Golshani06) F. Golshani, Multimedia and Reality. IEEE Multimedia, Vol. 13, N. 1, pp. 96-96, 2006. (Guharoy et al. 2010) M. Guharoy, J. Janin, C. Robert, Biological macromolecules: from static structure to exible partner. Semantic 3D Media and Content, FOCUS K3D conference, 2010. (Gutierrez 2005) M. Gutierrez, Semantics-based representation of virtual environments. Int. J. of Computer Applications in Technology, Vol. 23, 2005. (HaFe2007) S. Havemann, D.W. Fellner, Seven Research Challenges of Generalised 3D Documents. IEEE Computer Graphics and Applications, vol. 27, no. 3, pp. 70–76, 2007. (Havemann et al. 2008) S. Havemann. V. Settgast, R. Berndt, O. Eide, D. Fellner, The Arrigo Showcase Reloaded – towards a sustainable link between 3D and semantics. VAST: International Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage proceedings, pp 125-132, 2008. (Heitbrink 2008) M. Heitbrink, Multimodality (MM) imaging: Enhancement through interlacing. Vision paper commissioned by the boards of the NVvR and NVNG”, Eur. J. Nucl. Med. Mol. Imaging, 35:pp. 1221-1229, 2008. (Höhne 1995) K.H. Höhne, A new representation of knowledge concerning human anatomy and function. Nature medicine. 1995. (ImMo2005) C. Imielinska, P. Molholt, Incorporating 3D virtual anatomy into the medical curriculum. Communications of the ACM, v.48 n.2, 2005. (KaTh1999) M. Kallmann, D. Thalmann, Direct 3D interaction with Smart Objects. VRST'99. Proceedings of the ACM Symposium on Virtual Reality Software and Technology, 1999. (Koller et al. 2004) D. Koller, M. Turitzin, M. Levoy, M. Tarini, G. Croccia, P. Cignoni, R. Scopigno, Protected interactive 3D graphics via remote rendering. ACM Trans. Graph. 23, 3, 695–703, 2004. (Koller et al. 2009) D. Koller, B. Frischer, G. Humphreys, Research challenges for digital archives of 3D cultural heritage models. J. Comput. Cult. Herit. 2, 3, pp. 1-17, Dec. 2009. (Lafortune et al. 1992) M.A. Lafortune, P.R. Cavanagh, H.J. Sommer, A. Kalenak, Three dimensional kinematics of the human knee during walking. J Biomech, 25(4):pp. 347-357, 1992. (Lipinski et al. 2007) C.A. Lipinski, F. Lombardo, B.W. Dominy, P.J. Feeney, Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Del. Rev. 23: pp. 3–25, 2007. (Lou et al. 2009) R. Lou, F. Giannini, J.P. Pernot, A. Mikchevititch, B. Falcidieno, P. Veron, R. Marc, Towards CAD-less finite element analysis using group boundaries for enriched meshes manipulation, Proceedings of the ASME 2009 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference IDETC/CIE, San Diego, CA, US, 2009 August 30 – September 2, 2009. (Mortara et al. 2004) M. Mortara, G. Patané, M. Spagnuolo , B. Falcidieno , J. Rossignac, Plumber: a method for a multi-scale decomposition of 3D shapes into tubular primitives and bodies. Proceedings of Solid Modeling and Applications (Poster Session), pp. 339–44, 2004. (MoSp09) M. Mortara, M. Spagnuolo, Semantics-driven best view of 3D shapes. Computers & Graphics Volume 33, Issue 3, pp. 280-290, June 2009. (Nestlea 2006) U. Nestlea, Practical integration of [18F]-FDG-PET and PET-CT in the planning of radiotherapy for non-small cell lung cancer (NSCLC):The technical basis, ICRU-target volumes, problems, perspectives. Radiotherapy and Oncology, Volume 81, Issue 2, pp. 209-225, 2006. 01/04/2010 82
  • 83. FOCUS K3D D1.4.1 (Ogiela2009) M. R. Ogiela, Picture grammars in classification and semantic interpretation of 3D coronary vessels visualisations. Opto-Electronics Review, 2009 (PaFi2005) C. Paris and M. Fiedler Eds., Protection of stored data by system level encryption schemes. EuroNGI Deliverable D.WP.JRA.6.3.5, 2005. (Rochelle 2004) L. Rochelle, Morphology and Lexical Semantics.' Cambridge University Press, 2004. (Rosset 2006) A. Rosset, Navigating the Fifth Dimension: Innovative Interface for Multidimensional Multimodality Image Navigation. Radiographics, 26(1): pp. 299-308, 2006. (Smeulders00) A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, Content-Based Image Retrieval at the End of the Early Years, IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 22, N. 12, pp. 1349-1380, 2000. (SpFa2007) M. Spagnuolo, B. Falcidieno, The Role of Ontologies for 3D Media Applications. Semantic Multimedia and Ontologies, Part III, Chapter 7, pp. 185 - 205. Y. Kompatsiaris, P. Hobson (eds.), London Springer, 2007. (SpFa2009) M. Spagnuolo B. Falcidieno, 3D Media and the Semantic Web, IEEE Intelligent Systems, Ed. S. Staab, pp. 1-8, March/April 2009. (Spitzer et al. 1996) V. Spitzer, M.J. Ackerman, A.L. Scherzinger, D. Whitlock, The Visible Human Male: A technical report, JAMIA 3, 2, 1996. (TaVe2008) J.W.H. Tangelder, R.C. Veltkamp, A survey of content based 3D shape retrieval methods. Multimedia Tools and Applications, volume 39, 441-471, 2008. (Theodoridou et al. 2008) M. Theodoridou, Y. Tzitzikas, M. Doerr, Y. Marketakis, V. Melessanakis, Modeling and Querying Provenance using CIDOC CRM, December 2008 (Verykios et al. 2004) V.S. Verykios, E. Bertino, I.N. Fovino, L.P. Provenza, Y. Saygin, Y. Theodoridis, State-of-the-Art in privacy preserving data mining. ACM SIGMOD Rec. 3, 1, pp. 50–57, 2004. 01/04/2010 83