FOCUS K3D D1.4.1
Deliverable D1.4.1 of Task T1.4
Road map for Future Research
Authors: Chiara Catalano (IMATI), Michela Spagnuolo (IMATI), Michela
Mortara (IMATI), Bianca Falcidieno (IMATI), Andre Stork (FRAUNHOFER),
Marianne Koch (FRAUNHOFER), Pierre Alliez (INRIA), Frederic Cazals
(INRIA), Mariette Yvenec (INRIA), Wolfgang Huerst (UU), Remco
Veltkamp (UU), Marios Pitikakis (CERETETH), Caecilia Charbonnier
(MIRALab), Lazhari Assassi (MIRALab), Jinman Kim (MIRALab), Nadia
Magnenat-Thalmann (MIRALab), Patrick Salamin (EPFL), Daniel
Thalmann (EPFL), Tor Dokken (SINTEF), Ewald Quak (SINTEF)
Date: Thursday, 01 April 2010
Please indicate the dissemination level using one of the following codes:
PP=Restricted to other programme participants (including the Commission Services).
RE= Restricted to a group specified by the Consortium (including the Commission Services).
CO= Confidential, only for members of the Consortium (including the Commission Services).
FOCUS K3D D1.4.1
Vers. Issue Date Stage Content and changes
01 2nd February 2010 80% First version submitted
02 31 March 2010 100% Final submitted version
This document contains the deliverable D1.4.1 of the FOCUS K3D Coordination Action.
Deliverable D1.4.1 is the final technical report of Task 1.4 entitled “Road map for Future
Research”. This deliverable defines the guidelines envisaged by the FOCUS K3D Consortium for
future research, consolidating input from all the different kinds of AWG activities (i.e. ad hoc
meetings, questionnaires, workshops, dissemination events) carried out during the two years
of the project. The document also includes the outcomes of the discussion with AWG members
about the challenges of the road map, as presented during the final FOCUS K3D conference.
FOCUS K3D D1.4.1
Table of Contents
Chapter 0 : Preface................................................................................................. 5
Chapter 1 : Visionary scenarios .............................................................................. 7
1.1 Visionary scenario in BioTech, health & medicine .................................................. 7
1.2 Visionary scenario in Robotics ............................................................................ 8
1.3 Visionary scenario in Business & education, home & leisure.................................... 9
1.4 Discussion ...................................................................................................... 9
Chapter 2 : Definition of the field ......................................................................... 12
2.1 3D media representation: from geometry to semantics........................................ 12
2.2 Knowledge sources in 3D applications ............................................................... 14
2.3 Discussion .................................................................................................... 16
Chapter 3 : 3D in application domains .................................................................. 17
3.1 Medicine ....................................................................................................... 17
3.2 Bioinformatics ............................................................................................... 20
3.3 CAD/CAE and Virtual Product Modelling ............................................................. 23
3.4 Gaming and Simulation................................................................................... 26
3.5 Cultural Heritage and Archaeology.................................................................... 28
3.6 Discussion and conclusion ............................................................................... 30
Chapter 4 : The FOCUS K3D research road map .................................................... 32
4.1 Derive symbolic representations....................................................................... 34
4.1.1 Time line and dependence diagram ......................................................... 43
4.2 Goal-oriented 3D model synthesising ................................................................ 44
4.2.1 Time line and dependence diagram ......................................................... 48
4.3 Documenting the life cycle of 3D objects ........................................................... 49
4.3.1 Time line and dependence diagram ......................................................... 55
4.4 Semantic interaction and visualisation............................................................... 56
4.4.1 Time line and dependence diagram ......................................................... 62
4.5 Standards ..................................................................................................... 63
4.5.1 Time line and dependence diagram ......................................................... 64
4.6 Trends in Semantic Web research..................................................................... 65
4.7 Final discussion ............................................................................................. 68
4.7.1 Medicine scenario: liver segmentation ..................................................... 69
4.7.2 Bioinformatics scenario: from models to annotation................................... 70
4.7.3 Gaming and Simulation scenario: vessel design workflow ........................... 70
4.7.4 CAD/CAE and Virtual Product Modelling scenario: semantics based virtual 3D
product modelling............................................................................................. 72
4.7.5 Archaeology and Cultural Heritage scenario: large-scale repository of 3D digital
artefacts ......................................................................................................... 73
4.7.6 Robust geometry processing: practical benefits across the AWGs ................ 75
4.7.7 Conclusion........................................................................................... 77
References ........................................................................................................... 81
FOCUS K3D D1.4.1
Chapter 0 : Preface
3D media are digital representations of either physically existing objects or virtual objects
that can be processed by computer applications. They may be defined either directly in the
virtual world with a modelling system, or acquired by scanning the surfaces of a real physical
The production and processing of digital 3D content was and still is a traditional field of
expertise of Computer Graphics, but only recently 3D entered the multimedia world: only in
the last decade, indeed, has Computer Graphics reached a mature stage where fundamental
problems related to the modelling, visualisation and streaming of static and dynamic 3D
shapes are well understood and solved. Considering that nowadays most PCs connected to the
Internet are equipped with high-performance 3D graphics hardware, it seems clear that in the
near future 3D data will represent a huge amount of the data stored and transmitted using
Therefore, 3D media introduce a new kind of content in the established multimedia scenario,
with the term multimedia characterised by the possible multiplicity of content, by its
availability in digital form and its accessibility via electronic media. Text, audio, animations,
videos, and graphics are typical forms of content combined in multimedia, and their
consumption can be either linear (e.g. sound) or non-linear (e.g. hypermedia), usually allowing
for some degree of interactivity.
At the same time, research on semantic multimedia, defined by the deep integration of
semantic web techniques with multimedia analysis tools, has shown how to use and share
content of multiple forms, endowed with some kind of intelligence, accessible in digital form
and in distributed or networked environments. The success of semantic multimedia largely
depends on the extent to which we will be able to use them in systems that provide efficient
and effective search capabilities, analysis mechanisms, and intuitive re-use and creation
facilities, at the level of content, semantics and context (Golshani06). Going one step further,
we could easily envisage semantic multimedia systems of a higher level of complexity, in which
the generic architectural framework underpinning semantic multimedia systems could be
extended to knowledge and data intensive applications, which have been historically developed
in sectors that were quite far from multimedia and knowledge technologies.
Throughout this document, we will illustrate this idea focusing on prospective applications of
3D content, as a rapidly emerging new form of media in the semantic multimedia panorama
and an extremely challenging application context for semantic multimedia.
3D media, indeed, encompass all forms of digital content concerning 3D objects used and
managed in networked environments, and not only fancy-looking graphics used in
entertainment applications. 3D media are endowed with a high knowledge value carried either
by the expertise needed to design them or by the information content itself. Currently,
research on multimedia and semantic multimedia is largely devoted to pixel-based content,
which is at most two-dimensional (e.g. images), possibly with the addition of time and audio
(e.g., animations or videos), while 3D media are defined by vector-based representations. Due
to its distinctive properties, 3D media make it necessary to develop ad hoc solutions for
content analysis, content-based and context-based retrieval, modelling and presentation,
simply because most 2D methods do not generalise directly to 3D (SpFa2009).
The AIM@SHAPE Network of Excellence (www.aimatshape.net) was the first EC project
addressing the issue of adding semantics to plain geometrical shapes. The prototype
infrastructure created by AIM@SHAPE and the ontologies developed by it demonstrated the
potential gain for the Computer Graphics community of having detailed metadata attached to
Based on the experience of AIM@SHAPE, the FOCUS K3D project (www.focusk3d.eu)
tackled the more complex problem of raising the interest of users in a number of application
domains for semantics-driven processing of 3D data. 3D graphics are key media in many
FOCUS K3D D1.4.1
sectors, among them the ones we selected for the application working groups in FOCUS K3D:
Medicine and Bioinformatics, Gaming and Simulation, CAD/CAE and Virtual Product Modelling,
Archaeology and Cultural Heritage. In these areas, representing a complex shape in the
various stages of its complete life-cycle is known to be highly non-trivial, due to the sheer
mass of information involved and the complexity of the knowledge that a shape can reveal as
the result of a modelling process.
The research road map presented in this deliverable is an attempt to synthesise the vision
of the FOCUS K3D project on these themes, after extensive discussions with a variety of users
in the application domains. The perspective of the road map is oriented towards challenges
that exist more in the content production and sharing phase, rather than in the networking
Thinking about life fifteen years ago, you will discover how it is different from now and how
some technologies we regularly use today were unimaginable at that time, mobile phones and
internet above all. Therefore, we decided to start the document with three visionary scenarios,
which imagine different aspects of life in 2040. What emerged from the stories in the end is
that semantic 3D content will strongly permeate everyday life, even if not directly perceived. A
discussion of where, how and which kind of 3D data will be involved follows to make the link
between geometry and semantics explicit.
In the second chapter, we take a step backward and briefly define our view of semantic 3D
media, as originated from AIM@SHAPE, while, in the third chapter, we describe their current
status in the application working groups selected in FOCUS K3D. The state-of-the-art reports
produced by FOCUS K3D are an important source of information for the interested readers
(see deliverables D2.1.1, D2.2.1, D2.3.1, D2.4.1), as they give an idea of the level of
acquaintance with 3D semantics in these fields, with expected and natural differences between
more established fields, such as CAD/CAE and Product Design, when compared to fields such
as the Gaming and Simulation domains, where the potential of a concurrent use of semantics
and digital 3D data is really high but relatively less perceived and reflected by current practices.
Finally, chapter 4 provides the real road map, in which we have identified high-level goals
that somehow represent the long-term challenges that the Computer Graphics community
should face as targets for new and disruptive research, with a strong need to breach the
borders of a single discipline, calling for a truly multi-disciplinary effort. Within each of these
high-level challenges, we have also identified a number of mid-term challenges that, without
being exhaustive of course, could be required building blocks for further important research
After the experience of FOCUS K3D and as reflected by the research road map, the path
towards really effective semantic 3D content is still complex, but doable. As the research road
map tries to convey, we believe that the potential of semantic multimedia technologies could
be fully exploited in 3D and knowledge intensive application areas, where the processes deal
with contents of multiple form and type, the processing workflows are guided by knowledge
and semantics, and the working environment is usually distributed. Computer Graphics is
ready to answer these challenges, but it needs a closer connection with Knowledge
Management, Machine Learning, and Cognitive Modelling.
FOCUS K3D D1.4.1
Chapter 1 : Visionary scenarios
To illustrate how pervasive semantic 3D technology will be in about 30 years, we present in
the following three visionary scenarios and discuss which aspects related to 3D and semantics
are involved and should be tackled in the future. The three scenarios cover some areas where
3D media are a horizontal technology enabling the megatrends of the future, which are, as
many experts agree (Fut09):
• BioTech, health, medicine;
• Business & education, home & leisure.
1.1 Visionary scenario in BioTech, health & medicine
The scenario takes place in 2040. The current challenge for Dr Edwardes is to improve the
quality of life of his patient Bill Thomson who is suffering from a serious arthritic problem in
one of his knees. Bill not only suffers from unbearable pain, but also from the serious
discomfort of having limited flexion.
To begin with and check whether the pathology is amenable to drug treatment, Dr
Edwardes decides to have Bill's genome fully sequenced using the latest high throughput
sequencing technique. The output of the procedure being the collection of Bill's genes, Dr
Edwardes aligns portions of this genome against those of other patients suffering from related
disorders, so as to precisely identify the genetic determinants of the disease and check for
possible treatments. The matching portions of the genome are visualised through a
holographic projector. Interestingly for Dr Edwardes – but not for Bill – three genes appear as
mutated, respectively HOOK-01, ESS-21, and RGid-07, each of them being involved in a
different form of arthritis. While the first two mutations are amenable to drug treatment, no
drug is known for the third one. Therefore, Dr Edwardes carries out a homology search in the
Advanced Protein Data Bank to see whether the structure of proteins whose sequence is
similar to that of the protein coded by RGid-07 is known, and indeed finds one such structure.
Screening a library of candidate drugs through a virtual panel against this protein, Dr
Edwardes identifies two of them with good binding affinity. These molecules are candidate
drugs, and he hands the investigation to a colleague of his at the National Institute of Arthritis,
so as to check whether the protein coded by RGid-07 can be patched by one of these drugs.
Meanwhile, he and Bill decide to go for a surgical treatment.
The latest advances in computer-aided medicine enable Dr Edwardes to tackle the challenge
through function-driven modelling and simulation, with precise estimates of the risks so as to
optimise both the decisions and the treatments. The novel methodology consists of driving the
whole medical process by modelling, simulating and monitoring it with a constant eye on the
end goal, here restoring the function (knee motion) while minimizing the pain. The first crucial
decision is to move from non-operative to operative treatment: on the one hand, pain
medication controls pain but does not change the underlying arthritic problem; on the other
hand, it is possible to go ahead with prosthesis surgery albeit with some non-negligible risks.
In the present case the results of a simulation on patient-specific data led Bill to decide to go
ahead with surgery in order to place an implant. Different from previous approaches the shape
of the implant is not chosen among a predefined set but instead optimised through the
simulation so as to best restore the original motion amplitude and flexion of the knee. The
implant is manufactured by a 3D printer, which can make physical 3D shapes from a variety of
biologically compatible materials, while the actual surgical operation is robot-assisted. Another
novelty is that the body is treated as a global system in order to best engineer the implant
together with the treatment that comes with it in order to avoid infections and other
complications such as late mechanical dysfunction of the implant. The present system mixes
knowledge and geometric modeling of the implant. This requires simulating and monitoring
FOCUS K3D D1.4.1
Bill’s organism at different scales ranging from the proteins to the organs through the cells,
tissues and ligaments. After surgery Bill has to live with a machine, which monitors his body
until the risk of complication is considered low enough to return to a normal life. One year later
Bill asks Dr Edwardes whether or not he would be able to pursue an athletic activity, like when
he was 25. To answer this question Dr Edwardes runs a new series of acquisitions and
simulations (not reimbursed by social security!), which provide precise quantitative risk
estimates. While looking at the simulation Bill has the strange feeling of being engineered as
an industrial product by a computational engineer. This is, however, nothing compared to the
fun of running once more the Boston marathon.
1.2 Visionary scenario in Robotics
It is Monday morning and Cynthia is starting her new working week. She is a product
development engineer at Robotical Inc., a company focusing on the development of the next
generation of humanoids. Cynthia is specialised in behaviour modelling and simulation with a
special focus on swarm behaviour. At the beginning of her career she developed a 3D sensor
system and the embedded software for the interpretation of hand gestures of human beings
for robot-human-interaction. Her system was able to build a 3D model of the situation and to
interpret the gestures in a semantically described context.
After breakfast Cynthia gets into her car. The car is the new AUTONObile 2040 model.
Cynthia just tells the car the destination of her ride. The autonomous automobile checks the
traffic situation, plans the trip and starts to move. All of a sudden a child runs across the street
and the AUTONObile brakes instantly. Its semantic reasoning systems have come to the
decision to brake after analysing the 3D scene, which is constantly generated and updated by
various 3D sensors (global positioning system, lasers, radars, and cameras). The decision to
brake is taken after considering the distance between the car and the child and rejecting the
possibility to drive around the child because of opposing traffic. After the situation is cleared,
the car continues. Having ‘delivered’ Cynthia to her office, the car searches for a parking space
and waits for the next order.
During the day, Cynthia is trying to optimise the shape and behaviour of the new humanoid
she is developing with her colleagues at Robotical Inc. She is using the newest version of
CAHIA (Computer-Aided Humanoid Interactive Application), which provides function-based
constraints on free-space deformation techniques. With CAHIA all kinds of behaviour models
can be embedded and simulation models are generated on the fly and evaluated in real-time.
Not only one humanoid can be optimised but also strategies for jointly solving problems can be
used. The humanoid is to work as a fire fighter to replace humans in this dangerous job. To
achieve this, group behaviour in 3D environments needs to be trained and simulated.
While waiting for Cynthia, her intelligent house communicates with AUTONObile and asks to
bring some beverages from the supermarket. The car uses the drive-through option of the
supermarket and a service robot is putting the beverages into its trunk. In the meantime the
car’s traffic system receives the information that weather conditions will change rapidly within
the next hours and snowy roads are expected. So on the way back to Cynthia’s office it drives
to the next garage to have the tires changed.
After her working day, Cynthia calls the car to fetch her at the entrance of her office and
they start their way home. During the drive, the car is informed about an accident that
happened on the road ahead. Thus, it is starting to re-plan the trip. This is done in online-
communication with the other vehicles participating in traffic to avoid jams due to too many
cars planning the same route. In addition to the street map, also the 3D profile of the
landscape is taken into account to optimise energy consumption.
AUTONObile returns Cynthia safely to her home. Cynthia is very happy with her personal
mobility assistant. Also her children appreciate their personal chauffeur and Cynthia is
unworried since she knows that the grandparents will supervise the safe and comfortable drive
she already programmed to allow only for certain routes, e.g. to their friends and the football
stadium. AUTONObile recognises the people authorised to drive using 3D face recognition
together with voice analysis, a system Cynthia developed herself after finishing the gesture
FOCUS K3D D1.4.1
1.3 Visionary scenario in Business & education, home & leisure
When Semantha gets into her next sleeping phase, the mattress adjusts automatically, and
the room lighting changes slightly, to anticipate her waking up soon. After she gets up, she
prepares for going to work, and leaves a message to her daughter Alice, that will be
synthesised by her private avatar with her own voice.
Semantha is a 3D modeller at the gaming company 'Iterative Life'. When she approaches
the company building, her gait and face are recognised, and entrance is permitted. She is
currently working on the design and modelling of a collection of characters for a new game:
"Parallel Universe Loophole Paradise". The characters in the game, the PULPese, have feet like
wheels, arms like fins, and ears like wings. In her 'Evolving Personalised Information
Construct' (Epico, marketed by GoogleZone, the joint venture of Google and Amazone)
Semantha sees that she has modelled the feet last week. According to the schedule, she now
must design the accompanying gadgets for chiropody. She invokes the encyclopaedic
dictionary to learn about that, and the semantic network helps her to link these concepts to
the maintenance tools for wheels. In the collection of instruments she wants to select those
instruments that somewhat fit to the feet and can be used as a start for modelling new ones.
Then, she activates her “FindMeShape”, the shape matcher tool able to select the appropriate
models for the redesign process.
She is interrupted by her iWant, her personal digital life assistant, a seventh sense to her.
She has to collect Alice from school. On her way, she consults City 3.0 to find a restaurant
near school and near the transportation dock, so that they do not lose time on their way to the
afternoon event. One of her friends that are currently also signed in to City 3.0 tells her of a
good place to eat. They decide to spend the afternoon visiting the Clonehenge area, a replica
of the Stonehenge site that disappeared under water after the sudden ice meltdown
catastrophe in 2025. When Alice puts on her glasses, she sees the enhanced environment
consisting of projected 3D media and the real world. Through the looking glass, Alice sees a
wonderland with a lot of cultural information on this piece of heritage, and she gets really
affected by experiencing the past. When they finally get back home, Semantha continues her
work to meet the planning schedule, while Alice is doing homework with her iCat.
In the three stories above, we tried to envisage how technology will influence and support
different aspects of our future life. Although it may not be obvious, 3D multimedia play a key
role in the megatrends identified. In all the three scenarios futuristic technology makes strong
use of 3D data: since our perception of the world is 3D, the advanced tools of the future
should be able to support it as efficiently and effectively as (and together with) any other kind
The interaction between real and virtual world will be tighter: virtual avatars will replace
humans, while intelligent environments will become the human-computer interface. As a
consequence, one of the main research trends refers to the description and understanding of
the surrounding world, which have to rely on efficient methodologies for modelling and
processing 3D data. Another prominent aspect is the personalisation of the digital content and
technology according to the context and the user. Therefore, the formalisation of the data
cannot be static, but it should vary dynamically according to the purpose, the context and the
In particular, the BioTech, health & medicine scenario implies a function-centric modelling
and simulation of the human body, which relates different scales ranging from proteins to
organs. In such a context where the human body is simulated as a complex system, semantic
plays a key role to model the knowledge that links the various scales and the border between
medicine and computational engineering becomes fuzzier. The future of computer-assisted
medicine will certainly require thinking in terms of goal-oriented modelling and simulation
FOCUS K3D D1.4.1
(with the goal being a knee that functions well in terms of motion amplitude, and with no pain),
where some parts of the data come from knowledge about the generic human anatomy and
other parts come from patient-specific acquisition and semantic annotation both before and
after surgery. Furthermore, as medicine is not an exact science and developed countries care
increasingly about legal issues and quality assurance, precise semantic description and
estimates of the risks of any physical medical procedure will probably be of increasing
importance. While generic knowledge about the human body model will certainly be of crucial
importance for statistical risk estimation, a patient-specific estimate will require semantics-
based analysis of massive data sets coming both from past simulations and real experiments
carried out on other patients, which show some similarity.
In addition, the adoption of theragnostics is investigated more and more, which is a
treatment strategy that combines therapeutics with diagnostics. It associates both a diagnostic
(semantically rich) test that identifies patients most likely to be helped or harmed by a new
medication, and a targeted drug therapy based on the test results. Bioinformatics, genomics,
proteomics, and functional genomics are molecular biology tools essential for the progress of
Generic human anatomy, as well as patient specific data, will be modelled and integrated on
3D replicas of the physical parts and used to attach information and run simulation models
that will be tailored for each patient. Finally, and instead of thinking of computer-aided
medicine only when the disease or pain has already occurred, the future of computer-aided
medicine will most probably consist of increasingly preventive medicine, where modelling and
simulation will again be put to use for monitoring, predicting and avoiding dysfunction as much
as possible. In addition, robots will have an increased role in assisting surgery with high
Not only in medical applications the number of robots will be huge in the future, as
technology and market forecasts indicate. Not only their population will increase in terms of
sheer numbers but they will also drive some key technology developments and will become
ever more intelligent.
Marvin J. Cetron, President of Forecasting International and a member of the world Future
Society board, forecasts (Fut09) that:
• until 2020 we will have self-diagnostic and self-repairing robots in self-monitoring
infrastructures for almost any job in homes and hospitals;
• until 2030 the robot population will surpass human population in the developed
• in 2040 robots will completely replace humans in the workforce.
Robots can be regarded as intelligent machines, and semantic 3D models are key to their
development and for their successful use.
As pointed out in the robotics scenario, to orient themselves in the real world and to
communicate with human beings and other robots, they first and foremost need to build up an
understanding about the scene they are surrounded by. It is not sufficient to create just a
point cloud or a static (or dynamic) distance field – e.g. to avoid collisions –, the robot has to
create a ‘mental’ model from the objects it is surrounded with. It needs to understand the
meaning of those objects, it needs to know the actions they may perform, it needs to
understand their dynamics if they are moving, in order to be able to plan its own actions.
These are issues related to understanding the 3D world, valid not only for robots but for all
kinds of interaction between virtual entities and humans. The outdoor navigation domain is a
great challenge for three-dimensional perception. One key to successful navigation in real
world environments is the robot’s ability to reason about its environment in three dimensions,
to handle unknown space in an effective manner and to detect and navigate in environments
with obstacles of varying shapes and sizes. We believe that the problem should be addressed
by a tighter coupling of geometry processing methods with cognitive science, so that the
properties characterising the shapes can be better aligned with features that are used by
humans to “understand and classify” shapes. Also machine learning or similar statistical
FOCUS K3D D1.4.1
methods are expected to play a key role to cope with the complexity and variability of 3D
object shapes, and as tools supporting the automatic segmentation of massive datasets.
To be able to do so, the robot needs semantic and geometric 3D descriptions of reference
objects (learnt/trained knowledge about objects potentially being present in the scene and
their functionality). Such models have to be ‘implanted’ into the robot possibly along with self-
learning mechanisms, i.e. they have to exist when the robot ‘comes to life’, which takes us to
the development phase for robots. There is already a growing tendency to introduce high-level
semantic information into robotic systems. This tendency is visible in many areas, including
mapping and localisation, robotic vision, human-robot interaction, and the increasing use of
ontologies in robotics.
Today, robots are systems consisting of mechanics, electronics, and software. Some behave
in a pre-programmed way, some react to their surroundings. In any case the development
process of robots also requires semantic 3D models (not only the use phase of a robot).
Robots can be built more rapidly if functional-oriented design and reuse of functional
components are intrinsically supported by virtual product development tools along with
advanced simulation techniques. Ontological information is has increasingly being used in
distributed systems in order to allow for automatic reconﬁguration in the areas of ﬂexible
automation and of ubiquitous robotics. There is a need to use ontological information to
improve the inter-operability of robotic components developed for different systems.
Also in the Business & education, home & leisure scenario, we see that a number of new
technologies will be present, which allow for an integrated approach to search for and find
functionalities, are built on the massive semantic annotation of 3D media, exploit the re-use of
personalised information, depend on the connectedness between real and virtual worlds, and
utilise location-aware ambient intelligence and smart objects, which altogether make the
production and management workflows much more effective, and the leisure and care tasks
much more affective and satisfying than in the old days before 2010.
Although some of the above technologies are currently being developed, there is still a long
road to go. Through professional production, user content generation, and digitization and
preservation of cultural heritage, large amounts of digital media such as images, video, music,
three-dimensional objects and environments are becoming available. Tools for creation, editing,
searching and interaction have been developed by now. New techniques are being developed
to build games, model 3D worlds, and create virtual prototypes of new products or
The next generation of applications and environments will require more natural interaction
and intelligence to become more dynamic, affective, and satisfying. Tools are necessary to
create virtual persons that are endowed with speech and social and emotional intelligence.
Toolboxes and interfaces must be created to add ambient intelligence to working and living
spaces. These will lead to newly created products and services for personalised information
finding, connecting real and virtual worlds, experiencing past culture, and adopting its lifestyle.
Functionality and elements that are currently missing are rich semantic annotations of the
world and environments, massive semantic annotation of 3D objects, localisation in 3D
environments, and tangible 3D visual interfaces.
In the next chapter, our view of 3D semantic multimedia will be introduced and the current
status of the field will be briefly described; chapter 3 will then give an overview of some
FOCUS K3D D1.4.1
Chapter 2 : Definition of the field
In this chapter, we will introduce our view of semantic 3D media, taking into account the
perspective of researchers in the field of Computer Graphics as well as the perspective of the
users in the reference application domains. Researchers in Computer Graphics are indeed key
players in the process of formalising the semantics through knowledge technologies as they
are the experts able to develop the appropriate tools and schemes for documenting and
sharing 3D media representations, in close cooperation with experts in different domains. In
the following, Section 2.1 will discuss the evolution of 3D modelling paradigms, from the
traditional geometry-oriented to the emerging semantics-driven approaches. In Section 2.2,
we will outline the sources of knowledge that we believe need a formalisation and a more
effective management much more integrated with the geometric modelling pipeline of the
object’s shape. A discussion in Section 2.3 concludes this chapter.
2.1 3D media representation: from geometry to semantics
Knowledge technologies can be effectively used in complex 3D application fields if the
underlying 3D modelling approach is able to support and encapsulate the different levels of
abstraction needed for describing an object’s form, function and meaning in a suitable way.
The EC Network of Excellence project AIM@SHAPE proposed an evolution of the traditional
geometric modelling paradigm towards a semantics-based modelling paradigm, which is
consistent with the different description levels used for traditional 2D media and reflects an
organisation of the information content that ontologies dealing with 3D application domains
could nicely exploit.
The use of computers has revolutionised the approach to shape modelling, opening new
frontiers in research and application fields: Computer Aided Design, Computer Graphics and
Computer Vision, whose main goal is to discover basic models for generating and representing
shapes. In the beginning, this effort gave rise to research in geometric modelling, which
sought to define the abstract properties, which completely describe the geometry of an object,
and the tools to handle this embedding into a symbolic structure. The visual aspect of shapes
has deeply influenced the development of techniques for digital shape modelling, which have
mainly focused on the definition of mathematical frameworks for approximating the outer form
of objects using a variety of representation schemes.
The same geometry can be represented by different approximations (e.g. meshes,
parametric surfaces, unstructured point clouds, to cite a few), each representation being
chosen according to its advantages and drawbacks with respect to application purposes. For
example, complex triangulated meshes are badly suited to interactive animation; simpler
representations such as point set surfaces can be sufficient for special classes of applications
(e.g. ray-tracing techniques only require ray-surface interrogations). The conversion between
distinct representations is still a delicate issue in most systems but there are tools to derive
triangular meshes from the majority of representation schemes. The selected representation
model will eventually be mapped into an appropriate data structure, which is computer-
understandable and devised according to optimisation and efficiency criteria.
It is important to point out that for the development of semantic 3D media the
conceptualisation of the geometry is not an issue: geometric modelling is, by itself, the answer
of the Computer Graphics community to the need of defining and expressing in formal terms
concepts and relationships concerning the geometric representation of 3D media. Shape
models and related data structures encapsulate all the geometric and topological information
needed to store and manipulate 3D content in digital form.
While the technological advances in terms of hardware and software have made available
plenty of tools for using and interacting with the geometry of shapes, the interaction with the
semantic content of digital shapes is still far from being satisfactory: we still miss effective and
robust tools to manage non-spatial information, i.e. the information not related to the shape of
FOCUS K3D D1.4.1
the object (e.g. ownership, material, price) and the information related to the high-level
conceptualisation of the object, either expressed as features (e.g. size, volume, compactness,
presence of holes) or as synthetic concepts (e.g. what it is, what it is used for). Stated
differently, the semantic gap - the lack of coincidence between the information that one can
extract from the visual data and the interpretation that the same data have for a user in a
given situation (Smeulders00) –is still far from being filled.
To tackle this problem we may follow two complementary paths: on the one hand, it is
possible to integrate the missing information through the definition of concept-based metadata
(either by following a formal conceptualisation encoded in an ontology or by allowing a free
tagging of the resources); on the other hand, it is possible to exploit state-of-the-art
descriptors and analysis tools to extract content-based information from spatial information.
Both of the mentioned approaches help to narrow the semantic gap: the design of ad-hoc
ontologies and the development of tools for a versatile annotation of 3D objects are part of the
plan, but yet another important element is crucial for the description process: the role of the
user. In fact, it is not sufficient to provide a generic interpretation of the data, the
interpretation should take the context into account. The awareness of the context should
enable a dynamic and user-oriented description in which only the most relevant content-based
descriptors are activated and combined.
The integration of concept-based and content-based approaches and their embedding in a
context dynamically defined by the user could be an efficient solution, whose evaluation was
one of the main goals of FOCUS K3D through the Application Working Groups.
This integration can be achieved if the 3D content is organised in a way that takes into
account and supports reasoning at different levels of abstraction and that goes beyond the
limits of the pure geometry. The conceptualisation of the knowledge domain has to adhere to a
suitable organisation of the geometric data about the shapes. In AIM@SHAPE, shapes were
defined as characterised by a geometry (i.e. the spatial extent of the object), describable by
structures (e.g. form features or part-whole decomposition), having attributes (e.g. colours,
textures, or names, attached to an object, its parts and/or its features), having a semantics
(e.g. meaning, purpose, functionality), and possibly having interaction with time (e.g. history,
shape morphing, animation) (Falcidieno et al. 2004).
Leaving the traditional modelling paradigm, a simple one-level geometric model has to be
replaced by a multi-layer view, where both the geometry and the semantics contribute to the
representation of the shape. At the same time the structure is seen as an important bridge
towards the semantics, as it supports a natural manner to annotate the geometry with
semantic information. The key idea is that, while the geometry of a shape is unique, its
structure can be defined in many possible ways, each contributing to the creation of a specific
structural model, or view, of the shape. These views of the shape make it easier to interpret
the geometry according to different semantic contexts and attach appropriate content to the
relevant parts of the shape.
To realise the shift towards this new modelling paradigm, it is also necessary to develop
new and advanced tools for supporting semantics-based analysis, synthesis and annotation of
shape models. They correspond to image analysis and segmentation for 2D media: features of
a 3D model are equivalent to regions-of-interests in images. There is, however, a different
input and a slightly different target, as image analysis is trying to understand what kind of
objects are present in the scene captured by the image, while in 3D segmentation the object is
known and it has to be decomposed into meaningful components that might be used to
manipulate and modify the shape at later stages. From a high-level perspective, the main
differences concern the different nature of the content: descriptors used for 2D images are
concerned with colour, textures, and properties that capture geometric details of the shapes
segmented in the image. While one-dimensional boundaries of 2D shapes have a direct
parameterisation (e.g. arc length), the boundary of arbitrary 3D objects cannot be
parameterised in a natural manner, especially when the shape exhibits a complex topology,
e.g. many through-holes or handles. Most notably, feature extraction for image retrieval is
intrinsically affected by the so-called sensory gap: “The sensory gap is the gap between the
FOCUS K3D D1.4.1
object in the world and the information in a (computational) description derived from a
recording of that scene” (Smeulders00). This gap makes the description of objects an ill-posed
problem and casts an intrinsic uncertainty on the descriptions due to the presence of
information, which is only accidental in the image or due to occlusion and/or perspective
distortion. However, the boundary of 3D models is represented in vector form and therefore
does not need to be segmented from a background. Hence, while the understanding of the
content of a 3D vector graphics remains an arduous problem, the initial conditions are different
and allow for more effective and reliable analysis results and offer more potential for
interactivity since they can be observed and manipulated from different viewpoints.
Finally, at the semantic level, which is the most abstract level, there is the association of
specific semantics to structured and/or geometric models through annotation of shapes, or
shape parts, according to the concepts formalised by a specific domain ontology. In the Figure
below, a possible shape analysis pipeline is shown: in (a) the digital model of a chair obtained
by acquisition is shown, in (b) the shape represented in structural form can be analysed in the
domain of knowledge related to Computer Animation and the regions of possible grasping are
annotated accordingly (see (c)). The aim of structural models is, therefore, to provide the user
with a rich geometry organisation, which supports the process of semantic annotation.
Consequently, a semantic model is the representation of a shape embedded into a specific
context, and the multi-layer architecture emphasises the separation between the various levels
of representations, depending on the knowledge embedded as well as on their mutual
(a) (b) (c)
From geometry to semantics
The multi-layer view of 3D media resembles the different levels of description used for other
types of media, but there is a kind of conceptual shift when dealing with 3D media: here, we
have the complete description of the object and we want to be able to describe its main parts,
or features, usually in terms of low-level characteristics (e.g. curvature, ridges or ravines).
These features define segmentations of the shape itself, which is independent of a specific
domain of application but that carry a geometric or morphological meaning (e.g. protrusions,
depressions, and through holes).
2.2 Knowledge sources in 3D applications
Effective and efficient information management, knowledge sharing and integration have
become an essential part of more and more professional tasks and workflows in Product
Modelling, one of the first fields where semantics came into play. However, it is clear that the
same applies also to other contexts, from the personal environment to other applied sectors.
There is a variety of information related to the shape itself, to the way it has been acquired or
modelled, to the style in which it is represented, processed, and even visualised, and many
more aspects to consider.
The description of a shape is intrinsically not unique and varies according to both the
application and user contexts. Therefore, the abstraction levels used to process or reason
FOCUS K3D D1.4.1
about 3D media should correspond to the mental models used to answer questions such as
“what does it look like?”, “what is its function?”, thus making it possible to model, manipulate
and compare the various 3D media in a semantics-oriented framework.
Therefore, the ingredients needed to implement a 3D semantic application should definitely
include a conceptualisation of the shape itself, in terms of geometry, structure and semantics,
and of the knowledge pertaining to the application domain. In order to fulfil the requirements
of complex 3D applications, we need tools and methods to formalise and manage knowledge
related to the media content and to the application domain, at least at the following levels
• knowledge related to the geometry of 3D media: while the descriptions of digital 3D
media can vary according to the contexts, the geometry of the object remains the same
and it is captured by a set of geometric and topological data that define the digital
• knowledge related to the application domain in which 3D media are manipulated: the
application domain casts its rules on the way the 3D shape should be represented,
processed, and interpreted. A big role is played by the knowledge of the domain experts,
which is used to manipulate the digital model: for example, the correct manner to
compute a finite element mesh of a 3D object represented by free-form surfaces is
subject also to informal rules that should be captured in a knowledge formalisation
• knowledge related to the meaning of the object represented by 3D media: 3D media
may represent objects that belong to a category of shapes, either in broad unrestricted
domains (e.g. chair, table in home furniture) or narrow specific domains (e.g. T-slots,
pockets in mechanical engineering). The shape categories can also be defined or
described by domain-specific features, which are the key entities to describe the media
content, and these are obviously dependent on the domain.
The first bullet point of the list is concerned with knowledge, which has geometry as its
background domain. There are a variety of different representations for the geometry of 3D
media that cannot be simply reduced to the classical Virtual Reality Modelling Language
(VRML) descriptions and its variations, as currently supported by MPEG-4. Here, the view of 3D
media is more concerned with the visualisation, streaming and interaction aspects than with
requirements imposed by applications. 3D geometry, as used in applications, has to do with a
much richer variety of methods and models, and users might have to deal with different
representation schemes for the same product within the same modelling pipeline. In this sense,
describing the content of 3D media in terms of geometric data is much more complex for 3D
than for 2D media. There are many attributes and properties of 3D models that scientists and
professionals use to exchange, process and share content, and all these have to be classified
The second bullet point refers to the knowledge pertaining to the specific application domain,
but it has to be linked to the geometric content of the 3D media. Therefore, if we want to
devise semantic 3D media systems with some reasoning capabilities, we have to formalise also
expert knowledge owned by the professionals of the field.
Finally, the third bullet point has to do with the knowledge related to the existence of
categories of shapes; as such, it is related both to generic and specific domains. Usually in 3D
applications, it is neither necessary nor feasible to formalise the rules that precisely define
these categories in terms of geometric properties of the shape, besides very simple cases.
However, due to the potential impact of methods for search and retrieval of 3D media, there is
a growing interest in methods that can be used to derive feature vectors or more structured
descriptors that could be used to automatically classify 3D media.
FOCUS K3D D1.4.1
3D media content is growing both in volume and importance in more and more applications
and contexts. In this chapter, we have introduced the issues related to handling this form of
content from the point of view of semantic media systems, focusing on the level at which we
should be able to capture knowledge pertaining to 3D media.
3D applications are characterised by several knowledge-intensive tasks that are not limited
to browsing and retrieving properly annotated material, but deal with the manipulation,
analysis, modification and creation of new 3D content out of the existing one. While this is true
also for traditional 2D media applications, the tools and expertise needed to manipulate and
analyse 3D vector-based media are still the prerogative of a rather specialised community of
professionals and researchers in the Computer Graphics field. In this sense, the Computer
Graphics community could make a significant contribution: on the one hand, it could provide
modelling and processing tools able to handle and exploit the semantics of the shape, and
interaction and visualisation tools, which allow users to interact with the shape in an intelligent
way; on the other hand, it could contribute to comprehensive schemes for documenting and
sharing 3D media and related processing tools, to be linked and further specialised by experts
in different domains.
Semantics-aware classification of available tools for processing the geometry and
manipulating 3D shapes will be an important building block for the composition and creation of
new easy-to-use tools for 3D media, whose use and selection can be made available also to
professional users of 3D who are not experts in Computer Graphics. The use of 3D is indeed
spreading out of the traditional communities of professional users and will soon reach a
general and inexperienced audience.
The role of experts in 3D modelling for the development of semantic 3D media is twofold:
on the one side, the identification of key properties for the description of 3D media and
processing tools, and on the other side, the contribution to the development of advanced tools
for the interpretation, analysis and retrieval of 3D content.
It is evident that if we want to be able to reason about shapes at the geometric, structural
and semantic level, then we have to be also able to annotate, retrieve and compare 3D
content at each of the three layers (SpFa2009).
In the next chapter the current status of semantic 3D media in the four Application Working
Groups explored during the FOCUS K3D project will be described, while in chapter 4 the grand
challenges identified along the way will be stated together with a list of open issues necessary
to reach these goals.
FOCUS K3D D1.4.1
Chapter 3 : 3D in application domains
FOCUS K3D has identified four representative application areas that are both consolidated
(Medicine and Bioinformatics and CAD/CAE and Virtual Product Modelling) and emerging
(Gaming and Simulation and Archaeology and Cultural Heritage) in the massive use of 3D
digital resources coupled with knowledge. The selection of these different application areas was
motivated by two reasons. First, all these fields face a huge amount of available 3D data;
second, the use of 3D data is not only related to visual aspects and rendering processes, but
also involves an adequate representation of domain knowledge to exploit and preserve both
the expertise used for their creation and the information content carried.
Among the first actions of FOCUS K3D, a state-of-the-art report (STAR) was produced for
each of the four application areas to report on the differences and similarities in the
approaches and desiderata of users in these representative areas (see deliverables D2.1.1,
D2.2.1, D2.3.1, D.2.4.1). These four STARs were meant to identify the use of current
methodologies for handling 3D content and coding knowledge related to 3D digital content in
the different contexts, and also to identify possible gaps. Each STAR therefore reflects a
snapshot of the current situation in the four subfields that all massively use 3D data, but which
are distinct in their characteristics, requirements, status, and condition.
In this chapter, we provide a summary of the STARs discussed from the perspective of the
research road map, aiming at describing each of the five application areas (Medicine and
Bioinformatics are presented separately for the sake of clarity) and discussing the current
situations and limitations as perceived by the users in the different domains.
Computer-aided medicine is a rapidly growing field and recent
advances in medical imaging yield 3D shape data that are more and
more reliable, accurate and massive. Different medical applications
such as diagnosis, radiotherapy planning, image-guided surgery,
prosthesis fitting or legal medicine now heavily rely on the analysis
and processing of 3D information (e.g., data measuring the spatial
extent of organs, data about the electrical activity of the brain),
requiring different criteria for the specific 3D content they use. 3D
content is sometimes used for shape measurements, as basis for
diagnostic information, or to establish statistical reference data. Therapy planning, image-
guided surgery and assessments of post-operative results are other typical examples of use of
The starting point of a typical computer-assisted diagnosis session is often a set of image
data acquired from CT, MRI, MicroCT or other medical imaging devices. The current goal is to
generate a digital 3D model suited to visual inspection and analysis, but the goal could be
much more ambitious, such as simulation, electromagnetic wave propagation and temperature
distribution for hyperthermia, soft tissue deformation for cranial-maxillofacial surgery,
biomechanics for orthopaedic surgery, computational fluid dynamics for rhino-surgery,
electrical potentials for electro-cardiology, and many more. In between, we find not only a
sequence of automatic image processing algorithms but also a true reconstruction pipeline,
which involves data registration, segmentation, 3D geometry generation and processing.
While it is now commonplace to store medical images in digital form, 3D content used in
medical applications is far from easy to obtain: there is a labour-intensive pipeline to get from
the raw images generated by the medical imaging devices to the digital reconstructed
anatomical models suitable to medical applications, such as therapy planning or surgical
The current assessment reveals that the pipeline is far from being automatic and requires a
FOCUS K3D D1.4.1
lot of knowledge-based interaction, in particular for the segmentation of the images. Even after
decades of research, a major challenge is to recover the inherent 3D models of anatomical
structures from 3D medical images, as accurately and efficiently as possible. Surprisingly, the
most common clinical practice for this task is an interactive correction of an automatic
segmentation algorithm by a medical expert, which incorporates some knowledge throughout
the pipeline. One important thread of research to facilitate this process is therefore devoted to
the elaboration of semi-automatic tools such as magic wands, intelligent scissors or
deformable models. As the resolution of the images always increases, it is however commonly
admitted that such a labour-intensive interactive approach is not a long term approach. This
motivates another thread of research devoted to the elaboration of fully automatic methods,
which incorporate the knowledge through statistical or model-based approaches. One required
feature is to let a door open for the expert to adjust the segmentation as a post-process in the
few cases where the automated process fails. Similar issues arise for geometry modelling and
processing, where one of the goals is to preserve the multi-domain labels during surface
extraction and mesh generation.
The practitioners resort to creating 3D models of organs or of human beings for simulation
purposes. Those models are frequently input to finite element software, for example to
simulate problems related to the hip joint from patient-specific data. This is a typical example
of goal-oriented modeling, where the geometry of the organ to be modelled needs to fulfil the
specific requirements imposed by the technical solution required to solve the objective, or
function to be simulated, which might differ from those of the simple visual inspection: for
finite element analysis, for instance, a fine volumetric mesh representing the boundary and
interior of the bone is required, while for the simple visualisation of the bone shape a coarser
triangulation of its boundary surface would suffice.
Another example is the simulation of the human articulations for replacement by prosthesis.
The different steps are bone resection, prosthesis alignment, and checking of motion
amplitudes. In terms of knowledge, the parameters of interest are contact pressure within the
cartilages, forces and muscle biomechanics. The connections to physiology are related to
material properties as well as to bone motion and geometry. The current state of the art
assesses that such a data process is considerably simplified when these parameters are stored
into an ontology, and even more so if parts of the models of the geometry could be annotated
with functional information or comments by the medical doctors.
3D data and knowledge are again tightly coupled in the simulation of laparoscopic surgery
for training and practising. The goal is to simulate the planned surgery with a robot acting on a
3D model, so as to rehearse the gestures and to check the feasibility of the complete process.
The knowledge involved is related to the anatomy, the forces applicable to the organs and the
physical properties of a tumour.
Virtual anatomy is perceived as a central problem concerned with the elaboration of 3D
geometry models of human anatomy. 3D models must be augmented with knowledge about
the function of the organs, as the primary end goal in computer-aided medicine is to obtain a
human body, which functions well after surgery or treatment. The function itself cannot be
decoupled from the topological and geometrical relationships between the organs. Virtual
anatomy is important in medicine with respect to patient-specific computer-assisted therapy
planning, as well as for industrial applications, such as virtual crash tests where human models
are involved. It is common to distinguish between individual anatomy and generic anatomic
atlases, which demonstrate anatomical structures and their relationships. Individual anatomies
are of special interest when pathological situations need specific attention with respect to a
medical treatment. Generic anatomy models are carefully designed using modelling software,
whereas patient-specific anatomy models come as output of a scanning device, associated with
point set based or image-based reconstruction algorithms.
Another noticeable trend, in simulating and monitoring surgery and therapy is the
replacement of generic anatomic atlases by patient-specific digital anatomical models. This
trend results in the generation of more and more massive 3D content, thus requiring the help
of knowledge technologies for classification, storage and retrieval of those data.
FOCUS K3D D1.4.1
Monitoring therapy or following disease evolution requires storing and retrieving various
data for each patient over long time scales. A typical workflow example is provided by a
clinician in a department of neuro-rehabilitation, which routinely performs transcranial
magnetic simulations (TMS). The data are initially acquired by MRI. The MRI images are then
segmented and a so-called 3D anatomical project is created for each patient. Such project,
which corresponds to a geometry reconstruction process, includes surface reconstruction and
mesh generation for the head as well as for the cortical surfaces. This anatomical project is
stored and reused for repeated TMS sessions performed for the same patient.
Overall the evolution toward patient-specific modelling and simulation is recognised as an
advance. However, the fact that either little or no reuse of data and processes is currently
performed across patients is recognised as being very detrimental to the productivity.
Intuitively, the more patients processed the more productive (and hence assisted) the expert
would like to be. The experts and clinicians are conscious that the goal here is not only to
reuse the data, but also to reuse the processes themselves through documenting the whole
process. This challenge is where knowledge-based technologies must play a role.
Knowledge technologies are increasingly required to manage structural, functional and
topological information extracted from the medical data. More specifically, semantics is
required at every step of the modeling and processing pipeline, which ranges from raw data to
goal-oriented 3D numerical models through the crucial and yet current bottleneck step of data
Similarly, visualisation is always at the core of many medical applications such as diagnosis,
surgical simulations, image-guided surgery and training. In the context of knowledge-based
computer-aided medicine visualisation should be not only concerned with the plain rendering
of the geometry, but also with semantics. While our current assessment is that geometry and
semantics are most often decoupled when it comes to interaction and visualisation, some
recent research experiments show that it would be very beneficial to combine both. For
instance, the process of simulating a complex orthopaedic surgery is considerably simplified
when both geometry and knowledge are stored into an ontology, as the practitioner can get an
interactive, faithful simulation including both geometry and semantic information. Such
simplification translates into a reduced surgery planning duration, which is critical for the
practitioners. Having knowledge such as organ dependency and material properties combined
with geometry is already a significant step toward efficient and reliable surgery planning.
What is missing in the current knowledge representation is a detailed description of the link
between the geometry (and topology) of an organ or tissue and its function. For example, the
shape of a joint (e.g., a spherical cap) partially explains its function in terms of motion. The
geometry and topology of an organ also often explains its function, such as connection to a
bone or to another organ, etc. The simulation could thus be improved by integrating into its
model the inherent functional constraints of the musculoskeletal system.
Standards already play a prominent role in what is currently the most important use of
knowledge technology in medical applications: description of anatomical features, pathologies
and protocols. Examples are the Unified Medical Language System (UMLS), the Current
Procedural Terminology (CPT), and the Foundational Model of Anatomy (FMA). These standards
serve different purposes such as training and education, communication among practitioners,
statistical analysis for health insurance, preventive medicine and legal issues.
Current standards are primarily required to ensure the continuity of medical imaging data
and the sharing of those data among care and research centres. The development of standards
will most likely continue fostering the mutual enhancement of different imaging technologies.
Standards for 3D medical content are however currently lacking, although they are likely to
have in the near future a fundamental role in the field of computer-aided medicine where
patient-specific anatomical modeling is of increasing importance. Similarly, the development of
standardised procedures (including geometry and knowledge) for goal-oriented modelling and
processing will be of increasing importance both for the sake of quality and for legal issues,
when knowledge-intensive modeling and processing techniques will be routinely used.
FOCUS K3D D1.4.1
While computer science is concerned with the development of
(generic) algorithms, which in general may be applied in a variety of
settings, biology is more of a system-centric activity, since in general
a panel of methods are used to investigate a particular system
(molecule, cell, tissue, organ, organism). In biology, knowledge
technologies are especially suited to structure the knowledge relative
to a particular piece of this multi-scale puzzle, as well as to integrate
data across the different scales. In this broad context, knowledge
technologies in general and ontologies in particular are characterised
by two main features. First, their development is under the guidance of biologists, since a
precise understanding of the systems under investigation is necessary in order to select
relevant annotations. Second, the knowledge technologies used cover the full range, from
simple databases all the way to complex ontologies, as evidenced by the numerous tools which
are made available from the web portal of the Protein Databank in Europe (PDBe).
If one narrows down the application scope of knowledge technologies to applications dealing
with 3D shapes, the natural setting in biology is that of structural bioinformatics. With
structural biology being the sub-domain of biophysics and biology concerned with the
investigation of the connection between the structure of macromolecules and their function,
structural bioinformatics is concerned with the development of methods and algorithms meant
to complement experiments in this perspective. Structural biology poses a number of difficult
problems, among them the acquisition of reliable physical signals, the reconstruction of
macromolecular models from these signals, the description of these models so as to design
parameters best describing their biophysical properties, and finally, the utilisation of these
models for prediction. In particular, predicting the 3D shape of a protein (the folding problem)
and predicting the interaction of two or several partners (the docking problem) are clearly two
of the main challenges in structural biology.
When working on these challenges, since the shape of molecules is so important, all
scientists resort to 3D content. This 3D content consists of a mix between geometry and
biochemistry. The geometric side corresponds to molecular representations and conformations,
either represented as a collection of balls, using Cartesian coordinates, or using internal
coordinates (bond angles, valence and dihedral angles). On the knowledge side, one specifies
the types of atoms and bonds, as well as a number of biochemical annotations (positional
fluctuations for experimentally resolved structures, known partners of a protein, affinity
constants, diseases the protein is involved with, etc.). The complexity of the concepts dealt
with can again be seen from the various tools, which are accessible from the web portal to the
European PDB at http://www.ebi.ac.uk.
From a modelling perspective, the topics of interest depend on the application scenario.
Experts in reconstruction from experimental measurements wish to output a model coherent
with the physical measurements. Scientists designing drugs are especially interested in
pockets on protein surfaces, since these accommodate drugs. Similarly, the affinity and
specificity of protein–protein interactions are tightly coupled to geometric and biophysical
properties of the molecules. Biologists wish to understand how the structure of molecules
accounts for their function. This requires making progress on the two aforementioned
mechanisms, namely folding and docking, which involves manipulating a mix of quantitative
and symbolic information. We substantiate this claim by examining four situations, which
exhibit a wide range of difficulties:
• Characterising the packing properties of atoms. Native forms of proteins, that is
biologically active forms, exhibit very specific patterns in terms of spatial properties, in
particular regarding the packing properties of atoms. Packing properties are obviously
numeric properties (for example atomic volumes). On the other hand, annotations of
proteins are key: well-packed proteins are called native like, while proteins with loose
packing are called decoys. In between these two extremes, a wide variety of situations
occur, and some of them are particularly important for the description of diseases (e.g.,
FOCUS K3D D1.4.1
misfolding in amyeloid diseases);
• Computing the depressions and/or protrusions of molecules. When docking two
molecules in a blind fashion, i.e., without any a-priori knowledge, knowing where the
binding sites is a must. Depressions/pockets and protrusions/knobs on molecular
surfaces are good hints. In general, a pocket is defined in geometric terms (e.g., a
concave region of sufficient volume, or the volume corresponding to the stable manifold
of a maximum of the distance function to the molecular surface, etc.), but such
quantitative models are not easily handled, and annotations such as flat pocket, a
pocket with two crevices, etc. would help the design process.
• Developing proper notions of curvature, in conjunction with solving models and binding
modes. Curvature inherently refers to differential geometry, and there are indeed
several strategies to quantify the curvature of a molecular surface. Aside from these
quantitative statements, the qualitative description of regions of molecular surfaces in
qualitative terms such as flat, convex, concave, saddle-like, etc. would help biochemists.
• Partitioning a molecule into regions, which are likely to be flexible and rigid. Such
regions are of utmost interest for flexible docking algorithms. Rigidity characterisation
typically resorts to normal modes – the eigenvalues and eigenvectors of the internal
energy of the molecular system about a minimum. These quantities are used to define
molecular motion involved in deformations coupled to docking. Practically, annotations
such as hinge motion, shear motion, etc. are more easily handled than the underlying
Documenting the life cycle of 3D objects manipulated in bioinformatics naturally depends on
this context. We shall discuss three examples illustrating the complexity of the documentation
Consider first a biophysicist who is experimentally solving the 3D structure of a protein:
he/she aims at producing a final product, namely the 3D structure of the molecule. The
structure submitted to the Protein Data Bank contains specific pieces of information, namely
those imposed by the PDB file format. However, the difficulties handled along the process may
or may not be documented. To make a long story short, a biophysicist interested in solving at
the atomic level the structure of a given protein using X-ray crystallography has to go through
the following steps: (i) introduction of the gene coding for this protein into a host cell, (ii) large
scale protein expression and purification, (iii) crystallisation, i.e., crystal growth, (iv) X-ray
diffraction analysis of this crystal and model reconstruction. Each of these steps might be
challenging and may require years of work. Experiments along the way are written down in the
Laboratory Information Management Systems, which is the memory of the process followed.
Still for complex cases the whole process might be more similar to a handcrafted activity, and
rationalising might be very difficult. It is also important to note that a person who managed to
carry out a complex process will first exploit it before releasing the details.
Consider now a person dealing with molecular modeling, who is working on the problem of
inferring a putative complex from two unbound partners. Key questions contributing to the
ability to run a reliable prediction are: which evidence of interactions between homologous
proteins does one find in the biochemistry literature (this requires a thorough literature search,
together with a precise compilation of experimental conditions)? What are the key regions in
the interaction? Are additional molecules involved to stabilise the interactions (ions or water
molecules)? Are there flexible regions? What is the role of electrostatics – which often plays a
crucial role at early stages of the binding?
These questions involve a subtle mix of relatively simple and complex tasks. Simple
questions are those related to the search for proteins within a given homology threshold with
respect to either a given reference protein, the classification and the search of protein folds,
the search of protein atoms with prescribed biophysical properties (e.g., temperature factors),
or the search for bibliographical pieces of information related to a molecule or a complex.
FOCUS K3D D1.4.1
These tasks clearly benefit from knowledge technologies and can be easily documented.
Nevertheless, there are other and clearly very difficult questions, for example handling flexible
regions, computing precisely free energies, handling mutagenesis data, and understanding the
precise connection between coarse and atomic modeling results. These questions in their full
generality are still open. The route taken to produce a prediction cannot be fully and easily
traced and there is no standard way to rationalise and document it.
Finally, consider a pharmaceutical company designing a drug. The docking process just
described is now a mere step in a much longer process. To begin with, the chemist will aim at
identifying candidate drugs for the disease of interest. Drugs are small molecules, which
underwent a wealth of experiments in the chemistry and biochemistry communities over the
past five decades. The zoo of small organic molecules is rather stable. Simple rules involving
the molecular weight or the solubility of a drug allow one to infer pretty reliably the chances
for a molecule to be a lead – cf. Lipinski's rule of five (Lipinski et al. 2007). These pieces of
information can be rationalised and organised into ontologies, and can also be easily traced in
Laboratory Information Management Systems.
The second step, docking, is more difficult to handle as seen above. Finally, upon identifying
good hits thanks to a docking prediction, wet lab tests must be carried out, and finally clinical
trials implemented. Overall, the whole process lasts more than a decade, and there is no
standard way to trace it thoroughly. In fact, the Product Life Management within
pharmaceutical companies is a major problem and certainly an obstacle to benefiting from the
experience accumulated over a large variety of projects. This observation motivates the
strategy of companies such as Dassault Systemes: having to some extent mastered the life
cycle of manufactured objects, Dassault Systemes now aims at providing their tools and
environments to pharmaceutical companies.
To conclude, there is no general and unified way to handle the documentation of the life
cycle of 3D objects encountered in bioinformatics and applications. In the most general setting,
our understanding is not yet sufficient to document and elaborate easily simple knowledge
The semantic interaction found in structural bioinformatics is not different from that found in
other domains. As mentioned before, a number of tools have actually been developed and
made accessible. Let us mention three of them, which are available from the EBI institute at
http://www.ebi.ac.uk, and allow running queries on the Protein Data Bank. The rationale for
mentioning these three tools is that they are of increasing complexity in terms of knowledge
The first one, PDBelite, allows the user to pre-define criteria to be used by a search engine.
The second one, MSDpro allows the user to define its own logical queries using a drag-and-
drop interface. Finally, the third one, MSDmine gives access to a complex data model allowing
the user to take full advantage of the features of a relational database. Interestingly, the
knowledge data made available from these services are rather elementary in terms of
geometry, that is, no advanced annotation of geometric nature is provided. This is consistent
with the complexity of the aforementioned mechanisms.
The role and status of visualisation in the field of structural bioinformatics is rather
interesting. On the one hand, molecules are 3D objects, and it is rather tempting to inspect
their geometry so as to guess what the binding patches might be, or to assess the
complementarities of the molecular surfaces of interacting molecules. On the other hand, the
binding of two molecules is a very subtle mechanism, with a subtle interplay between enthalpic
and entropic phenomena, and a mere 3D visualisation cannot account for these delicate
mechanisms. However, the visualisation of 3D structures is important to develop one's
intuition. It is also of prime importance to communicate with large audiences and promote this
kind or research.
Standards do not play a prominent role in structural bioinformatics. File formats, for
example those that are used in the Protein Data Bank, come with standards. However, as seen
above, the complexity of the biological processes involved, e.g., in the experimental resolution
of a protein structure, is such that typically non standard problems are faced, which calls for
FOCUS K3D D1.4.1
an unstructured part in the files themselves. These difficulties and also the variety of scenarios
handled are clearly a hindrance to the exchange of information.
3.3 CAD/CAE and Virtual Product Modelling
CAD/CAE and Virtual Product Modelling cover all ranges of man-
made objects: electronics, transportation devices, tooling machines
and many more.
There are important differences between CAD/CAE and the other
fields considered in FOCUS K3D. Medicine and Cultural Heritage are
of global interest and collaboration amongst researchers, building
ontologies, etc., has a long tradition, whereas product-developing
companies face much harder competition amongst each other.
Although the human body can be looked at as 'the most complex
machine' known, it is just one type of object that Medicine and Bioinformatics are trying to
decode, understand and conceptualise/formalise fully.
In Virtual Product Modelling the objects (machines) modelled are much less complex than
the human body but the range of different types is enormous and cannot be easily mapped
into one scheme.
This explains to a certain degree the different status in terms of ontologies (in virtual
product modelling there are few) and standards in these areas. Nevertheless, the main
problems of CAD/CAE are still comparable to those of the other fields, e.g.:
• How to generate a model from a digitally acquired object?
• How to model a shape easily?
• How to model function or behaviour?
• How to retrieve a model with certain properties?
• How to handle all the information involved?
Virtual Product Modelling is part of the virtual product developing process. The 3D modelling
phase is typically fed with goal market characteristics, market/cost figures, requirements
(aesthetic, functional, environmental, safety, etc.), concept sketches, and so on and so forth.
Stylists and designers use CAD (computer-aided design) and CAS (computer-aided styling)
systems to create virtual 3D models that foremost represent shape. The product functionality
is usually modelled using a variety of behaviour/function modelling tools and is simulated with
corresponding solvers. In the process, physical models still need to be created for styling
reviews or to validate behaviour.
In the styling phase, the shape of physical models is changed manually, which requires
scanning the resulting 3D surface and re-creating a virtual model for it (re-engineering).
In the simulation phase data from real experiments are used to tweak material models (to
just name an example).
The data created along the process typically reside in a multitude of data sources with a
tendency to integrate product relevant data into a product data management or product
lifecycle management system. Today most of the content of those systems is restricted to
geometric 3D information, the whole semantics being in the heads of the engineers, and not
explicitly documented in a computer-interpretable way.
Although being the field where 3D media really started off (Sutherlands SketchPad 1965,
company-owned CAD systems in the 1960s, etc.), the software in use today still requires a lot
of manual intervention and tedious work and typically imposes certain concepts onto the user
instead of giving him/her full freedom in choosing the best-suited tools at will.
In the following we shortly present the state of play for the most important phases and
kinds of software tools and give hints on current limitations.
FOCUS K3D D1.4.1
CAD systems enable designers to create geometric models of products with the computer,
to be reused and manipulated by the designers as needed. CAD systems were, and remain,
highly technical software with an extremely rich feature set and functions for detailed design
CAE tools are widely used in the automotive and aerospace industry. In fact, their use has
enabled engineers to reduce product development cost and time while improving the safety,
comfort, and durability of their products. The power of CAE tools has progressed to the point
where much of the design verification is now done using computer simulations rather than
physical prototype testing.
Today’s shape representations in CAD (mainly NURBS) do not feed all purposes well,
especially not those of simulations. For designing, the concept of creating a larger base surface
from which certain parts are cut out by trim curves to create the final piece of surface is
certainly artificial – it has no natural equivalent. Other representation schemes are needed that
allow for more intuitive and easy shape manipulation and also enable easily integrating
concepts of function-oriented design, etc.
Simulations typically require a different kind of shape representation than provided by CAD
systems, a discrete instead of a continuous one. This implies a tedious manual process of
model preparation to suit different simulation needs.
The integration of CAD and CAE using FEA (Finite Element Analysis) is today very complex
due to the different principles for shape representation employed. In CAD, a volumetric object
is described by shells described by a patchwork of mathematical surfaces representing the
outer hull and inner hulls of the object, and data structures conveying the volumetric
relationships. There is no requirement that adjacent surfaces match exactly, only that they
match within specified tolerances. In FEA, the object has a complete mathematical description
through watertight structures of trivariate parametric volumes. There is a fast growing
research field in both the US and Europe concerning this discrepancy, addressing how to solve
better this great interoperability challenge of CAD and FEA. This research field can be denoted
as “Isogeometric representation and analysis” aiming at a common shape representation for
both CAD and FEA to integrate the two disciplines and drastically improve the quality of Finite
Element Analysis. Consequently, there is a need for investigating the concepts of this new
approach both from a purely mathematical and from a semantic perspective to integrate it into
current industrial information pipelines. Isogeometric representation and analysis employ new
concepts in both the shape representation phase and the analysis phase. We will need
ontologies that encompass these concepts and their relationships to the traditional concepts to
facilitate interoperability between the new isogeometric systems and traditional CAD and FEM.
Today, many companies use product data management (PDM) systems to store product
related information, such as geometry, assembly structure, and materials. Considering the full
product development process and the mechatronic disciplines (mechanics, electronics,
software), there exists much more information spanning from requirements, to specifications,
to mechanic, electronic and software behaviour models, to cross-domain dependencies,
influences and effects, largely carrying the semantics of the product and not represented in
The question is whether all the data should be in one PDM system. A more promising idea is
to bridge the existing islands with semantic linking to manage knowledge effectively in order to
reuse past positive experiences and solutions and avoid repeating past efforts or errors. When
talking about knowledge in industrial companies, one of the important islands of information is
the one that deals with the knowledge of the geometrical properties of the products
manufactured, bought or integrated, and to be maintained during product changes. Therefore,
models and methods for making explicit and maintaining links between geometric properties
and domain specific knowledge have to be developed for improving both the specific process –
either computer-aided or manually performed by the operator – and also the inter-process
communication. This would be beneficial for the integration between design and simulation in
product development, and in simulation during maintenance.
Currently, searching in distributed information sources is hampering efficient work
FOCUS K3D D1.4.1
procedures. When linking those information islands together, new search functionality is
needed that spreads out over these sources, takes shape, function, metadata and semantics
into account, and combines partial results to meaningful answers of the user’s questions. New
partial/local 3D retrieval approaches and combined searches contribute to overcoming the
current limitations of text-based or current global 3D content-based retrieval. Another aspect
is the appropriate visualisation and navigation through results being returned from combined
searches and user feedback for self-improving and self-learning mechanisms.
Another current limitation in the downstream process of virtual assembly is the tedious way
to create virtual kinematics and simulate this interactively within the virtual environment.
Nowadays kinematics simulation is performed in the domain of CAD and CAE software
packages, which need specially trained personnel and are designed to work on detailed product
models without providing the degree of realism and interactivity virtual environments offer.
Here, using semantics from CAD and parametric feature-based design for virtual kinematics in
a Virtual Conceptual Design (VCD) platform would ease the process and support the early
conceptual design phase as well as the embodiment design, allowing the immediate validation
of assemblies and mechanisms by experiencing their behaviour during the modelling,
simulation and assembling process.
3D model exchange between different systems has been and is still a great challenge with
respect to model quality, as different CAD-systems and CAD-system kernels employ several
approaches (and tolerances) to handle the inherit inaccuracies of STEP-type CAD-models.
CAD-model check and repair have been on the agenda for many years and resulted in a
number of standardisation initiatives:
• AIA/ASD EN-9300 Long Term Archiving (LOTAR) addressing High Quality Geometry
verification and validations and rules to execute.
• ISO 14721.4, Open archival information system – Reference model.
• ISO 10303-59 Product Data Quality (PDQ), Part 59: Integrated generic resource –
Quality of product shape data.
Since standards for CAD model quality are in place, the focus of advanced industry has now
turned to the validation of the CAD-models exchanged (e.g., the STEP file, or the CAD-model
generated by the receiving system), being an exact representation of or equivalent to the
CAD-model in the sending system. This demand for equivalence checking poses new
challenges to the CAD-model representation, to the semantic annotation of the CAD-model and
to CAD-vendors in general, as industry requires more than the basic CAD-technology, CAD
models, representation and algorithms to be available for validation.
The ‘standard file format problem’ has to be addressed by academia and research institutes
in the future. We need to develop a 3D file format addressing the following problems that are
• compatibility / interoperability / semantic mapping;
• shape representation (procedural, parametric);
• functional expressiveness;
• transport of objects together with their interactively modifiable parameters.
FOCUS K3D D1.4.1
3.4 Gaming and Simulation
Compared to Biomedicine and CAD/CAE, the gaming industry is
relatively young. In addition, it developed mainly independently of
research in gaming technologies – which until recently was often not
even considered as an academic activity. Also knowledge
technologies, such as taxonomies and ontologies have limited
application in the gaming-related industries. Relevant semantic
information is thus mostly created manually. Mathematical
descriptions of physical characteristics (e.g., max. rotation and state
of a door handle) are often used to describe implicitly semantic
characteristics instead of modelling them explicitly (e.g., door opens to the inside, door handle
can only be pushed down).
In simulations that are, for example, used for training and education (flight simulation, fire
fighter training, etc.), realistic visualisation is obviously an important if not essential issue.
However, it also has a high relevance in gaming where having the latest, “coolest”, state-of-
the-art computer graphics is often an essential criterion to survive in a highly competitive
market. One unique characteristic in gaming and simulation is that objects and characters are
normally part of a virtual world in which they move, operate, and interact with each other.
Thus, animations, emotions, changes over time, interactions between characters, etc., play an
important role and require descriptions of, for example, material characteristics (flammability,
behavior under pressure, etc.), how subjects and objects react to each other (e.g., a car
driving over an icy road, a game character getting punched in the face by another one) and so
on. As mentioned above, although this is a classical example for the beneficial usage of
semantic knowledge, related techniques are hardly used, and a lot of this information is often
Some semantic characteristics are however acquired from the real world. Motion capturing
is commonly used in the gaming industry to capture typical movements of a game character
and thus create animations that better model how humans move and behave in the real world.
In addition, research is working on methods to model and better describe related aspects.
Examples include work on modelling emotions such as crying and realistic visualisation of tears,
and more realistic simulation of how people move in crowds. Data from Graphical Information
Systems (GIS) is sometimes used in modelling worlds and landscapes.
As already said, providing a high level of realism is normally essential for simulations that
try to mimic the real world and are used, for example, for training purposes and serious
gaming applications. Leisure games can include real world simulations as well (e.g., sports
games, car races), but also fantasy worlds and characters that do not follow common physical
rules, but are just the product of a game designer’s imagination. In all cases though, high
quality graphics and realism (or a “convincing” look and behaviour of fantasy characters) are
important. Common 3D modelling and advanced computer graphics techniques are therefore
commonly used, including meshes of different sizes with texturing, bump mapping, skeleton-
based modelling for characters that are, for example, animated based on motion capture data,
statistical modelling of virtual worlds based on the likelihood principle, and so on.
Related models are often created from scratch, but sometimes also models from CAD are
used (cars, ships, etc.) and integrated into the gaming environment. Similarly, but much more
rarely, tools such as 3D scanners are used to create realistic models for simulations. Adding
game-characteristic issues to CAD models often means modelling the aforementioned issues,
resulting from the fact that in games we are usually dealing with objects in an interacting
virtual world where temporal changes, relationships, and interactive behaviour need to be
The high need for realistic modelling and visualisation is often in conflict with the demand
for real-time behaviour of the objects and characters in the virtual worlds. Objects such as fast
FOCUS K3D D1.4.1
cars in a race simulation and characters such as fleeing people in a fire fighter simulator
training do not only have to look and move realistically, they also have to do it in a fast and
temporally appropriate way. Game and simulation developers therefore often have to make a
decision about the level of detail (LOD) of a model and might even be forced to sacrifice its
geometric quality for the sake of speed. Automatic tools supporting a 3D designer in such
decisions with consideration of the semantic characteristics would therefore be quite useful.
Creating simulations and games that take place in large, highly complex virtual worlds can
also require from developers to reuse models (or parts thereof). However, storage,
management, and organisation of 3D data are surprisingly often realised using rather primitive
approaches that rely on naming conventions, pre-specified folder structures, textual
descriptions, etc., instead of using tools designed for such purposes. To browse and investigate
sets of 3D models, the modeling software that was used to actually create the models in the
first place, is often used instead of special browsing and search tools. Although knowledge
technologies promise to assist significantly in such tasks and offer the potential for a more
effective management and re-use of existing data, currently they seem to be hardly used in
this context. Reasons for this might include the sometimes still prototypical status of related
approaches, heavy dependence on existing proprietary data and workflows, but also scepticism
towards new ideas from academia, or simply a lack of knowledge of existing techniques.
However, despite a seemingly hesitant behaviour when it comes to the adoption of
knowledge technologies, the related community is well aware of the lacks and limitations of
existing workflows and generally agrees that – if done right – related tools could have a high
benefit. Semantically annotated libraries and databases with semantic search functionality that
allow you to easily find and re-use existing models or give you parts that allow you to easily
create new ones that do not only “look good”, but also fit together considering their semantic
behaviour would certainly be beneficial and have a positive impact on the design and
One critical issue for many people is that semantic annotation still requires a significant
amount of manual labour. This is particularly critical for the parts of the gaming industries that
are characterised by high competition, extremely short development circles, and tight
deadlines. Semantic annotation in such a scenario does indeed seem like a major burden in
adapting knowledge technologies. However, one possible option to deal with this issue is to
transfer the related effort from the developers to the actual users. For example, online games
often have a highly active and committed user community that is not only interested in playing
games together, but contributing, building, and refining the game itself. The tremendous
success of social networking and tagging sites suggests that users might indeed be willing to
provide such input and first successful examples from the gaming area already exist.
Aside from such aspects related to the synthesis and documentation of 3D models and their
semantics, there is also an apparent need for semantic search and retrieval in this community,
as illustrated by a quote from our state-of-the-art report, saying that “a web-based search and
retrieval system would be nicer, with also version control and more metadata, like polygon
count, texture count, formats, texture size, and retrieval tags; info on which projects it was
used for, and when; name of creator and people who worked on it.” However, people are
usually quite sceptical if such a goal can ever be achieved. Reasons for this are not necessarily
motivated by the tremendous technical challenges we would have to solve for it, but also to a
large degree by the nature of this industry and the related market. For example, although
desirable, sharing data and semantic information across companies is something that is
unlikely to happen.
Games and simulations usually have a temporal component and as such often heavily rely
on real-time behaviour. Similarly, both are often highly interactive. For example, players of a
game often take the role of a character or steer an object such as a car. They expect the
virtual world and the characters and objects therein to react promptly and in a semantically
reasonable way. Finding the right level of detail (LOD) that guarantees real-time behaviour,
while at the same time providing a high-quality visual representation, is therefore a very
critical issue. However, semantic technologies cannot only be helpful in the visualisation of the
game or simulation itself, but also in the production of material such as related documentation.
FOCUS K3D D1.4.1
For example, creating icons of ship models is considered as a separate and significant step in
the workflow when adding such models to a ship simulator environment. Semantic approaches
that automatically calculate the “best view” of a model promise to be very useful in this step
that is currently mostly done manually.
If we look at the users’ perspective, semantic interaction does not only involve the related
realistic visualisation and modelling of the virtual world and its characters, but also how they
can interact with them and how they control them. While special hardware can often be used
for serious simulations, interfaces for leisure games have long been dominated by game
specific controllers such as joysticks, steering wheels, light guns, etc., but also simple
keyboard and mouse input. However, a recent trend in gaming interaction and input
mechanisms suggests an increasing relevance of 3D semantics in this area as well. For
example, the tremendously successful Wii remote controller maps the 3D motion that a user is
making to related movements in the game. Other companies are working on 3D tracking by
video-based interaction. Gaming on the iPhone is becoming increasingly popular and has
pushed mobile gaming to a new level – partly because of the innovative usage of new
interaction modes that map the 3D orientation of the device to motion in the game. Related
research is just in its beginning, but the need for better tracking of human motion and a more
accurate and realistic mapping of it to actions in a game that also considers the semantic
restrictions and provides a natural interaction seems obvious.
Like any other area that is dealing with large amounts of 3D data, gaming and simulation
could benefit significantly from the introduction of general standards and commonly used file
formats. However, even if this is often acknowledged by people working in these branches of
the industry and mostly considered as an important objective, activities towards creating and
establishing them are rather limited, with few notable exceptions, such as COLLADA managed
by the Khronos Group. One reason for this seems to be the large scope of the field including
areas as different as sports games, car race and flight simulators, realistic battlefield and war
simulations, fantasy and adventure games, etc. Being able to create a common platform such
as “a game character ontology” or the like does indeed seem quite unrealistic and not to be a
reasonable effort. Focusing on sub-areas (e.g., taxonomies for sports games or ontologies for
race cars) might seem reasonable, but again the success is questionable given that people
working in related sub-areas are often direct competitors and thus not interested in creating a
common basis for exchange and sharing. Considering the file format, the situation might be
different because it could be beneficial not only for data exchange among different users but
also, for example, for re-usage and exchange within one institution. It would also be of high
relevance for the adoption of semantic processing tools, such as approaches to automatically
separate complex models into smaller, semantically useful parts and to automate the semantic
3.5 Cultural Heritage and Archaeology
The Cultural Heritage and Archaeology domain is characterised by
increased 3D digitisation efforts and an increasing volume of 3D
content during the last few years with several past and ongoing
projects. The main interests of professionals (e.g., curators,
archaeologists, historians) in this field are: firstly, the e-
documentation of the past in 3D; secondly, an effective and exact
organisation and presentation of archaeological/cultural heritage
content to virtual visitors. Another area of interest for the cultural
heritage community is the development of educational applications
for connecting real and virtual artefacts. 3D models and virtual spaces have a huge potential
for enhancing the way people interact with museum collections and e-learning environments.
The creation or capture of 3D models does not appear to trouble professionals (users,
developers, scientists, creators of 3D content, publishers/dealers of 3D repositories, etc.).
Laser scanners and photogrammetric methods are the most prominent techniques used for
acquiring artefacts or large structures and complex archaeological sites, while many models
and 3D environments are built from scratch. However, the lack of widely accepted (by all
FOCUS K3D D1.4.1
cultural heritage professionals) data formats, metadata structures and/or ontologies makes the
semantic integration of 3D models very difficult. A significant amount of time is spent on
manual model annotation and documentation.
Moreover, the management of 3D collections is at an early stage and in many cases
nonexistent at all. Our investigation confirmed the current situation and revealed some specific
problems and shortcomings related to 3D knowledge management. Although great
technological progress has been achieved, the involvement of cultural institutions with
knowledge technologies related to 3D is still limited.
3D professionals mainly deal with the geometric and structural aspects of 3D models while
the semantic aspect appears only for the management of more complex models. However, in
archaeology and cultural heritage, object semantics is typically just as important as the actual
geometry. Thus, it is a key requirement to assign thematic information to entire objects and to
individual geometric elements in a virtual environment. This also makes it possible to either
select, analyse, or edit the geometry and the appearance of objects based on semantic criteria.
The benefits of adding semantic information in the different stages of the reconstruction and
presentation of an artefact or an entire virtual space are twofold. From the modeller/creator of
3D content perspective, re-creating an ancient artefact in 3D or an ancient site (architectural
space) is difficult and time-consuming work. One cannot simply use a scanner. 3D
reconstruction is a long design process based on available data (pictures of the remains,
archaeological research drawings and maps, etc.) and evolving in close collaboration with
archaeologists and historians. For example, a scanned artefact may be incomplete or damaged,
and needs to be reconstructed by applying rules like repeating structures, symmetry or
boundary conditions, knowledge of similar artefacts, etc. The import of high polygon models
and rich object textures is a key issue for creating the realism necessary for a believable
immersion into a virtual environment.
Another perspective/area of interest is the organisation and presentation of archaeological
and cultural heritage content to virtual visitors (virtual exhibits). Developing
educational/training application scenarios and environments, which are visually complex and
information-rich are a very effective learning tool. The overall experience of a student or
virtual tourist is defined by the virtual reality representation/re-creation. The idea of learning
as an active, self-directed, and context-dependent process (e.g., walking around, admiring the
ancient buildings, interacting with objects, etc.) can greatly contribute to gaining new
Practices and methodologies obviously vary widely from scientists, researchers, developers,
designers, project managers, to publishers and dealers of 3D content. Concerning the actual
amount of data, there is a huge variety ranging from just a few models (e.g., ten) to very
large collections (e.g., thousands of models). Most of the data are stored on file servers or
proprietary repositories. Often rather primitive approaches are used for handling 3D content
(e.g., using file and folder names to encode information about the contained data), and there
is an apparent lack of information about knowledge technologies or at least a common
misunderstanding or misuse of the related terminology. There is a lack of specialised tools that
are particularly designed for organising, browsing, and searching 3D content.
Concerning the e-documentation of 3D objects, a common practice is to attach annotation
and other general information to the objects’ geometry. However, the annotations frequently
get lost when geometry changes due to various reasons (e.g., simplification, transformation,
even standardisation), while preserving semantics in this process is crucial. Traceability to
sources and data provenance must be documented and guaranteed.
Embedding semantic information and descriptive metadata related to the original source,
age, design and existing knowledge on associated artefacts can contribute both to the efficient
work of a cultural heritage professional and to the overall experience of a virtual visitor.
People from the Archaeology and Cultural Heritage domains claim to be dissatisfied with the
way they manage and store 3D content and that they would expect an improvement. Areas
mentioned for improvement or for which they feel that the use of 3D knowledge technologies
FOCUS K3D D1.4.1
can provide a solution include functionalities related to the documentation and identification of
objects and of parts of them, automatic extraction of geometric and semantic information from
models, better visualisation, and improved search using semantic and geometric criteria or a
combination of both.
A noteworthy observation is that 3D collections are becoming more and more demanding in
terms of management, preservation, and delivery mechanisms. Our investigation confirmed
the current situation and revealed some specific problems and shortcomings related to 3D
knowledge management. There is significant debate whether the approaches adopted for
digital libraries are suitable also for this kind of digital objects.
3D digital representations are often neglected in efforts to create large-scale libraries and
there is also a lack in effort by cultural heritage institutions to acquire and use knowledge
about 3D content. An obvious shortcoming in the European Digital Library (EDL) project and its
prototype implementation (Europeana site: http://www.europeana.eu/portal/) is the absence
of 3D content. One of the reasons is the lack of widely accepted standards and protocols for
knowledge management in relation to 3D objects. Whereas text and image digitisation and
management are rather mature technologies, the technology necessary to benefit from 3D
knowledge management fully is not yet within easy reach of most cultural institutions and
professionals. A major challenge towards semantic 3D data management is to provide effective
content-based and semantic-based organisation and searching.
Although the Cultural Heritage community is well aware of knowledge technologies, which
are used for other kind of content (e.g., text, images), no particular methods/standards are
used for managing and organising 3D content. Concerning the adopted knowledge technologies,
there is a wide preference for databases and metadata and many use quite often taxonomies
and ontologies, however not for 3D content.
Current practices in knowledge management include the use of standards for
interoperability like CIDOC-CRM, Dublin Core metadata and MPEG-7, and standards for data
exchange like X3D, VRML and COLLADA. In general, there is no consensus in the Cultural
Heritage world on metadata standards, which affects negatively long-term preservation. There
are many proprietary data formats by hardware vendors (this mainly concerns captured data
and is less important for born-digital objects).
3.6 Discussion and conclusion
In this chapter, we provided a snapshot of the current situation in the five application fields,
as a result of the numerous project dissemination and networking activities.
To sum up, the adoption of 3D data can be considered a common practice in the different
domains, both by acquisition and by modelling from scratch. In fact, a large number of
acquisition techniques are used to obtain 3D digital resources, requiring cumbersome manual
labour and knowledge expertise in order to become suitable for downstream applications. In
modelling there exist numerous ways to represent shapes, but also different needs according
to the applications (real-time vs. offline, low-resolution model, partial information, etc.).
Moreover, during modelling and simulation, many factors must be taken into account, which
require user know-how. For both the acquisition and the modelling workflow pipelines, the
importance of semantics is clear, and a strong effort has to be spent on devising more
effective techniques, able to capture as much context information as possible. Semantically
rich (procedural and feature-based) modelling seems to be not fully exploited (e.g., for high-
level editable and adaptive models or for codifying functional properties of 3D objects).
Semantics would help in semi-automated processing procedures, providing relevant
information at each step of the design phase.
Moreover, annotations have been considered helpful to manage knowledge effectively in
order to reuse past positive experiences and solutions. The annotation of a geometric 3D
model with additional data (from geometric properties to information on its history to
information on its purpose or function) is a major semantic issue and was recognised as such.
It is important to note that the users (such as medical doctors but also in the CAD industry)
FOCUS K3D D1.4.1
accept and even welcome semi-automatic approaches for annotation (of organs, CAD features,
etc.), which leave them the choice to pick the option that appears most probable to them as
In all the discussions the key role of semantics became evident not only in the Knowledge
Technology and Knowledge Management sense as a best practice for 3D content
documentation and sharing, but also as a driving factor for the development of new and more
effective computational tools to process, manage and analyse 3D content.
Finally, the increasing size and number of models is a major issue in all application areas.
Using semantics would help to display the right info at the right time (decision support) and
improve the technologies with automated reasoning. Furthermore, standards are lacking to
improve data exchange between the research and industrial community. Even within one
company, certainly if it is a big one, data sharing is not a given. Software vendors (for
equipment to acquire medical data, for scanners to acquire cultural heritage objects, etc.) are
interested in keeping their data proprietary. The confidentiality of medical data must be
respected. In addition, as one representative from industry noted at the CAD workshop,
academic researchers favour research over file format development. Still the development of
standards and common file formats was always identified as an important objective.
This issue is even more critical for standards to express semantics of 3D data. This is a
rather un-explored area and only partial solutions exist (see MPEG-7 for instance). The reason
for this lack of solutions is probably the large variety of representation schemas used for 3D
data and the necessity to make the semantic descriptions somehow independent of the
geometric representation of the 3D models.
FOCUS K3D D1.4.1
Chapter 4 : The FOCUS K3D research road map
Delineating the research pathway to reach semantic 3D media is not an easy task: the
underlying grand challenge – understanding and documenting the meaning of 3D data – and
the complexity of the application domains involved make it difficult to express an exhaustive
set of issues that need to be solved to open the door to the next generation of 3D media.
Nevertheless, the FOCUS K3D project was successful in gathering a large number of
requirements from the user communities, via questionnaires and STARs, but especially via the
discussions promoted during the thematic workshops that conveyed a good picture of what the
open issues are, as perceived by the users in the application domains. The outcomes of the
thematic workshops (see deliverable D1.3.2 for more details) were the primary source of
information that the consortium used to match the open issues against the state of the art in
Computer Graphics, and to assess during ad hoc brainstorming meetings if and which of these
open issues were pointing to real research challenges and which were instead caused by a lack
of cross-fertilisation between the various fields of expertise.
As an intermediate stage towards the research road map, the FOCUS K3D Consortium also
produced a list of open issues that were indicated as crucial in the various steps of a typical,
yet generic, pipeline of 3D content creation and processing in the chosen four representative
fields. This exercise, which was interesting to show the relative importance of the issues in the
various domains, led us to a better understanding of the differences and commonalities in the
fields, and helped us in preparing the next step of brainstorming.
Given the workshop outcomes and the consortium view of the open issues to be faced, we
then tried to synthesise the high-level goals that lie beneath the visionary scenarios envisaged
for the far future and expressed by the storytelling descriptions of Chapter 1. The resulting five
high-level goals represent the long-term challenges that we think the Computer Graphics
community should face as future targets for new and disruptive research, with a strong need
to breach the borders of a single discipline and a call for a truly multi-disciplinary effort to 3D
shape modelling. Within each of these high-level challenges, we have also identified a number
of mid-term challenges that, without being exhaustive of course, are judged as important
building blocks for further relevant research advances towards the goals. Finally, we have tried
to avoid the presentation of separate research road maps for each application domain, and
preferred to give a general overview while focusing on application domain issues, when
suitable, to underline problems arising from specific shapes introduced by the application
First of all, the adoption of acquisition techniques has become typical in the workflows of the
four AWGs considered. In Medicine, MRI and TAC are common tools for diagnostics, while in
Gaming motion captures are used for instance in movies to give a natural behaviour to virtual
characters. In CAD, reverse engineering is a widely adopted approach to redesign, whereas in
Cultural Heritage the digitisation of artefacts normally happens through laser scanning sessions.
As we assessed in the thematic workshops, the passage from the initial point cloud or 3D
image to a 3D model requires usually a lot of manual labour, while at the same time, it loses
all the knowledge related to the entity acquired. Moreover, handling 3D data is not as simple
as 2D data for the users: reconstructing high quality surfaces/volumes is not possible as a
direct result of the acquisition session and geometric processing tools have to be used. A high-
quality model is also needed to apply shape understanding methodologies to extract implicit
models. Currently, the available tools are still too focussed on geometry and not ready yet to
extract many semantic features of the object. For instance, from curvature analysis or
segmentation techniques special areas can be identified but the intervention of the user is still
fundamental to select the correct regions of interest.
Some information simply cannot be derived from geometry. In Cultural Heritage, it would
be an improvement being able to acquire also colours and materials, for example; in CAD the
identification of components from their function has been pointed out as a substantial advance.
More generally, tracing all the digital workflows of models becomes crucial if digitisation is to
FOCUS K3D D1.4.1
serve for documenting, sharing and reuse of knowledge. In fact, only geometry is not enough
for non-experts in Computer Graphics, while a semantic model, in which the context
knowledge is explicitly related to the shape, is one of the big challenges to tackle. In the
following, it will be named Derive symbolic representations.
The importance to include specific knowledge is not only high when the model is created
from acquisition, but also when it is created from scratch. Including and preserving the design
intent would make the 3D model much more powerful in all the application fields. In Medicine,
patient-specific data pave the road to precise diagnoses and personalised therapies; in CAD
efficient links between 3D product models, their behaviour/function and their properties have
been identified as a strong need; in Gaming creating autonomous virtual characters able to
take decisions according to the surrounding scene would definitely be an advance. Among the
desiderata collected during the thematic workshops, we can mention also: modelling shapes
evolving in time (e.g. in medicine), coding construction rules of the shape in CAD, and
obtaining a complete 3D anatomical human. The inclusion of knowledge directly in the
modelling phase not only supports the modelling itself but also the simulation phase. In fact,
simulations could be performed exploiting the semantics of the model, performing much more
efficiently and effectively than nowadays where the quality and the interpretation of the
simulation are only in the expertise of the practitioner. The second big challenge we identified
is then Goal-oriented 3D model synthesising.
The third big challenge is predictably Documenting the 3D lifecycle. In fact, efficiently
interpreting, sharing, searching, and reusing 3D data has been proven to be fundamental,
together with the preservation of the annotation according to the digital shape workflow. The
discussions on this theme solicited the common necessity of a dynamic documentation, which
should be able to trace source and data provenance, to preserve the annotation whenever the
geometry changes and propagate it for similar objects. Another issue raised was related to the
integration of different information sources in the documentation process: 3D information but
also 2D images and text should be interlinked to make the annotation of the object complete.
Not only the modelling phase calls for semantics, but also visualisation and interaction. Due
to the different devices supporting 3D, such as 3D TV and mobiles, the information should be
visualised semantically, which means according to the device, and according to the semantic
priority given to the content. Interaction includes not only strict human-machine interaction
but also how consumers are able to search and retrieve the shape data related to their current
task. Among the issues collected among our AWG members, we can mention intuitive GUIs for
practitioners who are not experts of Computer Graphics, the use of natural language to
formulate queries, and consequently the need of multilingual and multimedia documents. A
mechanism to express queries by the user effectively, on the one hand, and on the other hand,
the development of strong reasoning methodologies to retrieve resources more precisely
appeared as crucial. The fourth challenge is thus named Semantic visualisation and
Finally, from the FOCUS K3D experience, we can generalise that there is still a strong need
of Standards in the different application fields. Interoperability and distributed design oblige
users to big efforts, when even possible, in making metadata and models consistent without
losing information. Unfortunately, the FOCUS community confirmed that both vendors and
researchers currently do not focus on tackling this issue.
In the next sections all five big challenges will be treated in detail, pointing out a list of open
issues to include in the research agenda in order to develop semantic 3D media.
FOCUS K3D D1.4.1
4.1 Derive symbolic representations
Deriving symbolic representations mainly addresses the path from physically born objects to
possibly symbolic 3D models in the digital domain.
There are two main ways to construct a digital 3D representation out of a physically born
object: either the digital model is manually designed by a graphic designer using appropriate
software and shaped so as to resemble the physical counterpart as much as possible (or to the
desired extent), or an acquisition session is carried out where data from the real object are
captured with the help of an acquisition device. The choice of suitable acquisition techniques
and devices depends on many factors such as the size of the object to be acquired, its physical
properties (e.g. colour and reflectance) and the accuracy required by the downstream
application. The acquisition session generates raw data, typically millions of points in 3D space.
In both cases, the knowledge related to the object is somehow lost in the modelling
process: the operator performing the acquisition as well as the designer perfectly know what
kind of object is being captured/designed, what its meaning is, what it can be used for and in
which context; what the main features are and what the details; if (and how) it can interact
with other objects, for instance if it can be grasped by the human hand and which parts offer
the best grasping points, and so on. This information is not stored in the digital model, which
only retains data on the geometric appearance of the object: basic geometric representations
do not give explicit information about the content semantics, which one can grasp only by
viewing the object.
If we want machines to understand the content of digital 3D media, we need tools that
classify automatically objects in semantic classes, extract salient features, and segment the
model into meaningful parts. In other words, the geometric model must be associated with an
abstract and machine understandable representation that embeds knowledge about its content
(semantic/meaning). Some of this knowledge can be automatically extracted by processing,
analyzing and structuring the raw geometry: we call this process deriving symbolic
Building up symbolic representations out of acquired data is vital to let the vision of
semantic 3D objects come true. This topic is relevant in all our application domains as the
following examples illustrate: in Medicine to reconstruct anatomical structures and monitor
morphological changes over time; in Virtual Product Modeling to re-engineer physical models;
in Cultural Heritage to digitise cultural heritage artefacts; in Bioinformatics to highlight protein
binding interfaces; in Gaming and Simulation to attach interaction rules to object parts.
Although the digitised physical models may vary in shape, size, material properties and
purpose (surgery planning, re-modeling, virtual exhibitions, etc.), the problems in generating a
semantically rich description out of the scanned data are pretty much the same. Considering
also our autonomous robot scenario, without sensing the environment, understanding the
surroundings, building up conceptual models and planning actions, robotics as described in
section 1.2 will not become reality in the future.
To derive symbolic representations, a direct flow can be devised from geometry acquisition
to building a semantic-aware model, where the necessary intermediate step is represented by
shape understanding, which is the process of making explicit the knowledge, which is implicitly
hidden in the geometry, through feature identification, structuring and classification.
Acquiring geometric models can be simple, or extremely complex. In the simplest setting, a
laser scanner operating in good conditions so as to scan a clay mock-up of a car will return a
point set. If the point set is noise free and the clay mock-up relatively smooth, the geometry
encoded is stand-alone and the modeling process can start, for example in terms of reverse-
engineering to build NURBS patches from the point set data. In such a setting, the output of
the acquisition process is essentially stand-alone, and the modeling process straightforward.
Conversely, most of the time the acquired data are raw in the sense of being sparse,
irregularly sampled, and riddled with uncertainty, so that they require heavy processing and a
relevant amount of manual intervention; the amount of such data is getting overwhelming.
FOCUS K3D D1.4.1
The abundance of geometric data is explained by the recent, considerable advances in the
modeling paradigms, in the acquisition technologies, and in the variety of automatic
conversion methods. Measurement data are acquired with an increasing variety of acquisition
technologies, whose evolution is characterised by a shift from contact to contact-free sensors
and from short to long range sensing, culminating with satellite images.
Medical imaging is another domain where handling raw data from imaging devices is an
intensive pipeline process with numerous steps, especially when heterogeneous data from
multiple imaging technologies have to be merged together. The performance of non-invasive
medical imaging modalities such as Computed Tomography (CT) and Magnetic Resonance
Imaging (MRI) is excellent in terms of speed and resolution, which is a critical requirement in
the medical field. Quantitative and qualitative medical image interpretation is successfully used
for diagnosis and to evaluate new therapies. In addition, the reconstructed 3D anatomical
models (e.g., bones, cartilages, ligaments, muscles, tendons, etc.) from imaging data have
been used in different applications (e.g., endoscopic surgery, computer-aided surgery, surgery
simulators for training, personalised prosthesis design, etc.). However, difficult challenges
remain concerning the standardisation of the acquisition protocol, data heterogeneity, storage
and interoperability, data processing and hardware limitations.
Ultimately, multi-modal images introduce a great amount of information that can be used to
construct medical ontologies and derive symbolic representations. As multi-modality introduces
complementing information, more details about the images can be represented, and thus
translated into symbols. Currently, the most commonly used dual-modal scanner in clinical
practices is the PET-CT. With this scanner, a CT of the subject is acquired to provide the
structural definition, which is then followed by the PET scan that provides functional
information. Together, it is now possible to visualise the functional structure, such as the lesion,
in relation to its anatomical surrounding. Open problems in multi-modal images are associated
with hardware registration, such as caused by patient breathing and movement, and with
software-based registration, which typically relies on the similarity of spatial features between
the images, and typically involves image alignment, image scaling, warping, and
Various applications in different fields are based on the acquisition of whole human body
models, e.g. computer graphics animation, body shape analysis, sport, medicine, etc. A
multitude of 3D scanning solutions and technologies (e.g., laser scanning, white light scanning,
photogrammetry and image processing) are proposed, which are characterised by high
resolution (≈ 1 mm) and allow a fast (less then 6 seconds) digitisation of the human body.
Again, and despite great progress in hardware, the proposed software solutions still have
limitations and obtaining a good quality model requires still a lot of human intervention during
the acquisition pipeline. When in addition specific movements of the human body need to be
acquired from a real person, motion capture is the appropriate modality to record the human
joint kinematics. The skeletal motion is recorded by infrared cameras that track the
trajectories of skin markers attached to different portions of the human body (e.g., pelvis,
thigh, etc.). In kinesiology, it is common to combine motion capture with electromyography
(EMG) and force plates for assessing musculoskeletal dynamics. Motion capture is used in
many domains such as sports, gaming and medical applications. However, this recording
system also presents some limitations: for instance intensive post-processing (filtering, noise
removing, skin artefact correction, etc.) is needed to evaluate the true joint kinematics from
Basically, when classifying an object or parts of an object, one faces the recognition
dilemma: one can only recognise things that are known a-priori, to know a thing means to be
able to recognise it. The same paradigm applies to 2D data, where the so-called sensory gap
further complicates the problem due to the presence of occlusions in the images.
For being able to classify object parts, the object has to be segmented, that is, decomposed
into its main constituting parts. With a slightly different meaning, the term segmentation is
used also for detecting objects of interest in a raw data set, typically an image. In both cases
segmentation is fundamental for the extraction of knowledge from the data available.
FOCUS K3D D1.4.1
In medicine segmentation definitely plays a key role in the construction of semantically
meaningful models. Image segmentation is the process where regions of interest (ROI) are
extracted from the image data based on domain-specific knowledge about the images. For
example, tumours from MRI can be automatically identified based on the unique intensity and
textures of the pixel characteristics belonging to these tumours. There is extensive literature
available on medical image segmentation, which can be found in (Charbonnier et al. 2009).
However, with the rapid increase in image resolution and quality, the ability to automatically
derive the meaningful features more and more needs to be assisted and improved. Although
there are signs that the segmentation automation is improving, it is still a significant challenge
to produce reliable and robust results, and with the arrival of multi-modal imaging, even more
sophisticated segmentation algorithms need to be developed.
Object segmentation is also a very complex problem. In this case the shape of the object is
known, but not the meaning and location of its relevant parts, or features. At the state-of-the-
art there are a lot of segmentation algorithms tailored for different purposes, and most of
them work on the basis of geometric features such as curvature, attempting to mimic high-
level semantic conditions via low-level geometric properties. Only recently, the research
community has started to explore semantically rooted segmentation algorithms, e.g. based on
knowledge about the object’s class, or object taxonomies. Another approach to object
segmentation is to propagate automatically the segmentation of one manually annotated
model to other models in the same class, via partial matching. If we are given tools to detect
automatically partial similarities between the object and a set of object parts (a data base)
then the annotation of one object can be transferred to all the others in the class. Again, this is
a developing research field, and no generic solutions are available yet.
There is a whole set of objects that can be effectively characterised by a skeletal structure
such as animals and articulated shapes. Configurations of tubular parts can be represented at
an abstract level by a centreline skeleton, i.e., a graph whose nodes correspond to the joints
and whose arcs correspond to parts of the shape. Skeletons are very useful to describe a wide
range of shapes and are frequently used in application domains such as Gaming and
Simulation to code information about the motion of the shape.
Another developing idea to derive a symbolic representation of raw data is to fit parametric
models to the scanned geometry. All the semantics embedded in the parametric model, and
expressed by rules and constraints, can be inherited by the scanned model, which is – after
the fitting process – represented by the re-parameterised 3D reference model. Generative and
procedural modeling approaches are especially well suited to pursue such a process. Finally,
these models become editable through its embedded parameters, which have been used to
perform the fitting process. Preliminary research is being carried out in this field but the idea is
still largely unexploited. Approaches to cope with the permutative parametric complexity are
In the structural biology field the construction of symbolic representations deserves a
discussion of its own starting from the acquisition phase, where data about biological signals
result from a complex experimental process, the interpretation of those data being heavily
reliant on a priori knowledge.
Biological functions are accounted for by biological complexes, and over the past five years,
the structural proteomics project gave access to the structure of biological complexes at an
unprecedented pace. The complexes solved are stored in the Protein Data Bank, and provide a
unique opportunity to learn the key determinants of the stability and the specificity of
biological interactions. Molecules interact by forming a complex, which is known as the docking
process; understanding and predicting docking is still recognised as the major open problem in
While early docking models, such as the lock-and-key model proposed by the second Nobel
Prize winner in chemistry Emil Fischer in 1894, stipulated that macro-molecules are rigidly
assembling, it has been suggested by Koshland in 1958 that complex formation is often
accompanied by an induced fit (deformation) phenomenon. Indeed, proteins are intrinsically
flexible, i.e. continuously undergo conformational changes over time, or equivalently, exist at a
FOCUS K3D D1.4.1
given time as an ensemble of conformations in equilibrium. This key property makes the life of
structural biologists especially difficult. On the one hand, experimentalists working in
crystallography often face situations where a structure cannot be resolved because the
flexibility impairs the biophysical signal in the X-ray diffraction spectrum. On the other hand,
modellers investigating complexes involving partners undergoing large conformational changes
are in general unable to predict satisfactory conformations of the complex, a fact well
established by the Critical Assessment of PRedicted Interactions community wide experiment
Besides taking properly into account molecule flexibility, two main questions are posed.
First, given two isolated proteins, one would like to know whether they interact, and if so,
whether this interaction is specific and stable. Second, given two proteins known to interact,
one would like to optimise their interaction, for example in the context of drug design.
Answering these questions is a non-trivial task. So far, they have been approached from
different perspectives, both theoretical and experimental. Bridging the gap between insights
gained in the two realms will require a subtle mix of biophysics, geometric modeling, and
knowledge management. The former two disciplines are naturally coupled since the
conformation of molecules is the fingerprint of forces acting on the atoms. As for knowledge
technologies, they provide the cement to annotate subtle and diverse properties of proteins
and complexes, thus underpinning the modelling on interactions.
In the following, we are going to describe some of the open issues that have been indicated
as particularly relevant to the steps of data acquisition, shape understanding and semantics-
aware representations, and that will contribute to the achievement of the global challenge of
an automatic derivation of symbolic and semantic descriptions of 3D models. These sub-
challenges are presented by a title, which synthesises the issue, and a short description.
Challenge: From offline to semantically annotated and nearly real-time
interpretation in 3D data acquisition
The recent advances in the computational performance of multi and many core processors
have a potential of providing the data acquisition operator with feedback on the completeness
of the data and a preliminary suggestions for interpretation, segmentation and semantic
classification of the data acquired. Such feedback will enable the data acquisition operator to
augment or correct the proposed semantic classification, and further to complement the 3D
data acquired by further acquisition during the actual acquisition. As the data acquisition
operator knows the context in which (s)he operates, this also opens the possibility for the
operator to provide information on the context of the data acquisition and to thus enable a
near real-time data interpretation to use the context oriented rule sets for segmentation and
interpretation. Examples of relevant contexts are: a traditional building (mostly horizontal or
vertical planar surfaces, most often 90% angles between planar surfaces), a process plant
(combination of planar surfaces, and piping structures), a cultural artefact (sculptured shape
degraded over the centuries), a cityscape (dominated by buildings and human made objects
and structures), nature without human footprint (sculptured and fractal shapes), nature
influence by humans (landscape with housed and human made objects), exterior of car (Class
Challenge: Managing the abundance, heterogeneity and uncertainty of data
The recent advances in acquisition technologies have made a massive amount of data
available about the geometry of objects; unfortunately in the majority of cases, the acquired
data are raw in the sense of coming from different devices, being irregularly sampled, and
riddled with uncertainty (including noise and outliers). Therefore, despite the expectation that
technological advances should improve quality, geometric data sets increasingly unfit for direct
processing. In addition, geometric data are increasingly heterogeneous due to the combination
of several distinct acquisition systems, each of them acquiring a certain type of feature and
level of detail. Furthermore, we are observing a trend brought on by the speed of technological
progress: while many practitioners use high–end acquisition systems, an increasing number of
them turn to consumer–level acquisition devices such as digital cameras, willing to replace an
accurate but expensive acquisition by a series of low-cost acquisitions. Overall robustness to
FOCUS K3D D1.4.1
where the images are considered to be already registered since they were acquired from a
single scanner sequentially (Rosset 2006), as exemplified in the picture below. Although such
hardware methods are producing good results, there continue to be problems in multi-modal
image analysis. Firstly, there will always be the unavoidable issue of with patient movement
such as from breathing and the beating heart. Secondly, there are only limited types of multi-
modal scanners that are clinically utilised (PET/CT; SPECT/CT, US/PET etc.),) with new
scanners only starting to be made available in clinical practices. For example PET/CT has been
around since mid-2000, while the first PET/MR will be put into clinical service in 2010.
Fortunately, there appears to be a stream of new multi-modal devices entering the market
within the next few years. Thirdly, hardware solutions cannot register more than two images,
thus not supporting truly multi-modal registration, which often involves three or more image
data to be fused. Currently (as of this road map), there are no hardware solutions that support
more greater than two modalities. Nevertheless, in the future, we expect such systems to exist.
There has been a tremendous effort put into solving, in particular, the first problem,
developing software solutions that use the hardware data, however, a lot of additional work
needs to be done. Software based registration typically relies on the similarity of spatial
features between the images, and typically involves image alignment, image scaling, warping,
and transformations. Software based approaches have the advantage that with manual
intervention almost any image modalities can be registered together, i.e. dynamic MRI/US and
SPECT/MRI, thus addressing the second and third problem. Future work must lead towards
further automation and improved accuracy in both the software and hardware registration
solutions. With the hardware developers working in conjunction with the software teams, it is
feasible for a breakthrough solution to be devised.
Ultimately, multi-modal images, either through software or hardware, introduce a great
amount of information that can be used to construct a medical ontology and derive symbolic
representations. As multi-modality introduces complementing information, more details about
the images can be represented, and thus translated into symbols. Currently, the most
commonly utilised dual-modal scanner in clinical practices is the PET-CT. With this scanner, a
CT of the subject is acquired to provide the structural definition, which is then followed by the
PET scan that provides functional information. Together, it is now possible to visualise the
functional structure, such as the lesion, in relation to its anatomical surrounding, as in the
example below from (Nestlea 2006).
The challenge contributes to the acquisition of highly complex phenomena that can be
captured with different measuring devices, and occur at specific locations in the human body.
The registration of the measures and their fusion is essential for the creation of a patient-
specific model, and for the derivation of symbolic representations of morphological and
FDG-PET-CT fusion images of patient with [18F]-FDG-negative atelectasis. Pink structure:
GTV derived from PET (source-background algorithm). Red contour: GTV derived from CT.
Image from (Nestlea 2006) used by permission.
FOCUS K3D D1.4.1
The first problem is that of protein flexibility. If the protein processed has one or more
flexible regions, the signal associated with this region will be low in the diffraction spectrum.
For favourable cases, the biophysicist may be able to reconstruct a set of plausible
conformations for this flexible loop, that is a mixture of loop conformations; in the most
difficult cases, he may have to leave a blank in the reconstruction, meaning that the
reconstruction of this particular region has been impossible. From an algorithmic standpoint,
reconstructing a flexible loop is a challenging task as soon as the number of residues involved
is larger than 10. This requires (i) solving an inverse problem, namely generating candidate
conformations of the loop given fixed anchors at the extremities, and (ii) choosing the
conformation achieving the best correlation with the experimental data. Knowing that a
particular region is highly flexible is of course key in modeling, in particular for the docking
The second problem is related to the biological significance of what is observed in the
crystal. The output of the reconstruction process is the so-called asymmetric unit of the crystal,
namely the smallest structural unit which, when operated upon by the symmetry elements of
the space group, yields the total crystal structure. On the other hand, the biologist is
interested in the biological unit, which is the unit responsible for a particular biological function.
Establishing the relationship between the asymmetric unit and the biological unit is a non-
trivial task. In a crystal structure, contacts between polypeptide chains are forced, so that
most of the time, the contacts observed do not correspond to protein complexes associated
with biological reactions. From an algorithmic perspective, distinguishing biological versus
crystal contacts requires characterizing the packing properties of the atoms in contact, in
conjunction with the energetic functions assessing the stability of the crystal structure. This
problem is still open in its full generality. From the knowledge standpoint, the information
gained from it is crucial. For example, knowing that a protein is a hetero-dimer is a
prerequisite to using this protein in any modeling or simulation process.
Challenge: Coupling theoretical and experimental approaches to unravel the
structure of protein interfaces
Open issues about protein interactions have been approached from different perspectives,
both theoretical and experimental. In the first realm, models and algorithms have been
designed to investigate the geometric structure of interfaces, in conjunction with properties
such as the phylogenetic conservation of amino acids and their polarity, or the dynamics of
solvent molecules squeezed in-between the partners. On the experimental side, techniques
such as isothermal titration calorimetry or surface plasmon resonance allow a quantitative
assessment of the binding affinity of an interaction, while directed mutagenesis provides a
direct assessment of the importance of amino acids. Yet, the fine structure of macromolecular
interactions still has to be unravelled.
Bridging the gap between insights gained in the two realms will require a subtle mix of
biophysics, geometric modeling, and knowledge management. It is a common opinion that
knowledge technologies can provide the means to encode and correlate these subtle and
diverse properties of proteins and complexes into a unique symbolic source, thus underpinning
the modeling on interactions. In fact, one would ideally like to develop high-level descriptions
of interfaces, providing unified and symbolic representations of key biophysical, phylogenetical,
and geometric properties. Such representations would ease the manipulation of proteins to
model large networks of interacting proteins.
FOCUS K3D D1.4.1
A protein - protein interface, as a function of the depth of interfacial atoms, from (Bouvier et al.
2009). Only one of the two proteins forming the complex is shown, the bottom one. (a)
Voronoi diagram restricted to the interface between the two proteins. The colour shade, from
cold (dark blue) to hot (red) colours, measures the depth of a Voronoi polygon at the interface.
(b) The depth mapped onto the atoms found at the interface. The depth map of the interface
provides a unified way to apprehend correlations between a number of geometric and
biophysical parameters. It paves the way to the development of abstract interface models that
will be amenable to annotations and symbolic processing.
Challenge: Modeling molecule flexibility
Flexibility is actually a multi-scale phenomenon involving correlated motions of atoms
interacting both covalently and non-covalently (atoms that share or do not share covalent
bonds) in dense surroundings, so that the heart of the problem consists of disentangling these
couplings. From the computer science perspective, this process is reminiscent of
dimensionality reduction (finding couplings), but also rigidity analysis (finding the hinges of a
molecule). On the other hand, abstract mathematical models encoding the flexibility of
molecules are not easily accessible to people designing experiments, for example the
optimisation of a protein complex. Ideally, one would like to enjoy a classification of proteins
based on the flexibility properties of their domains or secondary structure elements, depending
on certain additional parameters (such as the biological environment for example). Deriving a
symbolic representation based on flexibility would clearly provide a strategic advantage when
docking molecules, and call for further work on the problem of protein shape analysis and
annotation (Guharoy et al. 2010).
Challenge: Automatic semantic-driven segmentation of 3D objects
Many automatic segmentation methods already exist in the literature, which are suited to
identify different kinds of features on specific shape classes, while none of them is recognised
to work better than the others in the general case. Since it seems unlikely that such a general
segmentation will ever be conceived, there is a trend in using intelligent combinations of
different methods, each optimising a different segmentation criterion; the challenge here is
how to select automatically the proper segmentation methods and/or how to
combine the output of a set of decomposition algorithms in order to return a
meaningful segmentation of a given object according to its class, its
meaning/function, or to the particular application domain. This may require a
formalisation of the correspondence between shape categories, shape properties that
characterise a shape in a specific category, and segmentation methods able to identify such
properties. Knowledge technologies are a promising way to perform this task. We believe that
the problem should be addressed by a tighter coupling of geometry processing methods with
cognitive science, so that the properties characterizing the shapes can be better aligned with
features that are used by humans to “understand and classify” shapes. Also, machine learning
or similar statistical methods are expected to play a key role to cope with the complexity and
variability of 3D object shapes, and as a tool supporting the automatic segmentation of
FOCUS K3D D1.4.1
Challenge: Derive expressive symbolic descriptions of 3D shapes
Provided that meaningful segmentation methods will be available which, singularly or
combined in some smart way, provide all the important features and components of a shape,
the challenge concerns the issue how to organise them symbolically by encoding their metric
and semantic properties, relevance, and level of detail in a hierarchy of symbols that
represents fully the semantics of the object and its parts.
Much like in linguistic morphology, where words and lexical groups are seen as interfaces
between syntax and semantics and are organised in a semantic skeleton (Rochelle 2004), 3D
shape features and components should be hierarchically arranged without losing the link with
the source geometry of each entry. An example may be the shape graph proposed in (Mortara
et al. 2004) where the skeleton is the adjacency graph of segments given by a multi-scale
decomposition into tubular features. Each segment retains geometric and topological attributes,
but the only semantic knowledge encoded so far is whether a segment is a tube or not. The
semantic skeleton envisaged by this challenge, however, should be so expressive to make a
3D shape completely machine understandable.
4.1.1 Time line and dependence diagram
In this section, a diagram showing the time lines of the proposed issues and the
dependencies between the issues connected with the grand challenge Derive symbolic
representations is presented.
The dark boxes cluster families of challenges that, in our opinion, are related to each other:
we gave each cluster a name, i.e. DSR1, DSR2, DSR3 (Derive Symbolic Representation 1, 2,
3). The dependences are marked with an arrow from A to B, which means "know-how from A
will support the development of B". The boxes are put in line with a time, which represents the
time when we estimate the challenge will be achieved.
The names of the challenges in the boxes do not coincide exactly with the ones used in the
FOCUS K3D D1.4.1
4.2 Goal-oriented 3D model synthesising
When modeling a 3D model, the practitioner has most often an end goal, i.e., an application,
in mind. For the computational engineer, the model must fulfil a number of functions while
matching a set of constraints related to its size or manufacturing cost. For the medical
practitioner an organ must function well after surgery or drug treatment. In gaming and
simulation, the model must interact well while providing a faithful or artistic depiction. For
these reasons the computerised representation of the model is conceived with a substantial
amount of knowledge, which – among many application-dependent principles –, relates the
shape and its function. While this is currently satisfactory for a number of applications, the
knowledge is often incorporated into the process only through the practitioner. In addition to
the computerised representation of the object, an ambitious challenge consists of also
formalising the knowledge in a computerised manner such that the whole conception cycle
(which comprises modelling and simulation) ultimately becomes solely driven by the targeted
end goal. Furthermore, when using acquisition instead of modelling such as for medicine or
reverse engineering applications, another challenge consists of elaborating upon an integrated
process, which is again driven by the end goal (e.g., machining and quality control, surgery)
with a feedback loop, which relates acquisition to simulation.
In this section, we draw a research road map geared toward goal-oriented model
synthesising. To this end, we discuss the problems and algorithms required to elaborate upon
models specifically tuned for the aimed applications and the organisation of storage and
retrieval for these models.
One enduring challenge is the problem of synthesising a shape to fulfil a certain function.
Some examples of simulation-driven modelling exist for instance in the field of computational
fluid dynamics simulations, where an optimal shape is generated through repeated
simulations. Nevertheless, no general and knowledge-based approach exists to achieve a well-
FOCUS K3D D1.4.1
principled function-oriented shape synthesis. A first step in this direction is to relate shapes to
functions and vice versa, and to apply rule-based shape modeling. In mechanical engineering
the ideas of construction methodology and principle solution to achieve certain effects
(“Wirkprinzipien”, physical principles) exist. Transforming these well-documented principle
solutions into a computer-interpretable knowledge base can help re-using these provably
working solutions. The first difficulty stems from the fact that the term function can have a
variety of meanings in the different domains and even within a single domain. In engineering
function may cover, e.g., mechanics, dynamics, hydraulics, and electronics behaviour.
Furthermore the design process must consider all aspects and phases of the product life cycle
such as tool making, production, use, environmental compliance and disposal. The second
major difficulty stems from the fact that simulation-driven modelling requires looping between
simulation and modelling. This requires many traversals of the geometry processing pipeline,
which is notoriously difficult to streamline.
Overall the problem of synthesizing “functional” shapes is very complex and automatisms
need to be studied carefully depending on the context. Such contexts need to be
conceptualised to build up knowledge bases. Semantics in this context must be interpreted as
being functional behaviour and requires simulating the functions modelled. As of today there is
not a single modelling language to cover all different kinds of functional modeling nor a single
simulator to simulate all kinds of models. All attempts to find such a modeling language or to
develop such a simulator have failed in the past. Careful studies need to approach the problem
field on a case-by-case basis, advancing step by step. In some areas, e.g., in simulation,
automatisms have to be interwoven with the process of finding the best representations and
the simulation process itself. In some cases it can be decided only during the simulation
process where more detail is needed to create a reliable simulation result. For all these reasons
the function should ideally be abstracted in the sense that its description must be independent
of the 3D representation. Finally, we need to conceptualise application and experience
knowledge to bring the abstract concepts of shape and function into context and apply
semantically aware algorithms to practical problems.
In this context trying to come up with one single magic solution is likely to fail.
Nevertheless, semantically-based automatisms can generate high increases in productivity in
industry by giving engineers access to proven principles and by relieving them of repetitive
manual modelling and processing stages. The potential benefits are multiple and affect almost
all areas where application working groups have been active in FOCUS K3D: i) virtual products
or anatomical models could faster be defined, transformed for being used for different
purposes and re-used in a variety of evolving contexts; ii) virtual models for games could be
derived from master models and automatically adapted to various hardware environments; iii)
digital cultural heritage models could be more easily re-purposed for different applications such
as research or virtual exhibitions.
For goal-oriented model synthesising an additional challenge is to provide the practitioners
with additional function-centric semantic facets and descriptors for browsing and retrieval from
massive datasets. One example of a targeted application scenario is the conception of a
mechatronic product where the engineer would reuse existing knowledge through querying for
a model which fulfils a set of specified functions. Part of the problem is to obtain a computer-
interpretable knowledge base together with instances of reusable models. Furthermore, in the
context of simulation-driven modelling another important aspect of the problem is how to
formalise the knowledge about the results of the simulations themselves, which may be
massive data as well. The later models are in a sense already matured over time while still
being purely digital.
We now list a number of focused challenges that we use to exemplify the proposed research
Challenge: Streamlining the geometry processing pipeline for goal-oriented modeling
The geometry processing pipeline ranges from acquisition to machining, including
registration, reconstruction, repair, simplification, analysis, editing, watermarking, storage,
FOCUS K3D D1.4.1
transmission, searching and browsing. The lack of robustness, generality, and guarantees
concerning the algorithms makes the streamlining of the whole processing pipeline quite
impossible. While each building block along the pipeline is presented as a fully automatic
process in academic papers, their requirements for inputs and lack of guarantees of outputs
prevents these building blocks from working together seamlessly. Practitioners thus need to
deal with a trial-and-error iterative process not conducive to efficient geometry processing.
This lack of automation can have dire consequences for the computational engineer, especially
in the context of goal-oriented modelling and simulation. For instance, an aircraft
manufacturer may need to perform a computational fluid dynamics simulation of a production-
level CAD model of a helicopter, which will have to be converted into a surface mesh, which is
watertight and intersection-free. Unnecessary features (such as interior details) may also need
to be removed from the CAD model before being converted to a smooth surface mesh
amenable to computations. The overall procedure involving conversion, repair and defeaturing
currently takes weeks for an experienced engineer, while the simulation takes one hour of
parallel computation on a cluster of 2K computers. As this procedure must be repeated for
each major editing of the CAD model, and as it is the wall clock time of a process that matters
in such industrial applications, it is of crucial importance to reduce its duration from weeks to
hours through automation. The solution would be to elaborate upon algorithms, which are
extremely resilient to the input while guaranteeing the output.
Challenge: Reusing the knowledge in maintenance
Numerical behaviour simulation is fundamental for new solution assessment in various
engineering activities as it allows avoiding physical validation tests. Numerical simulations are
used in various product life cycle phases from initial design to maintenance or other lifecycle
problem analysis. Such activities are frequently submitted to various constraints crucial from
the production point of view. In the case of maintenance for instance various constraints
related to production interruption are normally arising that make it crucial to accelerate part
substitution or adjustment.
Today product behaviour analyses are classically
carried out by 5 steps as follows: (1) CAD
solution modelling, (2) mesh model preparation,
which also includes some shape adaptation
processes to make the model tractable, (3) FE
model preparation for adding additional semantic
information related to the behaviour modelling,
such as Boundary Conditions and behaviour laws,
this includes creating groups of mesh elements
for supporting the specification of the simulation
semantics, (4) FE simulation, (5) result analysis
and optimisation loops. These optimisation loops
normally require going back to the CAD model
(step 1), adjust the shape accordingly and then
perform again all the preparation and simulation
phases. Therefore, the first four steps, shown in the figure (taken from Lou et al. 2009), are
often executed several times in the loop until the results are satisfactory.
It is then clear which advantages could be obtained if the behaviour-related semantic data
could be maintained during all these modification activities, even more it is evident that having
the possibility of acting directly on the FE mesh model with efficient shape modification tools
able to maintain such semantics would drastically reduce the evaluation of different
alternatives to find the optimal solution.
The development of such tools needs to: 1) provide efficient shape modelling capabilities
directly working on 2D and 3D meshes, possibly having idealised elements; 2) maintain and
update the associated geometric support for the associated behaviour-related semantics
according to the shape changes.
FOCUS K3D D1.4.1
Challenge: Simulating functions of the human body
The human body is composed of different complex and heterogeneous elements. Modelling
this system remains a big challenge due to its
complexity of geometry, mechanical behaviour and
interactions. Consequently, the study of the human
body is divided into different disciplines. Despite
the interdependencies, the models developed in
the various domains focused on one specific aspect.
Nowadays, creating a complete and accurate
human model is still a difficult task. Therefore,
pluri-disciplinary networks are created in order to
exchange and combine knowledge from different
domains of expertise (e.g., clinicians and
engineers), and to develop applied technologies for
medical purposes. Simulating the functions of the
human body requires multimodal data provided by various acquisition modalities. For instance,
the 3D anatomical models (e.g., bones, cartilages, ligaments, muscles, tendons, etc.) are
reconstructed from imaging data using segmentation techniques, while the functional and
kinematical parameters (e.g., motion parameters, material properties, applied loads and forces,
etc.) require several acquisition and post-processing techniques. A semantic-driven medical
application could be more efficient to manage and structure the large amount of acquired data
(Charbonnier et al. 2007). As a result, the 3D visualisation, navigation and analysis of the
human body would be improved and would have a positive impact on the medical procedures,
as well as for teaching and training. We believe that in the coming years the growing advances
in hardware technologies will have great impact on the simulation phase. However, the
integration with the medical devices (e.g. scanners, surgery simulators, etc) and the human
physiology modelling needs more time.
Challenge: Models combining geometry and biophysics knowledge for docking of
Simulation using macromolecular models encompasses a huge number of scenarios, and we
focus in the sequel on one of the most important ones, namely docking. Generically, docking
aims at predicting an assembly from the unbound partners. The number of protein structures
known to date is about 100 times smaller than the total number of non-redundant coding
sequences. In fact, the 55,000 structures found in the Protein Data Bank contain a mere 1000
biological complexes or so. In other words, there is an extreme paucity in terms of biological
complexes known. Figuring out such complexes from the unbound partners is the goal of
docking. Practically, a number of variants of docking exist. First, one may distinguish
depending on the number and on the nature of partners: binary protein-protein docking;
binary protein-drug docking; docking of multiple component assemblies, etc. Second, one may
also distinguish depending on the properties of these partners: rigid body docking if they do
not undergo any significant deformation upon binding, flexible docking otherwise.
As a pre-requisite to docking, structural biologists analyze data and identify amino acids as
well as regions on the protein surface that may be involved in the binding process. These
residues and regions often concentrate the bulk of the effort during the docking. But as of
today, the signals conveyed by conservation and mutagenesis data are complex and poorly
understood. In particular, the couplings between residues, both in terms of conservation and
variation of free energy deserve further investigations. Here such investigation will
undoubtedly yield advanced annotations of proteins.
Rigid body docking aims at finding the best relative position of two partners which do not
deform upon binding. This problem greatly benefits from the identification of the regions most
likely to be in contact, since the bulk of the effort can be concentrated there. When the
biological information is not precise enough, the geometric detection of pockets and more
generally surface regions favourable to binding is called for. As evidenced by the CAPRI
community-wide experiment, rigid body docking is pretty much under control. Yet in many
FOCUS K3D D1.4.1
cases, the molecules deform so as to bind. First, rigid body docking is applied to coarse
representations of the molecules, so as to select approximate bound conformations and
relative positions of the partners. Second, upon switching to atomic models, flexibility is
investigated at a local scale, typically that of side chains. That is, the conformations of the side
chains are explored thanks to so-called rotamer libraries, the admissible conformations for the
side-chains. Third, the atomic positions are adjusted by running a molecular dynamics
simulation. Finally, for each conformation retained, a score is computed so as to rank all the
putative solutions. For a very flexible region, the question arises of generating families of
conformations, which represent the admissible geometries. For cases where the flexibility is
milder, the question of developing scoring functions able to single out the near native
conformations is a central one. All these developments require a tight coupling between
geometry for the description of the models and biophysics knowledge for the calculation of
4.2.1 Time line and dependence diagram
In this section, a diagram showing the time lines of the proposed issues and the
dependencies between the issues connected with the grand challenge Goal-oriented 3D model
synthesising is presented.
The dark boxes cluster families of challenges that, in our opinion, are related to each other:
we gave each cluster a name, i.e. G3DMS1, G3DMS2, G3DMS3 (Goal-oriented 3D model
synthesising 1, 2, 3). The dependences are marked with an arrow from A to B, which means
"know-how from A will support the development of B". The boxes are put in line with a time,
which represents the time when we estimate the challenge will be achieved.
The names of the challenges in the boxes do not coincide exactly with the ones used in the
text and we added a few more issues in the diagram.
FOCUS K3D D1.4.1
4.3 Documenting the life cycle of 3D objects
Documentation of 3D objects is a continuation of the traditional activities, which scholars
and scientists have been pursuing for several centuries. Whereas static 2D forms of
documentation (plans, sections, elevations, reconstructions) created on paper and published in
print have been used for a long time, 3D interactive digital tools are now increasingly
employed. This transformation of the medium of expression and publication has spread rapidly
through a large, well-established field, which has generally embraced the new technologies in
recognition of their obvious superiority to what they have replaced. Ironically, scientists and
practitioners are doing little to conserve their own digital products.
There are a number of research challenges that should be addressed to realise an ideal
cataloguing and documentation of the life cycle of 3D objects. These challenges include, for
instance, coding of the data provenance and version control for 3D models, effective metadata
structures, interoperability, object and part-based annotation.
Central to the documentation is the annotation of a 3D object. By 3D annotation we mean
the process by which a text-based piece of information is linked/associated to the object and
stored for subsequent uses. We speak about semantic annotation because (or when) the text
associated is meaningful in some context, and used for understanding and storing information
about the object, which is not explicit or not contained in the geometric data that define the
3D annotation is a fundamental step in the documentation of 3D objects. Annotation may be
related, for instance, to the process/workflow that was used to create or process the object
and/or its parts. Annotation may be also related to the meaning of an object or an object’s
parts. If the target of the annotation is a part of the object, this part to be associated
with textual data has to be selected with appropriate tools. Selecting regions of interest in the
manual annotation of 2D media is rather simple in terms of the user interface: dragging a
selection box or lasso tool over an image achieves the necessary functionality. The same does
not hold for 3D media, or at least it is complicated by the data’s nature: parts might be out of
reach for mouse interaction, and bounding a part can be rather complex. Therefore, the issue
of smart identification of semantically relevant parts/features arises, either in an interactive or
in an automatic way.
Part-based textual annotations might be very useful also to support 3D content-based
retrieval relying on textual queries instead of, or in addition to, classical geometry-based
techniques. One can easily imagine a process where a new shape is created by searching for,
modifying and composing parts of already existing objects. This will be possible only if existing
shapes have been previously segmented and annotated at the component level. Also the
editing and assembling of parts may be further supported by the semantics associated to them:
the function or meaning of a feature is likely to give hints about where and how it should be
placed on another object (e.g. where to place a handle on a bowl to shape a teapot). The
annotation related to the meaning or functionality of objects and objects’ parts could support
advanced mechanisms of interaction among objects in virtual worlds, no built-in behaviours
but reactivity and self-adaptability to the context.
The duration of the life cycle of 3D objects varies a lot between the different application
fields. In the aerospace industry all relevant data needs to be kept for more than 50 years –
think about any 3D data format, which sustained the last 50 years - there is none. Long term
archiving is thus mainly done by storing blue prints or digital 2D drawings in TIFF. In the area
of Cultural Heritage digital representations of physical objects are to be provided to the next
couple of generations. In games 3D objects typically live only a few years during the
production process of a particular game.
To come back to the aerospace example, not only is it necessary to document the status,
which represents the airplane as-built, but also all the maintenance work done over its 30-year
use phase. Imagine all the data in some archives (in analogical and digital form) and the value
and knowledge this data represents as well as the efforts implied when searching specific
information in those archives, being it paper folders, TIFF-files or product life cycle
FOCUS K3D D1.4.1
In the following, the challenges related to 3D documentation are described.
Challenge: Documenting provenance data
The documentation of provenance, that is the source, origin or history of derivation of a
particular object, is considered a critical requirement in many practical fields and is highly
content and application dependent. The reason is that 3D content without the context of what
is being represented is not very useful for users. It is regarded as important to demonstrate
the value of 3D content for contextualizing other information that will benefit i) general users,
as not everybody has the same expertise for reading lengthy descriptions on the physical parts,
history and meaning of each of the parts; and ii) scholars, who will be able to explore new
methods to investigate and discover the objects’ relevance and their context.
Developing a data model, e.g. by creating an ontology or extending an existing one, that
maps the most important concepts of provenance for both physical and born digital 3D objects
is a research challenge. In this context, provenance describes the events that occur during a
digital object’s life cycle including the process of digitizing and documenting the 3D content.
This information is important to record the provenance of the 3D content (Theodoridou et al.
2008). Then, it should be possible to record important information such as: who made the
digitisation? Which were the settings of the hardware? Who made the post-processing of the
3D content? Who interpreted the documentation?
Challenge: Version control for 3D models
Virtual environments are a particularly demanding class of 3D models for managing
versioning. For example, 3D models of archaeological reconstructions often are changed and
evolve over long periods of time as new knowledge comes to light from excavations and
further scholarship. Models themselves may depict changes across large temporal spans, with
multiple versions corresponding to reconstructions at specific points in time. Additionally, the
technological aspects of 3D modelling often require the development and distribution of
multiple versions of models. For example, 3D models constructed for use in fully interactive,
real-time environments are usually of much lower geometric complexity than models used for
static, high-quality renderings.
The methods of version control as applied to software development can theoretically be
adapted to 3D models, but no concerted research effort has yet studied the application of
version control techniques to the development and dissemination of 3D models. Version
control should be a fundamental feature of an effective archive for publishing 3D models.
Future research should investigate the applicability of traditional revision control methods,
and further extend them to be suitable for managing large archives of 3D models. We
anticipate (in the near future) developing solutions for a 3D model version control system with
the following features:
3D Difference Computation and Visualisation. Traditional revision control systems for textual
documents provide functionality for convenient viewing of the differences between two
document versions, similar to the “diff” command available on UNIX systems. A “3D diff”
function that allows rapid 3D visualisation of the difference between two versions of a 3D
model calls for further investigation. This capability could use geometric analysis algorithms
and information visualisation techniques to automatically identify and then visually highlight
the variations among 3D models.
Compression for 3D Models. For purposes of storage efficiency, revision control systems
usually retain only the differences between successive or branched versions of documents;
similar techniques for compression of 3D geometric components should be investigated.
Tracking changes (addition, deletion, and modification) to 3D Models. As with source code
revision systems, each edit should be stored with metadata representing the time of the edit,
the identity of the editor, and comments or annotations describing the nature of the
FOCUS K3D D1.4.1
Challenge: Digital rights management for 3D models
Providing for the digital rights management of 3D models is of paramount importance in the
development of 3D archives and disseminating 3D content securely is a difficult problem. Most
of the prior research in this area has focused on the a posteriori detection of piracy (e.g.
watermarking), not its a priori prevention.
Such an a priori approach is ScanView (Koller et al. 2004), which provides the user with
only a very low-resolution version of the 3D model for navigation purposes, and delegating
high-quality renderings to a remote secured server that contains the high-resolution model.
Other methods should also be investigated, which may hold better promise for allowing
protected sharing of 3D models. One of such methods is secure graphics hardware, using a
decryption key that is embedded in the hardware, e.g. in 3D graphics cards. While this
approach is certainly sound from a security point of view, the architectural implications of
adding cryptographic capabilities to graphics hardware have not been studied.
Another interesting possible approach for allowing protected distribution and usage of 3D
models involves the application of encrypted computation techniques for digital rights
management (PaFi2005). If such a scheme could be implemented, the 3D models would be
distributed in an encrypted form, along with a custom viewer that performs the critical
rendering functionality directly on the encrypted representation. The limitations of encrypted
computation are currently an open research problem, and encrypted rendering appears far
from practical at this time.
Defending against reconstruction attacks that draw on computer vision algorithms to
recover the model from a sequence of high-quality renderings is another research challenge.
Whereas the techniques described before are concerned primarily with allowing protected
rendering and display of 3D models, an ideal secure dissemination system would also allow
users a greater degree of geometric analysis of the protected 3D models without further
exposing the data to theft. Database researchers have developed privacy-preserving data
mining techniques that extract useful knowledge from databases without significantly
compromising data security (Verykios et al. 2004). Applying similar methods to the problem of
protecting 3D data could be a feasible solution.
Challenge: Workflow annotation
With the prevalence of distributed computing a growing demand for tracking, recording and
managing data sources and derivation has become a reality and there are provenance-related
requirements identified within the scope of many applications. However, there is no commonly
agreed conception of provenance in the context of service-oriented computing, nor any
concretely implemented prototypes.
Workflow provenance has been investigated in many contexts using definitions such as
audit trail, lineage, dataset dependence and execution trace. We usually refer to workflow
provenance as the process that led to the data. This process-centred view of provenance is
motivated by the observation that most scientific (and business) activities are usually
accomplished by a sequence of actions performed.
Process documentation and preservation of 3D model annotation in workflows is a research
challenge. The overall data flow (internal and external) characterises the process that led to a
result. The key role of semantics becomes evident, not only in the Knowledge Technology and
Knowledge Management sense as a best practice for 3D content documentation and sharing,
but also as a driving factor for the development of new and more effective computational tools
as well as for peer-reviewing and validation of e-Scientific results.
Challenge: Mark-up language to associate 3D geometry to its annotation
There is no clear standard way to link annotations to 3D geometry. Current standards for
FOCUS K3D D1.4.1
expressing geometric data, such as X3D, allow the possibility of describing compound scenes
as assemblies of simpler ones, and describing behaviours and interactions. X3D, however, is
mainly used to code information needed for interactive applications rather than for a complete
3D Semantic environment. Traditional geometric representations could be enhanced with tags,
annotations, and even hyperlinks to other 3D models, making them evolve towards what
Havemann and Fellner call generalised 3D documents (HaFe2007).
The problem of defining a stable 3D markup has to be tackled in its generality. Preliminary
work in this direction has been proposed with the ShapeAnnotator (Attene et al. 2009), which
keeps the geometry and annotation in two distinct files; the annotation produces a set of
instances that, together with the domain ontology, form a knowledge base. Each instance is
related to one part of the model, and it is defined by its URI, its type (the class the feature
belongs to) and other attribute values and relations.
The Shape Annotator
An important point is that annotations, or tags attached to parts of 3D models, should
survive changes in the geometric representation. The object is unique, and so should be its
annotated features, while the representations used to manipulate the object may vary at
different stages of the modeling workflows. Maintaining the annotations consistently is not
trivial at all, even if we consider just one representation type, such as for instance triangle
Think of part-based annotation of a 3D model representing a statue or a complex artefact.
For visualisation, we need to simplify the models; that is, we need to remove a number of
vertices and triangles. What happens to the annotations? How do we keep them consistent
across resolution changes?
The problem gets even more complicated if we think of completely changing the
representation type - for instance, switching from triangle to quadrilateral meshes. The
statue’s shape, together with its relevant features, remains the same, and the annotations
should follow the scale changes accordingly and smoothly.
FOCUS K3D D1.4.1
The Bimba model represented by a triangle mesh, by a simplified triangle mesh, a
quadrilateral mesh and by resampling
Challenge: Massive semantic 3D annotation
More and more 3D models are becoming available. In order to recognise, retrieve, reuse,
interact, and design with 3D models, a semantic annotation of the whole and the parts of the
model is necessary. What we are looking for are 3D models that have more and more
information attached to them or parts of them in the form of annotations presenting
knowledge about the object, a part of the object, its purpose or function. Annotations support
machine-processable knowledge sharing of semantically enriched objects and therefore open
new frontiers to the usage of 3D content in virtual and networked environments.
Only recently, and especially in the CAx domain, 3D models embed certain types of
annotations and data formats exist, which encode the 3D model together with its annotations.
Those annotations are typically restricted to material information and manufacturing process
information. In the Cultural Heritage field, 3D annotation is perceived as crucial but is not
established yet. It offers Cultural Heritage experts the possibility to link their hypotheses with
digital 3D artefacts, or parts of them.
When talking about massive amounts of 3D objects becoming available on the Internet,
social tagging becomes the only reasonable way (besides automatisms but they will have
limitations) to handle and enrich these models. Social tagging for 3D objects is also a relatively
new field. Quality of data, tags, motivation schemes for users, etc., are largely unexploited in
that area. For today’s professional use 3D objects from the Internet are not mature enough,
unless they are provided by companies that ensure their quality. Nevertheless 3D warehouses
are further developing. Exploiting mechanisms to stimulate social tagging for 3D models along
with developing better tools form an interesting research topic.
Imagine for instance the impact that a massive annotation of 3D city models could have.
There is nowadays a large availability of 3D city models that are produced thanks to Lidar data.
To make sense out of these data, semantic annotation is needed but it is implausible to think
of manual annotations of the datasets. Literally millions of models must be semantically
analysed and annotated, in order to make it possible to use the models in an intelligent way.
The problems to be solved are to decide what kinds of annotations are desired, and how 3D
shape can be segmented in advance so that the individual parts of the shapes can be used.
This requires both the development of effective segmentation algorithms to identify parts, as
well as the development of a strategy to have those parts semantically annotated at a large
scale (see also the challenge on automatic semantics-driven segmentation).
Many tasks are trivial for humans but continue to challenge even the most sophisticated
computer programs. It has been advocated to constructively channel human brainpower
through a class of computer games called "games with a purpose", or GWAPs, in which people,
as a side effect of playing, perform tasks computers are unable to perform (AhDa2008). The
Entertainment Software Association has reported that more than 200 million hours are spent
each day in the US playing computer and video games. Indeed, by age 21, the average
American has spent more than 10,000 hours playing such games—equivalent to five years of
working a full-time job 40 hours per week. What if this time and energy were also channelled
FOCUS K3D D1.4.1
toward solving computational problems and training AI algorithms? People playing GWAPs
perform basic tasks that cannot be automated. The challenge is to design a massive initiative
that leads to semantically rich annotations of 3D models.
For specific classes of shapes, there are promising approaches that deserve more
investigation. The propagation of annotations is gathering interest within the research
community. The approach is to segment manually a representative shape, annotate it with
metadata, and then propagate both the segmentation and annotation in a consistent
way to all the similar shapes, and to refine finally the annotation for each specific shape, if
needed. The problem of consistent segmentation is very recent and preliminary results work
by propagating segmentations within pre-defined classes of similar shapes (GoFu2009). The
challenge for the future is then how to assign automatically a segmentation (and an annotation)
to an input shape by automatically selecting the proper reference object (already segmented
and annotated). (see also the challenge on 3D search).
Propagation of consistent segmentations across models of the same class (GoFu2009)
Challenge: Long-Term preservation of 3D data
One of the most pressing needs is to ensure the survivability of 3D models. More and more
3D models are being created, however, poor archival practices make them lose their original
functionality and information richness after only a few years, or even sooner. A more active
approach should be adopted for the preservation issue and the management of the life cycle of
digital resources, from data creation and management to data use and rights management.
The digital libraries community has made strong progress in technical solutions for
preservation and reliability of access, and we recommend extending and applying these
techniques to 3D archives. Semantic web technologies can effectively be used to facilitate the
integration and interchange of heterogeneous information and to generate metadata.
We have to acknowledge again that end-users’ contributions will be critical for acquiring and
documenting 3D resources, just as it is now for images and videos on the web. Nevertheless, it
is recognised that there are challenges involved in requiring non-experts to work with software
and hardware for 3D graphics, which are not yet commonly used. The main challenges involve
designing processes (software and hardware tools), which can be easy-to-use and can be
followed by users with a minimum amount of training. The final aim is to enable users to
capture and document 3D representations of objects and their metadata.
There is an urgent need for reliable documentation as the key step required for preservation
(preservation through documentation). An example of this trend is the CyArk 3D Cultural
Heritage Archive (www.cyark.org), an open access Internet archive where the stored data
provide a valuable resource for preservation professionals (site managers, archaeologists,
conservators, architects, and engineers). It uses an integrated methodology - 3D High
Definition Documentation (HDD) - that utilises advanced survey and imaging technologies. The
core technology is 3D laser scanning combined with high-resolution photography, while other
technologies may also include GPS, photogrammetry, GIS, and remote sensing.
FOCUS K3D D1.4.1
Challenge: Operational 3D cultural repositories
The rapid spread of virtual heritage has inevitably created some new problems and set forth
some major challenges facing the virtual heritage community. There are a number of technical
research challenges that should be addressed (Koller et al. 2009) to realise an ideal 3D
repository (digital archive). These challenges include digital rights management, version
control for 3D models, effective metadata structures for publication, updating and long-term
preservation of 3D models, interoperability, and 3D searching.
Other critical functions associated with digital archives of 3D models that require further
research are indexing and searching. An active area of research that is directly relevant is the
development of shape-based retrieval methods that measure shape similarity between 3D
models. Rich textual annotation of 3D models can often provide metadata for text matching
and retrieval, but the value of semantic searching that can combine both text and 3D shape
similarity must be further investigated.
Also, further research into algorithms for indexing and searching on inherent model
attributes that do not depend on human annotation should be considered.
Techniques for organising associated metadata and its effective presentation to users should
be investigated as well as methods for managing and disseminating 3D models. One
particularly interesting challenge is to provide methods for allowing interactive exploration of
relevant metadata displayed in corresponding locations in the 3D model or virtual environment.
For example, ancient cities may contain hundreds of individual buildings, each of which can
have distinct associated metadata. Thus the metadata will need to be spatially referenced to
the underlying 3D model, and the 3D exploration interface linked to an appropriate display
methodology that combines presentation of 3D, 2D images and/or textual metadata elements.
Current metadata structures are not likely to be appropriate for organizing 3D metadata, and
thus we suggest research into alternative models.
Challenge: Complete and detailed documentation of the anatomical model of the
In Medicine, there is an issue about results validation because a complete detailed
anatomical model does not exist.
In order to validate results obtained with 3D models and simulation,
the availability of a large set of reference digital body models would be
highly beneficial. The Visible Human project already pointed to this
problem more than 10 years ago (Spitzer et al. 1996, ImMo2005).
Indeed, creating standards and norms about the organs of the human
body will help to detect malformations, diseases, and injuries.
Moreover, on one hand, having a 3D digital representation model of
the human body accurate enough at different levels will lead to finding
the “best diagnostic” to apply. On the other hand, such a model will
also improve the communication between computer scientists and
doctors because they will have a common representation with
semantics to treat the patient.
The process needed to create these reference models is, however,
not trivial at all as a significant number of data of individuals in
different conditions and with different characteristics are required. Also,
non-invasive techniques for acquiring the data should be used.
The Visible Human project
4.3.1 Time line and dependence diagram
In this section, a diagram showing the time lines of the proposed issues and the
dependencies between the issues connected with the grand challenge Documenting the 3D
FOCUS K3D D1.4.1
lifecycle is presented.
The dark boxes cluster families of challenges that, in our opinion, are related to each other:
we gave each cluster a name, i.e. D3DL1 and D3DL2 (Documenting the 3D lifecycle 1 and 2).
The dependences are marked with an arrow from A to B, which means "know-how from A will
support the development of B". The boxes are put in line with a time, which represents the
time when we estimate the challenge will be achieved.
The names of the challenges in the boxes do not coincide exactly with the ones used in the
4.4 Semantic interaction and visualisation
The most natural and traditional manner to interact with 3D content is by visualisation and
rendering of the geometry that defines the objects. While we have plenty of tools for
visualising, streaming and interacting with the geometric information related to 3D objects,
tools for retrieving, manipulating and presenting the semantic content of 3D media are still far
from being satisfactory. In other words, current graphics systems are not conceived to give
explicit information about the semantics of the 3D content, which can be grasped only by
viewing the object itself.
If we are able to embed semantics in 3D models with goal-oriented modelling
methodologies, however, also new approaches to the interaction and visualisation can be
introduced. For instance, by semantic interaction we address the issue of interacting with 3D
models/objects using the semantic description of the object and the intent of the user to drive
the interaction process.
Let us give an example in the CAD domain: parametric feature-based modeling is a first
step into the direction of semantic interaction. Tools to create shapes by using terminology and
FOCUS K3D D1.4.1
operations the user is familiar with are placed at his/her disposal. Amongst those features are
simple concepts such as holes, rips, or slots: the user does not need to model the geometry of
the part he/she has in mind, but needs simply to instantiate the class of the part with the
But what happens – for instance - if the user wants to model a building that has some
arcades and he/she has to change the size of the building later on? Will the arcades be scaled
automatically or do new ones have to be introduced by manual copy operations?
Semantic interaction, at the level of modelling tasks, would take the goal of the operation
as well as the characteristics of the object into account and provide the correct modification
possibilities, which comply with the user intention. This requires both:
• Modelling the semantics of the shape, how it is structured, what the construction rules
• Modelling the semantics of the interaction, or in other words, the semantics of the
Today, both aspects are only in their infancy.
Another example is interaction in virtual reality that should respect and be guided
automatically by the semantics of the 3D models (scene). Devising methods to automatically
process and use the semantic description of 3D objects would open up new possibilities for
Human Machine Interaction using natural human communication channels, such as speech.
Understanding the meaning (semantics) of the virtual scene being manipulated and the
statements of the users and mapping them to each other is a field of ongoing and future
research. While speech understanding has largely evolved over time, semantic scene
descriptions are rather uncommon in 3D modeling applications. This field is further developed
rather in the area of Computer Vision, which creates potential for synergies with Computer
Graphics. Moreover, if objects are properly described in terms of meaning and functionality of
their parts, we might even think of devising methods to support the autonomous interaction of
objects or avatars in virtual worlds, thus opening an entirely new perspective to the
development of simulation and gaming applications.
Another aspect concerns the semantic visualisation, which is a kind of analogue to semantic
interaction on the ‘output channel’ from the model to the user. While semantic interaction aims
at interpreting the inputs of a user in a goal-oriented and semantically reasonable way,
semantic visualisation aims at providing the ‘right’ visual output given a certain visualisation
‘problem’ with properties such as:
• Model/data set to be visualised
• Visualisation purpose
• Kind of insight to be achieved
• Kind of decision to be taken.
While semantic visualisation can easily be imagined in a number of general purpose
scenarios – automatic production of thumbnails for 3D models, automatic production of
catalogues for shops selling 3D objects, furniture, cars, or other goods –, CAE with its
multitude of simulation results, huge data sets, and largely varying kinds of analysis (post-
processing) done based on simulation results, is the field where the benefits of semantic
visualisation can be easily understood. In CAE and scientific visualisation a lot of dedicated
tools exist to fulfil specific data analysis (post-processing) requirements. These have been
developed on a case-by-case basis. Only little work exists trying to conceptualise those
analysis scenarios and build up a theory for goal-driven visualisation of semantics contained
within data sets, while semantics (here meaning) contained within the data sets largely
depends on the insight the user wants to gain by looking at the simulation results.
Last but not least, we believe that semantic search of 3D models will soon become a key
issue in 3D content management and consumption. The success of 3D communities, gaming
technologies and mapping applications (e.g., Second Life, GoogleEarth, 3D city modelling) are
FOCUS K3D D1.4.1
causing a dramatic shift in the way people see and navigate the Internet. In this panorama,
search and retrieval of 3D media is rapidly becoming a key paradigm of interaction with the
huge amount of traffic and data stored in and shared over the Internet. New requirements for
information representation, filtering, aggregation and networking need to be addressed, which
are as intuitive as possible and effective in their capability of bringing the users to the 3D
content they wish to access.
The problem of 3D retrieval can be rephrased as follows: given a certain shape, please find
similar shapes, or shapes which the given shape is part of and vice versa. Assessing the
similarity among 3D shapes is a very complex and challenging research topic and the
computational aspects of 3D shape retrieval and matching have been only recently addressed
(see TaVe32008 and Bustos et al. 2007 for recent surveys). The methods developed so far
span from coarse filters suited to browse very large 3D repositories on the web, to domain-
specific approaches. The majority of the methods proposed in the literature mainly focus on
the geometry of shapes, in the sense of considering its spatial distribution or extent in the 3D
space. Nevertheless, there is a consensus that the shape of objects is recognised and coded
mentally in terms of relevant parts and their spatial configuration, or structure. The use of
structural descriptions for shape similarity is very interesting as it supports reasoning on shape
similarity at a local and/or partial level. So far, however, there exist no solutions that work
best for all possible cases and in terms of recognition fidelity they all lag behind humans. New
approaches need to be explored, possibly more deeply motivated by human perception
principles, especially developing further in the future the techniques for partial matching.
At the same time, semantic search is traditionally performed in text-based documents using
knowledge representation technologies such as taxonomies, thesauri, and ontologies. Today
the area of content-based 3D retrieval is orthogonal to the traditional semantic search. In the
future different types of media as sources for searching and information/knowledge
management have to be addressed together, especially 3D media have to be combined with
what already exists in the field of text, image-based, audio, and video retrieval. To this end
new ways of combining, ranking, presenting and feeding back user input are required.
In the following, we will specialise some of these issues in the presentation of the sub-
challenges that we think are most relevant to the achievement of the main goal.
Challenge: User-guided modelling of 3D content
For generic usage scenarios, it is necessary to consider that inexperienced users are
becoming more and more actively involved in the content creation pipeline and ask for more
and more intuitive and effective tools for creating, sharing, retrieving and re-using 3D content.
These issues are common to more established media types, such as images and videos, but 3D
media open a number of specific challenges due to the geometric nature of the data involved.
Even non-professionals can easily re-mix user generated or broadcast 2D content with
software like Photoshop or Final Cut, but the same level of functionalities for 3D media is
available only within complex software systems. Nevertheless, there are signs that these 3D
content creation tools are getting easier to use and more accessible to the general audience,
for example, video games such as Little Big Planet (Sony computer systems) introduce tool
sets, with which the users without any 3D experiences can build and play in a 3D world with
3D objects and interactions. Similar Microsoft tools exist in the software named Kudo –
research software that is designed for general users to create 3D content. In the near future,
i.e. in 2-3 years, we will see a huge influx of user-generated content, which will benefit of the
semantic annotation and preservation.
Challenge: Automatic extraction of the best view of 3D shapes
The best view problem consists in the automatic selection of the pose of a 3D object that
corresponds to the most informative and intuitive 2D view of the 3D shape. Among the
possible applications, the creation of thumbnails for huge repositories or catalogues of 3D
models, shape recognition and classification in Computer Vision, and grasping or next-best
FOCUS K3D D1.4.1
viewpoint for path planning in Robotics may be mentioned.
There is agreement that the best view is closely related to the semantics of a shape and/or
of its salient components, in a specific context or application domain, and it should make it
easier to recognise the 3D object according to its meaning or purpose. While previous work in
this area dates back to 10 years ago [Blanz et al. 99], the actual use of semantic information
has only recently been addressed, and demonstrated to improve the results significantly
[MoSp09]. Being driven by the semantics of shape and most of all by meaningful features of a
shape, the achievement of a best view tool is naturally dependent on the development of
smart methods for part-based annotation of shapes and/or for authoring objects and their
Extraction of the best view adopting the method described in (MoSp09)
Challenge: Semantic rendering of shapes in medical applications
The rendering of shapes for semantic representation has played an important role for virtual
environments (Gutierrez 2005) and in the modeling and understanding of complex non-
manifold shapes for analysis and retrieval (DeFloriani 2009). In medical applications, semantic
shapes have been used successfully in both the teaching and clinical understanding of medical
images. In recent studies, due to their ability to convey information in an interactive manner,
such systems have been integrated into 3D atlases of the human body, as exemplified in
(Höhne 1995), where by integrating concepts of computer graphics and artificial intelligence,
novel ways of representing medical knowledge were developed. In another study (Ogiela2009),
the author presented a new approach for semantic descriptions and analysis of medical
structures, especially coronary vessels (from CT spatial reconstructions), with the use of AI
graph−based linguistic formalisms. Such descriptions were suggested as being useful for both
smart ordering of images while archiving them and for semantic searches in medical
multimedia databases. Existing applications are, however, typically based on a particular part
of the body or an articulation, e.g. shoulders, brain, etc., while for a true semantic
representation of medical shapes, there needs to be a complete human body that includes all
body parts. These challenges are openly acknowledged and there is momentum to reach a
complete semantic representation of the human body. The primary dependencies are that all
the separate body parts first need to be reconstructed, thus enabling the whole body
integration. Because of its importance, this challenge is expected to be resolved in the next
decade, partly due to progress in medical image acquisition and processing, as described
FOCUS K3D D1.4.1
Exploration of the brain: Arbitrary cutting reveals the interior. The user has gained access to
the available information on the optic nerve concerning functional anatomy, which appears as
a cascade of pop-up menus. He has asked the system to colour-mark some blood supply areas
and cortical areas. He can derive from the colours that the visuallworld center is supplied by
the left middle cerebral artery (Höhne 1995).
Challenge: 3D retrieval based on multi-modal search
3D shape retrieval is a complex interaction process between the user and the 3D content,
along with its semantics. The so-called semantic gap, that is, the gap between the visual data
information and the meaning of the data for the user, can be closed by 3D search mechanisms
able to integrate content-based criteria, driven by global or partial shape similarity, with
concept-based semantics-driven ones, supported by global and part-based annotations of 3D
data. For example, it should be possible to pose queries such as “find the 3D models in the
repository that represent a vase with handles, and whose handles are globally similar in shape
to a given query model”. In the example “vase” and “handle” could refer to semantic
annotations and be resolved via a semantic search, whereas “handles are globally similar in
shape” will be resolved by applying a geometric search to the models selected by the semantic
Current 3D search engines work with one single descriptor or a set of predefined alternative
descriptors, sometimes as a pipeline of conservative filters to select iteratively smaller subsets
of the candidates. It is well known, however, that a single shape descriptor cannot work
equally well for all types of properties and all types of object shapes: even within the same
application domain, the shape variability of the objects may be too large with respect to the
capabilities of a single shape descriptor. Imagine for instance the complexity and variability of
3D shapes that we can expect in the medical domain, where the new acquisition devices
acquire data that are used in a combined manner; for instance MRI and motion capture data
are used together for gait analysis and should be indexed together. Semantics-guided 3D
search and retrieval of the huge amount of data stored digitally in the medical domain could be
really beneficial to the clinicians and incorporated into routine clinical medicine or medical
We believe a big challenge in visual search systems, and 3D in particular, is the
development of smarter multi-modal search mechanisms to allow the user to customise each
search session and tune the selection of the indices, or descriptors, to his/her search context
by exploiting also semantic information attached as annotation tags to the objects and object
FOCUS K3D D1.4.1
Challenge: User-centric semantic search by effective relevance feed back
Multi-modal search mechanisms are not enough to guarantee that the 3D search results fit
the subjective ideas of the observer. To be really effective, the search requires including a
human in the loop, that is, the user should be an active player in the search process. In this
context, relevance feedback plays an important role in 3D search, as it helps to bridge the
semantic gap between the user and the system. By means of relevance feedback, the user can
feed the retrieval system with his/her thoughts via the repetition of three processes: he/she
submits a query, which is answered by a list of items; then he/she gives feedback about the
relevance of some of the items, according to his/her needs; the system refines its set of
answers, so that they better fit the user's similarity concept.
Relevance feedback techniques help in understanding the semantics of similarity for an
observer, in the context of a specific query. Hence, they attempt to solve the semantic gap
between description and meaning, between system and user.
To improve the retrieval effectiveness we believe it is necessary to develop more flexible
and effective user-system interaction methods, which enable to capture, through a user-
friendly interface, as much information as possible about the user's perceptual world,
minimising his/her effort. Therefore the need exists to develop new techniques for search
refinement, through relevance-feedback methods that go beyond the traditional relevant/non
relevant assessment (multi-relevance methods) or even capture the implicit feedback, and
support adjusting the query and its parameterisation towards the user’s notion of similarity.
Challenge: 3D retrieval result visualisation
As an essential interaction layer, the visualisation of the results of a 3D search should
provide novel result navigation facilities able to support the complexity of the envisaged search
modalities. The rationale behind is that result visualisation must not simply present a sorted
list of answers, but should communicate to the user why the specific results were retrieved.
Considering partial similarity among objects, for example, it is important to be able to visualise
the correspondences between similar parts: an aspect, which is currently neglected by the few
and prototypical partial 3D search engines. An important aspect of result visualisation is the
representation of the data base context in which the search results are obtained (focus and
context). The context visualisation should be tightly coupled with free navigation and browsing
Challenge: Smart objects – interaction with the environment
It has been shown that using avatars in a virtual
environment is mandatory to improve the subject’s
immersion. Nevertheless, these avatars must interact
with the environment in order to provide the subject
with the feeling of realism. They must in essence
communicate with the objects of the virtual
environment in order to know how to interact with
them in the right way. The concept of smart objects
appeared more than ten years ago (KaTh1999). The
interest is to encapsulate semantic information in each
object represented as a 3D model. On the one hand,
there is information about the elements or parts of this
object, but there is also information about how to
handle this object that, for instance, will notify the avatar on how it can interact. Nevertheless,
working with such an environment depends on semantic information provided for each element
of the environment (Abaci et al. 2005). As shown in the picture above, in the kitchen made of
smart objects (here oven, casserole, shelf, tap) the avatar knows how to interact with the
objects. Inverse kinematics is then used in order to link the current position to the required
position to handle the object.
FOCUS K3D D1.4.1
Challenge: Preserve, find, and interpret 3D information in a lifetime of recordings
The term life-logging describes the process of capturing, storing, and distributing everyday
experiences and information for objects and people with the intention of preserving one’ s
entire life or large portions of it. There exists some justifiable disbelief against the wide
adoption of related techniques that is predicted in related visionary scenarios due to, for
example, critical privacy issues. However, given the already existing trend to capture parts of
one’ s everyday life by the widespread use of digital and cell phone cameras and private
information sharing via various social networking platforms, there is no doubt that techniques
to capture and preserve parts of one’ s personal life will play a significant role.
Whereas the actual capturing and preservation process is a problem that should be solved
within the next couple of years, the major critical issue with life-logging is how to access and
manage the huge amount of resulting data. Searching in millions of images and terabytes of
audio and video data is a challenge that goes way beyond the possibilities of today’s search
and retrieval techniques and will certainly not be doable without a significant amount of
semantic information being available.
Because the related information is captured automatically from the 3-dimensional world
surrounding us, 3D information and semantics will be a major issue in this context. For
example, knowing when and where a photo was taken will not be enough information, but we
also need to know in which direction the camera was pointing. We do not only need to
automatically identify relevant objects and people in them, but also their position in 3D, in
relation to each other, and over time in order to track them over a sequence of captured
images. In this context, we can learn from Google Map’s new feature – ‘Street view’. In this
system cars mounted with a camera were driven around streets in the world to acquire images,
which were annotated with localisation tags (GPS co-ordinates) and camera angle, resolution,
etc., which were used to reconstruct the view on the map. Such technologies can be made
available, in the near future, on mobile phones that everyone can use.
Street View in Google Maps that uses semantic information of the image acquired by a
Current research results in semantically annotating live-logged data (e.g. by identifying
locations, events, people) in personal archives of automatically captured images (cf., for
example, Bryne et al. 2008) suggest that automatic semantic annotation and search on a
semantic level on a large scale will be possible within the next five to ten years. However,
considering 3D data and related semantic information, research is only in its beginnings and at
a stage where related work on video and image data was about five years ago.
4.4.1 Time line and dependence diagram
In this section, a diagram showing the time lines of the proposed issues and the
dependencies between the issues connected with the grand challenge Semantic interaction and
FOCUS K3D D1.4.1
visualisation is presented.
The dark boxes cluster families of challenges that, in our opinion, are related to each other:
we gave each cluster a name, i.e. SIV1 and SIV2 (Semantic interaction and visualisation 1 and
2). The dependences are marked with an arrow from A to B, which means "know-how from A
will support the development of B". The boxes are put in line with a time, which represents the
time when we estimate the challenge will be achieved.
The names of the challenges in the boxes do not coincide exactly with the ones used in the
text and we added one more issue in the diagram.
Taking the broader scope of Visual Computing into account, we need formats that are not
only able to represent shape and semantics (as well as the visualisation aspects of the shape)
but also formats, which address the requirements of Computer Vision (CV) and Machine
Learning (ML) approaches.
Typically computer vision applications perform certain tasks, e.g. person tracking. To
achieve this, reference data is needed, being it codebook knowledge bases or similar data with
known semantics. During the analysis/interpretation stage, hypotheses are built and
confidence values are assigned to the probability that a certain hypothesis holds true.
Should not interoperable formats exist, which are able to capture and represent both
digitally-born objects and scenes as well as interpreted scanned scenes?
To accelerate and deepen the convolution process of computer graphics and computer
vision, we are convinced such formats would make a great contribution and new research is
needed in various fields.
FOCUS K3D D1.4.1
Yet to achieve this goal of a semantic format we also need to develop a tool that has the
power to express the goal of a CV task in a semantic and computer-understandable way.
• A possibility to describe reference data, codebooks and, templates in an exchangeable
way would be a revolution for the development of software and its application. It was
never foreseen to exchange this kind of data but it would strengthen the software
• The semantic representation of object relationships in a “scene graph” would be a
possibility to avoid an enormous effort for the re-use of models. If the semantics of
objects could be saved while data is exchanged either between systems or companies
we could save the value of the invested work.
• The evolution process of shapes from raw data to interpreted data is a long process
where users have invested an enormous time effort. If we could solve ambiguous
research topics like segmentation and classification of objects it is natural that we have
to deal with the problem of representing this evolution process to keep the semantics in
a standardised way.
• To derive digital 3D representations out of physically born objects, usually scanning
devices are applied, typically generating millions of points in 3D space. Another
upcoming question concerning challenges in standards is how we can represent and
encode the semantics of these digitally-born objects in order not to lose already
existing semantics of the object.
• Because of the heterogeneity of representation and processing algorithms we need a
standard for capturing provenance and editing operations that offers efficiency and
authenticity for capturing editing steps (granularity).
Beside these fundamental challenges, ways have to be found the represent the
visual(isation) aspects of shape, such as detailed material information (e.g. measured BTF
data of physical props).
The more semantics we put into formats the more value we add. Thus, digital rights
management has to be handled as an intrinsic part of the development of these new format(s).
Having semantics at hand allows for example to do semantic filtering. Depending on the shape
representation(s) chosen, ‘unfolding’ of information/shape can be supported by tools operating
on these formats (e.g. GML viewer). Access rights may determine how much semantics are
An abstract semantic scene description could be used for imprecise rendering purposes.
Beside the representation power of the format(s), we need to investigate how to implement
such a format to guarantee desired properties such as: interoperability, extensibility, self-
To our knowledge, there exists no format with the features described above. We are
convinced that research in this field can deeply contribute not only to new kinds of formats but
also that such formats can considerably stimulate the field of Visual Computing by bringing
computer graphics and computer vision aspects more closely together in one representation
scheme and finally in a format.
4.5.1 Time line and dependence diagram
In this section, a diagram showing the time lines of the proposed issues and the
dependencies between the issues connected with the grand challenge Standards is presented.
The dark boxes cluster families of challenges that, in our opinion, are related to each other:
we gave each cluster a name, i.e. STA1 and STA2 (Standards 1 and 2). The dependences are
marked with an arrow from A to B, which means "know-how from A will support the
FOCUS K3D D1.4.1
development of B". The boxes are put in line with a time, which represents the time when we
estimate the challenge will be achieved.
The challenges in the diagrams specify better the general grand challenge described above.
4.6 Trends in Semantic Web research
Semantic web technologies, especially ontologies and machine-processable relational
metadata, pave the way to knowledge management solutions that are based on semantically
related knowledge pieces of different granularity: Ontologies define a shared conceptualisation
of the application domain and provide the basis for defining metadata that have precisely
defined semantics and are therefore machine-processable. Although knowledge management
approaches and solutions have shown the benefits of ontologies and related methods, there
still exist a large number of open research issues that have to be addressed in order to make
semantic web technologies a complete success.
A seamless integration of knowledge creation, e.g. content and metadata specification, and
knowledge access, e.g. querying or browsing, into the working environment is required.
Strategies and automated methods are needed that support the creation of knowledge as side
effects of activities that are carried out anyway. This requires means for emergent semantics,
e.g. through ontology learning, which reduces the overhead of building-up and maintaining
Access to, as well as presentation of, knowledge has to be context-dependent. Knowledge
management approaches being able to manage knowledge pieces provide a promising starting
point for “smart” services that will proactively deliver relevant knowledge for carrying out tasks
related to the acquisition/creation, processing and documentation of 3D models.
Contextualisation has to be supplemented by personalisation, taking into account the
experience of the user and delivering knowledge on the right level of granularity.
FOCUS K3D D1.4.1
Knowledge management solutions will be based on a combination of internet-based and
mobile functions in the very near future. Semantic web technologies are a promising approach
to meet the needs of the mobile environments, like e.g. location-aware personalisation and
adaptation of the presentation to the specific needs of the user and the mobile device, i.e. the
presentation of the required information at an appropriate level of granularity. In essence,
users should have access to the knowledge management application anywhere and anytime.
The Semantic Web stack
There are a number of major research questions which have not been solved yet by the
current efforts, as identified in (Euzenat 2002) and also in the final report of the EU-NSF
strategic workshop on Research Challenges and Perspectives of the Semantic Web, Sophia-
Antipolis, France, 3-5 October, 2001. Also, as shown in the diagram above, the top layers of
the Semantic Web stack contain technologies that are not yet standardised or contain just
ideas that should/will be implemented in the future.
Multiplicity of languages. Different languages apply to different situations. Some
applications require expressive languages with expensive computational costs, while others
need simple languages. Research should be carried out on the lines of (a) designing languages
that stack easily or at least that can be combined and compared easily and (b) make explicit
the relations between the languages and which needs they can fulfil, so that application
developers can choose the appropriate language. The current situation suffers from rather an
explosion of languages. At some point in the future, a shakeout and reconciliation of many of
these separate languages will be required.
Reconciling different modelling styles. Different communities in different situations adopt
different knowledge modelling styles: axioms (from logic), objects (from software engineering),
constraints (from artificial intelligence), view-queries (from databases). It is important to know
how to combine these modeling styles and how to implement one style into another. This is
partly a technical process requiring translations between the different modeling paradigms.
Different reasoning services. As with languages, different reasoning services are required by
different applications. Examples of different reasoning services are: querying consequences of
a domain description, checking for consistency, matching between two separate descriptions,
determining the degree of similarity between descriptions, detecting cycles and other possible
anomalies, classifying instances in a given hierarchy, etc.
FOCUS K3D D1.4.1
Semantic Web infrastructure/services
From the architectural point of view, it is necessary to take into account several
requirements that will help to support storage, caching, optimisation, query, inference, and
distribution of the computation on many computers (if necessary), and to design an
infrastructure that scales, is efficient and robust.
Distributed systems for Web-scale reasoning. Current Semantic Web reasoning systems do
not scale to the requirements of demanding applications, such as analysing data from millions
of mobile devices, dealing with terabytes of scientific data, and enterprise content
management. To go beyond the limited storage, querying and inference technology currently
available for semantic computing, a platform for Web-scale reasoning must be developed that
can handle very large volumes of data. The FP7 IP project LarKc (http://www.larkc.eu) is
working towards this direction. This vision can be achieved by
- Enriching the current logic-based Semantic Web reasoning with methods from information
retrieval, machine learning, information theory, databases, and probabilistic reasoning.
- Employing cognitively inspired approaches and techniques such as spreading activation,
focus of attention, reinforcement, habituation, relevance reasoning, and bounded
- Building a distributed reasoning platform and realising it on both a high-performance
computing cluster and via “computing at home”.
Semantic transformations. A crucial need is to be able to merge ontologies for integrating
data. This could be achieved through the development of transformation services (mediators)
and libraries of transformations. However, not all transformations are suited to every task. A
properties and providing general schemes for expressing these transformations and properties
in languages that the machine can understand.
Taking the example of ontology versioning, it is very dangerous to use a knowledge base
with a new version of the ontology. However if the ontology maintainers provide a
transformation from old to new versions, it becomes possible to use an old knowledge base
once transformed with the new ontology (and other new resources). Research is of course
needed on transformation composition, proof-carrying transformations, proof languages and
Developing ontologies - requirements for tools and methods. Building and especially
maintaining ontologies will require tools and methodologies. Some requirements on these are
- Visualisation of complex ontologies is an important issue because it is easy to be
overwhelmed by the complexity of ontologies. Moreover, tools should be able to provide
comparative displays of different ontologies (or ontology versions). Specific tools based on
the objective semantics and the perceived meaning should be developed.
- Design rationales should be attached to ontologies so that the application developers (and,
maybe, the applications) can use them to choose the adequate ontology.
- Support for modularity and transformation of ontologies by recording the links between
modules and proposing transformations to export parts of ontologies in a context with
Some recommendations for future research include ontology acquisition from multiple
primary sources (texts, multimedia, images) by means of learning techniques, ontology
comparison, merging, versioning, conceptual refinement, and evaluation.
Coping with “messy metadata” (from people and machines). When untrained people and
imperfect machines generate metadata or populate ontologies, the results will be “messy”.
Depending on the context (e.g. the criticality of maintaining high quality capture), three non-
exclusive strategies are:
FOCUS K3D D1.4.1
- Avoid it: the cost of incorrect metadata is too high (e.g. medicine or safety-critical
- Tolerate it: the cost-benefit trade-off is good enough that imperfections do not get
- Clean it: this may be a cooperative process between humans and intelligent agents, or a
purely discursive process, e.g. expert conferences to decide on a new taxonomy, or how to
codify new discoveries (e.g. found in some areas of bioinformatics).
4.7 Final discussion
In a long-term perspective, achieving the five grand challenges and the open issues
discussed in this chapter will be a strong support to the realisation of the visionary scenarios
presented in chapter 1. Deriving symbolic representations will guarantee very accurate and
complete patient-specific data in medical diagnoses and treatment, a trustworthy and rich
digital reconstruction of the environment from 3D sensors as a support for city management
systems or for robots to move and interpret information around them. Goal-oriented 3D model
synthesising will permit a function-centric modelling and simulation on patient data to obtain
very reliable assessments and predictions, the creation of 3D models which have “self-
consciousness”: it would definitely benefit the design phase if it becomes possible to reuse past
projects for modelling new robots or components, new characters of a virtual game, new
behaviours to substitute humans with humanoids in dangerous tasks, new affective cultural
experiences. Documenting the 3D lifecycle will imply a 3D mark-up language for the
annotation of 3D models consistently with any other media, even massively. We can imagine a
precise correspondence between the real semantic world and the digital semantic description,
allowing for understanding, processing and retrieval of suitable information not only by
humans but also by robots and machines in general. Semantic visualisation and interaction will
permit a smooth interaction between the real and virtual words, and within the virtual world
itself. Developing tools to make the required information accessible to any user (either human,
robot or machine), dynamically according to the specific goal, will guarantee real time decision
support mechanisms and the feasibility of new interactions among real and virtual entities. The
adoption of stable, rich and extensible Standards for semantic 3D media is the fundamental
technological step required to make the visionary scenarios conceived truly operational.
In the following figure, a time line diagram across the grand challenges is shown, which
provides a visual and global road map to the realisation of 3D semantic media as proposed in
the FOCUS K3D project. The clusters introduced previously illustrate the timing with respect to
the four phases; moreover, clusters with the same colour placed horizontally indicate that they
can be reached independently from each other.
FOCUS K3D D1.4.1
In a mid-term perspective, achieving – even partially - the goals proposed will clearly
improve the digital shape workflows in applications. To show this more practically we describe
four real life application scenarios, which we outlined together with the AWG members and
which prove how the application fields considered in FOCUS K3D would benefit from the use of
semantic 3D media. In the case of bioinformatics, some comments will be given on the
feasibility of semantic annotation of biological data, still very far from being fulfilled.
4.7.1 Medicine scenario: liver segmentation
As explained by our AWG member IRCAD, in medicine the long-term grand challenge is to
perform patient-specific modeling which ranges from the cell to the organs through globules.
This requires new models and architecture to link all extracted models from various modalities
(functional, density, etc). In addition, the patient modeling must integrate therapy, which
requires new models and architecture to plan, simulate, apply and follow therapy application.
Looking at a much shorter-term goal, the practitioners are currently facing a stringent real-
life problem, which is considered a priority for the specific problem of liver segmentation and
simulation. For applications such as resection, segmenting the liver is still a labour-intensive
process, which takes a considerable amount of wall clock time for a trial-and-error process by
an expert (the algorithms are not fully automatic but rather devised to assist the expert).
While the process is now well mastered and used routinely in medical applications, it is simply
economically not sustainable in a normal production scenario to take one hour of an expert
surgeon’s time to accomplish this task. In addition, each new segmentation is restarted from
scratch and does not benefit from the considerable experience of having performed the tasks
on many patients before. One solution to this problem would consist of formalising the
knowledge about the functional constraints of the organ (which include geometrical,
topological, mechanical and biological constraints) to assist the expert during the
segmentation. In addition, it is a priority matter to formalise the interactive segmentation
FOCUS K3D D1.4.1
process using knowledge technologies so that each new segmentation learns and reuses from
the past segmentations performed.
4.7.2 Bioinformatics scenario: from models to annotation
The application of knowledge technology techniques in structural bioinformatics faces major
obstacles, which are not present when the geometry is stand-alone: for macromolecules, the
couplings between geometry, biology and biochemistry complicate things a lot. In addition, the
diversity of alternatives even for simple problems is such that consensus is difficult to reach. In
the sequel, we examine these issues and conclude with a list of plausible actions to be
Coupling between geometry and biology. The central question in structural biology is the
investigation of the structure-to-function relationship: given that the functions of a protein are
accounted for by complexes formed with partners, one wishes to understand how the shape of
the molecule accounts for the binding to these partners. This goal is such that the geometric
description has to be coated with functionality-related annotations. In structural biology the
focus is precisely on the contexts in which this molecule may be found since these contexts
provide the biological knowledge, i.e. the function. Phrased differently, purely geometric
descriptions of shapes de-correlated from the contexts are of little interest for (structural)
biologists. As a matter of fact, the data models used in structural bioinformatics relate the
geometry of a molecule (admittedly simple facts at this stage), to the gene coding for it, and
to interactions with partners. That is, the structural, genomic and interaction levels are tightly
Coupling between geometry and biochemistry. Annotating requires a certain level of
abstraction, and greatly benefits from parametrised models, whose development is in general
impossible for molecules. For example, adding say one carbon atom to an aliphatic chain, thus
resulting in a longer chain, may radically change the chemical properties of the molecule, thus
resulting in completely different abilities. In drug design, a number of cases are known where
elongation of one atom enables the formation of a hydrogen bond at the end of the chain, thus
greatly stabilizing interactions. The same holds for proteins, where a moderate elongation can
trigger a folding event, thus completely changing the secondary and tertiary structures of a
polypeptide chain. This complex coupling between the geometry and biochemistry is a major
hindrance for the development of parametric models.
On consensus models. Another difficulty is that even simple tasks can be performed in a
number of ways. Consequently, consensus is hard to reach, and this penalises annotations. We
substantiate this claim with a simple example, that of normal modes. Given a macromolecular
system about a minimum of its potential energy, its normal modes encode the couplings with
the atomic variables describing the flexibility of the structure. The normal modes are computed
by diagonalising the quadratic form representing the energy well about the minimum. Yet
starting from a crystal structure extracted from the Protein Data Bank (PDB), computing the
normal modes requires minimising the structure to bring it into a minimum of the potential
energy. This requires choosing an energy minimisation method, either based on a force field,
or on a knowledge-based potential. For the former strategy, a solvent model must be chosen,
and molecular dynamics need to be run (with degrees of freedom regarding the length of the
simulation and the stop criterion). For the latter, a zoo of potentials exists. Finally, the modes
need to be computed, and a relevant number of them need to be selected. The complexity of
this process is such that even for the same system different scientists will opt for different
approaches, let alone the case of different systems. Therefore, no consensus strategy can in
general be used to store normal modes in general audience databases such as the PDB.
4.7.3 Gaming and Simulation scenario: vessel design workflow
In the domain of gaming and simulation, one of the members of our AWGs is VSTEP, a
company that developed a ship simulator game. Currently, there are 30 civilian vessels as
player vessels or autonomous vessels, and 40 additional autonomous vessels. The workflow of
adding a new ship to the simulator was described in a presentation at one of the FOCUS K3D
FOCUS K3D D1.4.1
events, namely the CASA Workshop on 3D Advanced Media In Gaming and Simulation
(3AMIGAS), 16 June 2009, Amsterdam, The Netherlands. For the complete set of slides, see
Ship similar game. Courtesy VSTEP Serious Games.
The workflow consists of 47 discrete steps, exploiting 10 different tools, involving 8 different
people. Some of these steps, such as the production of a press release, have little to do with
3D media, but some would benefit from more knowledge intensive 3D media technologies, and
such technologies would improve the quality of the simulation.
Of the 47 steps in adding a ship to the simulator, 26 steps are related to shape processing,
which can benefit from semantic 3D technology. In fact, the following steps will be able to be
automated: the eight processing steps that involve decimation, levels of detail, collision
detection preparation, the six texturing steps needed for visualisations that depend on weather
and sailing conditions, and two steps about generating icons and renderings. Achieving such
automatic processing needs research and development in the direction of combining the
geometry with properties and processing steps depending on the context.
Ten steps are involved in modeling interior, ornaments, special objects, foam, etc.
Techniques such as smart objects and procedural modelling involving semantics need to be
developed to speed up the modelling phase and enhance the quality. In addition, in the
modelling of a new vessel, parts of other ships are generally reused, such as bolts, anchors,
and doors. Exploiting innovative semantically semantic meaningful retrieval capabilities could
improve efficiency and completeness of the search within a component library. Moreover, in
the simulation, rich semantic annotations about combustibility, rigidity, and buoyancy would
improve the simulation quality.
There are also steps requiring user interaction and semantic visualisation aspects. For
instance, the generation of chart icons, thumbnail images of parts and models, is currently
done by visual inspection of alternative views. The automatic generation of information rich
and visually appealing views would speed up and ease the workflow. During rendering, smart
level-of-detail algorithms will improve both efficiency and effectiveness of information
conveyance, leaving out unnecessary detail. Autonomous vessels and characters must be able
to navigate and plan routes in a natural way: smart object like doors, handles, hinges, and
tools, which would know how they are operated and treated, can improve interaction planning
and suspension of disbelief.
Although a lot of work will remain labour intensive, these examples of knowledge and
semantically rich 3D media (models and environments) show how efficiency and quality may
FOCUS K3D D1.4.1
be increased. There are two direct ways how knowledge intensive 3D media technology might
have impact in this case. First, a more efficient workflow could save many hours of work,
leading to a saving in the order of a hundred thousand Euro for the whole range of vessels.
There are currently over half a million copies sold of the ship simulator game. Second, a more
advanced, intelligent, meaningful behavior and functionality could lead to extra sales of the
game and add-ons, leading to extra revenues in the order of a million Euro.
4.7.4 CAD/CAE and Virtual Product Modelling scenario: semantics based
virtual 3D product modelling
Let us imagine that, in a 3D shape semantics based world, a new rim is to be developed.
Existing rims have been modelled using a procedural modeling approach and are kept in a
knowledge base, along with both their geometric semantics and a computer-interpretable
description of the requirements they are designed to fulfil.
The designer/engineer starts his/her task by sketching a 6-spoke rim and stating some
engineering requirements towards the rim: the load (weight of the vehicle to carry), the
intended size and width, the speed the rim shall operate under, and the like.
Since the company policy is not to start modeling a new product without researching the
knowledge base, a segmentation algorithm starts to interpret the designer’s sketch. Knowing
that the designer is looking for rims, the segmentation algorithm uses a semantic model of
rims to perform the segmentation task, to extract the number of spokes and finally to generate
a combined query to the knowledge-based model repository to look for similar rims which fulfil
the stated requirements.
The repository returns the best matches. The designer chooses the one that she/he likes the
best. A geometric modeling tool pops-up and allows for adapting the shape of the existing rim
towards better matching the shape of the sketch.
In parallel, since the functional requirements are known, a simulation engine starts to check
whether the new design actually fulfils the requirements and provides hints to the designer if
problems are to be expected in the use phase.
The simulation engine leverages the semantic information contained in the model and the
semantic description of the requirements to automatically generate corresponding discrete
simulation models out of the shape description. Understanding the problem, the simulation
engine (actually its meshing component) can decide which resolution is required where and
then it can optimise for performance and accuracy at the same time, relieving the user from
tedious manual meshing work and algorithmically using experience knowledge collected on a
What is needed to let this scenario become reality?
1. Semantically rich procedural models;
2. Knowledge technologies to express, understand, and reason about requirements;
3. Improved 3D shape retrieval mechanisms that take additional information into account;
4. New modelling approaches to combine sketch-based and semantic geometric modelling;
5. New representation schemes to handle this extended set of information/knowledge;
6. Meshing and simulation algorithms that can be ‘parameterised’ by problem definitions;
7. Fast feedback loops between modelling and simulation to create functional-driven
8. New ways to communicate results to the user on a semantic level if something fails.
To enable all these points to inter-operate requires new data formats (grand challenge:
Standards). The grand challenge Deriving Symbolic Representations is not particularly
addressed here although symbolic representations are involved, e.g. in the semantic-rich
procedural models. The points 1, 4, 6, and 7 refer to the grand challenge Goal-oriented 3D
FOCUS K3D D1.4.1
modelling synthesising; the points 2 and 5 refer to Documenting the 3D lifecycle, while 3 and 8
belong to Semantic visualisation and interaction.
4.7.5 Archaeology and Cultural Heritage scenario: large-scale repository of
3D digital artefacts
A great part of Europe’s heritage is linked to historical and cultural evolution. Hence, it is clear
that without preservation, documentation and easy reference to various facets of this great
cultural tradition, which moulded the life of our ancestors for many centuries, mankind cannot
understand its past and thus it cannot lead them into the future. Therefore, a great benefit
would come from the development of a large-scale, documented, distributed 3D/multimedia
repository for culture heritage and archaeology in order to preserve, share and reuse
The main objective of this project will be to support the archaeology and cultural heritage
community in the creation, interpretation and access of 3D content. The major challenge will
be the creation of a large-scale distributed 3D digital library which will be developed using a
novel reliable cost-effective 3D reconstruction system that can concurrently capture images,
videos and 3D representations for the preservation of cultural artefacts. One of the main
components of the technological infrastructure required to support such an initiative will be a
Knowledge Management system. Such a system will support the creation and maintenance of
artefacts, which will be composed of the digitised assets as well as all the necessary metadata,
which will store the information required for the proper documentation of each artefact or each
collection of artefacts. All these tasks are related to the grand challenge Derive Symbolic
Such an ideal system will also support the creation and maintenance of multilingual
thesauri, which will be used for the deployment of cultural semantic web dictionaries, a feature
that is necessary not only for advanced searching mechanisms but also for the correct one-to-
one information mapping since the content spans multi-era, multinational and multilingual
barriers. These objectives are clearly included in the grand challenges Documenting the 3D
lifecycle and Standards.
Moreover, users will be able to search and retrieve multimedia and scientific data from the
distributed library, using novel sophisticated search techniques that evolve both content and
context, since content-based access to information has already started to become
unsatisfactory. Thus, low-level geometric feature extraction algorithms will be developed and
knowledge extraction techniques will be introduced to extract semantic information/features
from scientific documentation. This is related to the grand challenge Semantic Visualisation
In archaeology and cultural heritage, object semantics is typically just as important as the
actual geometry. Thus, it is a key requirement to assign thematic information to entire objects
and to individual geometric elements. This also makes it possible to select, analyze or edit the
geometry and the appearance of objects based on semantic criteria. One major issue is the
(re-)creation, reuse and maintenance of the artefacts. Storing the models of those artefacts,
along with additional semantic information and provenance data, can ease the procedure of
creation or even replication of old masterpieces. For example, it could be used for the
reconstruction and reproduction/repair of archaeological findings, e.g. missing parts of a hand
in a statue that was broken, and/or any such application. An instructive counterexample is
what happened to the Athina temple in Aigina, which was excavated by German
archaeologists. The findings were moved to Munich and the various pieces were reconstructed
by actual sculptors. Later on it was discovered that many mistakes had been made and the
various statues constructed this way were disassembled. Thus, many ancient pieces were
destroyed in the process. If this had been done digitally many ancient pieces would have been
saved. Partial matching is another application area. For example, if a handle is found and the
rest is missing, after partial matching with already stored similar objects we can accurately
guess the origin of the handle and reconstruct the artefact quite closely.
Another strong interest of the archaeological/cultural heritage community is in the
FOCUS K3D D1.4.1
organisation and presentation of content to virtual visitors (virtual exhibits) and developing
educational and training application scenarios for connecting real and virtual artefacts. To re-
create an architectural space one cannot simply use a 3D scanner. 3D reconstruction of
architectural spaces is a long design process based on available data (pictures of the remains,
archaeological research drawings and maps, etc.), evolving in close collaboration with
archaeologists and historians, and taking advantage of any available knowledge (annotated 3D
objects). A framework like the one proposed here will contribute to the reconstruction of such
virtual environments. A success story example is the "Tholos", a Virtual Reality museum of the
Foundation of the Hellenic World (FHW). http://www.fhw.gr/index_en.html.
Such a framework will also support collaboration communities providing the required
infrastructure to build targeted applications. It will also provide appropriate interfaces and tools
to support the logging, the merging and the extraction of new knowledge, the sharing and
reuse of existing context, and personalised services.
3D digitisation can be a key factor for the sustainability of the economic development of
virtual museums. The on-line availability of heritage-related information allows better
management of the impact of economic development on the cultural environment. The impact
on cultural tourism is self-evident.
Considering more practical and real life scenarios, examples of tasks regarding 3D
processing operations that are useful in the cultural heritage field include all the steps of the
digital lifecycle of a resource, i.e. from the acquisition and data preparation, to the annotation
and classification, and also to the analysis and virtual restoration, reconstruction and
visualization. A CH professional could be guided in the phases and supported with a
technological platform able to provide tools for performing steps that deal with:
• data acquisition: e.g. 3D scanning, registration, and merging;
• archiving and retrieval: e.g. classification of data, storage of 3D content and metadata,
and retrieval of data from a repository;
• virtual presentation: e.g. data healing, level-of-detail-driven simplification, association
of material properties, texturing, and viewing;
• virtual reconstruction and restoration: e.g. composition/assembling of parts, data
conversion, modification/deformation of models, and morphing between two different
• prototyping and replication: e.g. fairing, generation of a uniform polygonal mesh, for
instance, in the STL format;
• deformation monitoring: comparison of geometric deviation of artefacts or buildings
Possible applications of the described ideal system can include the following:
Acquisition and Reconstruction. The system will provide a set of functionalities that allow the
generation of a digital model from a physical object (e.g. the front of a building, a fountain, a
statue, etc.). The digital model may be further processed depending on the application task:
visualization, simplification at different levels of detail, deformation, prototype generation, etc.
For instance, the use of a laser scanner to acquire a façade of a building or a white light
pattern projection scanner scanning in close range 3D artefacts produces a huge amount of
data, and usually more than one scanning session is necessary to acquire the whole façade or
artefact, after which an accurate digital model is produced. It is important to choose the best
scanner device suitable for the specific scanning session, depending on the physical object.
Each scanning session produces a dataset (point cloud), and different datasets have to be
registered and merged in order to produce the whole digital raw data from which the final
digital model is obtained. Several post processing tools can be applied to the digital model,
depending on the application task (e.g., visualization, prototype generation, validation and
Smart Archiving and Retrieval. Once the digital model of an object is available, it should be
FOCUS K3D D1.4.1
classified and put into a repository to be made available to the community. The classification
process can be supported by smart tools that automatically encode geometrical properties
(e.g., this ancient column has a specific size, it is cylindrical, its average diameter is…) and
structural properties (e.g., this amphora has three handles; this façade of a temple has four
columns). Moreover, semantic information (e.g., this is a piece of an amphora and the material
is pottery) and environmental information (e.g., this fountain is located in a specific square, it
has been acquired using a specific device through a specific methodology) can also be encoded
in order to enrich the cultural and scientific value of the digital content. All this information
improves the quality of a later retrieval: e.g., I am looking for a statue whose arms are
missing (structural information); I am looking for a marble fountain (semantic information)
whose maximum height is three meters (geometrical information).
Virtual Restoration of 3D artefacts. The framework could provide all the necessary scientific
resources for (semi-) automatically completing the missing parts in a statue or an ancient
building based on the selection of semantically similar parts retrieved from a 3D repository. In
this case, for instance, an analysis based on the Virtual Human domain expertise could capture
information on the body landmarks of the Venus of Milos. An arm could be selected elsewhere,
and its measures could be adapted to fit the missing arm of the statue. A process could
produce the fitting arm, and another process could make the actual merging. This could be
done for scientific studies but also for training and entertainment (“Lend a hand to Venus!”).
Another application, once a critical mass of CH content is achieved, could be to seek for fitting
parts and fragments of historical artefacts often distributed in locations all over the world.
Virtual Reconstruction. Another application scenario could be the reconstruction of historical
artefacts and whole historical environments. Virtual reconstruction is based on information
sources like historical texts, drawings and other 2D information, and the comparison with
historically similar artefacts or buildings, which could be assessed applying semantically driven
queries for historical information available in the repository. Besides, existing 3D digital models
and shape processing tools for modelling would enable economically more feasible approaches
to such reconstructions.
Deformation Monitoring. Another application scenario is the monitoring of the erosion of
statues and buildings or the deformation of artefacts over time. Such deformation takes place,
for instance, because of changes of the climatic conditions. Canvas and panel paintings are
examples of such artefacts sensible to changes of air humidity. The shape processing tools
would include the geometrical comparison of the deviation between the two digitised models of
an artefact obtained at different time stamps.
4.7.6 Robust geometry processing: practical benefits across the AWGs
Although the strong impact of robust geometric processing of 3D data was mentioned several
times in this document, we motivate with more details its practical importance in the
applications by describing the context in terms of acquired and generated geometric data, and
state-of-the-art methods for digital geometry processing.
Geometric data are most commonly generated by modelling, by measurements or through
automated processes. The abundance of data is explained by the recent considerable advances
in the modelling paradigms, in the acquisition technologies, and in the variety of automatic
conversion methods. In addition, numerous algorithms along the geometry processing pipeline
generate new, processed geometric data.
Measurement data range from point sets to depth images and contours. As we have seen, they
are acquired with an increasing variety of acquisition technologies, whose evolution is
characterized by a shift from contact to contact-free sensors and from short to long range
sensing, culminating with satellite images. In many cases the acquired data are “raw” in the
sense of being sparse, irregularly sampled, and riddled with uncertainty (noise and outliers).
Therefore, despite the expectation that technological advances should improve quality,
geometric datasets are increasingly unfit for direct processing. This trend is explained by a
drastic change of scale in geometric datasets: projects such as Google Earth, geophysical
measurements, or climate analysis involve measurements from satellite images or seismologic
FOCUS K3D D1.4.1
sensors - all of which contain a significant number of outliers. In addition, geometric data are
increasingly heterogeneous due to the combination of distinct acquisition systems, each of
them acquiring a certain type of feature and level of detail. Examples include the measurement
of digital cities from satellites, planes and by pedestrians. Other examples in computer-aided
medicine include measurements of non-parallel slices, 3D volumes and diffusion tensors. Data
are thus not only heterogeneous, but may also be redundant. Although beneficial from the
sampling point of view, redundancy hampers data registration. Furthermore, we are observing
a trend brought on by the speed of technological progress: while many practitioners use high–
end acquisition systems, an increasing number of them turn to consumer–level acquisition
devices such as low cost free-hand medical devices or digital cameras, hoping to replace an
accurate but expensive acquisition by a series of low-cost acquisitions. Another evolution
consists in dealing with community data: acquisition of our physical world can be achieved by
exploiting the massive data sets available online.
Automatically generated data appear throughout the geometry processing pipeline every time
a conversion occurs or the geometry is altered. Examples of generated data are surface
meshes generated by marching cubes from 3D medical images or by meshing parametric
surfaces from CAD models. As the input data and the meshing algorithms are imperfect, the
output meshes may contain spurious defects such as gaps and self-intersections. This type of
raw meshes, referred to as polygon soups, may severely hamper the robustness of geometry
processing algorithms. In general processed data are riddled with imperfections due to the lack
of guarantees provided by algorithms used along the processing pipeline. Ironically, many
algorithms do not even guarantee for their output the very properties they require of their
input. Since geometric data are increasingly heterogeneous as mentioned above, they require
more conversions and processing, and are thus even more prone to contain flaws.
Methods. A major research effort in geometry processing in recent years has been to elaborate
upon mathematical and algorithmic foundations for analysing and processing complex shapes
through so-called discrete differential geometric operators. Significant theoretical and
numerical advances have been made to show that these operators mimic their smooth
counterpart when applied to discrete representations of 3D shapes. The main results are
discrete equivalents of basic notions and methods of differential geometry, such as curvature
and shape fairing of polyhedral surfaces. Another important research direction has been to
elaborate upon methods for converting shapes from one representation to another. In
particular, the issue of shape acquisition and reconstruction has stimulated a considerable
number of contributions. In an AIM@SHAPE survey we have listed more than 500 publications
on this topic, revealing the lack of a unified solution. Specifically, it took 15 years to invent a
provably correct reconstruction approach based on Voronoi filtering together with the first
ingredients of a sampling theory for smooth shapes of arbitrary topology. As these foundations
are valid only for input measurement data, which are both dense and noise free, they are of
little practical use for the practitioners that increasingly have to deal with raw data.
Consequently, a series of heuristics was devised to deal with noisy data sets, but for example
no well-principled approach can deal simultaneously with piecewise-smooth reconstruction,
outliers and heterogeneous inputs.
Targeted problem. The initial grand promise of digital geometry processing (DGP) was to
achieve what had been done in digital signal processing (DSP) but for the very special “signals”
that shapes represent, with highly distinctive properties such as topology, dimensionality, and
lack of global parameterisation bringing a wealth of challenges. In many ways, DGP has been
successful in the sense that a variety of methods have been developed to offer a vast array of
geometry editing tools. However, digital geometry processing still has yet to provide the type
of robustness to which digital signal processing owes its success: to have a tangible impact at
the scientific, technological and societal levels, DGP must offer a set of robust and reliable
methods (as currently available for sounds or images) to practitioners. The level of robustness
sought after (resilience to uncertainty and heterogeneity) constitutes an enduring scientific
challenge, even more difficult than for ordinary signals due to the variety of geometric data.
While creative and exploratory, the resulting algorithms of the first wave of DGP research are
too brittle and too input-dependent to offer solid foundations.
FOCUS K3D D1.4.1
To fully realize the potential of geometry processing, one must tenaciously address the most
enduring and fundamental problems, which hamper robustness to imperfect and
heterogeneous input. One promising approach consists of formalizing the a priori knowledge
both about the measured shape and about the acquisition system such that this knowledge is
exploited as early as possible during the acquisition phase.
Regarding the generality of heterogeneous data, the current approach consists of choosing the
data structure, which is best suited to the queries required by the algorithms at hand. Turning
the problem around, we propose instead to adapt the algorithm queries to the geometric data,
and isolate a minimal set of so-called oracles required by the algorithms. In addition, these
oracles may provide access not directly to the input data but instead to a more abstract class
of objects such as compact sets equipped with a metric space. This way the algorithm gets
access to the sensed data only through an error metric which can be made resilient to
imperfect data once the metric is enriched with the knowledge (formalized with semantics) of
what the sensed shape is. As these oracles are the only interface through which the input
geometry is known (or sensed), the algorithm becomes independent of the representation of
the input geometry: generality is gained and heterogeneous data can be handled gracefully.
More concretely, for a scanner manufacturer this may translate into a sensor not data-centric,
but query-centric, where each query can be specialized to the type of shape of interest and
enriched with the semantic of the sensed shape. We can think of, e.g., a free-hand ultrasound
medical device, which provides answers to geometric queries about a specific organ such as
non-parallel slices with topology and geometric regularity constraints specific to the said organ.
In addition, a formalization of the knowledge about the acquisition process itself can be
instrumental to provide robustness to noise, outliers and more generally to defect-laden data.
In the final FOCUS K3D Conference on Semantic 3D media and content we had the
opportunity to discuss the grand challenges and open issues of this road map with many
experts of the applications and we have been able to compare our insight on the future
research agenda with the perspectives of both academia and industry.
The experts in Bioinformatics confirmed that knowledge can be related either to the shape
or to the function/structure but the first one is useless without the knowledge of the
application domain. Therefore, the general target of semantics has to be contextualised and
the biggest challenge is relating the 3D shape to the biological context. Annotation is clearly
important but in this domain it is not obvious how to represent such knowledge and relations.
In addition, the problem of preservation of the annotation is also delicate because it depends
on the single case: even performing a minor change in the molecule geometry may trigger a
major event in terms of annotations. For example, in the active sites of proteins you may have
a certain geometry of residues, which carry on a specific reaction; any slight change can either
stop that reaction from happening or make it happen; then, the geometry in the active site is
extremely important but it is not clear how these situations can be represented. The natural
conclusion is in line with ours: the big challenge, which will have a very high impact, is
modelling efficiently such biological knowledge and relations between data. Until parameterised
models able to include knowledge are conceived, annotation will not be possible.
For Gaming and Simulation applications it has been noticed that often research is
technology-driven while it should be more user-driven. Semantic 3D media can be exploited in
learning experiences and semantics in general could be beneficial for learning models,
personalisation, and user-environment interactions. There is a lot of potential in how to model
towards the user, towards the learners; in other words, there is a real benefit of Knowledge
Technology in the assessment of use/learning models and their capabilities. Interaction
becomes crucial, where the interaction is between the user and the technology, but is also the
social interaction between users. An important issue in this perspective could be how to
visualise (textual, visual) datasets to support interaction. The social and economic benefit is in
how this interaction between the users and the environment could be accelerated by the
technology. The future of semantic 3D technology in Gaming applications could probably be in
FOCUS K3D D1.4.1
the direction of providing very layered approaches to knowledge acquisition whilst being within
a very massive kind of environment. For instance, the current attempts of geo-tagging the 3D
environment, while also linking to text and objects coming from a whole range of different data,
go in this direction.
The case is different for the mechanical industry and telecommunications (CAD/CAE and
Virtual Product Modelling), where 3D models and semantics have a long successful tradition. In
these sectors engineering drawings are fundamental to transfer a message in a community
that shares the same culture: they contain a lot of implicit semantics which engineers are able
to interpret. Hence, a big challenge is the possibility of transferring the implicit semantics
included in the drawings and this would result in a huge economic value since a sector like
mechanical engineering does not evolve fast and the life of design and architecture is very long.
In the automotive sector, the design of a new car implies the reuse of 90% of old components.
As a consequence, there is an enormous value in drawings and there is a lot of work to be
done to extract the semantics from engineering drawings, where the geometry is just one of
the aspects to consider. Another aspect is that CAD modelling systems cannot evolve since
they are based on nominal shapes. What makes the difference is not the precision of the model
but the tolerance the designer gives to it, since tolerances are connected to costs and
functionality. This kind of information is not included in the idea of nominal geometry, and then
it appears to be evident that deriving symbolic representations is a huge challenge with very
Embedding semantics in CAD models automatically such that knowledge could be reused in
downstream phases is perceived as a challenge even in current scenarios. In fact, the way to
include semantics is now up to the designer, thus it is difficult to control and reuse in the
process. Although such an issue could be seen as a research objective for the near future, the
annotation of past projects, even dating back to several years ago, appears unfeasible in
practice. Moreover, considering the diversity of product modelling phases, it is clear that
semantics not only has to be preserved during the pipeline but also has to be adapted when
the geometry changes; for example, during model simplification and idealisation for simulation,
it would be valuable to adapt the FEM semantics to the new simplified/idealised model.
The benefits of an effective semantic search also emerged during the conference,
considering the large amount of data already available now. It currently happens that some
useful information is deployed and available in the system, but there is no standard way to
access this information. Knowledge technologies could offer such standards so that generic
“semantic” modules are plugged together (i.e., CAD systems cooperate with knowledge bases)
to provide the same kind of functionality that is in use today but saving a lot of time. However,
the gap between the academic prototypes and real application systems is still perceived as a
strong practical limitation by industry. In addition, software vendors have no real interest in
investing in new tools unless customers explicitly ask for them. Only skilled customers, i.e.
industrial companies, aware of new technologies can push software vendors to advance their
The CAD community includes very dissimilar profiles, which require ad hoc solutions. For
instance, the styling department of a company works differently and uses different tools
because they follow a completely different process. Technical people usually design from
technical constraints, and try to find the best compromise; oppositely, people creating shapes
usually start from an idea and then develop a nice shape. Procedural shapes, as have been
presented during the conference, cannot support stylists but engineers, who have to manage
the design constraints and can do it more intuitively with such a methodology. Clearly
creativity cannot be disjoint with object produceability. The high economic value of creative
work in today’s market has been highlighted, which needs freedom to evolve and consequently
a strong interaction in the modelling system between the stylists and the product they are
Also Cultural Heritage is a very heterogeneous field, and also here semantics is strictly
related to the context: for instance, the information necessary for a curator operating in
archaeology is different from the one needed by a curator of a library. In CH 3D models appear
in many tasks, such as representation, documentation, video, educational material, but it is
FOCUS K3D D1.4.1
important to notice that 3D cannot be considered as the solution but more properly as an
evolution of the traditional media to be integrated. The benefit of the FOCUS K3D approach is
considered valuable especially for the “scientific” cultural heritage community, which aims at
studying, cataloguing and documenting the objects. A big issue is related to standards of data
models and this is affected also by multi-linguism because in archaeology and architecture,
terminology is very close to the knowledge structure, especially when used to describe shapes.
Since the domain does not share the same vocabulary, semantic representations should take
care of this aspect and, more generally, of the context of use. Another challenge from the
FOCUS road map, which appeared important in the field, was the integration of results coming
from semantic-based classification and geometry-based classification. In this perspective,
shape may be regarded as the common denominator between different knowledge universes.
The vision of the Semantic Web community, also represented at the conference, is slightly
different from the one of FOCUS K3D. Semantics is not generally used in the sense of intent
and knowledge but in the sense of description of the relations between units, where the
relationships are described so that there are mechanisms/algorithms to derive many types of
inference (ontological description). The two definitions do not diverge but focus on different
aspects: semantic web is primarily a web of data, and then including effectively 3D media as
part of such data requires conforming to the development of web. In more practical terms,
whichever the semantics one intends to formalise (both geometrical and contextual), the
annotation should make use of http URIs to add metadata to objects and parts of objects,
because this permits to identify univocally a resource through a shared syntax. In addition, it
is profitable to use public vocabularies when possible and make our own 3D data publicly
available and link them with any relevant resource accessible on the net, since this opens up to
the new application facilities available on the web for other kind of data.
Being the semantic web focussed mainly on relations between resources, the future
challenges addressed by the community are related to the reasoning about the data. One
aspect to tackle is the introduction of uncertainty into the picture and then developing fuzzy
description logic algorithms. In fact, the different inference mechanisms used today are all
based on formal and binary logic. There are few approaches trying to solve this problem,
adopting probabilistic approaches in description logic. Research is now far from a solution: the
mechanisms have still to be understood and then reach a state of equilibrium to be
standardised. A second issue, which has been raised in the past few years and relevant in the
context of 3D semantic media, is the impossibility of description logic to find partial solutions
to a query since description logic based inference tries to find all the solutions of a search.
Going to the web scale where the amount of data is now massive, this approach does not work
anymore: it is preferable to retrieve partial data in a few seconds instead of all data (or even
failure) in days.
Another difference between the professional users addressed by FOCUS K3D and the variety
of possible users of the web is related to the freedom in documenting data. In the web, the
control on the annotation becomes crucial to an effective retrieval and reuse of very
specialised data. Oppositely, the web evolves in an organic way, and consequently it is
fundamental leaving people free to find new ways to exploit relationships among data, which
will produce new semantic data we even did not think of before. It was suggested to leave
room to the social mechanism and learn from the way it happened on the web in the
documentation of 3D data because the additional value in the retrieval of results is in the
combination of results coming from different kinds of annotation. Actually this is one of the
approaches we mentioned to tackle the massive annotation issue: in fact, the way the web
evolves today is extremely interesting beyond the technology and is something new for us –
classical scientific people - since we were not trained in such an approach.
Summing up, we can conclude that it was confirmed by the audience that the grand
challenges proposed by FOCUS K3D are perceived as the major issues of the future of
semantic 3D media. Effective ways to extract and represent knowledge for an effective
understanding and interaction with the model are necessary together with a standardisation of
the terminology and the documentation itself. Even effective ways to visualise data and
metadata have to be investigated: the best way to categorise non-textual data could be not by
FOCUS K3D D1.4.1
text that we are used to now, as there could be better representations. Accomplishing these
goals would benefit the sharing, the transfer, the retrieval and reuse of semantic data in the
different contexts. Due to the explosion of data of different nature, all the application domains
emphasised the urgency of handling a massive amount of heterogeneous and uncertain data,
which indeed proves the high potential of such research themes in terms of technological and
economic impact on different user communities.
FOCUS K3D D1.4.1
(Abaci et al. 2005) T. Abaci, J. Ciger and D. Thalmann, Planning with Smart Objects.
International Conferences in Central Europe on Computer Graphics, Visualisation and
Computer Vision, 2005.
(AhDa2008) L. von Ahn and L. Dabbish, Designing Games With A Purpose, Communication of
the ACM, 51(8), pp. 58-67, 2008.
(AIM@SHAPE) AIM@SHAPE: Advanced and Innovative Models And Tools for the development
of Semantic-based systems for Handling, Acquiring, and Processing knowledge Embedded
in multidimensional digital objects, European Network of Excellence, Key Action: 126.96.36.199
Semantic-based knowledge systems, VI Framework, Contract IST 506766, URL:
(Attene et al. 2009) M. Attene, F. Robbiano, M. Spagnuolo, B. Falcidieno, Characterisation of
3D Shape Parts for Semantic Annotation. Computer- Aided Design, Vol. 41, No. 10, pp.
(Blanz et al. 99) V. Blanz, M.J. Tarr and H.H. Bülthoff, What object attributes determine
canonical views? Perception 28, pp. 575–599, 1999.
(Bouvier et al. 2009) B. Bouvier, R. Grunberg, M. Nilges, and F. Cazals, Shelling the Voronoi
interface of protein-protein complexes reveals patterns of residue conservation, dynamics
and composition. Proteins: structure, function, and bioinformatics, 76 (3), 2009.
(Bustos et al. 2007) B. Bustos, D. Keim, D. Saupe, T. Schreck, Content-based 3D object
retrieval, IEEE Computer Graphics and Applications (CG&A), Vol. 27, No. 4, 2007.
(Byrne et al. 2008) D. Byrne, A.R. Doherty., C.G.M. Snoek, G.F. Jones, A.F. Smeaton,
Validating the Detection of Everyday Concepts in Visual Lifelogs. SAMT 2008 – 3rd
International Conference on Semantic and Digital Media Technologies, Koblenz, Germany,
3-5 December 2008.
(Charbonnier et al. 2007) C. Charbonnier, B. Gilles, N. Magnenat-Thalmann, A Semantic-
Driven Clinical Examination Platform. In Surgetica'2007, Computer-Aided Medical
Interventions: Tools and Applications, Chambéry, France, pp. 183-189, September 2007.
(Charbonnier et al. 2009) C. Charbonnier, L. Assassi, P. Volino, N. Magnenat-Thalmann, Motion
Study of the Hip Joint in Extreme Postures. Vis Comput, Springer-Verlag, Vol. 25, No. 9, pp.
(DeFloriani 2009) L. De Floriani, Computing and Visualizing a Graph-Based Decomposition for
Non-manifold Shapes. LNCS Graph-Based Representations in Pattern Recognition, 2009.
(Euzenat2002) J. Euzenat, Research Challenges and Perspectives of the Semantic Web. IEEE
Intelligent Systems, vol. 17, no. 5, pp. 86-88, Sep./Oct. 2002.
(Falcidieno et al. 2004) B. Falcidieno, M. Spagnuolo, P. Alliez, E. Quak, E. Vavalis, C. Houstis,
Towards the semantics of digital shapes: the AIM@SHAPE approach. Proc. of the European
Workshop on the Integration of Knowledge, Semantics and Digital Media Technology
(EWIMT2004), P. Hobson, E. Izquierdo, I. Kompatsiaris and N.E.O'Connor Eds., pp. 1-4,
(Fut09) Timeline for the Future: Potential Developments and Likely Impacts. The Futurist,
Volume 43, No.2, p: 33-38, March-April 2009, ISSN 0016-3317.
(Gilles et al. 2004) B. Gilles, R. Perrin, N. Magnenat-Thalmann, J-P. Vallee, Bones motion
analysis from dynamic MRI: acquisition and tracking. Proc. of MICCAI'04, Vol. 2, pp. 942-
(GoFu2009) A. Golovinskiy, T. Funkhouser, Consistent Segmentation of 3D Models. Computers
FOCUS K3D D1.4.1
& Graphics, IEEE SMI 2009 proceedings, (33)3, pp. 262-269, June 2009.
(Golshani06) F. Golshani, Multimedia and Reality. IEEE Multimedia, Vol. 13, N. 1, pp. 96-96,
(Guharoy et al. 2010) M. Guharoy, J. Janin, C. Robert, Biological macromolecules: from static
structure to exible partner. Semantic 3D Media and Content, FOCUS K3D conference, 2010.
(Gutierrez 2005) M. Gutierrez, Semantics-based representation of virtual environments. Int. J.
of Computer Applications in Technology, Vol. 23, 2005.
(HaFe2007) S. Havemann, D.W. Fellner, Seven Research Challenges of Generalised 3D
Documents. IEEE Computer Graphics and Applications, vol. 27, no. 3, pp. 70–76, 2007.
(Havemann et al. 2008) S. Havemann. V. Settgast, R. Berndt, O. Eide, D. Fellner, The Arrigo
Showcase Reloaded – towards a sustainable link between 3D and semantics. VAST:
International Symposium on Virtual Reality, Archaeology and Intelligent Cultural Heritage
proceedings, pp 125-132, 2008.
(Heitbrink 2008) M. Heitbrink, Multimodality (MM) imaging: Enhancement through interlacing.
Vision paper commissioned by the boards of the NVvR and NVNG”, Eur. J. Nucl. Med. Mol.
Imaging, 35:pp. 1221-1229, 2008.
(Höhne 1995) K.H. Höhne, A new representation of knowledge concerning human anatomy
and function. Nature medicine. 1995.
(ImMo2005) C. Imielinska, P. Molholt, Incorporating 3D virtual anatomy into the medical
curriculum. Communications of the ACM, v.48 n.2, 2005.
(KaTh1999) M. Kallmann, D. Thalmann, Direct 3D interaction with Smart Objects. VRST'99.
Proceedings of the ACM Symposium on Virtual Reality Software and Technology, 1999.
(Koller et al. 2004) D. Koller, M. Turitzin, M. Levoy, M. Tarini, G. Croccia, P. Cignoni, R.
Scopigno, Protected interactive 3D graphics via remote rendering. ACM Trans. Graph. 23, 3,
(Koller et al. 2009) D. Koller, B. Frischer, G. Humphreys, Research challenges for digital
archives of 3D cultural heritage models. J. Comput. Cult. Herit. 2, 3, pp. 1-17, Dec. 2009.
(Lafortune et al. 1992) M.A. Lafortune, P.R. Cavanagh, H.J. Sommer, A. Kalenak, Three
dimensional kinematics of the human knee during walking. J Biomech, 25(4):pp. 347-357,
(Lipinski et al. 2007) C.A. Lipinski, F. Lombardo, B.W. Dominy, P.J. Feeney, Experimental and
computational approaches to estimate solubility and permeability in drug discovery and
development settings. Adv. Drug Del. Rev. 23: pp. 3–25, 2007.
(Lou et al. 2009) R. Lou, F. Giannini, J.P. Pernot, A. Mikchevititch, B. Falcidieno, P. Veron, R.
Marc, Towards CAD-less finite element analysis using group boundaries for enriched
meshes manipulation, Proceedings of the ASME 2009 International Design Engineering
Technical Conferences & Computers and Information in Engineering Conference IDETC/CIE,
San Diego, CA, US, 2009 August 30 – September 2, 2009.
(Mortara et al. 2004) M. Mortara, G. Patané, M. Spagnuolo , B. Falcidieno , J. Rossignac,
Plumber: a method for a multi-scale decomposition of 3D shapes into tubular primitives
and bodies. Proceedings of Solid Modeling and Applications (Poster Session), pp. 339–44,
(MoSp09) M. Mortara, M. Spagnuolo, Semantics-driven best view of 3D shapes. Computers &
Graphics Volume 33, Issue 3, pp. 280-290, June 2009.
(Nestlea 2006) U. Nestlea, Practical integration of [18F]-FDG-PET and PET-CT in the planning
of radiotherapy for non-small cell lung cancer (NSCLC):The technical basis, ICRU-target
volumes, problems, perspectives. Radiotherapy and Oncology, Volume 81, Issue 2, pp.
FOCUS K3D D1.4.1
(Ogiela2009) M. R. Ogiela, Picture grammars in classification and semantic interpretation of 3D
coronary vessels visualisations. Opto-Electronics Review, 2009
(PaFi2005) C. Paris and M. Fiedler Eds., Protection of stored data by system level encryption
schemes. EuroNGI Deliverable D.WP.JRA.6.3.5, 2005.
(Rochelle 2004) L. Rochelle, Morphology and Lexical Semantics.' Cambridge University Press,
(Rosset 2006) A. Rosset, Navigating the Fifth Dimension: Innovative Interface for
Multidimensional Multimodality Image Navigation. Radiographics, 26(1): pp. 299-308, 2006.
(Smeulders00) A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, Content-Based
Image Retrieval at the End of the Early Years, IEEE Transaction on Pattern Analysis and
Machine Intelligence, Vol. 22, N. 12, pp. 1349-1380, 2000.
(SpFa2007) M. Spagnuolo, B. Falcidieno, The Role of Ontologies for 3D Media Applications.
Semantic Multimedia and Ontologies, Part III, Chapter 7, pp. 185 - 205. Y. Kompatsiaris, P.
Hobson (eds.), London Springer, 2007.
(SpFa2009) M. Spagnuolo B. Falcidieno, 3D Media and the Semantic Web, IEEE Intelligent
Systems, Ed. S. Staab, pp. 1-8, March/April 2009.
(Spitzer et al. 1996) V. Spitzer, M.J. Ackerman, A.L. Scherzinger, D. Whitlock, The Visible
Human Male: A technical report, JAMIA 3, 2, 1996.
(TaVe2008) J.W.H. Tangelder, R.C. Veltkamp, A survey of content based 3D shape retrieval
methods. Multimedia Tools and Applications, volume 39, 441-471, 2008.
(Theodoridou et al. 2008) M. Theodoridou, Y. Tzitzikas, M. Doerr, Y. Marketakis, V.
Melessanakis, Modeling and Querying Provenance using CIDOC CRM
http://www.casparpreserves.eu/Members/metaware/Papers, December 2008
(Verykios et al. 2004) V.S. Verykios, E. Bertino, I.N. Fovino, L.P. Provenza, Y. Saygin, Y.
Theodoridis, State-of-the-Art in privacy preserving data mining. ACM SIGMOD Rec. 3, 1, pp.