Are Data Models Superfluous Nov2003
Upcoming SlideShare
Loading in...5
×
 

Are Data Models Superfluous Nov2003

on

  • 313 views

This article advocates that information storage requirements should not be expressed in the form of data models or conceptual schemas, but database structures should allow for any expression in a ...

This article advocates that information storage requirements should not be expressed in the form of data models or conceptual schemas, but database structures should allow for any expression in a general purpose language, whereas implementation constraints should be expressed as constraints on the use of the general purpose language.

Statistics

Views

Total Views
313
Views on SlideShare
302
Embed Views
11

Actions

Likes
0
Downloads
3
Comments
0

3 Embeds 11

http://www.linkedin.com 5
https://www.linkedin.com 5
http://www.slashdocs.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Are Data Models Superfluous Nov2003 Are Data Models Superfluous Nov2003 Document Transcript

  • Gellish A standard data and knowledge representation language and ontology Are Data Models becoming Superfluous? by Ir. Andries van Renssen Shell Global Solutions International Andries.vanRenssen@shell.com Abstract Data storage and data communication lack a common standard universal data model as well as a common data language and knowledge base with a taxonomy of concepts and a grammar for data exchange messages. This article presents a solution to this problem in the form of the new Gellish language and knowledge base, as an extension of the standard data models and ontology of two new ISO standards. The article presents Gellish as a language for neutral data exchange between systems, that can replace data models, and that provides an extendable ontology with standard reference data for customization and harmonization of systems. The definition of Gellish includes the public domain (“open data”) Gellish knowledge base with definitions of a large number of concepts and product models. It illustrates that a single Gellish Table in a database or data exchange file, is sufficient to express a wide range of kinds of facts about classes as well as facts about individual objects. Keywords: knowledge representation, data exchange, language, data models, standards, ontology, semantic web, knowledge base, classification system Table of Content Knowledge versus Data Models 1 13/04/2010
  • Introduction Currently, each software system stores its data using its own data model and communicates with other systems usually using a dedicated interface data structure, which means that it applies a dedicated interface data model. The large variety of data models cause that data exchange between systems is costly because of the required conversion of the data from the semantics of one data model to the other. This demonstrates the urgent need for widely applicable common standard data models. Often systems can be ‘customized’ by adding ‘reference data’ as instances, such as the definition of equipment types, document types, activity types, property types, pick lists, etc. However, reference data are usually different per implementation, even when database structures of different systems are equal, such as is the case with several implementations of the same system. This also holds for different implementations of the same system, such as a CAD, CAE, PDM, PLM, ERP or CRM system. The consequence is that data in those implementations can still not be compared, integrated or exchanged without costly data conversion processes. This illustrates the urgent need for a common dictionary, classification system or taxonomy of reference data, because there is currently no standard user data language. In the current systems there is a separation between the world of data models and the world of instances. Data models are developed by IT specialists (data modelers) who document them using either proprietary tools or using a standard data modeling language, such as EXPRESS (ISO 10303-11) or UML, which languages are especially designed to define data models. Once a data model is defined in such a language, the data model acts as another language in which the reference data as well as the user data has to be expressed. The use of two different languages, one for the model, one for the user data, illustrates the barrier between the two worlds. It is as if the English language definition is expressed in Chinese. On top of this comes that each programmer and each reference data producer is free to define his own terminology using those data definition languages! The result of the current state of the art is that data storage is done in a Babylonian mix of data models and reference data ‘languages’ with the consequence that exchange of data between systems is impossible, except where dedicated bilateral translators are created not only for each pair of data models, but also for the data content ‘languages’. The current situation is sketched by Smith and Welty (2001) as follows: “Out of the apparent chaos, some coherence is beginning to emerge. Gradually, computer scientists are beginning to recognize that the provision, once for all, of a common, robust reference ontology – a shared taxonomy of entities – might provide significant advantages over the ad-hoc, case-by-case methods previously used”. Several attempts are made to develop an ‘upper ontology’, such as SUMO by Niles and Pease (2001), the IEEE Standard Upper Ontology, SUO (2001), the Cyc ontology, Lenat (1995) and GOL, Degen et al (2001). However none of them integrates the upper level ontology with a lower level ontology of reference data. In other words they do not integrate a generic data model with reference data and a language for the description of knowledge and of individual objects and processes. This article presents a solution to the above-mentioned issues in the form of the Gellish language. Gellish satisfies the criteria for proper ontologies as expressed by Degen et al (2001 par 6.1), but is not limited to an upper ontology. It includes and extents concept definitions that also appear in other sources such as ISO standards and IEC standards, and knowledge stemming from industry standards and proprietary sources. It is extendable just as any natural language. Its taxonomy and knowledge base uses unique identifiers for Knowledge versus Data Models 2 13/04/2010
  • concepts, thus allowing for synonyms and multiple names in various languages. The latter enables the expression of propositions about facts in one natural languages and automatic translation and presentation in any other natural language. Gellish eliminates the traditional barrier between the data model definitions of classes and the data instances. The Gellish language demonstrates that this barrier is not necessary and that there are clear advantages when class definitions, reference data and user data are expressed in one and the same language. Standard data models, ontologies and reference data There are several developments of standard lower level ontologies and reference data libraries, stimulated among others by requirements of the e-commerce ‘market places’ and the developments around The Semantic Web promoted by Lee et al (2000) and the Web Ontology Language OWL. For example, the UNSPSC code (http://www.unspsc.org/), Ecl@ss (http://www.eclass.de/), Trade Ranger (http://www.trade-ranger.com/EN/Pages/ContentStandards.asp), etc. These standards have their value mainly in the standardization of terminology, but do not provide a standard language or a standard data model for general use, because of their limited semantic expression power due to the fact that they apply only a few relation types and lack of integration with a rich upper ontology. There have also been several attempts to develop standard data models for data exchange or for data storage. Some of them are proprietary, but others are in the public domain. Those standard data models are defined independent of a particular system, and are therefore called ‘neutral’. Those standard data models are usually developed for a particular application domain instead of being limited to a particular system. Examples of standard data models are the STEP family of standards in ISO 10303, such as a graphics data model AP203, a data model for the automotive industry (AP214), one for piping systems (AP227), one under development for the defense industry (AP239, PLCS), etc. The integration of all those data models into one overall data model is not yet fully achieved. Although the scopes of these valuable standard data models are wide, they are still limited to particular application area’s and do not provide a general ‘common language’ yet. A further step towards a data model with a generic scope was the development and publication of the Epistle Core Data Model (2001), in which development the author of this article participated. From that, two new ISO standards were derived, ISO 15926-2 and its counterpart within the STEP family (AP221). Although these generic data models stem from the process industries, they have the generic nature of an upper level ontology, which make that they are applicable in other application domains as well. To become practically applicable in a particular application domain, these generic data models need a standard ‘reference data library’ or lower level ontology, in order to add standard definitions of application domain specific concepts and to specialize the generic data model. The author coordinated the development of such a standard reference data library, called STEPlib. This is a main source for the common standard library ISO 15926-4. Then it was discovered that the top of the specialization hierarchy of standard data in the library coincided with the entities, attributes and relations in the generic data model. This led to the inclusion of the data model in the library. In other words, the upper level ontology was combined with a lower level ontology. The insight that information should be contained in relations and not in objects, led to the birth of the Gellish language, which is based on standard relation types, expressed by natural language ‘phrases’. Knowledge versus Data Models 3 13/04/2010
  • The Gellish language and ontology Gellish is a public domain standard data and knowledge representation language and ontology that that is defined in STEPlib. It does not have the barrier between the user data and the IT data model data. It contains and extents the concepts of the above mentioned generic data models and integrated and extended them with standard reference data and a knowledge base with product and process models. The ontology includes also the definition of a large number of standard fact types (or relation types) that defines the grammar of the Gellish language. It contains the definition of over 20.000 concepts arranged in a specialization hierarchy of classes. These concepts can be interpreted as entity types, attribute types and relationship types or as a classification system or taxonomy. This makes Gellish equivalent to a very large data model. In addition to that STEPlib contains a large number of relations between the concepts. They define the content of the knowledge base of product models and process models. Gellish is not object oriented, but fact oriented. The basic Gellish object is therefore a fact. Each (atomic) fact is expressed as a relation between (two) objects. For example, fact 1 is expressed by a particular relation between objects with unique identifiers (UID’s) 100 and 101. This expression (1, 100, 101) illustrates the structure of each basic Gellish expression. Gellish requires that both the objects and the fact must be classified explicitly by standard classes, including standard relation types. The standard classes are predefined in the Gellish ontology. In addition to that, objects may have a name. This enables that the expression can be interpreted correctly by software. Gellish and the above mentioned ISO standards are both based on the understanding that there appears to exist a limited set of application independent standard relation types that are sufficient to model all kinds of products and processes. Gellish standardizes these relation types. The relation types also define the role types that the related objects play in the relations with each other. The variety and extendibility of standard relation types define the semantic expression capabilities of Gellish. A large part of the Gellish relation types is defined in the ISO standards and an extended set is defined in the TOP part of the Gellish language definition (STEPlib). A standard implementation of Gellish is defined as a Gellish Table. In a Gellish Table the basic Gellish expression becomes: Left hand Left hand Fact Relation Relation type Right Right hand object object UID type UID name hand object name UID name object UID 100 thing-1 1 2850 is related to 101 thing-2 In a Gellish Table one (atomic) fact is represented by one record, being as a relation between two object UID’s, the names of the objects and the classification of the fact. The classification of the objects is done via separate classification facts in additional records. Some examples of facts from a particular application domain, which illustrates the use of standard Gellish relation types are: Left Left hand Fact Relation Relation type name Right Right hand Scale hand object name UID type UID hand object name object object UID UID 130091 diesel engine 2 1146 is a specialization of 130108 engine Knowledge versus Data Models 4 13/04/2010
  • 104 M-1 3 1225 is classified as a 130091 diesel engine 130802 cylinder 4 1146 is a specialization of 730063 artifact 107 C-1 5 1225 is classified as a 130802 cylinder 107 C-1 6 1190 is part of 104 M-1 107 C-1 7 1727 has aspect 108 volume of C-1 108 volume of C-1 8 1225 is classified as a 550140 internal volume 108 volume of C-1 9 2044 is quantified as 922235 1800 cm3 104 M-1 10 4760 is subject of 110 order-1 • Note, for human readability, the relation type UID is ignored in the tables below. The above table illustrates: • Standard Gellish relation types, that classify the facts, and that determine the expression capabilities and semantics of Gellish. • Examples from the large number of standard object types that are predefined in Gellish. For example: engine, diesel engine, cylinder, artifact, internal volume, 1800 and cm3. • The way in which new object types can be added: such as fact 2 and 4. Although they already exist in Gellish. But if diesel engine and cylinder would not have existed, they could have been added in this way. • It is possible in Gellish to express facts, such as the volume of C-1, without the need that such a fact is pre-modeled in the data model. Although such a fact type could be defined in Gellish, after which this particular instance can be verified against such a definition. It could also be defined to be obligatory in a particular context, after which the instances can be validated on completeness and compliance. • One table is suitable to express many kinds of facts. Note: The table above presents just an example of some of the capabilities of Gellish. For example, Gellish also allows to express in which language the facts are expressed, whether the objects are real or imaginary, what the communicative intent is, who the author of a proposition is and the addressee, etc. Storage and exchange of data as well as semantics in Gellish In this paragraph I will describe how knowledge, data and semantics are represented in Gellish. The generic nature of Gellish allows expressing any complex network of facts. For example it allows expressing that: - physical objects (of any kind) have properties (of any kind), - properties have values, - physical objects have parts, - physical objects participate in activities or processes in particular roles, - etc. But for clarity I will use a specific example, being the fact that: - a particular pump (‘P-1’) is pumping a particular stream (‘S-1’). Knowledge versus Data Models 5 13/04/2010
  • In a conventional database it is required to declare some entity types and attribute types that define the semantics in the form of a data model. In case of the example, the data model could for example consist of the entity types ‘pump’, ‘process’ and ‘stream’, each with some attributes. In Gellish, the concepts ‘pump’, process’ and ‘stream’ are not entity types, but they are concepts that are defined via facts that are expressed as relations in a generic knowledge base. The knowledge base has a structure that only ‘knows’ the minimum number of ‘basic semantic axioms’ and contains the definition of a large number of concepts. The minimum set of ‘basic semantic axioms’ comprises the fundamental ontological concepts of Gellish that should be known and understood and which are sufficient for the definition of additional semantic concepts. For the definition of a new concept it is required to define a coherent set of elementary facts, expressed as relations between the new concept and the existing concepts. In other words, each new concept requires the creation of a structure as presented in figure 1. kind of thing is (a) is a is a role anything playing requirement relation (of something a role of role in relation) plays in - object-1 - role-1 played by requires - relation-1 - object-2 - role-2 Figure 1, Basic semantic concepts The minimum set of ‘basic semantic concepts’ that are the axioms of Gellish and which meaning should be understood is: - anything - role - relation / relations - plays role - requires role - is / is a (is classified as a) - individual thing / individual things - kind of thing / kind of things - single thing / plural thing The structure of figure 1 holds for facts about classes as well as facts about individual objects (instances) or relations, but also for single objects as well as for plural objects. In other words, object-1 and object-2 in figure 1 can be either a single or plural individual object, relation or class. The lines in the top left corners of the boxes indicate that the structure is a typical instance. Knowledge versus Data Models 6 13/04/2010
  • Any other ‘atomic fact’ is expressed as such a structure. In other words, any atomic fact is expressed as an ‘atomic relation’ between two or more ‘objects’ and by the classification of the ‘objects’, the ‘roles’ and the ‘relation’. This implies that an atomic fact is expressed by a structure of nine (9) relations, formed by the blue boxes in figure 2 (note that 4 of the 5 boxes appear twice in an atomic fact). For example the fact that impeller O1 is part of centrifugal pump O2 is expressed in Gellish by the following 4 elementary relations: - O1 plays role R1 - R1 is required by C1 - C1 requires role R2 - R2 is played by O2 These 4 relations relate 5 objects. To interpret them correctly the following 5 additional classification relations are required: - O1 is classified as an impeller - R1 is classified as a part - C1 is classified as a composition relation (“is part of”) - R2 is classified as a whole - O2 is classified as a centrifugal pump In practical implementations it appears that the explicit identification of the roles and their classification can be neglected, because they follow from the classification of the relation and the definition of the relation type. Therefore the above relations are usually summarized in 3 Gellish atomic expressions as follows: - O1 is classified as an impeller - O1 is part of O2 - O2 is classified as a centrifugal pump From this example it can be seen that the 5 kinds of things with which the 5 objects are classified need to be present in or added to the semantics of the Gellish knowledge base in order to ensure that the fact can be interpreted correctly. The awareness that a knowledge base of predefined concepts is required for a correct interpretation of Gellish expressions resulted in the development of the top-down hierarchical definition of the Gellish knowledge base of concepts, including also relation types, as available in STEPlib. Knowledge representation: relations between classes Any fact type that extends the semantics is expressed as a relation between kinds of things. For example, assume that the concept ‘centrifugal pump’ needs to be added. Then the following two atomic relations define that concept: 1. A specialization relation that defines that: centrifugal pump is a specialization of pump 2. A relation that defines that a centrifugal pump by definition uses the centrifugal principle: centrifugal pump has by definition as aspect centrifugal. These relations build respectively on the definition of the concept ‘pump’ and ‘centrifugal’. Knowledge versus Data Models 7 13/04/2010
  • Interpretation of expressions In current database technology the semantic interpretation of an expression is done via the fact that any object is implicitly classified by being an ‘instance’ of an entity of which the semantics are defined. For example, assume that P1 is an instance of an attribute called ‘name’ of an entity called ‘pump’. This probably means that P1 is the name of a thing that is classified as a pump, although this meaning comprises two facts that are usually not defined in a computer interpretable way. It should be noted that if there are no other attributes, this data structure does not allow the classification of P1 as a centrifugal pump. In Gellish all semantics is made explicit by the creation of explicit classification relations between the elements in the expression and the Gellish concepts (classes of objects, including relations). This replaces the instantiation relations and eliminates the need to define a data model with entities and attributes, such as the entity ‘pump’ and the attribute ‘name’. This is illustrated in figure 3. Green shaded area = Gellish ontology (STEPlib) 130206 730083 192512 pump is performer of liquid stream is subject in pumping classifier classifier classifier classifier classifier 13 15 14 is classified as aa is classified as is classified as aa is classified as is classified as aa is classified as aa is classified as is classified as aa is classified as is classified as classified classified classified classified 12 112 ‘P-101’ 111 ‘S-1’ 113 ‘is subject in pumping S-1’ ‘pumping S-1’ 11 classified player requirer ‘is performer of pumping S-1’ player requirer Figure 2, Linking a Gellish expression to Gellish concepts through classification Figure 2 illustrates the expression: P-101 is pumping S-1” (in dark yellow). The ‘pumping S-1’ process is an interaction between the fluid S-1 and the pump P-101. The pump has the role as performer and the liquid has the role as subject in the pumping process. The blue boxes in the green shaded area represent the Gellish concepts, being instances in the Gellish knowledge base, STEPlib. The explicit classification relations with the concepts in those blue boxes provide the semantics for the interpretation of the expression. In a Gellish Table this becomes: Left hand Left hand Fact UID Relation type name Right hand Right hand object UID object name object UID object name 111 P-101 11 is performer of 112 pumping S-1 113 S-1 12 is subject in 112 pumping S-1 111 P-101 13 is classified as a 130206 pump 112 pumping S-1 14 is classified as a 192512 pumping Knowledge versus Data Models 8 13/04/2010
  • 113 S-1 15 is classified as a 730083 liquid stream Such a set of rows in a Gellish Table can be exchanged between Gellish enabled software packages in any kind of table, such as an MS-Access database table, an Oracle or DB2 table, XLS spreadsheet, an XML file (e.g. according to ISO 10303-28) or in STEP physical file format (ISO 10303-21). Further details are described in ref. 1. Note that the shaded light yellow boxes all have the same name: “is classified as a”. However, they are different individual classification relations. Each of those relations has a unique identifier (13, 14 and 15). The name in the shaded box indicates that each is (implicitly) “conceptualized” to be a classification relation. In other words, each of them is a “is classified as a” relation. For a correct interpretation of the Gellish concepts they need to be defined in a computer interpretable way. This is done via specialization/generalization relations as is illustrated in figure 3. These specialization relations form one hierarchical network terminating at the top, called ‘anything’. This generic top supports the wide applicability of Gellish, as any missing concept can be added to Gellish as a subtype of an existing concept. anything is aa specialization of is specialization of individual things Green area = Gellish ontology individual thing instance isis aninstance of an instance of kinds of things supertype entity is aa specialization of is specialization of is aa specialization of is specialization of is aa specialization of is specialization of subtype instance physical object supertype relation activity supertype supertype supertype is aa specialization of is specialization of is aa specialization of is specialization of is aa specialization of is specialization of is aa specialization of is specialization of is aa specialization of is specialization of subtype subtype subtype subtype subtype pump is performer of liquid stream is subject in pumping classifier classifier classifier classifier classifier is classified as aa is classified as is classified as aa is classified as is classified as aa is classified as aa is classified as is classified as aa is classified as is classified as classified classified classified classified ‘P-101’ ‘S-1’ ‘subject in pumping S-1’ ‘pumping S-1’ classified player requirer ‘performer of pumping S-1’ player requirer Figure 3, Definition of Gellish concepts in a specialization hierarchy In practice there are several intermediate levels of specialization between e.g. ‘pump’ and ‘physical object’ and ‘anything’, etc. Furthermore there are classes of physical objects defined as subtypes of ‘physical object’. These can be extended by specializations, such as standard components (e.g. from ASME, BSI or DIN standards) and also specializations such as manufacturer catalogue items (e.g. Manufacturer models and types). Figure 3 contains eight facts expressed as eight “is a specialization of” relations, each of which is a separate relation between classes. Similarly to what is described above about the “is classified as a” relation, this illustrates that the term ‘is a specialization of’ is not the Knowledge versus Data Models 9 13/04/2010
  • name of each of those relations, but it is a name of the Gellish concept (the class) that is the conceptualization of those relations. The knowledge about the meaning of the concepts pump, ‘is performer of’, liquid stream, ‘is subject in’ and pumping is defined in the Gellish ontology STEPlib. Some of that is illustrated in the following facts, which includes some intermediate facts not shown in figure 3 (the UID’s and names are taken from STEPlib, except for the UID’s of the facts): Left hand Left hand Fact UID Relation type name Right hand Right hand object UID object name object UID object name 130206 pump 16 is a specialization of 730044 physical object 4761 is performer of 17 is a specialization of 4767 is involved in 4761 is performer of 18 requires as role-1 a 640020 performer 730044 physical object 19 can have as role as a 640020 performer 4761 is performer of 20 requires as role-2 a 4773 involver 730083 liquid stream 21 is a specialization of 730045 stream 4760 is subject in 22 is a specialization of 4767 is involved in 192512 pumping 23 is a specialization of 190168 process This knowledge is inherited from higher concepts in the hierarchy to lower level concepts. If an individual object is classified to be of such a class, then the knowledge is applicable to the individual object as a constraint for the specific aspects of the individual object. Experiences and applications Gellish is applied to express - information about individual objects, - knowledge about kinds of objects, - requirements for data and documents in particular contexts about individual objects and about kinds of objects. These three application are related to each other, as is illustrated in Figure 4. Knowledge versus Data Models 10 13/04/2010
  • Product / Requirements / Knowledge models Product Model Requirements Model Knowledge Model has / is shall have a / shall be a can have a / can be a (in the context of a) Dongting SHELLlib STEPlib SGP DEP xxx Coal gasification facility shall comply with compressor U-1300 shall have a K-1301 system luboil system K-1301 is classified as a can have a shall have a LubOil-100 capacity Copyright: Shell Global Solutions International B.V. Figure 4, Three types of Gellish Models The left hand of Figure 4 represents a Product Model that illustrates a Gellish model of a process plant (the thick black lines represent composition relations). The relation types in a product model generally start with ‘is’ or ‘has’. For example, K-1301 system is part of U-1300 and K-1301 is classified as a compressor. The right hand Knowledge Model illustrates the content of the STEPlib knowledge base. The relation types in a knowledge model generally start with ‘can be a’ or ‘can have a’. For example, a compressor can have a capacity and a lubrication oil system can be part of a compressor. The middle part of Figure 4 illustrates a proprietary Requirements Model that expresses which data has to be present in a particular context. The relation types in a requirements model generally start with ‘shall be a’ or ‘shall have a’. For example, we developed requirements models that express that in the context of ‘handover’ of data from design to operations a compressor shall have a capacity (in the context of a handover) and a compressor shall be compliant with design guide xx, in the same context. This is expressed in Gellish as follows: 130069 compressor 24 shall have a 551564 capacity 130069 compressor 25 shall be compliant with 5490386 DEP 31…. When data about a compressor is handed over, then this Gellish specification makes it possible to do an automated verification of the completeness of that data, whereas that verification is driven by the requirements model. This is illustrated in figure 5. Knowledge versus Data Models 11 13/04/2010
  • Figure 5, Automated verification of a design against a requirements model The right hand side of figure 5 illustrates the content of the SHELLlib knowledge base, which is a proprietary extension of STEPlib, which also uses Gellish. It illustrates how the knowledge in STEPlib and SHELLLlib is inherited via the specialization hierarchy. Because although P-101 is classified as a centrifugal pump, the requirement that is defined for a pump in general can automatically be made applicable to P-101, because of the defined inheritance via the specialization hierarchy. The specialization hierarchy also enables intelligent queries. For example search engines can perform intelligent searches on subtypes of keywords. For example, a document which is recorded to contains information about a line shaft pump can also be found if documents are searched about ‘centrifugal pump’. And a query on ‘pump’ can also find P-101, being classified as centrifugal pump. An example of a commercial application of Gellish is a Gellish Browser developed by Mi2. The browser can read (and write) data expressed in the Gellish language and is able to present any knowledge about classes of objects and any data about individual objects. It was expected that implementation of Gellish would have serious performance issues. Therefore the Browser was loaded with over 60.000 facts, originating from different systems, but all expressed in a Gellish Table. These facts included the Gellish knowledge base, extended with a Shell proprietary standards database, data about documents, a materials catalogue, an equipment list and material balances of the design of a process plant. It appears to have an excellent performance. Knowledge versus Data Models 12 13/04/2010
  • We also customized an implementation of the Eigner PLM product lifecycle management system and loaded the same data in that system. This system also had a good performance. We are currently working on the customization of existing systems so that they can export data in a Gellish Table. The Browser can then be used to view data from various systems and data can be imported and integrated with other data in the Eigner PLM system. It is our intention to use a Gellish Table among others as a data exchange language for data hand-over of design data between engineering contractors and plant owners and for data about catalogue items and items delivered by suppliers. Further work will explore the use of Gellish for the exchange of messages by intelligent Agent software, acting as nodes in the Semantic Web. For example business communication messages about transactions in E-procurement. Conclusions The above illustrates that the current practice to define data models separate from reference data and user data is unnecessary. Integration of data model concepts with reference data and user data in one consistent language can provide a single common standard language for data storage and exchange that can significantly reduce development costs and can simplify data communication. A common use of the little data model of figure 2, together with the common use of the Gellish ontology makes it possible to express and interpret a very wide scope of types of facts. This is possible because the explicit classification relations provide interpretation rules for the expressions for which the relation types as well as the object types are defined in Gellish. It is only required to have the concepts defined in the Gellish knowledge base and to refer to them as in the basic structure using the ‘basic semantic axioms’ mentioned above. The above illustrates that: - It is possible that a common standard knowledge base of concepts and relations between concepts can replace many data models. - The Gellish knowledge base of concepts solution is more flexible than fixed data models and it is easier to add semantics to the database. - The Gellish knowledge base of concepts provides an application independent language with a semantic basis that is equivalent to a very large data model. If sufficient concepts of an application domain are present or added, then data models for such an application domain can become superfluous. - The Gellish knowledge base, using the inheritance capabilities of the specialization hierarchy, provides extendable product models for many types of objects. - The implementations have proven that a Gellish knowledge base can be implemented with good performance. - The implementations have proven that neutral format data exchange using a Gellish Table is a feasible solution. As Gellish is in the public domain, proposals for extensions of the Gellish language are invited. References 1. Andries van Renssen, “The Gellish Table and its Formats”. A definition of the Gellish Table and its implementation syntax for Gellish messages. www.steplib.com. Knowledge versus Data Models 13 13/04/2010
  • 2. Andries van Renssen, “Guide on STEPlib”. This guide describes how STEPLib is defined and how to extent the Gellish language and knowledge base. www.steplib.com. 3. STEPlib, the Gellish knowledge base. This is a set of Gellish Tables (available in Excel and in MS Access). The upper level ontology part is documented in the TOPini part. www.steplib.com. 4. Tim Berners-Lee, James Hendler and Ora Lassila, 'The Semantic Web', Scientific American, May 2001; http://www.sciam.com/2001/0501issue/0501berners-lee.html. 5. OWL, Web Ontology Language Overview. http://www.w3.org/TR/owl-features/ 6. Ian Niles and Adam Pease (2001), “Towards a Standard Upper Ontology”, in: Formal Ontology in Information Systems, ISBN 1-58113-377-4. 7. SUO (2001), The IEEE Standard Upper Ontology website, http://suo.ieee.org. 8. Lenat, D. (1995), “Cyc: A Large-Scale Investment in Knowledge Infrastructure”, Communications of the ACM, 38, no 11 (November 1995). 9. Wolfgang Degen, Barbara Heller, Heinrich Herre and Barry Smith (2001), “GOL: A General Ontological Language”, in: Formal Ontology in Information Systems, ISBN 1-58113-377-4. 10. The Epistle Core Data Model (2001), http://www.btinternet.com/~chris.angus/epistle/specifications/ecm/ecm_400.html Knowledge versus Data Models 14 13/04/2010