SlideShare a Scribd company logo
Gellish English

    From data models to a modeling language


                         by
             dr. ir. Andries van Renssen




I
The problem
Gellish English is a formal language for an unambiguous description of reality. It is based on natural
language, is defined in a formal ontology1, and is practically applicable, at least for technical
artefacts.2 It is suitable to express and exchange information in the form of electronic data in a
structure that is system and natural language independent.
Gellish3 is designed to solve the problem that it is currently hardly possible to exchange information
between computers and to integrate information that comes from different sources without the creation
of costly data conversions and interface software. This has mainly two causes:
       1. Software developers are trained to create a new data structure for each new application, so that
          all those data structures are different.
       2. Software users are used not to use a standard electronic dictionary for the data that is the
          content of those data structures, so that in various systems different names are used for the
          same things.
Together these cause a ‘Babylonian confusion of tongues’ in information integration and in
information exchange between computer systems. This means that there is no common language for
communication with and between computer applications. This hampers the unambiguous interpretable
storage, integration and retrieval of information and often requires the development of costly data
conversion procedures. A solution to this problem is therefore of considerable social and economic
importance.

The solution
Gellish provides a solution to the root of this problem by being a language that can be used for an
unambiguous and computer interpretable description of reality and imagination. The language is a
formal subset of natural languages with variants per natural language, such as Gellish English, Gellish
Nederlands, etc. The language provides an extensible and generally applicable data structure that
potentially eliminates the need to develop ad hoc data structures for many applications. Gellish is
defined in such a way that expressions in one language variant can automatically be translated in any
other language variant for which a Gellish dictionary is available. Gellish is system independent and is
both human and computer readable. Gellish might also provide a contribution to the technology of
computerized natural language processing.
The Gellish language is developed through an analysis of commonalities and limitations of data
models and of many issues in data modelling, in combination with a study of ontologies and generic
semantics of languages.

Development stages
The following highlights illustrate the development process that resulted in the creation of Gellish.
Generic data modelling
The first stage was the development of the generic data modelling method. Generic data models are
generalizations of conventional data models. They define standardized general relation types, which
include the definition of the kinds of things that may be related by such a relation type. This can be
regarded as the definition of a formalized version of a natural language. For example, a generic data

1
 An ontology is the result of research on the nature and properties of things. An ontology can be presented as a
graphical model, as is illustrated by the figures in this thesis.
2
    Artifacts are products that are made by human beings.
3
  Gellish is originally derived from ‘Generic Engineering Language’, however it is further developed into a
language that is also applicable outside the engineering discipline.

           1
From Data Models to a Modeling Language



model may define a 'classification relation', being a binary relation between an individual thing and a
kind of thing (a class) that can be used to specify of what kind any object is. It may also define for
example a 'part-whole relation', being a binary relation between two things, one with the role of part,
the other with the role of whole, which is so generally applicable that it can be used to specify for any
object that it is a part of another object. The generic modeling method allows for the definition of an
extensible dictionary of classes. A generic modeling method with only those two relation types already
enables the classification of any individual thing and also enables to specify part-whole relations
between any two individual objects. By standardization of an extensible list of relation types, a generic
data model enables the expression of an unlimited number of kinds of facts and will approach the
capabilities of natural languages. Conventional data models, on the other hand, have a fixed and
limited domain scope, because the instantiation (usage) of such a model only allows expressions of
kinds of facts that are predefined in the model.

Generic data models are developed as an approach to solve some shortcomings of conventional data
models. For example, different modelers usually produce different conventional data models of the
same domain. This can lead to difficulty in bringing the models of different people together and is an
obstacle for data exchange and data integration. Invariably, however, this difference is attributable to
different levels of abstraction in the models and differences in the kinds of facts that can be
instantiated (the semantic expression capabilities of the models). The modelers need to communicate
and agree on certain elements which are to be rendered more concretely, in order to make the
differences less significant.

Generic data modeling has the following characteristics:

    •   A generic data model consist of generic entity types, such as 'individual thing', 'class',
        'relationship', and possibly a number of their subtypes.
    •   Every individual thing is an instance of a generic entity called 'individual thing' or one of its
        subtypes.
    •   Every individual thing is explicitly classified by a kind of thing ('class' or ‘concept’) using an
        explicit classification relationship.
    •   The classes used for that classification are separately defined as standard instances of the
        entity 'class' or one of its subtypes, such as 'class of relationship'. These standard classes are
        usually called 'reference data'. This means that domain specific knowledge is captured in those
        standard instances and not as entity types. For example, concepts such as car, wheel, building,
        ship, and also temperature, length, etc. are standard instances. But also standard types of
        relationship, such as 'is composed of' and 'is involved in' can be defined as standard instances.

This way of modeling allows the addition of standard classes and standard relation types as data
(instances), which makes the data model flexible and prevents data model changes when the scope of
the application changes.

A generic data model obeys the following rules:

    1. Attributes are treated as entities that are related by relationships to other entities.
    2. Entity types are represented, and are named after, the underlying nature of a thing, not the role
       it plays in a particular context.
    3. Entities have a local identifier within a database or exchange file. These should be artificial
       and managed to be unique. Relationships are not used as part of the local identifier of the
       entity but have their own unique identifier.
    4. Activities, relationships and events are represented as entities (not attributes).
    5. Entity types are part of a sub-type/super-type hierarchy of entity types, in order to define a
       universal context for the model. As types of relationships are also entity types, they are also
       arranged in a sub-type/super-type hierarchy of types of relationship.




        2
From Data Models to a Modeling Language



    6. Types of relationships are defined on a high (generic) level, being the highest level where the
       type of relationship is still valid. For example, a composition relationship (indicated for
       example by the phrase: 'is composed of') is defined as a relationship between an 'individual
       thing' and another 'individual thing' (and not just between e.g. an order and an order line). This
       generic level means that the type of relation may in principle be applied between any
       individual thing and any other individual thing. Additional constraints are defined in the
       'reference data', being standard instances of relationships between kinds of things.

Generic data models
The second stage was the development of a generic data model. The generic data model increased the
scope of conventional data models to a general applicability so that the generic data model can replace
many different conventional data models.
The difference between conventional data models and a generic data model is illustrated in
.

   Conventional                                      Generic data model
    data model                                       Generic Entity type

                                                      is classified as a
         Entity type         Generic Entity type                             Generic Entity type
                                  class of                                    individual thing
            pump
                              individual thing
                                                     Generic Entity type                           Generic Entity type
       Attribute types
                                                    can have as aspect a                             has as aspect
        pump.name             Generic Entity type                            Generic Entity type

                               class of aspect                                    aspect
      pump.capacity


                                           Instance-1                                   Instance-2
       Instance-2
                                                      is classified as a           P-101
              #1                    pump                is classified as a

      Attributes

            P-101
                                                    can have as aspect a                             has as aspect
                                                     can have as aspect a                             has as aspect
            5 dm3/s
                                  capacity
                                                      is classified as a          5 dm3/s
                                                        is classified as a


                          Figure 1, Conventional versus generic data models




        3
From Data Models to a Modeling Language



Generic data models include among others an explicit classification relation (the ‘is classified as a’
relations in ) between two different entity types. That explicit classification relation in principle
replaces the implied instantiation relation between the attribute instances and the attribute types in
conventional data modelling. The entity types in a generic data model has no attributes, because
attribute types are replaced by entity types whereas implicit and explicit relations between attribute
types (which relations are unqualified in conventional data models) are replaced by explicitly qualified
relations. This resulted in a generic data model with a taxonomy of relation types. The generic data
model thus contained a strict specialization hierarchy (subtype/supertype hierarchy) of all concepts,
ending with the most generic concept ‘thing’ or ‘anything’. The resulting generic data model was
standardised in two variants in ISO standards: ISO 10303-221 and ISO 15926-2. The first one
complies with the conventions of the STEP family of standards (ISO 10303), the latter does not obey
those conventions.
Smart Dictionary
The third stage was the development of an electronic smart dictionary. It has the form of an ontology
that accompanied the generic data model. It is based on a taxonomy of concepts, to capture application
domain knowledge about the definition of concepts in a flexible and extensible way. The author of this
document coordinated the work of many discipline engineers, organized in ‘peer groups’ to come to
agreement about definition models of the concepts in their domain. The result was expressed as a
database of definition models of concepts that are mutually related, whereas domain knowledge was
also captured and expressed as relations between the defined concepts. The concepts and knowledge
were originally defined to be instances of the entity types of the generic data model.
Gellish database
The next phase was the discovery that all instances of the generic data model could be expressed in a
single table. Therefore a universal and generally applicable Gellish Database was defined that consists
on one or more identical Gellish Database tables. An example of basic columns in a Gellish Database
table is presented in the following tables.
                             101                 3                 201        7
                          Left hand       Relation type       Right hand
                                                                             UoM
                         object name           name           object name
                             car          is a subtype of        vehicle
                            wheel         is a subtype of        artifact
                            wheel        can be a part of a        car
                             W1          is classified as a       wheel
                             C1          is classified as a        car
                             W1             is a part of           C1
                             W1              has aspect            D1
                             D1          is classified as a     diameter
                                         can be quantified
                              D1                                   50        cm
                                            on scale as




        4
From Data Models to a Modeling Language



      54          2         101          1        60              3             15         201        7
Language          Left                                                          Right
                          Left hand                                                     Right hand
 of left         hand                  Fact    Relation     Relation type       hand
                           object                                                         object   UoM
  hand           object                 id     type id         name            object
                            name                                                          name
 object           UID                                                           UID
  English       670024      car          1       1146      is a subtype of     670122     vehicle
  English       130679     wheel         2       1146      is a subtype of     730063     artifact
  English       130679     wheel         3       1191     can be a part of a   670024       car
multi-lingual      10       W1           4       1225     is classified as a   130679      wheel
multi-lingual      11       C1           5       1225     is classified as a   670024       car
multi-lingual      10       W1           6       1190        is a part of        11         C1
multi-lingual      10       W1           7       1727         has aspect         12         D1
multi-lingual      12       D1           8       1225     is classified as a   550188    diameter
                                                          can be quantified
multi-lingual     12         D1          9       5279                          920303       50       cm
                                                             on scale as

                                   Figure 2, Example of a Gellish Table
  The information that is recorded in the above tables could be presented in free form Gellish
  English for example as:
  -    a car is a kind of vehicle and a wheel is a kind of artefact, whereas a wheel can be a part of a
       car.
  -    wheel W1 is a part of car C1, whereas the diameter D1 of wheel W1 is 50 cm.
  Further research is required to investigate the applicability of this free form Gellish. Current
  Gellish is limited to a representation of Gellish expressions in tabular form.
  The first of the above tables contains the part of a Gellish Database table that is human readable.
  Each line in that table is the expression of a fact. The second table is an illustration of a more
  extended version of the same Gellish Database table, in which also the unique identifiers are
  presented as well as the indication of the language in which the facts are expressed. The unique
  identifiers are natural language independent. This means that the facts are recorded in a natural
  language independent way, so that it becomes possible that, by using a Gellish dictionary, a
  computer can generate and present the same table also in other languages. This is illustrated when
  the above Gellish Database table is compared with the same tables expressed in another language,
  such as in Gellish Nederlands (Dutch).
  A Gellish Database table is an implementation method (syntax for the Gellish language) that is
  suitable to express any kind of facts. For example:
  -    To store the definitions of concepts.
       – See line 1 and 2 in the above table, with specialization relations that define domain concepts
       (although details of those definitions on those lines are not shown).
  -    To express knowledge as relations between concepts.
       – See line 3, which contains an example of a relation between kinds of things.
  -    To store information about individual things.
       This regards facts and data (instances), expressed as relations between individual things.
       – See line 6 and 7.
  -    To express the classification of things as relations between individual things and concepts
       (kinds) as well as other kinds of relations between individual things and concepts.
       – See line 4, 5, 8 and 9.
  -    It appeared that a single table is suitable to express any kind of fact, including the definition of
       the language itself.
       The Gellish Table represents a structured form of a subset of natural language grammar. Its


       5
From Data Models to a Modeling Language



       core consists of concepts (represented in the above table by the left hand objects and right
       hand objects) and Gellish phrases (represented in the above table by the relation types).
Elimination of Meta language
In a next phase it was discovered that the generic entity types in the data model (being the meta-
language concepts) are identical to the higher-level concepts in the discipline ontology. This is
illustrated in .
                 Generic Entity type                               Generic Entity type
                    class_of_                                       physical_object
                  physical_object

                                                 =                  Meta-language concepts
                  Instances of                                      User-language concepts
             class_of_physical_object

                  physical object

                                               is a subtype of
                                                 is a subtype of
                       pump

                         Figure 3, Identity between entity type and instance
    illustrates that, for example, the generic entity type ‘physical_object’ in the data model, is a meta-
   language concept that appeared to be the same thing as the concept ‘physical object’ in the
   taxonomy. Furthermore, that concept ‘physical object’ is a supertype of all classes of physical
   objects in the Gellish Database, whereas that concept and each subtype is also an instance of the
   entity type ‘class_of_physical_object’. Because of that identity between an instance and an entity
   type, the meta-language concepts, being the generic data model concepts (entity types) were added
   to the ontology as its upper ontology and the data model was eliminated or reduced to a
   ‘bootstrapping’ mini data model. Furthermore, the data model was replaced by a Gellish Database
   table in which the definition of the upper ontology was included.
   The data model entity types that were added to the ontology included also the relation types and
   the roles of various kinds that are required by relations and that are played by the related objects.
   With the addition of the relation types and role types, a corresponding consistent specialization
   hierarchy or taxonomy of relations and roles was created. The result is that Gellish eliminates the
   conventional distinction between application language (user data) and meta-languages (data
   models). The concepts that occur in meta-meta-languages in which data models are usually written
   (such as EXPRESS, UML, XMLS or OWL) could be included in the Gellish taxonomy/ontology.
   In contrast with the conventional distinction in meta-levels of languages, Gellish integrates all
   concepts that appear in those three levels in one language. This is illustrated in Figure 4.




       6
From Data Models to a Modeling Language



               Meta language hierarchy                                                     Gellish English
                                                                                                    anything




                                           modelling languages:                                     concept
                                           - UML
            meta-meta-                     - EXPRESS
            languages          entity
                                           - XML Schema                                                  individual
                     subtype               - OWL                                                           thing
                                                                                              is a
                     r-4                                                                   subtype of                         Gellish
                                                    written in a modelling language
                                                                                                                              usage for
                                      relation                                               r-4
                                                                                                                              Gellish
                                                                                                              relation
                  physical                          e.g. ISO 15926-2 data model                                               language
      meta-                                                                                physical
      languages
                   object     aspect                     ISO 10303-221 data model                   aspect                    definition
                                                                                            object


                    pipe                    has a    e.g. ISO 10303-227 data model         pipe                       has a   Gellish
                           diameter                                                                diameter                   usage for
                                                          written as instances of                                             product
language                                                        a Data Model                                                  description
                   p-1                     r-1                                             p-1                        r-1
usage                       d-1                                                                    d-1
                                                         -product models


                                          Figure 4, Gellish as one integrated language
The left hand part of Figure 4 illustrates the distinction between concepts from various languages
(product models or user languages), data models or meta-languages and modelling languages or meta-
meta-languages. The right hand part illustrates that all the concepts from the various levels are
integrated in a single Gellish language.
Generic software
In a following stage generally applicable software4 was developed that can import, store and export
data in Gellish Databases and that can verify the semantic correctness of the content. That software
proofed that it is possible to read the upper ontology knowledge that is contained in a Gellish Database
and that subsequently can use that knowledge to interpret a discipline ontology, requirements models,
etc. Furthermore, that software was also able to interpret Gellish Database tables with facility
information models that contain information about individual products and occurrences, to verify their
correctness and to display those models.
Language simplification
The next step was the discovery that many entity types that were converted into upper ontology
concepts were semantically superfluous artefacts that could be removed from the ontology without a
loss of semantic expressiveness. This process is illustrated in .




4
    The STEPlib Browser/Editor, which is actually a general Gellish Browser/Editor.


           7
From Data Models to a Modeling Language




      Definition van relation types before the elimination of ‘class of physical object’
                                                                        relation between
                                                      is a
    physical object       is a part of                                     instances of
                                                specialization of
                                                                         physical object

                                                                        relation between
       class of                                       is a
                       can be a part of a                                  instances of
    physical object                             specialization of
                                                                     class of physical object


      Definition van relation types after the elimination of ‘class of physical object’
                                                                                    relation between
                           is a part of                is a
                                                                               individual physical objects
                                                 specialization of
    physical object                                                   (classified by a subtype of physical object)
       (or one of
     its subtypes)                                                                relation between
                                                       is a
                        can be a part of a                                     kinds of physical object
                                                 specialization of
                                                                            (subtypes of physical object)

                               Figure 5, Elimination of superfluous concepts
For example, the concept ‘class of physical object’ in the left hand part of was originally a generic
entity type, intended to contain (or to collect and/or classify) instances of classes of physical objects,
such as ‘car’ and ‘wheel’. Such instances of concepts are used to define the semantics of possibilities
for relations between members of classes. Once the relation types, such as ‘can be a part of a’, are
defined in the Gellish language, they can be used to classify relations between instances of ‘class of
physical object’. For example, the ‘can be a part of a’ relation can be used to express the knowledge
that ‘a wheel can be a part of a car’ in a computer interpretable way.
However, it was discovered that it is also possible and even better to define that same ‘can be a part of
a’ relation type as a relation between the concept ‘physical object’ and itself, as is indicated in the
right hand part of , in which case the meaning of the relation type is defined as a relation type that
expresses that an individual thing that is classified as a ‘physical object’ or one of its subtypes can be a
part of another individual thing that is also classified as a ‘physical object’ or one of its subtypes.
This discovery made the artificial concepts, such as ‘class of physical object’, superfluous and
therefore that kind of concepts were removed from the ontology.
Furthermore, it appeared that semantically unnecessary subtypes could be eliminated from the
ontology, especially as they do not appear in natural languages either. For example, the relation type
‘has property’ could be replaced by the more general relation type ‘has aspect’, because such a
subtype duplicates the semantics that is already contained in the fact that by definition the property
that is possessed already will be classified as a subtype of ‘property’ and because the concept
‘property’ is defined as a subtype of ‘aspect’.
On the other hand additional subtypes of relation types appeared to be required to be added to the
ontology to capture the precise semantics of kinds of facts that appeared to be present in the various
application domains.

Formalised subset of natural languages
During the above developments the resulting ontology was aligned with the concepts that resulted
from an analysis of various ontologies and of natural language.
Natural languages are probably not designed by human beings, but we generally take their existence
for granted and mainly analyse their structure and the underlying concepts. Such research reveal that
the various languages seem to be using common semantic concepts (Wierzbicka, 1996). This research
led to the conclusion that that common semantics is formed by concepts and relationships of a limited
number of kinds between concepts, whereas for those concepts and kinds of relationships different
terms are used in different cultures and languages. Gellish is a language that is built on the elements



        8
From Data Models to a Modeling Language



from those semantic commonalities between languages. The following common elements are captured
in the Gellish language:
•   Concepts.
    When human beings communicate, they seem to use the same concepts, irrespective of the
    language they use, although they refer to those concepts by different names in the various
    languages. Therefore, Gellish makes an explicit distinction between the language independent
    concepts and the terms or phrases with which the concepts are referred to in different contexts or
    language communities. Each concept is referred to in Gellish by a unique identifier (UID) that is
    independent of any natural language. Furthermore, the Gellish language refers to those concepts
    also through terms and phrases from those natural languages. This is done through the use of a
    symbol or string or pattern of symbols, either written or spoken in the various applied languages.
    Therefore, Gellish includes a Dutch dictionary, an English dictionary, etc., which dictionaries are
    structured as a taxonomy of concepts.
A basic semantic structure.
   There seems to be a basic semantic structure of natural languages, which is a commonality that is
   independent of the natural languages. Such a structure is presented in .

                                                                         subtype                 is a
                                                                          subtype                 is a
                                                      kind                                  specialization
                                                                                             specialization
                                                                        supertype               of a
                                                                         supertype                of a



                        is classified as aa
                         is classified as       is classified as aa   is classified as aa
                                                 is classified as      is classified as



                           something                 role               is role in
       anything             something           (of something in          is role in         relationship
                            has role                                   relationship
                             has role             relationship)         relationship
        - object-1           plays                 - role-1                   in
                                                                                            - relation-1
        - object-2        is played by             - role-2                  requires


                                    Figure 6, Basic semantic structure
    That structure is identified and captured by its inclusion as the basic semantic structure of Gellish
    for the expression of any fact. In that structure, facts are expressed as relations between things,
    whereas the relations require two or more roles and only things of a particular kind can play roles
    of the required kinds. Common facts, or pieces of knowledge, are expressed as (common) relations
    between concepts; or expressed more precisely: common relations that conceptually require roles
    of a kind, which roles can be played by members of the related concepts.
•   Facts and relations.
    A fact is: that which is the case, independent of language. The concept ‘fact’ is a concept that can
    be used to classify things as ‘being the case’.
    Facts seem to be expressed in languages as relations between things. Therefore that is also the case
    in Gellish. Gellish is defined to a large extent by the identification of kinds of relations that are
    independent of language and that can be used to classify expressions of facts of corresponding
    kinds. An analysis of the kinds of relations that are used in ontology, in physics, in engineering
    and in business processes, revealed that there is a limited number of relation types with which
    ontologies, technical artefacts and other objects and their behaviour or occurrences can be
    described in a computer interpretable way. The identification of those relation types resulted in a
    taxonomy of kinds of relations (or relation types) that is included in the definition of Gellish.
    Those relation types are also referred to by language independent unique identifiers (UID’s) and
    per natural language they are referred to by different ‘phrases’ (partial sentences). The semantics



        9
From Data Models to a Modeling Language



    of expressions in Gellish is completed by the use of the concepts and relation types for the
    classification of individual things and relations.
    Many relation types are binary, but occurrences and correlations are examples of n-ary relations in
    which a number of things are involved. Each of those involvements can be expressed as a binary
    elementary relation. This implies that those higher order relations can be expressed in Gellish as a
    collection of n binary elementary involvement relations. In each involvement relation, the
    particular role of the involved thing can be made explicit by using a specific subtype of the
    elementary involvement relation.
•   Taxonomy of concepts.
    A subtype/supertype hierarchy or taxonomy of common concepts was developed, which include
    also the relation types. It appeared that a high degree of agreement could be achieved between
    domain experts from various languages and countries, about that taxonomy and about the
    definitions. This resulted in a Gellish English Dictionary / Taxonomy of concepts (also including
    relation types) with a number of translations of terms and phrases. That dictionary / taxonomy can
    be extended as and when required with new concepts that can be kept proprietary or can be
    proposed for addition to the standard Gellish language definition.
•   Ontology.
    The above-mentioned elements are integrated in a coherent hierarchical network, which includes
    the definition of the Gellish language. By inclusion of additional knowledge a knowledge base
    was developed which completed the ontology.
The definition of the Gellish language is published as ‘open source’ data and is publicly available and
can be downloaded on the basis of an open source license.

Applications of the Gellish language
Possible applications of the Gellish language include, but are not limited to:
•   The use of the Gellish dictionary and knowledge base as a basis for a company specific data
    dictionary or knowledge base.
    Such a knowledge base can be used for example as an information source in design systems or as
    a reference data set for the harmonization of the content of various systems. For example, when
    various systems have to be replaced by fewer systems that may or may not be of the same type. It
    is also possible that the Gellish dictionary and knowledge base is used as a basis for an intelligent
    electronic dictionary or encyclopaedia. The knowledge base can be extended by the expression of
    additional public domain knowledge or proprietary knowledge is expressed in Gellish and is added
    to the existing Gellish knowledge base.
•   The development of system independent computer interpretable product models.
    This implies that design information about individual products or product types is recorded as
    product models that are expressed in Gellish. For example design information about parts and
    assemblies, such as equipment, tools and structures, roads, buildings, ships, cars, airplanes,
    facilities, etc. This enables that parts or complete product models are exchanged in a system
    independent way between various parties. Furthermore, it becomes simpler to combine several
    product models and related documents into one integrated overall product model, even if the
    contributions stem from different source systems. The fact that the use of Gellish implies the use
    of standardised concepts means that the consistency of the data is increased and that it becomes
    simpler to use computer software to support the verification of the quality of the data. This means
    that a significant quality increase can be achieved. Furthermore, it becomes simpler to develop
    generic applications that enable to retrieve, compare and report information about different or
    complex installations, such as comparison of performance data of equipment on different sites that
    is stored in systems with different data structures.
•   The description of behaviour of products, persons and organizations.
    This means that processes and occurrences are described in Gellish, including the description of
    mechanical, as well as physical, chemical and control processes and the roles that people and


        10
From Data Models to a Modeling Language



    organizations play in those processes. Such descriptions are easier to maintain, to exchange
    between systems and to search. In addition to that it becomes easier to integrate process
    descriptions with product descriptions.
•   The expression of general and particular requirements in product catalogues.
    This means that specifications for standardised products are described in Gellish. For example,
    requirements that are expressed in standard specifications as published by standardization
    institutes. But also specifications of types of products, such as product types that are described in
    product catalogues of suppliers or of buyer specifications. Such specifications enable that software
    applications can assist in the mutual comparison of product types or in the selection of product on
    the basis of specifications, even if those descriptions stem from different sources. This provides
    suppliers of products for e-business a way to record product information that is unambiguous,
    neutral and computer interpretable.
•   The description of procedures and business processes.
    The expression of such information in Gellish simplifies the maintenance of the business process
    descriptions and enables their integration and comparison with other process descriptions,
    especially when the process descriptions make use of a systematic methodology, such as the
    DEMO methodology. It also becomes possible to develop software ‘agents’ that are controlled by
    the knowledge that is contained in those process descriptions, so that they can automatically react
    on incoming messages.
•   The description of information in Gellish about real individual things and occurrences, such as
    measurements and observations.
    Such system independent recording simplifies for example the integration of time dependent
    observations with product models that describes the observed objects and increases the clarity
    about the definitions of the measured variables.
•   Improvement of the accuracy of the response of search engines on Internet or other document
    repositories. This can be achieved by using the relations between the concepts (keywords or key-
    facts) that are contained in the taxonomy / ontology of Gellish. This knowledge can be used during
    recording of information as well as during the retrieval process through improved interpretation of
    the queries.
Various examples of applications of Gellish English are included in the book ‘Gellish, development
and application of a generic, extensible ontological language’.
(see for a paper version: http://www.iospress.nl/loadtop/load.php?isbn=9789040725975 or its
electronic pdf version: http://repository.tudelft.nl/file/313741/306185 ).
For example:
•   A part of the knowledge, requirements and design of a lubrication oil system for a compressor.
•   The specification of a catalogue item from a product catalogue.
•   A part of a generic business process for communication about business transactions.
Other examples, together with the definition of the semantics of the upper ontology is provided in the
table ‘Gellish English’ of the ‘upper ontology’ part of Gellish. (See the TOPini file with the Gellish
Upper Ontology on https://sourceforge.net/projects/Gellish).




        11

More Related Content

What's hot

Entity Relationship Diagaram
Entity Relationship DiagaramEntity Relationship Diagaram
Entity Relationship Diagaram
Nilt1234
 
Artificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain OntologiesArtificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain Ontologies
International Journal of Science and Research (IJSR)
 
Er Modeling
Er ModelingEr Modeling
Er model
Er modelEr model
Er model
slidenageswaran
 
Representation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelRepresentation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelMihika Shah
 
Doc format.
Doc format.Doc format.
Doc format.butest
 
Entity Relationship Diagrams
Entity Relationship DiagramsEntity Relationship Diagrams
Entity Relationship Diagrams
sadique_ghitm
 
Lecture#01, object orientation
Lecture#01, object orientationLecture#01, object orientation
Lecture#01, object orientationbabak danyal
 
Relational Databases 2
Relational Databases 2Relational Databases 2
Relational Databases 2Jason Hando
 
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
IOSR Journals
 
Bt8901 objective oriented systems1
Bt8901 objective oriented systems1Bt8901 objective oriented systems1
Bt8901 objective oriented systems1
Techglyphs
 
Er model
Er modelEr model
Er model
Soumyajit Dutta
 
Relational Model
Relational ModelRelational Model
Relational Model
Er. Nawaraj Bhandari
 
Object oriented database model
Object oriented database modelObject oriented database model
Object oriented database modelPAQUIAAIZEL
 
Er model ppt
Er model pptEr model ppt
Er model ppt
Pihu Goel
 
Dcap Ja Progmeet 2007 07 05
Dcap Ja Progmeet 2007 07 05Dcap Ja Progmeet 2007 07 05
Dcap Ja Progmeet 2007 07 05
Julie Allinson
 
ER MODEL
ER MODELER MODEL
ER MODEL
Rupali Rana
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
cscpconf
 
Assessment of Programming Language Reliability Utilizing Soft-Computing
Assessment of Programming Language Reliability Utilizing Soft-ComputingAssessment of Programming Language Reliability Utilizing Soft-Computing
Assessment of Programming Language Reliability Utilizing Soft-Computing
ijcsa
 

What's hot (20)

Entity Relationship Diagaram
Entity Relationship DiagaramEntity Relationship Diagaram
Entity Relationship Diagaram
 
Artificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain OntologiesArtificial Intelligence of the Web through Domain Ontologies
Artificial Intelligence of the Web through Domain Ontologies
 
Er Modeling
Er ModelingEr Modeling
Er Modeling
 
Er model
Er modelEr model
Er model
 
Representation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelRepresentation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object model
 
Doc format.
Doc format.Doc format.
Doc format.
 
Entity Relationship Diagrams
Entity Relationship DiagramsEntity Relationship Diagrams
Entity Relationship Diagrams
 
Lecture#01, object orientation
Lecture#01, object orientationLecture#01, object orientation
Lecture#01, object orientation
 
Relational Databases 2
Relational Databases 2Relational Databases 2
Relational Databases 2
 
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...
 
Bt8901 objective oriented systems1
Bt8901 objective oriented systems1Bt8901 objective oriented systems1
Bt8901 objective oriented systems1
 
Er model
Er modelEr model
Er model
 
Relational Model
Relational ModelRelational Model
Relational Model
 
Object oriented database model
Object oriented database modelObject oriented database model
Object oriented database model
 
1 rdm concepts
1 rdm concepts1 rdm concepts
1 rdm concepts
 
Er model ppt
Er model pptEr model ppt
Er model ppt
 
Dcap Ja Progmeet 2007 07 05
Dcap Ja Progmeet 2007 07 05Dcap Ja Progmeet 2007 07 05
Dcap Ja Progmeet 2007 07 05
 
ER MODEL
ER MODELER MODEL
ER MODEL
 
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING
 
Assessment of Programming Language Reliability Utilizing Soft-Computing
Assessment of Programming Language Reliability Utilizing Soft-ComputingAssessment of Programming Language Reliability Utilizing Soft-Computing
Assessment of Programming Language Reliability Utilizing Soft-Computing
 

Similar to From Multiple Data Models To One Modeling Language V1

SMalL - Semantic Malware Log Based Reporter
SMalL  - Semantic Malware Log Based ReporterSMalL  - Semantic Malware Log Based Reporter
SMalL - Semantic Malware Log Based Reporter
Stefan Prutianu
 
Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction                Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction
dannyijwest
 
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONSONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
sipij
 
Data Modeling and Database Design 2nd Edition by Umanath Scamell Solution Manual
Data Modeling and Database Design 2nd Edition by Umanath Scamell Solution ManualData Modeling and Database Design 2nd Edition by Umanath Scamell Solution Manual
Data Modeling and Database Design 2nd Edition by Umanath Scamell Solution Manual
endokayle
 
Entity Relationship Modelling
Entity Relationship ModellingEntity Relationship Modelling
Entity Relationship Modelling
Bhandari Nawaraj
 
PROPERTIES OF RELATIONSHIPS AMONG OBJECTS IN OBJECT-ORIENTED SOFTWARE DESIGN
PROPERTIES OF RELATIONSHIPS AMONG OBJECTS IN OBJECT-ORIENTED SOFTWARE DESIGNPROPERTIES OF RELATIONSHIPS AMONG OBJECTS IN OBJECT-ORIENTED SOFTWARE DESIGN
PROPERTIES OF RELATIONSHIPS AMONG OBJECTS IN OBJECT-ORIENTED SOFTWARE DESIGN
ijpla
 
Are Data Models Superfluous Nov2003
Are Data Models Superfluous Nov2003Are Data Models Superfluous Nov2003
Are Data Models Superfluous Nov2003
Andries_vanRenssen
 
Fundamentals of Database Management Systems 2nd Edition Gillenson Solutions M...
Fundamentals of Database Management Systems 2nd Edition Gillenson Solutions M...Fundamentals of Database Management Systems 2nd Edition Gillenson Solutions M...
Fundamentals of Database Management Systems 2nd Edition Gillenson Solutions M...
gamuhuto
 
Object oriented software engineering concepts
Object oriented software engineering conceptsObject oriented software engineering concepts
Object oriented software engineering conceptsKomal Singh
 
Gellish A Standard Data And Knowledge Representation Language And Ontology
Gellish   A Standard Data And Knowledge Representation Language And OntologyGellish   A Standard Data And Knowledge Representation Language And Ontology
Gellish A Standard Data And Knowledge Representation Language And Ontology
Andries_vanRenssen
 
UNITII -MODELING and AGGRIGATION-sss.docx
UNITII -MODELING  and AGGRIGATION-sss.docxUNITII -MODELING  and AGGRIGATION-sss.docx
UNITII -MODELING and AGGRIGATION-sss.docx
sharwarisolapur
 
Semantic Modeling for Information Federation
Semantic Modeling for Information FederationSemantic Modeling for Information Federation
Semantic Modeling for Information Federation
Cory Casanave
 
IJNLC 2013 - Ambiguity-Aware Document Similarity
IJNLC  2013 - Ambiguity-Aware Document SimilarityIJNLC  2013 - Ambiguity-Aware Document Similarity
IJNLC 2013 - Ambiguity-Aware Document Similarity
kevig
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
ijnlc
 
Unit-1-DBMS-SUN-4 everything you need to know.pptx
Unit-1-DBMS-SUN-4 everything you need to know.pptxUnit-1-DBMS-SUN-4 everything you need to know.pptx
Unit-1-DBMS-SUN-4 everything you need to know.pptx
nirajsharmapuneiat
 
Hc3612711275
Hc3612711275Hc3612711275
Hc3612711275
IJERA Editor
 
A study on the approaches of developing a named entity recognition tool
A study on the approaches of developing a named entity recognition toolA study on the approaches of developing a named entity recognition tool
A study on the approaches of developing a named entity recognition tool
eSAT Publishing House
 
Object models and object representation
Object models and object representationObject models and object representation
Object models and object representation
Julie Allinson
 

Similar to From Multiple Data Models To One Modeling Language V1 (20)

SMalL - Semantic Malware Log Based Reporter
SMalL  - Semantic Malware Log Based ReporterSMalL  - Semantic Malware Log Based Reporter
SMalL - Semantic Malware Log Based Reporter
 
Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction                Microposts Ontology Construction Via Concept Extraction
Microposts Ontology Construction Via Concept Extraction
 
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONSONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONS
 
Data Modeling and Database Design 2nd Edition by Umanath Scamell Solution Manual
Data Modeling and Database Design 2nd Edition by Umanath Scamell Solution ManualData Modeling and Database Design 2nd Edition by Umanath Scamell Solution Manual
Data Modeling and Database Design 2nd Edition by Umanath Scamell Solution Manual
 
Entity Relationship Modelling
Entity Relationship ModellingEntity Relationship Modelling
Entity Relationship Modelling
 
PROPERTIES OF RELATIONSHIPS AMONG OBJECTS IN OBJECT-ORIENTED SOFTWARE DESIGN
PROPERTIES OF RELATIONSHIPS AMONG OBJECTS IN OBJECT-ORIENTED SOFTWARE DESIGNPROPERTIES OF RELATIONSHIPS AMONG OBJECTS IN OBJECT-ORIENTED SOFTWARE DESIGN
PROPERTIES OF RELATIONSHIPS AMONG OBJECTS IN OBJECT-ORIENTED SOFTWARE DESIGN
 
Are Data Models Superfluous Nov2003
Are Data Models Superfluous Nov2003Are Data Models Superfluous Nov2003
Are Data Models Superfluous Nov2003
 
Fundamentals of Database Management Systems 2nd Edition Gillenson Solutions M...
Fundamentals of Database Management Systems 2nd Edition Gillenson Solutions M...Fundamentals of Database Management Systems 2nd Edition Gillenson Solutions M...
Fundamentals of Database Management Systems 2nd Edition Gillenson Solutions M...
 
Object oriented software engineering concepts
Object oriented software engineering conceptsObject oriented software engineering concepts
Object oriented software engineering concepts
 
Gellish A Standard Data And Knowledge Representation Language And Ontology
Gellish   A Standard Data And Knowledge Representation Language And OntologyGellish   A Standard Data And Knowledge Representation Language And Ontology
Gellish A Standard Data And Knowledge Representation Language And Ontology
 
UNITII -MODELING and AGGRIGATION-sss.docx
UNITII -MODELING  and AGGRIGATION-sss.docxUNITII -MODELING  and AGGRIGATION-sss.docx
UNITII -MODELING and AGGRIGATION-sss.docx
 
Ontologies
OntologiesOntologies
Ontologies
 
Semantic Modeling for Information Federation
Semantic Modeling for Information FederationSemantic Modeling for Information Federation
Semantic Modeling for Information Federation
 
IJNLC 2013 - Ambiguity-Aware Document Similarity
IJNLC  2013 - Ambiguity-Aware Document SimilarityIJNLC  2013 - Ambiguity-Aware Document Similarity
IJNLC 2013 - Ambiguity-Aware Document Similarity
 
AMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITYAMBIGUITY-AWARE DOCUMENT SIMILARITY
AMBIGUITY-AWARE DOCUMENT SIMILARITY
 
Unit-1-DBMS-SUN-4 everything you need to know.pptx
Unit-1-DBMS-SUN-4 everything you need to know.pptxUnit-1-DBMS-SUN-4 everything you need to know.pptx
Unit-1-DBMS-SUN-4 everything you need to know.pptx
 
Hc3612711275
Hc3612711275Hc3612711275
Hc3612711275
 
A study on the approaches of developing a named entity recognition tool
A study on the approaches of developing a named entity recognition toolA study on the approaches of developing a named entity recognition tool
A study on the approaches of developing a named entity recognition tool
 
Object models and object representation
Object models and object representationObject models and object representation
Object models and object representation
 
Oop
OopOop
Oop
 

From Multiple Data Models To One Modeling Language V1

  • 1. Gellish English From data models to a modeling language by dr. ir. Andries van Renssen I
  • 2.
  • 3. The problem Gellish English is a formal language for an unambiguous description of reality. It is based on natural language, is defined in a formal ontology1, and is practically applicable, at least for technical artefacts.2 It is suitable to express and exchange information in the form of electronic data in a structure that is system and natural language independent. Gellish3 is designed to solve the problem that it is currently hardly possible to exchange information between computers and to integrate information that comes from different sources without the creation of costly data conversions and interface software. This has mainly two causes: 1. Software developers are trained to create a new data structure for each new application, so that all those data structures are different. 2. Software users are used not to use a standard electronic dictionary for the data that is the content of those data structures, so that in various systems different names are used for the same things. Together these cause a ‘Babylonian confusion of tongues’ in information integration and in information exchange between computer systems. This means that there is no common language for communication with and between computer applications. This hampers the unambiguous interpretable storage, integration and retrieval of information and often requires the development of costly data conversion procedures. A solution to this problem is therefore of considerable social and economic importance. The solution Gellish provides a solution to the root of this problem by being a language that can be used for an unambiguous and computer interpretable description of reality and imagination. The language is a formal subset of natural languages with variants per natural language, such as Gellish English, Gellish Nederlands, etc. The language provides an extensible and generally applicable data structure that potentially eliminates the need to develop ad hoc data structures for many applications. Gellish is defined in such a way that expressions in one language variant can automatically be translated in any other language variant for which a Gellish dictionary is available. Gellish is system independent and is both human and computer readable. Gellish might also provide a contribution to the technology of computerized natural language processing. The Gellish language is developed through an analysis of commonalities and limitations of data models and of many issues in data modelling, in combination with a study of ontologies and generic semantics of languages. Development stages The following highlights illustrate the development process that resulted in the creation of Gellish. Generic data modelling The first stage was the development of the generic data modelling method. Generic data models are generalizations of conventional data models. They define standardized general relation types, which include the definition of the kinds of things that may be related by such a relation type. This can be regarded as the definition of a formalized version of a natural language. For example, a generic data 1 An ontology is the result of research on the nature and properties of things. An ontology can be presented as a graphical model, as is illustrated by the figures in this thesis. 2 Artifacts are products that are made by human beings. 3 Gellish is originally derived from ‘Generic Engineering Language’, however it is further developed into a language that is also applicable outside the engineering discipline. 1
  • 4. From Data Models to a Modeling Language model may define a 'classification relation', being a binary relation between an individual thing and a kind of thing (a class) that can be used to specify of what kind any object is. It may also define for example a 'part-whole relation', being a binary relation between two things, one with the role of part, the other with the role of whole, which is so generally applicable that it can be used to specify for any object that it is a part of another object. The generic modeling method allows for the definition of an extensible dictionary of classes. A generic modeling method with only those two relation types already enables the classification of any individual thing and also enables to specify part-whole relations between any two individual objects. By standardization of an extensible list of relation types, a generic data model enables the expression of an unlimited number of kinds of facts and will approach the capabilities of natural languages. Conventional data models, on the other hand, have a fixed and limited domain scope, because the instantiation (usage) of such a model only allows expressions of kinds of facts that are predefined in the model. Generic data models are developed as an approach to solve some shortcomings of conventional data models. For example, different modelers usually produce different conventional data models of the same domain. This can lead to difficulty in bringing the models of different people together and is an obstacle for data exchange and data integration. Invariably, however, this difference is attributable to different levels of abstraction in the models and differences in the kinds of facts that can be instantiated (the semantic expression capabilities of the models). The modelers need to communicate and agree on certain elements which are to be rendered more concretely, in order to make the differences less significant. Generic data modeling has the following characteristics: • A generic data model consist of generic entity types, such as 'individual thing', 'class', 'relationship', and possibly a number of their subtypes. • Every individual thing is an instance of a generic entity called 'individual thing' or one of its subtypes. • Every individual thing is explicitly classified by a kind of thing ('class' or ‘concept’) using an explicit classification relationship. • The classes used for that classification are separately defined as standard instances of the entity 'class' or one of its subtypes, such as 'class of relationship'. These standard classes are usually called 'reference data'. This means that domain specific knowledge is captured in those standard instances and not as entity types. For example, concepts such as car, wheel, building, ship, and also temperature, length, etc. are standard instances. But also standard types of relationship, such as 'is composed of' and 'is involved in' can be defined as standard instances. This way of modeling allows the addition of standard classes and standard relation types as data (instances), which makes the data model flexible and prevents data model changes when the scope of the application changes. A generic data model obeys the following rules: 1. Attributes are treated as entities that are related by relationships to other entities. 2. Entity types are represented, and are named after, the underlying nature of a thing, not the role it plays in a particular context. 3. Entities have a local identifier within a database or exchange file. These should be artificial and managed to be unique. Relationships are not used as part of the local identifier of the entity but have their own unique identifier. 4. Activities, relationships and events are represented as entities (not attributes). 5. Entity types are part of a sub-type/super-type hierarchy of entity types, in order to define a universal context for the model. As types of relationships are also entity types, they are also arranged in a sub-type/super-type hierarchy of types of relationship. 2
  • 5. From Data Models to a Modeling Language 6. Types of relationships are defined on a high (generic) level, being the highest level where the type of relationship is still valid. For example, a composition relationship (indicated for example by the phrase: 'is composed of') is defined as a relationship between an 'individual thing' and another 'individual thing' (and not just between e.g. an order and an order line). This generic level means that the type of relation may in principle be applied between any individual thing and any other individual thing. Additional constraints are defined in the 'reference data', being standard instances of relationships between kinds of things. Generic data models The second stage was the development of a generic data model. The generic data model increased the scope of conventional data models to a general applicability so that the generic data model can replace many different conventional data models. The difference between conventional data models and a generic data model is illustrated in . Conventional Generic data model data model Generic Entity type is classified as a Entity type Generic Entity type Generic Entity type class of individual thing pump individual thing Generic Entity type Generic Entity type Attribute types can have as aspect a has as aspect pump.name Generic Entity type Generic Entity type class of aspect aspect pump.capacity Instance-1 Instance-2 Instance-2 is classified as a P-101 #1 pump is classified as a Attributes P-101 can have as aspect a has as aspect can have as aspect a has as aspect 5 dm3/s capacity is classified as a 5 dm3/s is classified as a Figure 1, Conventional versus generic data models 3
  • 6. From Data Models to a Modeling Language Generic data models include among others an explicit classification relation (the ‘is classified as a’ relations in ) between two different entity types. That explicit classification relation in principle replaces the implied instantiation relation between the attribute instances and the attribute types in conventional data modelling. The entity types in a generic data model has no attributes, because attribute types are replaced by entity types whereas implicit and explicit relations between attribute types (which relations are unqualified in conventional data models) are replaced by explicitly qualified relations. This resulted in a generic data model with a taxonomy of relation types. The generic data model thus contained a strict specialization hierarchy (subtype/supertype hierarchy) of all concepts, ending with the most generic concept ‘thing’ or ‘anything’. The resulting generic data model was standardised in two variants in ISO standards: ISO 10303-221 and ISO 15926-2. The first one complies with the conventions of the STEP family of standards (ISO 10303), the latter does not obey those conventions. Smart Dictionary The third stage was the development of an electronic smart dictionary. It has the form of an ontology that accompanied the generic data model. It is based on a taxonomy of concepts, to capture application domain knowledge about the definition of concepts in a flexible and extensible way. The author of this document coordinated the work of many discipline engineers, organized in ‘peer groups’ to come to agreement about definition models of the concepts in their domain. The result was expressed as a database of definition models of concepts that are mutually related, whereas domain knowledge was also captured and expressed as relations between the defined concepts. The concepts and knowledge were originally defined to be instances of the entity types of the generic data model. Gellish database The next phase was the discovery that all instances of the generic data model could be expressed in a single table. Therefore a universal and generally applicable Gellish Database was defined that consists on one or more identical Gellish Database tables. An example of basic columns in a Gellish Database table is presented in the following tables. 101 3 201 7 Left hand Relation type Right hand UoM object name name object name car is a subtype of vehicle wheel is a subtype of artifact wheel can be a part of a car W1 is classified as a wheel C1 is classified as a car W1 is a part of C1 W1 has aspect D1 D1 is classified as a diameter can be quantified D1 50 cm on scale as 4
  • 7. From Data Models to a Modeling Language 54 2 101 1 60 3 15 201 7 Language Left Right Left hand Right hand of left hand Fact Relation Relation type hand object object UoM hand object id type id name object name name object UID UID English 670024 car 1 1146 is a subtype of 670122 vehicle English 130679 wheel 2 1146 is a subtype of 730063 artifact English 130679 wheel 3 1191 can be a part of a 670024 car multi-lingual 10 W1 4 1225 is classified as a 130679 wheel multi-lingual 11 C1 5 1225 is classified as a 670024 car multi-lingual 10 W1 6 1190 is a part of 11 C1 multi-lingual 10 W1 7 1727 has aspect 12 D1 multi-lingual 12 D1 8 1225 is classified as a 550188 diameter can be quantified multi-lingual 12 D1 9 5279 920303 50 cm on scale as Figure 2, Example of a Gellish Table The information that is recorded in the above tables could be presented in free form Gellish English for example as: - a car is a kind of vehicle and a wheel is a kind of artefact, whereas a wheel can be a part of a car. - wheel W1 is a part of car C1, whereas the diameter D1 of wheel W1 is 50 cm. Further research is required to investigate the applicability of this free form Gellish. Current Gellish is limited to a representation of Gellish expressions in tabular form. The first of the above tables contains the part of a Gellish Database table that is human readable. Each line in that table is the expression of a fact. The second table is an illustration of a more extended version of the same Gellish Database table, in which also the unique identifiers are presented as well as the indication of the language in which the facts are expressed. The unique identifiers are natural language independent. This means that the facts are recorded in a natural language independent way, so that it becomes possible that, by using a Gellish dictionary, a computer can generate and present the same table also in other languages. This is illustrated when the above Gellish Database table is compared with the same tables expressed in another language, such as in Gellish Nederlands (Dutch). A Gellish Database table is an implementation method (syntax for the Gellish language) that is suitable to express any kind of facts. For example: - To store the definitions of concepts. – See line 1 and 2 in the above table, with specialization relations that define domain concepts (although details of those definitions on those lines are not shown). - To express knowledge as relations between concepts. – See line 3, which contains an example of a relation between kinds of things. - To store information about individual things. This regards facts and data (instances), expressed as relations between individual things. – See line 6 and 7. - To express the classification of things as relations between individual things and concepts (kinds) as well as other kinds of relations between individual things and concepts. – See line 4, 5, 8 and 9. - It appeared that a single table is suitable to express any kind of fact, including the definition of the language itself. The Gellish Table represents a structured form of a subset of natural language grammar. Its 5
  • 8. From Data Models to a Modeling Language core consists of concepts (represented in the above table by the left hand objects and right hand objects) and Gellish phrases (represented in the above table by the relation types). Elimination of Meta language In a next phase it was discovered that the generic entity types in the data model (being the meta- language concepts) are identical to the higher-level concepts in the discipline ontology. This is illustrated in . Generic Entity type Generic Entity type class_of_ physical_object physical_object = Meta-language concepts Instances of User-language concepts class_of_physical_object physical object is a subtype of is a subtype of pump Figure 3, Identity between entity type and instance illustrates that, for example, the generic entity type ‘physical_object’ in the data model, is a meta- language concept that appeared to be the same thing as the concept ‘physical object’ in the taxonomy. Furthermore, that concept ‘physical object’ is a supertype of all classes of physical objects in the Gellish Database, whereas that concept and each subtype is also an instance of the entity type ‘class_of_physical_object’. Because of that identity between an instance and an entity type, the meta-language concepts, being the generic data model concepts (entity types) were added to the ontology as its upper ontology and the data model was eliminated or reduced to a ‘bootstrapping’ mini data model. Furthermore, the data model was replaced by a Gellish Database table in which the definition of the upper ontology was included. The data model entity types that were added to the ontology included also the relation types and the roles of various kinds that are required by relations and that are played by the related objects. With the addition of the relation types and role types, a corresponding consistent specialization hierarchy or taxonomy of relations and roles was created. The result is that Gellish eliminates the conventional distinction between application language (user data) and meta-languages (data models). The concepts that occur in meta-meta-languages in which data models are usually written (such as EXPRESS, UML, XMLS or OWL) could be included in the Gellish taxonomy/ontology. In contrast with the conventional distinction in meta-levels of languages, Gellish integrates all concepts that appear in those three levels in one language. This is illustrated in Figure 4. 6
  • 9. From Data Models to a Modeling Language Meta language hierarchy Gellish English anything modelling languages: concept - UML meta-meta- - EXPRESS languages entity - XML Schema individual subtype - OWL thing is a r-4 subtype of Gellish written in a modelling language usage for relation r-4 Gellish relation physical e.g. ISO 15926-2 data model language meta- physical languages object aspect ISO 10303-221 data model aspect definition object pipe has a e.g. ISO 10303-227 data model pipe has a Gellish diameter diameter usage for written as instances of product language a Data Model description p-1 r-1 p-1 r-1 usage d-1 d-1 -product models Figure 4, Gellish as one integrated language The left hand part of Figure 4 illustrates the distinction between concepts from various languages (product models or user languages), data models or meta-languages and modelling languages or meta- meta-languages. The right hand part illustrates that all the concepts from the various levels are integrated in a single Gellish language. Generic software In a following stage generally applicable software4 was developed that can import, store and export data in Gellish Databases and that can verify the semantic correctness of the content. That software proofed that it is possible to read the upper ontology knowledge that is contained in a Gellish Database and that subsequently can use that knowledge to interpret a discipline ontology, requirements models, etc. Furthermore, that software was also able to interpret Gellish Database tables with facility information models that contain information about individual products and occurrences, to verify their correctness and to display those models. Language simplification The next step was the discovery that many entity types that were converted into upper ontology concepts were semantically superfluous artefacts that could be removed from the ontology without a loss of semantic expressiveness. This process is illustrated in . 4 The STEPlib Browser/Editor, which is actually a general Gellish Browser/Editor. 7
  • 10. From Data Models to a Modeling Language Definition van relation types before the elimination of ‘class of physical object’ relation between is a physical object is a part of instances of specialization of physical object relation between class of is a can be a part of a instances of physical object specialization of class of physical object Definition van relation types after the elimination of ‘class of physical object’ relation between is a part of is a individual physical objects specialization of physical object (classified by a subtype of physical object) (or one of its subtypes) relation between is a can be a part of a kinds of physical object specialization of (subtypes of physical object) Figure 5, Elimination of superfluous concepts For example, the concept ‘class of physical object’ in the left hand part of was originally a generic entity type, intended to contain (or to collect and/or classify) instances of classes of physical objects, such as ‘car’ and ‘wheel’. Such instances of concepts are used to define the semantics of possibilities for relations between members of classes. Once the relation types, such as ‘can be a part of a’, are defined in the Gellish language, they can be used to classify relations between instances of ‘class of physical object’. For example, the ‘can be a part of a’ relation can be used to express the knowledge that ‘a wheel can be a part of a car’ in a computer interpretable way. However, it was discovered that it is also possible and even better to define that same ‘can be a part of a’ relation type as a relation between the concept ‘physical object’ and itself, as is indicated in the right hand part of , in which case the meaning of the relation type is defined as a relation type that expresses that an individual thing that is classified as a ‘physical object’ or one of its subtypes can be a part of another individual thing that is also classified as a ‘physical object’ or one of its subtypes. This discovery made the artificial concepts, such as ‘class of physical object’, superfluous and therefore that kind of concepts were removed from the ontology. Furthermore, it appeared that semantically unnecessary subtypes could be eliminated from the ontology, especially as they do not appear in natural languages either. For example, the relation type ‘has property’ could be replaced by the more general relation type ‘has aspect’, because such a subtype duplicates the semantics that is already contained in the fact that by definition the property that is possessed already will be classified as a subtype of ‘property’ and because the concept ‘property’ is defined as a subtype of ‘aspect’. On the other hand additional subtypes of relation types appeared to be required to be added to the ontology to capture the precise semantics of kinds of facts that appeared to be present in the various application domains. Formalised subset of natural languages During the above developments the resulting ontology was aligned with the concepts that resulted from an analysis of various ontologies and of natural language. Natural languages are probably not designed by human beings, but we generally take their existence for granted and mainly analyse their structure and the underlying concepts. Such research reveal that the various languages seem to be using common semantic concepts (Wierzbicka, 1996). This research led to the conclusion that that common semantics is formed by concepts and relationships of a limited number of kinds between concepts, whereas for those concepts and kinds of relationships different terms are used in different cultures and languages. Gellish is a language that is built on the elements 8
  • 11. From Data Models to a Modeling Language from those semantic commonalities between languages. The following common elements are captured in the Gellish language: • Concepts. When human beings communicate, they seem to use the same concepts, irrespective of the language they use, although they refer to those concepts by different names in the various languages. Therefore, Gellish makes an explicit distinction between the language independent concepts and the terms or phrases with which the concepts are referred to in different contexts or language communities. Each concept is referred to in Gellish by a unique identifier (UID) that is independent of any natural language. Furthermore, the Gellish language refers to those concepts also through terms and phrases from those natural languages. This is done through the use of a symbol or string or pattern of symbols, either written or spoken in the various applied languages. Therefore, Gellish includes a Dutch dictionary, an English dictionary, etc., which dictionaries are structured as a taxonomy of concepts. A basic semantic structure. There seems to be a basic semantic structure of natural languages, which is a commonality that is independent of the natural languages. Such a structure is presented in . subtype is a subtype is a kind specialization specialization supertype of a supertype of a is classified as aa is classified as is classified as aa is classified as aa is classified as is classified as something role is role in anything something (of something in is role in relationship has role relationship has role relationship) relationship - object-1 plays - role-1 in - relation-1 - object-2 is played by - role-2 requires Figure 6, Basic semantic structure That structure is identified and captured by its inclusion as the basic semantic structure of Gellish for the expression of any fact. In that structure, facts are expressed as relations between things, whereas the relations require two or more roles and only things of a particular kind can play roles of the required kinds. Common facts, or pieces of knowledge, are expressed as (common) relations between concepts; or expressed more precisely: common relations that conceptually require roles of a kind, which roles can be played by members of the related concepts. • Facts and relations. A fact is: that which is the case, independent of language. The concept ‘fact’ is a concept that can be used to classify things as ‘being the case’. Facts seem to be expressed in languages as relations between things. Therefore that is also the case in Gellish. Gellish is defined to a large extent by the identification of kinds of relations that are independent of language and that can be used to classify expressions of facts of corresponding kinds. An analysis of the kinds of relations that are used in ontology, in physics, in engineering and in business processes, revealed that there is a limited number of relation types with which ontologies, technical artefacts and other objects and their behaviour or occurrences can be described in a computer interpretable way. The identification of those relation types resulted in a taxonomy of kinds of relations (or relation types) that is included in the definition of Gellish. Those relation types are also referred to by language independent unique identifiers (UID’s) and per natural language they are referred to by different ‘phrases’ (partial sentences). The semantics 9
  • 12. From Data Models to a Modeling Language of expressions in Gellish is completed by the use of the concepts and relation types for the classification of individual things and relations. Many relation types are binary, but occurrences and correlations are examples of n-ary relations in which a number of things are involved. Each of those involvements can be expressed as a binary elementary relation. This implies that those higher order relations can be expressed in Gellish as a collection of n binary elementary involvement relations. In each involvement relation, the particular role of the involved thing can be made explicit by using a specific subtype of the elementary involvement relation. • Taxonomy of concepts. A subtype/supertype hierarchy or taxonomy of common concepts was developed, which include also the relation types. It appeared that a high degree of agreement could be achieved between domain experts from various languages and countries, about that taxonomy and about the definitions. This resulted in a Gellish English Dictionary / Taxonomy of concepts (also including relation types) with a number of translations of terms and phrases. That dictionary / taxonomy can be extended as and when required with new concepts that can be kept proprietary or can be proposed for addition to the standard Gellish language definition. • Ontology. The above-mentioned elements are integrated in a coherent hierarchical network, which includes the definition of the Gellish language. By inclusion of additional knowledge a knowledge base was developed which completed the ontology. The definition of the Gellish language is published as ‘open source’ data and is publicly available and can be downloaded on the basis of an open source license. Applications of the Gellish language Possible applications of the Gellish language include, but are not limited to: • The use of the Gellish dictionary and knowledge base as a basis for a company specific data dictionary or knowledge base. Such a knowledge base can be used for example as an information source in design systems or as a reference data set for the harmonization of the content of various systems. For example, when various systems have to be replaced by fewer systems that may or may not be of the same type. It is also possible that the Gellish dictionary and knowledge base is used as a basis for an intelligent electronic dictionary or encyclopaedia. The knowledge base can be extended by the expression of additional public domain knowledge or proprietary knowledge is expressed in Gellish and is added to the existing Gellish knowledge base. • The development of system independent computer interpretable product models. This implies that design information about individual products or product types is recorded as product models that are expressed in Gellish. For example design information about parts and assemblies, such as equipment, tools and structures, roads, buildings, ships, cars, airplanes, facilities, etc. This enables that parts or complete product models are exchanged in a system independent way between various parties. Furthermore, it becomes simpler to combine several product models and related documents into one integrated overall product model, even if the contributions stem from different source systems. The fact that the use of Gellish implies the use of standardised concepts means that the consistency of the data is increased and that it becomes simpler to use computer software to support the verification of the quality of the data. This means that a significant quality increase can be achieved. Furthermore, it becomes simpler to develop generic applications that enable to retrieve, compare and report information about different or complex installations, such as comparison of performance data of equipment on different sites that is stored in systems with different data structures. • The description of behaviour of products, persons and organizations. This means that processes and occurrences are described in Gellish, including the description of mechanical, as well as physical, chemical and control processes and the roles that people and 10
  • 13. From Data Models to a Modeling Language organizations play in those processes. Such descriptions are easier to maintain, to exchange between systems and to search. In addition to that it becomes easier to integrate process descriptions with product descriptions. • The expression of general and particular requirements in product catalogues. This means that specifications for standardised products are described in Gellish. For example, requirements that are expressed in standard specifications as published by standardization institutes. But also specifications of types of products, such as product types that are described in product catalogues of suppliers or of buyer specifications. Such specifications enable that software applications can assist in the mutual comparison of product types or in the selection of product on the basis of specifications, even if those descriptions stem from different sources. This provides suppliers of products for e-business a way to record product information that is unambiguous, neutral and computer interpretable. • The description of procedures and business processes. The expression of such information in Gellish simplifies the maintenance of the business process descriptions and enables their integration and comparison with other process descriptions, especially when the process descriptions make use of a systematic methodology, such as the DEMO methodology. It also becomes possible to develop software ‘agents’ that are controlled by the knowledge that is contained in those process descriptions, so that they can automatically react on incoming messages. • The description of information in Gellish about real individual things and occurrences, such as measurements and observations. Such system independent recording simplifies for example the integration of time dependent observations with product models that describes the observed objects and increases the clarity about the definitions of the measured variables. • Improvement of the accuracy of the response of search engines on Internet or other document repositories. This can be achieved by using the relations between the concepts (keywords or key- facts) that are contained in the taxonomy / ontology of Gellish. This knowledge can be used during recording of information as well as during the retrieval process through improved interpretation of the queries. Various examples of applications of Gellish English are included in the book ‘Gellish, development and application of a generic, extensible ontological language’. (see for a paper version: http://www.iospress.nl/loadtop/load.php?isbn=9789040725975 or its electronic pdf version: http://repository.tudelft.nl/file/313741/306185 ). For example: • A part of the knowledge, requirements and design of a lubrication oil system for a compressor. • The specification of a catalogue item from a product catalogue. • A part of a generic business process for communication about business transactions. Other examples, together with the definition of the semantics of the upper ontology is provided in the table ‘Gellish English’ of the ‘upper ontology’ part of Gellish. (See the TOPini file with the Gellish Upper Ontology on https://sourceforge.net/projects/Gellish). 11