Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this


  1. 1. CS7620-A Case-Based Reasoning December 09, 2009 CBArch Report Urjit Bhatia, Andres Cavieres, Preetam Joshi, Radhika Shivapurkar CS-7620 | Case-Based Reasoning | 12/09/09 1 General Problem The general, long term goal of this project is to generate a tool to assist architects and engineers in early phases of a building design process, known as conceptual design. The potential relevance of such a tool lies on the fact that most important decisions regarding the environmental impact, life-cycle performance and operative costs of a building are taken on this phase. The hypothesis of our work is that bringing knowledge and expertise from good examples and relevant cases (Figure 1) might be a significant contribution for decision-making at conceptual level. The energy consumption of buildings is just one relevant example of how early decisions on building shapes, orientation, construction materials and mechanical systems may have a positive impact if assessed correctly from the beginning. Figure 1 Relationship between amount of knowledge available for each design phase and design freedom, according to Fabrycky (Fabrycky1991). Early design infusion is supposed to reduce design freedom (middle figure). On the right side, the image shows the relation between levels of effort in design and distribution of added-values. Curve 1 means the ability to affect costs and functional capabilities; Curve 2 refers to the cost of design changes; Curve 3 represents the traditional design effort distribution; Curve 4 represents a BIM (CAD system) enhanced design effort distribution (Patrick McLeamy). Knowledgeable efforts on early phases clearly make more sense. 1.1 Specific Problem The specific proposal is to use Case-Based Reasoning to help the exploration of initial building configurations, which is known as conceptual design phase. Conceptual design essentially focuses on the definition of basic 3D models that represents the overall shape, size, orientation and how the main activities are related to each other and distributed within a physical terrain (Fig 1). At small building scales this exercise does not imply too much complexity in order to justify the help from a CBR system, but in the context of big commercial buildings such as office buildings, shopping malls, laboratories, schools or hospitals, the assessment of relevant examples can provide a valuable guidance during early design. Georgia Institute of Technology Page | 1
  2. 2. CS7620-A Case-Based Reasoning December 09, 2009 Figure 2 Example of conceptual design building models. Some type of knowledge and guidance can be provided by parametric models, which are smart geometric representations driven by user-defined rules. In this way non-geometric properties of a physical object can guide the behavior of a parametric model, while other properties get updated accordingly. For instance a “column” can update its diameter and/or its material properties if loads coming from the upper floor increase, a building height could be defined a being a ratio of the building’s width, and its width could be defined as ratio of the terrain length, etc. Libraries of such parametric components can be created based on class definitions that embed general rules as described above. During design, these classes are instantiated with specific values and then optimized (adapted) as needed during the process. From a CBR perspective, parametric models can be considered similar to cases, with mechanisms explicitly defined for adaptative behavior. However, what is missing on parametric CAD systems is an efficient mechanism to select and retrieve best model cases from libraries. In this scenario Case-Based Rasoning may offer an important contribution by complementing the adaptation capabilities of a parametric system with retrieval techniques. A third important component however is the integration of an ontology of parametric models. Since many new forms of parametric models can be created during design, the retainment step requires a proper classification of the new types so that the parametric case base grows in a consistent and organized manner. Furthermore, the ontology provides a semantic layer to the system to support additional automation of design tasks. Among them we foresee reasoning capabilities to recommend unexpected alternatives to the user based on some initial query. In this scenario the integration of the ontology with a CBR system is intended to add a conceptual classification of instances, providing a new level of abstraction necessary to support the exploration of a wider scope of valid options. 1.2 Context of the Study A design problem can be tackled from many different perspectives, depending on the interests, goals and constraints of the design team. In the same way a parametric representation can be implemented according to such conditions. In order to build our system integration we focused on the conceptual design of large commercial buildings and the evaluation of their potential energy consumption. This domain problem is well known and extensive datasets are publicly available to support the retrieval stage. The building energy consumption dataset is available from the Energy Information Administration (EIA) website, which is a government department responsible for generating official energy statistics. The specific datasets chosen are the CBECS public use microdata files 1 and 15. Each file contains 5210 records. These Georgia Institute of Technology Page | 2
  3. 3. CS7620-A Case-Based Reasoning December 09, 2009 records represent information voluntarily provided in 2003 by building owners from 50 states and the District of Columbia. The main link between our parametric design system and the datasets is the definition of building shapes as an enumeration of basic standard types. The parametric instantiation of these types according to a series of other building properties extracted from the dataset provides the basic starting point for our CBR framework. • Squared • Squared with patio • Rectangular • “I” shape • “L” shape • “T” shape • “E” shape • “F” shape • “H” shape • “X” shape 2 Proposed Solution and Methodology The traditional CBR cycle is depicted in the Figure 2. In this model the data structures of retrieved and adapted cases are the same, therefore only one repository is needed. In our system however we have two different sets of cases with different representations each. For the problem of designing buildings that follow energy consumption patterns of real-world instances our system starts the cycle from a master dataset of existing buildings as mentioned above. But instantiation and adaptation use the parametric format. Figure 3 CBR traditional cycle: Retrieve, reuse, revise and retain steps. Georgia Institute of Technology Page | 3
  4. 4. CS7620-A Case-Based Reasoning December 09, 2009 For this reason we adopted a second repository for parametric cases. While the first case base is fairly large, the second one is small, containing eight or nine initial templates. The main idea is that this small parametric case base must grow and be managed in an efficient way. Figure 3 shows our proposed framework. Figure 4 Proposed system framework. . Each main step has a simple or a refined algorithm implementation (+ or -). First phase focuses on system integration with simple algorithms. The proposed framework has the same four steps of CBR, but two different case repositories, the ontology and an ontology reasoner. The cycle here starts with the architect’s query. A building is required based on a feature vector of 12 components. Geographic location, building activities, intended shape and size are the most important ones. The user also provides the geometry of the site in which the new building has to be designed along with a main orientation vector. The best match gets retrieved from the building case base and instantiated in the CAD environment as a parametric model which contains all the information from the retrieved case. The next step is adaptation (reuse). For this step we initially considered both automatic and user-driven adaptation. Currently only automatic adaptation is implemented at the topological level. In this step a simple heuristic search algorithm generates a valid topologic adaptation to be evaluated at the revise stage. Evaluation is intended to be two-folded. A user evaluation is expected to assess the result in terms of aesthetics, volumetric distribution, functionality, etc. The second evaluation is about the estimated energy consumption of the design model by checking its performance against the real-world dataset. Currently this evaluation is not implemented. After evaluation the retainment step is triggered by the user. In this step the ontology reasoner classifies the adapted model according to pre-existing types. In case that the adapted model represents an instance of an exiting type it may be stored as such. Otherwise the reasoner infers a new type and creates a new concept in the ontology. The adapted model then becomes the first instance of this new type, available for further Georgia Institute of Technology Page | 4
  5. 5. CS7620-A Case-Based Reasoning December 09, 2009 retrieval. The goal of the proposal is that whenever in future iterations the designer asks for a case, the system will look for a best match on the real-world database, but also would suggest related shape types that were created in the past, increasing the scope of valid design alternatives. Each step of the framework is considered to have a simple (naïve) implementation and a more advanced implementation (+ or – sign). Due to time constraints the team decided to focus on the system integration first using just simple algorithms for each aspect. Further work will explore more advanced options. 2.1 Tasks led by each team member • RETRIEVE (Urjit): {Building Database, Fish and Shrink and Knn Retrieval} • REUSE (Andres): {Domain knowledge, System Integration, Adaptation} • REVISE (Preetam): {Ontology definition, Reasoner Integration, Knowledge Extraction} • RETAIN (Radhika): {Knowledge Extraction} 3 Retrieval The retrieval is the prime phase in a CBR cycle. The work on retrieval depends on a lot of factors such as the knowledge engineering, the dataset etc. The following sections will provide an overview of how the retrieval is handled in the CBArch System. 3.1 Raw Input Data The data set is a very broad data set of building features –“The 2003 Commercial buildings Energy Consumption Survey (CBECS) building characteristics and consumption and expenditures public use files”. This data-set is available at containing information of about 5200 buildings. The data-set had to be pruned and filtered to remove missing values. Some of the values were interpolated and added to the database to make the data consistent. Not all the features of the dataset were a part of the input feature set, so we filtered those columns and created another “feature-vector” containing only the important features that we needed. This gave us the advantage, from a database point of view, of being able to work on a smaller database table and make the query process faster. Georgia Institute of Technology Page | 5
  6. 6. CS7620-A Case-Based Reasoning December 09, 2009 Figure 5 Some of the important features from the dataset. 3.2 Retrieval System Framework The retrieval system is built around MySql 5.0 and C#. The persistence manager – nHibernate is a relatively new technology and was used for Object Relational Mapping (ORM). The communication between the database and the algorithm has been contained to a one-time load of requested feature set from the database to the algorithm’s working memory. It is represented as an Object Mapped Model – called as an “Entity”. These are the POCOs (Plain old CLR Objects) that are mapped to the database. There are some external systems references that help complete the framework and provide important services. Figure 6 Assembly References. Georgia Institute of Technology Page | 6
  7. 7. CS7620-A Case-Based Reasoning December 09, 2009 3.3 The Retrieval Algorithm There are two similarity based algorithms used in this system: Fish and Shrink and Knn. The need for two algorithms arose due to the challenges brought forward by the performance of the Fish and Shrink algorithm. The basic life cycle of the retrieval algorithm is represented in Figure C. 3.3.1 Fish and Shrink The fish and shrink algorithm works in two phases. First it calculates the similarity amongst cases themselves and then does a match-and-promote phase similarity checking of the cases with the query. The central idea is that the case that is similar to the query can guide us to finding other cases that are possible candidates for being similar to the query since they themselves are similar to this target case. Figure 7 Algorithm Lifecycle. This generic lifecycle is maintained by both the algorithms. Data structure The class Node contains an “Aspect Hash” of type “Aspect”. Each “Aspect” contains a list of Neighbor objects, which point to the position of a Node in the working memory Map. Figure 8 Relation between structures. Georgia Institute of Technology Page | 7
  8. 8. CS7620-A Case-Based Reasoning December 09, 2009 Figure 9 Class diagram for data structure. 3.3.2 Knn Retrieval: The Knn retrieval is also based on the same underlying data-structure as shown above. It calculates the similarity of the query with the existing cases in the memory in a “just-in-time” fashion. The Knn gives quick results compared to the Fish and Shrink. 3.3.3 Similarity Measurement Heuristics: There are some very important similarity measurements and heuristics used for retrieval. These define how the algorithms compare two cases in the case base. For example, the similarity matrix shown below gives a heuristic for matching the census divisions and calculating the similarity contribution of this feature over the range of 0 to 1. In a similar fashion, other heuristics include: 1. Exponential Scaling: used for features like NumberOfFloors. A building with 2 floors is very different from a building with 5 floors, but another building with 20 floors is not as different from one Georgia Institute of Technology Page | 8
  9. 9. CS7620-A Case-Based Reasoning December 09, 2009 that has 27-30 floors. Thus as the base value of measurement (number of floors) increases, the significance of the gap decreases. 2. Magnification (Linear Scaling): used for features filtering some features like TotalWeeklyOperatingHours. This gives us a way to filter purely numerical values. Mathematically, it can be modelled as: Let x = (a – b)/(a + b) Magnification m = x/f where x < f and f is the magnification factor. Other cases are thus considered to be too far away from the test case. 3. Direct Testing: In cases of truth values or fixed valued functions over the feature vectors, direct testing was used. If the values match, then the similarity is positive otherwise zero. 3.3.4 Issues and Challenges The task of retrieval presented us with several challenges, including choice of a good platform, integration with the other modules of the system and performance. During initial phases of the retrieval implementation, time taken was multiple of its current performance. This optimization was done using micro-timers embedded in the code and third-party performance evaluation tools like: ANTS memory profiler and EQUATEC memory profiler. These tools helped to indentify the cause of lags. It was found out that some loops like ForEach and data-structures like ArrayLists were performing slowly. Thus this was remedied by using crude, but faster implements. Another approach used was to make a lot of the decision process inline. This came with a sacrifice of code modularity but helped to improve significantly the performance. Another issue was the way we evaluated and interpreted the Fish and Shrink algorithm. The text supporting this algorithm is not very expressive and alternate sources of the same do not agree on some of the finer details, like updating the testDistances. This penalized our work on the retrieval algorithm. Our time & effort evaluation for Fish and Shrink failed and forced us to implement the Knn algorithm. The issue identified was that Fish and Shrink was filtering to about 90% which seems fine overall but given the large size of case-base, we were targeting around 98%. On the other hand, we planned to use Knn to rank and filter the cases presented to us by the Fish and Shrink Algorithm, in a hybrid ensemble like approach. The fish and shrink was able to present around 300 cases out of nearly 5000. So we could have again ranked them presented the best k-cases, but time constraints hindered that implementation. 4 Reuse Adaptation is performed once a best match or a list of best matches is retrieved. Case properties of interest at the conceptual design level can vary according to each problem, particular goals or specific business practices. The initial set of building properties selected for our retrieval system contains 12 properties: - Building Shape: An enumerated set of standard building shapes. - Square Foot Area: Size of the building. - Census Division: Describes geographic location. Relevant to analyze climate conditions. - Free Standing: Describes if a building is isolated or not from others. - Number of Floors: Number of useful levels of the building. Georgia Institute of Technology Page | 9
  10. 10. CS7620-A Case-Based Reasoning December 09, 2009 - Main Activity: Describes what type of business or activity occurs on the building. - Number of Businesses: Describes how many businesses exist on a building for energy assessment. - Number of Employees in Main Shift: Relevant for energy consumption assessment. - Open 24 Hours: Relevant for energy consumption assessment. - Open During Week Days: Relevant for energy consumption assessment. - Open During Weekends: Relevant for energy consumption assessment. - Total Weekly Operating Hours: Relevant for energy consumption assessment. The retrieved feature vector representation of a good match gets partially replicated into the data structure of the parametric model representation. For instance all the information requested at the query stage gets replicated in the CAD model, plus extra information such as building materials, façade properties, glazing and sun protection necessary for energy consumption evaluation. One assumption was that the designer would not normally request for cases focusing explicitly on those extra properties, but would rather expect them as useful information to be learned from the retrieval. Once the relevant case properties are mapped into a parametric representation the parametric model gets instantiated. The shape of the retrieved case is instantiated by using a library of basic shape topology templates (figure X). These templates are adjusted to fit the geometric characteristics of the case as well as the orientation of the building as defined by the user. However chances are that an instantiated version of the case would not perfectly fit the characteristics of a given site or other contextual constraints of the new problem. Therefore adaptation of building layout and other associated properties must be done. There are two basic approaches for adaptation, namely geometric adaptation and topologic adaptation. Another important assumption made in this project is that sometimes non-geometric properties may drive the topological / geometrical features of a building, but it is most common the case where shape modifications drive the value of non-geometric properties. Any of them can be performed either by the user herself or by some automatic procedure. In this work we are focusing only on automatic adaptation of both building topology and geometry. Georgia Institute of Technology Page | 10
  11. 11. CS7620-A Case-Based Reasoning December 09, 2009 Figure 10 Database of standard topology templates for initial retrieval instantiation. 4.1 Geometric Adaptation Geometric adaptation is the simplest method and the set of rules to achieve a successful adaptation can be summarized by means of three basic geometric transform operators: Move, Rotate and Scale. In our system the initial attempts for adaptation must be geometric so to accommodate the parametric instance in a given polygonal site by some combination of these operators. However, despite that geometric adaptation makes a lot of sense from a domain perspective and it is part of the natural adaptation capabilities of parametric models, the research team decided not to implementation it to its full potential because geometric adaptation can be achieved procedurally, hence there is no special need to retaining a geometric adapted form. (Figure 8). Figure 11 Sequence from geometric adaptation to topologic adaptation. The initially instantiated "L" shaped building gets out of the site bounds due to the chosen orientation line. It system first should try to move the shape (incomplete). If some space remains out, then topologic adaptation is triggered so that the shape fit the site. A different topology is reached (but not necessarily a new one). A more interesting and more promising scenario is provided by topologic adaptation. The idea is that Georgia Institute of Technology Page | 11
  12. 12. CS7620-A Case-Based Reasoning December 09, 2009 new topologies have different meaning from building design perspective, implying new architectural concepts which are more worthy of keeping and reusing. 4.2 Topologic Adaptation Assuming that geometric adaptation failed 1 , i.e., not combination of Move, Rotate or Scale operators could make a building shape to completely fit a polygonal site; the system performs a topologic adaptation process. The two algorithms that generate the topologic adaptation are called in a sequence based on a simple heuristic rule. Any part (space component) of the building failing to fit in the site must first search for an empty spot adjacent to its same original sector, called a building branch. This rule is in accordance to the basic criteria of compatibility between building activities, which states that only compatible spaces (rooms) can be put together. If there is no empty spot big enough to accommodate the “misfit” space within its same branch, then it must start a new branch attached to its original branch. If this also fails then the misfit space has to try these two steps again with the closest branch, and so on. After all branches have been looked at the ground level then the system looks for accommodation in a second level and so on. Such re-accommodation lead to a change on the topology whenever the misfit space creates a new branch attached to closest one to its original branch. In order to perform this search the following adjacency graph data structure was defined: Figure 12 Adjacency list representation for a building shape topology graph. Note the almost complete match between a “H” and “U” gets differetiated by the concept of “yard”. 1 As stated in the previous point, geometric adaptation was not fully implemented, because topological adaptation was more relevant according to our time constraints. Georgia Institute of Technology Page | 12
  13. 13. CS7620-A Case-Based Reasoning December 09, 2009 The topologic adaptation algorithms keep track of node relationships for later classification of standard types and subsumption inference of new types by the ontology reasoner. These also includes the recognition of cycles for identification of shapes such as ‘”Square_with_courtayard”. Figure 13 Example of topology adaptation that leads to a new concept. Initial "Wide_rectangle" type which is correctly classified by the ontology as a children of the “U” type (two inverted U’s) gets adapted into a new shape which is classified by the reasoner as children of both "U" and "H" types. 4.2.1 Issues and Challenges The representation of the topology graph is incomplete as the current implementation does not consider all the information that should be considered for properly computing the adaptation. Furthermore, the current implementation of the topology adaptation algorithm does not support addition or deletion of spaces nodes, being limited to adapt only the same number of spaces originally instantiated from the retrieval. Further work has to be done in a more complete representation of the shape topologies and better algorithm to keep track of more complex outcomes, as well as to support addition or deletion of nodes. Another important limitation of the current implementation is that space nodes are undifferentiated. This limitation does not correspond to the complexity of real-world buildings made of different types of spaces with different requirements. This additional level of information should lead to a richer set of rules regarding how spaces might adapt, not only regarding themselves but also in relation with other contextual constraints such as accessibility, sun exposure, energy optimization, etc. At this stage the current implementation worked well as proof-of-concept and a starting point for further improvements in such direction. 5 Revise The system requires an adaptation which is not only geometric and topological but also efficient in terms of energy. This can be achieved by evaluating the design with respect to energy and modifying it. The CBR- Arch dataset consist of energy parameters associated with materials used for construction. We use these parameters to evaluate the design produced using geometric and topographic adaptations. The current evaluation module in CBR-Arch is a standalone module which provides evaluation based on energy consumption. 5.1 Energy Consumption Evaluation The evaluation module uses a knowledge extracted from the domain i.e our master database pertaining to energy associated with real existing buildings. The knowledge extraction module creates a mapping of all the wall materials and roof materials used with the consumed energy .The energy components taken into Georgia Institute of Technology Page | 13
  14. 14. CS7620-A Case-Based Reasoning December 09, 2009 consideration are fuel and electricity consumption by use of materials like glass, wood, bricks etc. This material-energy mapping format is given as follows: Roof Material 1 Average associated Energy=14888798.43 Roof Material 2 Average associated Energy =5162388.04 Roof Material 3 Average associated Energy =2190562.31 …. Wall Material 1 Average associated Energy =10516642.5364 Wall Material 2 Average associated Energy =21645429.1371 Wall Material 3 Average associated Energy =9407973.24345 ….. The evaluation module uses this mapping to find all the materials used to build the entity and uses an aggregation function to calculate the aggregated energy. The aggregated energy is summation of all the materials used. Figure 14 A simplified framework for evaluation of energy performance for the adapted building model. 6 Retain After the adaptation stage, there will be a retain stage which would essentially complete the classical Case based reasoning cycle. In a design related domain like architecture it is up to the designer to decide if the adapted solution is good enough to stored, based on her expertise or based on some performance assessment such as mathematical analysis or simulations. In any case simple geometric adaptation can be achieved procedurally so that there is not much gain in storing its outcomes. However topologic adaptation processes can lead to completely different configurations that can be worthy of storing. In this scenario a new Georgia Institute of Technology Page | 14
  15. 15. CS7620-A Case-Based Reasoning December 09, 2009 class of shape or building concept can emerge. The retain stage therefore is two-folded; it has to store a new meaningful concept as well as relevant instances that represent such a concept. In domains like the travel recommender system, we had a direct storage of instances without any regard to the meaning it provides while storing. In our domain of architecture, when new shapes are generated they will be stored using an ontology. The need to do this is that if we store shapes directly then there is no real use of these shapes in our domain. We require new shapes which are essentially new concepts to be classified as part of the already existing shapes. For example, if a new shape is generated, we first run an evaluation of the new shape based on the different evaluation approaches that have been proposed in this paper. Then based on the results of these evaluations, we first check the ontology model to see whether this shape already exists, if it does not exist then we classify the new shape as a part of the already existing shapes. Hence, when the user queries for a H, he can get different varieties of a H shape like a combination of H and an L shape etc. Hence, CBR acts as a discovery process here which can discover new shapes after the adaption takes place. 6.1 Ontology Design Decisions An ontology is a collection of Concepts and their corresponding Instances. Concepts consist of individuals which may belong to either a single concept or to more than one concept provided that these concepts are not disjoint. Any given instance cannot belong to two or more disjoint concepts. Such individuals cannot exist. For example, consider the popular pizza ontology. In this case, consider two concepts: Non-Vegetarian topping and Vegetarian topping. These two concepts are disjoint i.e., any topping which is a Vegetarian topping cannot be a Non-Vegetarian topping. The presence of such a topping which belongs to both the Non-Vegetarian and Vegetarian classes causes an inconsistency in the ontology. Concepts also consist of Properties which have a constraint placed on them. These constraints determine as to which concept a new concept would belong to. By running a reasoner on the ontology, inconsistencies can be identified and automated classification (Only in OWL-DL) can be achieved. There are many reasoners available namely: Pellet, DIG and a few more. We will describe the Pellet ( v1.5.2) reasoner in the further sections of this report. Ontologies are represented by many different formats such as Web-Ontology Language (OWL), N- TRIPLES etc. The OWL representation of an ontology is a very common representation. It has three variants namely: OWL Lite, OWL-DL and OWL Full. The OWL Lite is the syntactically simplest species of OWL. It is intended to be used in situations where a simple class hierarchy and simple constraints are needed. OWL- DL is much more expressive than OWL-DL and is based on description logics. Description logics are a decidable fragment of first order logic and are therefore amenable to automated reasoning. OWL Full is the most expressive and is used in situations where very high expressiveness is more important than being able to guarantee the decidability or computational completeness of the language. It is not possible to perform automated reasoning on OWL Full. Therefore, we chose OWL-DL for the representation of our ontology. 6.2 Ontology Structure We incorporated basic building shapes into our ontology which served as the base ontology. We used Protégé 3.4 to create the ontology consisting of the concepts and the constraints placed on these concepts. Figure 9 shows the Jambalaya view of the ontology. Georgia Institute of Technology Page | 15
  16. 16. CS7620-A Case-Based Reasoning December 09, 2009 Figure 15 Jambalaya view of topology ontology: Basic topology concepts and set of standard topologic shapes. As seen in Figure 9, the Ontology consists of the following basic shapes: T Shape U Shape H Shape I Shape L Shape The Building Energy Consumption dataset describes building shapes as a larger enumeration, including shapes that were not part of our initial ontology. Such shapes include the Square shape with courtyards and all its derivations, the “X” shape and the “E” shape. The purpose was to just define a minimal set from which more complicated shapes could be inferred from more basic types. Thus the super class of all these shapes is the class Shape, which will consist of all possible shapes that would be generated in the future operations of the CBArch system. These initial shapes serve as a basis for new shapes which would be classified under these shapes. For example a new shape which is a combination of a “H” and an “L” will be a member of both the “H” and the “L” classes. These shapes are not disjoint with respect to each other hence enabling newer concepts to belong to more than one of these basic concepts. The basic classes have the following properties associated with them: • hasYards Georgia Institute of Technology Page | 16
  17. 17. CS7620-A Case-Based Reasoning December 09, 2009 • hasLines • hasBranches • hasPoints • has Angles Figure 16 Asserted and inferred hierarchy of building shapes. A new type gets classified by the Pellet reasoner according to Description Logics rules defined on the right side. A branch is defined as a sequence of three continuous points which result in a straight line. A building has a number of yards like for example consider a “U” shape building, it consists of one yard. Similarly, a “L” shape building has one yard. A “H” shape building has two yards. The angles, lines and points are obvious. Figure 10 shows the constraints placed on individual basic shapes. There are two types of constraints namely: necessary constraints and necessary and sufficient constraints. The necessary constraints determine whether a new class is eligible to be considered as a part of a particular class. On the other hand, necessary and sufficient constraints determine a closure condition such that a new class would be truly classified under a particular class. As shown in Figure 10, cardinality constraints were imposed on each of the properties of each class. These cardinality constraints were selected based on intuition and trial and error. Initially, we started off with only hasLines, hasPoints and hasAngles properties and played around with the constraints in order to get a reasonable classification. However, these properties could not result in a good classification of the new shaped that were being generated. We, therefore, investigated better features that would give us better classification results. Hence we came up with the hasBranches and hasYards properties. By imposing appropriate constraints on these properties and combining them with the other properties, we were able to establish a reasonable ontology structure which can be updated as and when new concepts arise. Figure 10 also depicts the inferred hierarchy computed using the inbuilt Pellet 1.5.2(Direct) reasoner. The various levels of classification have been generated by classifying the taxonomy using the Classify Taxonomy function provided by Protégé. Before this step is performed, the ontology should be checked for inconsistencies i.e., a new shape should be validated. Georgia Institute of Technology Page | 17
  18. 18. CS7620-A Case-Based Reasoning December 09, 2009 6.3 Invoking the Pellet reasoner through C# code: We needed to emulate the operations performed in Protégé in our C# code because the GenerativeComponents API was available in C#. Hence, we need to invoke the Pellet reasoner from C# and classify the ontology model represented using the Jena framework. The Jena and the Pellet reasoner were available in Java. Hence, we used IKVM to enable access of these functionalities in our C# code. Figure 11 shows the steps followed in order to achieve the required functionality. The Pellet 1.5.2 package consisted of the Pellet reasoner and the Jena API. We used IKVM to convert it into a .dll file and then imported it as a library reference in the C# code. Figure 17 Integration of Pellet Reasoner (Java based) with the parametric representation of the building (C# based). 6.4 Retain stage: The retain stage basically consisted of updating an existing ontology with a new shape that was generated in the CBArch cycle of operation. The reasoner was invoked in order to get either the direct or all the super classes of the new shape. When the reasoner is told to get all the super classes of a given shape, the inheritance hierarchy of the new shape would be shown. If only the direct super-classes function of the reasoner is invoked, then only the direct super-class of a particular new shape would be shown. For example, for a new shape which is a combination of a H and an L, the direct super classes of the new shape would be H and L. All the super classes of the new shape would list the whole hierarchy of classification i.e., it would also list the super classes of H and L. The importance of this step is mentioned in the next section. 6.5 Importance of the Ontology Update: The ontology update module is an important part of the CBArch cycle. This is because the addition of new shapes results in a better choice range of shapes from the parametric database. For example, if a user queried for a H shape, then this query would consult the ontology to check for shapes which are similar (belongs) to the H shape and return different possible shapes relating to an H. Hence, different varieties of an H shape can be instantiated instead of a simple H shape that would result due to the absence of the parametric Georgia Institute of Technology Page | 18
  19. 19. CS7620-A Case-Based Reasoning December 09, 2009 database. An additional feature to be implemented here is to store the specifications (features) associated with a new shape in a database. For example, the features like site area, region, building activity, etc. corresponding to a new shape which is classified under a H shape can be stored into a database. If a user gives a query, shapes whose corresponding features were stored in the database, would be retrieved if the user's query shows a reasonable level of match with these features of a particular shape. The “level” of match is yet to be finalized. As future work, we plan to implement the promising concept of Derivational Analogy which would store the traces of operations of a user in order to arrive at a particular solution. These traces can be useful to compute new solutions. This aspect is currently being investigated. 7 Evaluation and Conclusions The current report introduces a novel design support system that integrates Case-Based Reasoning with Parametric Modeling and Ontologies. The system takes as reference the domain of conceptual design of commercial buildings. At the current stage of development our focus was on system integration and proof-of-concept only. The goal of this integration is to take advantage of complementary capabilities of these three systems to support architectural design processes. While CBR provides a framework to store and retrieve good examples at the instance level, Parametric Modeling offers a framework for rule-based form adaptation. Finally ontologies are intended to provide a higher layer of abstraction at the semantic level, so that new design concepts can be created and classified. Instances can be therefore organized under this conceptual umbrella, and new forms of design automation can be explored. Among them it is expected an improvement of the recommendation capabilities of the system by enabling unexpected cases to be brought to the designer, increasing in this way the scope of valid alternatives to be explored. The system as proof-of-concept shows initially very good results. It successfully retrieves and adapts shapes according to the specified rules, and then classifies them as new concepts of the ontology when appropriate. However the retrieval system is not referring to the ontology yet. This step remains to be done, but at least the basic foundations required to achieve this functionality are well rooted. Further work has to focus on the evaluation of adapted models, more specifically on the aspects related with expected energy consumption. Additional work will also explore fine grain evaluation of results, exploration of alternative approaches for data representation and performance improvement of algorithms. Georgia Institute of Technology Page | 19