Halo Pcs Kcap2007 V2

408 views

Published on

We present an approach towards knowledge acquisition of process knowledge for the natural sciences. The work has been conducted within Project Halo, which is creating advanced knowledge authoring and question answering systems for the natural sciences. An analysis of AP®-level questions for Biology, Chemistry and Physics uncovered that process knowledge is the single most frequent type of knowledge required. Thus, we developed means to acquire process knowledge, to formally represent it, and to reason about it in order to answer novel questions about the do-mains.
All these tasks are supported by an abstract process meta-model. It provides the terminology for user-tailored process diagrams, which are automatically translated into executa-ble FLogic code. The meta-model and the code generation are based on the notion of Problem Solving Methods (PSM) which represent an abstract formalization of the reasoning strategies needed for processes.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
408
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Analysis of AP syllabus resulted into a number of knowledge types
  • Processes are special concepts that…
  • Partially inspired by existing process ontologies e.g. EO, GLIF Many of these preexisting resources were domain specific. We abstracted them for a generic use and completed the remnants
  • First two points of the slide: We intended to produce a PSM library which allows describing a particular process and also providing the means to reason about and solve process-related problems. Therefore, we have approached processes as special types of problems and PSM as the way to represent and solve them.
  • Two main phases: Identification of domain-specific processes in the syllabi Decomposition and abstraction of the domain-specific processes into domain-independent processes expressed in terms of the metamodel
  • Each process category specialized into a number of generic, abstract processes, which can be achieved by one or more PSM.
  • We focus on one of the methods of the “Split” category: “decompose & recombine”
  • The Process Editor allows SMEs to author themselves process knowledge without intervention of Knowledge Engineers Graphical representations of process metamodel entities and methods of the PSM library provide SMEs with guidance during Knowledge Formulation and avoid the blank page syndrome The components of a process diagram are first modelled by choosing a role of the process metamodel from the palette. Then, these components are mapped to concrete domain entities by means of the interface shown. In the example, domain-level concept Ionic Compound is modelled as a process metamodel Resource . This perspective allows importing rules into process actions enabling domain-level reasoning within processes
  • FLogic representation and reasoning language Process diagrams authored by SMEs using the process metamodel and the PSM library are automatically translated into FLogic (OntoBroker) code
  • The frame of action “Dissolve” is comprised by all the “Ionic Compounds” of the Solutions contained in the knowledge base. However, action “Crystallize” is applicable exclusively to the “Cations” and “Anions” produced by action “Dissolve”.
  • The execution of an action allows transition from its pre to its post state
  • Though process knowledge is accessible via the overall Question Formulation system, the most comfortable and complete way to use the authored process knowledge is by means of the test & debugging perspective, which executes queries and is also helpful to validate the formulated knowledge bases.
  • Performed in the context of the overall Halo evaluation. This explains why some parts of the syllabus showed not very representative for processes, while their domains are actually very rich in this kind of knowledge. The intermediate evaluation is not to be understood as a usability test in a formative sense, but as an empirical assessment of DarkMatter’s performance in a setting that is representative, in terms of the profile of recruited SMEs and their assigned tasks. In the case of PCS, since our approach is focused on enabling SMEs to model executable processes at the knowledge level without intervention of KEs, the evaluation was specially aimed towards collecting direct experience of SMEs on PCS knowledge formulation and reasoning. This was measured according to two main dimensions: usability and utility. These metrics must be modulated by the actual relevance of the PCS knowledge type with respect to the selected syllabi, which we measured in terms of the number of issues raised by SMEs on PCS in specific domains and the number of processes modeled by SMEs in each domain.
  • Physics SMEs did not use processes Not so important for Chemistry SMEs but… Biology SMEs found it very useful with a score of 3 out of 4. Being the domain with a more representative amount of process knowledge in the evaluation syllabus (the selected Chemistry and Physics syllabi contained a negligible amount of process knowledge) this mark can be considered representative as well.
  • Key Concepts: Encapsulation Reusability: small number of entities and PSMs in metamodel and library which can be reused (generic enough) and aligned with particular domains of application SMEs building KBs instead of KEs (mess up domain KBs and are expensive) They know the domain and are willing to formalize this knowledge Ex: heart inheriting from diaphragm pump. Inheritance is not always a good idea (e.g. one chamber a the pump, four in the heart) Process diagrams can be easily moved from one domain to a different one where an analogous process happens. Data Flow view simplifies visualization and keeps representation to the Process-level (unlike in Aura) Similarities with CLib (Ken Barker) To provide SMEs with the means required to acquire , formally represent and reason about processes in the target domains by Creating an abstract process metamodel that supports these tasks Supporting the creation and edition of user-tailored process diagrams based on this terminology, without intervention of knowledge engineers Automatically translating process diagrams into executable code Based on Problem Solving Methods (PSM) , which represent an abstract formalization of the reasoning strategies needed for processes
  • Halo Pcs Kcap2007 V2

    1. 1. Applying Problem Solving Methods for Process Knowledge Acquisition, Representation, and Reasoning Jose Manuel Gómez-Pérez iSOCO, S.A [email_address] K-CAP 2007
    2. 2. The Halo Project <ul><li>“ Project Halo aims to create an intelligent computer system able to answer novel questions in scientific domains with expertise equivalent to AP competence level” </li></ul><ul><li>Three scientific domains: Chemistry, Biology, and Physics </li></ul><ul><li>Focused on creating tools that allow Subject Matter Experts (SMEs) to author scientific knowledge (maximum usability) </li></ul><ul><li>Goals: </li></ul><ul><ul><li>Reduce average cost of formalizing domain knowledge, from $10,000 to $1,000 per textbook page </li></ul></ul><ul><ul><li>Reduce brittleness of authored knowledge </li></ul></ul>
    3. 3. The Relevance of Process Knowledge <ul><li>Analysis of AP syllabus resulted into a number of knowledge types </li></ul><ul><li>Processes appear in 37% (average) of the syllabus of the three domains </li></ul><ul><li>The most important knowledge type in Chemistry (53%) </li></ul><ul><li>Second in Biology (35%) </li></ul><ul><li>Fourth in Physics (22%) </li></ul>
    4. 4. Motivation <ul><li>To provide SMEs with the means required to acquire , formally represent and reason about processes in the target domains by </li></ul><ul><ul><li>Creating an abstract process metamodel that supports these tasks </li></ul></ul><ul><ul><li>Supporting the creation and edition of user-tailored process diagrams based on this terminology, without intervention of knowledge engineers </li></ul></ul><ul><ul><li>Automatically translating process diagrams into executable code </li></ul></ul><ul><li>Based on Problem Solving Methods (PSM) , which represent an abstract formalization of the reasoning strategies needed for processes </li></ul>
    5. 5. Outline: Problem Solving Methods and Process Knowledge <ul><li>The Process Metamodel </li></ul><ul><li>The PSM library for Process Authoring </li></ul><ul><li>Enabling SMEs to Formulate Process Knowledge </li></ul><ul><li>Representing and Reasoning with Process Knowledge </li></ul><ul><li>Evaluation </li></ul>
    6. 6. A Metamodel for Process Knowledge <ul><li>Processes are special concepts that </li></ul><ul><ul><li>Relate to the sequence of operations and involved resources leading to the production of some outcome </li></ul></ul><ul><ul><li>Encapsulate preconditions, results, contents, actors, and causes </li></ul></ul><ul><li>Examples: Chemical precipitations, Biology mitosis, etc </li></ul><ul><li>Process metamodels provide the terminology necessary to express process entities and the relations between them </li></ul>
    7. 7. The Process Metamodel <ul><li>Actions </li></ul><ul><ul><li>Inspired by A ctivities in EO and TOVE </li></ul></ul><ul><li>Decisions (or forks) </li></ul><ul><li>Relations </li></ul><ul><li>Resources (roles) </li></ul><ul><ul><li>Containers of domain concepts </li></ul></ul><ul><ul><li>Pointers to the types of domain concepts that can play a given role </li></ul></ul><ul><ul><li>Static and dynamic roles </li></ul></ul>
    8. 8. Using PSMs for Process Knowledge <ul><li>We have approached processes as special types of problems and PSMs as the way to represent and solve them </li></ul><ul><li>The terminology provided by the process metamodel is used by a PSM library, which supports both </li></ul><ul><ul><li>Process formulation across the three domains </li></ul></ul><ul><ul><li>Reasoning about processes </li></ul></ul><ul><li>This PSM library contains methods that </li></ul><ul><ul><li>Define the necessary knowledge in each process step </li></ul></ul><ul><ul><li>Establish and control the sequence of actions required in each process </li></ul></ul>
    9. 9. Methodology <ul><li>755 AP questions for the three domains analyzed </li></ul><ul><li>100 domain-specific processes </li></ul><ul><li>Four main process categories: Join, Split, Modify, Locate </li></ul>Identification Decomposition and abstraction
    10. 10. The PSM library for Process Modelling
    11. 11. A PSM example: decompose & combine Combination, Byproduct output roles Recombination set, Decomposer, Combinator input roles combine output action decompose input action decompose, combine actions member(Recombination set, Element) and member(Constituents set, Piece) and part-of(Piece, Element) and part-of(Piece, Combination) and properties(Element, ep) and properties(Combination, cp) and not equal(ep, cp) goal decompose & combine name
    12. 12. The Process Editor: Enabling SMEs to Formulate Process Knowledge Domain-level reasoning and control flow evaluation Process metamodel PSM library (e.g. decompose & recombine) Domain process to which this process diagram is bound Associated process explanation
    13. 13. Representation and Reasoning Formalism <ul><li>Process diagrams authored by SMEs are automatically translated into FLogic (OntoBroker) code, allowing </li></ul><ul><ul><li>Single entry point for question answering throughout the whole system </li></ul></ul><ul><ul><li>Using Rule knowledge for reasoning within the scope of processes </li></ul></ul><ul><ul><li>Retrieval of meta-level information about processes, e.g. sub-processes and intermediate process results </li></ul></ul><ul><li>High expressiveness plus compact and concise syntax </li></ul><ul><li>Synthesized code implements data and control flow within process occurrences </li></ul><ul><li>Supports reasoning with process knowledge, allowing inference of process outcomes, grounded in the particular domains of application </li></ul>
    14. 14. The Frame Problem <ul><li>On what portion of the knowledge base can a process action have effect? </li></ul><ul><li>How can the effects of actions on the knowledge base be propagated to successive actions? </li></ul>
    15. 15. Solving the Frame Problem <ul><li>Two states (or modules) per process action </li></ul><ul><ul><li>Pre state: contains the scope of an action </li></ul></ul><ul><ul><li>Post state: contains the effects of an action that must be propagated </li></ul></ul><ul><li>Hence, actions read from their pre state and write into their post state </li></ul><ul><li>Process rules synthesized at authoring time dynamically create these modules upon process execution </li></ul><ul><ul><li>Setup rules : build the pre state of the input actions of the process </li></ul></ul><ul><ul><li>Precedence rules : describe what actions can be connected with each other by means of their outputs and inputs </li></ul></ul><ul><ul><li>Transition rules : describe the transition between pre and post states </li></ul></ul><ul><ul><li>RULE dissolve: FORALL aCation, anAnion, aSolute, water aCation:Cation@postState(dissolve) AND anAnion:Anion@postState(dissolve) AND water:Dissolvent@postState(dissolve) AND <- aSolute:Solute@preState(dissolve) AND water:Dissolvent@preState(dissolve). </li></ul></ul>Sample transition rule
    16. 16. Validating Process Knowledge: The Testing & Debugging Perspective Test set Query Test results
    17. 17. Evaluation <ul><li>Performed in the context of the overall Halo evaluation </li></ul><ul><li>Goal: To collect direct hands-on experience of SMEs on knowledge formulation and reasoning, in particular for process knowledge </li></ul><ul><li>Two main dimensions: usability and utility </li></ul><ul><li>Settings </li></ul><ul><ul><li>2 Chemistry SMEs, 2 Biology SMEs, and 2 Physics SMEs </li></ul></ul><ul><ul><li>Syllabus </li></ul></ul><ul><ul><ul><li>Chemistry: Stoichiometry, solutions and equilibrium (Brown & Lemay, pages 75-83, 113-133, and 613-653) </li></ul></ul></ul><ul><ul><ul><li>Biology: Cell and DNA structure and processes (Campbell and Reece, pages 112-124, 217-223, 239-245, 293-301, 304-311, and 317-319) </li></ul></ul></ul><ul><ul><ul><li>Physics: Kinematics and Dynamics (Serway and Faughn, chapters 2,3, and 4) </li></ul></ul></ul>
    18. 18. Evaluation Results <ul><li>Physics SMEs did not use processes </li></ul><ul><li>Not so important for Chemistry SMEs </li></ul><ul><li>SME2 (Biology): “It makes the representation of biological models easier... And asking questions with T&D worked okay” </li></ul><ul><li>SME3 (Biology): “The modelling of processes is very useful. It must be possible to ask questions about the various states of a process” </li></ul><ul><li>SU scale </li></ul><ul><li>SMEs filled in a form with questions about the system with a quantitative value between 0 and 100 associated </li></ul><ul><li>Average obtained: 64,5 (OK, but still way to improve) </li></ul>
    19. 19. Conclusions and Future Work <ul><li>The combination of a process metamodel and a PSM library provides SMEs with reusable, domain-independent means to formulate and reason with processes </li></ul><ul><li>The Process Editor allows SMEs to author process knowledge without the intervention of KEs, reducing brittleness and costs with high usability and high utility </li></ul><ul><li>Future work includes </li></ul><ul><ul><li>Collaborative authoring of process knowledge </li></ul></ul><ul><ul><li>Application to Business Process Modelling (BPM) </li></ul></ul><ul><ul><li>Formulation and reasoning with decision making processes, in the context of the NeOn project </li></ul></ul>
    20. 20. iSOCO Valencia +34 96 3467143 Oficina 107 C/ Prof. Beltrán Báguena 4, 46009 Valencia iSOCO Barcelona +34 93 5677200 Edifici Testa A C/ Alcalde Barnils 64-68 St. Cugat del Vallès 08190 Barcelona iSOCO Madrid +34 91 3349797 C/Pedro de Valdivia, 10 28006 Madrid iSOCO http://www.isoco.com Jose Manuel Gómez-Pérez [email_address] #T +34 91 334 9778 #M +34 609 077 103 Thanks for your attention!

    ×