Four ways to represent computer executable rules

1,288 views
1,139 views

Published on

July 27, 2008: "Four Ways to Represent Computer-Executable Rules". Presented at InterSymp 2008 conference sponsored by the International Institute for Advanced Studies
in Systems Research and Cybernetics (IIAS). Paper published in conference proceedings.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,288
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
19
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Four ways to represent computer executable rules

  1. 1. Cover Page   Four Ways to  Represent Computer‐ Executable Rules Author: Jeffrey G. Long (jefflong@aol.com) Date: July 25, 2008 Forum: Talk presented at the InterSymp 2008 Conference, sponsored by the International Institute for Advanced Studies in Systems Research and Cybernetics (IIAS).  Paper published in conference proceedings, available at http://iias.info/pdf_general/Booklisting.pdf Contents Pages 1‐5: Preprint of Article Pages 6‐26: Slides (but no text) for presentation  License This work is licensed under the Creative Commons Attribution‐NonCommercial 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by‐nc/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.  Uploaded June 24, 2011 
  2. 2. Four Ways to Represent Computer-Executable Rules Jeffrey G. Long jefflong@aol.comAbstractRules have long been used by society but have rarely been studied explicitly in their own right.They are increasingly recognized as interesting and useful abstractions. The recent trend towardsbusiness rules has brought the subject front-and-center in the business world, as have interests inwork process re-engineering over the past twenty years. Rules for computerized applicationscurrently are represented in three ways:  as software instructions  as production rules in the rulebase of an expert system  as pairs of XML tags.Each of these has its strengths and weaknesses. This paper discusses these approaches andbriefly describes a proposed fourth approach, namely representing most rules in a relationalDBMS. I view this as an exercise in notational engineering, i.e. examining alternative represen-tations to select one that is “best” in some engineering sense.Key Words: Business Rules; Software; Expert Systems; XML; Relational DatabasesGeneral Features of RulesAny manner of representing rules must have several fundamental features, including:  what kind of events can initiate a cascade of rule executions  the sequence in which rules are to be inspected, if sequence matters (including loops)  the various conditions under which each rule is to be inspected and/or fired  what happens if no rule, one rule, or multiple rules are found that match selection criteria  how to resolve conflicts if multiple actions are prescribed  when and how to stop or complete a rule cascade.To be a rule management system, such a system must also have metadata such as:  who created or updated the rule, and when  why the rule was created/updated  by what device the rule was created or updated (manually, by import, by software, etc.)  whether the rule can safely be changed without consulting others 1
  3. 3.  what kind of further “research” ought to be done regarding a rule, if any (e.g. are there questions about the rule? Might it be obsolete?).Software RulesSoftware rules are implemented as lines of code in a computer language such as Java. Such rulesare typically called “business logic” rather than “business rules,” and are specified in terms ofone of four standard programming constructs:  an ordered sequence of instructions  loops, used to specify conditional re-iterations of rules  If-Then-Else statements that select among two or more options  Case statements that select among multiple options.The result of executing a software rule is that either (a) internal or external data values areupdated, or (b) program control goes to a portion of the program that is specified. From there,further rules are found and executed. Because different situations often have similar but slightlydifferent rules, parameters are often specified whereby the code reads the parameter (typicallystored as a data element) and branches to another section of code based on the value of theparameter. This allows software designers to anticipate predictable differences in the waydifferent users might want the system to work. An example of a parameter is the definition of afiscal year-end month, so that accounting systems can handle the fact that any month may be theyear-end of a fiscal year for a particular user.The ability to specify rules as software provides a very fine-grained ability to represent complexand contingent rules. The downside of this is that there are always many such rules, typicallythousands or more, and as a result there are thousands to millions of lines of code in a typicalsoftware application, or even a single software object. This large code corpus is difficult tocomprehend, and, since it must evolve with new rules, ensures significant life-cycle maintenancecosts. As with any complex system, changing one part of the system may have unanticipatedconsequences for other parts. And since only programmers can update the code, there is alwaysthe risk of miscommunication between the subject experts and the programmers.Production RulesIn expert systems there is an inference engine, that knows only the rules of inference, a rulebasethat specifies the rules (called productions), and an initial set of facts (the environment). Rulesare triggered by facts, and any and all rules are selected that match the current environment.Those rules are added to an agenda, any conflicts are resolved (often via rule prioritization) andthe remaining rules are fired. The result of firing a rule is to make a change to the facts (assert-ing new facts or withdrawing existing facts), which may then cause other rules to be fired. Thisprocess continues until a specified end-point is reached, or until there are no more rules on theagenda. 2
  4. 4. Production rules are formulated in an If-Then (sometimes If-Then-Else) format. There can be anunlimited number of If-conditions, used to specify the specific environmental conditions underwhich the Then-action(s) will be taken, and an unlimited number of Then-actions. Rules aretypically stored in a text file which is loaded into memory at runtime, as are the initial facts. Theway rules are defined (formatted) has become important for rule interchange among differentsystems, and the Object Management group (OMG) released in 11/2007 a Beta version of itsProduction Rule Representation specification.This approach has shed light on the kind of thinking that an expert seems to do, namely to lookfor salient features of a given environment, respond to those features with changes to theenvironment, and then respond to the changed environment. Its downside is that when therulebase exceeds a few thousand rules the system may behave in an unexpected manner, for therule interactions are hard to anticipate, and the order of rule execution is important. Anotherdifficulty is that there are many (possibly thousands of) free-standing, independent rules tomanage, even when the rules are grouped into rulesets. Yet future expert systems will need tomanage not just thousands but hundreds of thousands, even millions, of rules.XML RulesMuch work has been done in recent years towards the design and standardization of XML-basedRule Markup Languages. These are intended to make rules more easily maintainable by non-programmers; to serve the semantic web; and to define rules in a manner not tied to anyparticular vendor’s technology. A primary driver has been the increasing need to communicateand cooperate with numerous systems not only within an organization but now across organiza-tions (e.g. to customers, vendors, regulatory agencies, etc). This has led to an interest in exter-nalizing certain rules outside of software so they may be more readily examined and changed.The eXtendable Markup Language (XML) format has been widely adopted as a generalframework for the specification of rules (e.g. RuleML, R2ML). XML tags are used to demarkthe beginning and the end of operators and relations to check for a particular rule; these may benested and combined as necessary. Rules so demarked may then be searched for and read bymultiple applications. There is a W3C Working Group dedicated to producing a Rule Inter-change Format (RIF), and the OMG is working on a variety of important areas, and recentlyreleased version 1.0 of its Semantics of Business Vocabulary and Business Rules.One difficulty of this approach is that those who maintain the rules are still left with an enormousnumber of free-standing, independent rules to manage. Integrity constraints are being developed,but there is still no referential integrity, such that an update can cascade to all places where anentity is referenced. Lastly, there is little query or reporting capability by which one can scan orupdate rules quickly and easily. These problems are similar to the problems encountered withthe software representation of rules. An example of a simple RuleML rule implementation togive a premium customer a 5% discount on any regular product is shown in Figure 1 below. 3
  5. 5. <imp> <_head> <atom> <_opr><rel>discount</rel></_opr> <var>customer</var> <var>product</var> <ind>5.0 percent</ind> </atom> </_head> <_body> <and> <atom> <_opr><rel>premium</rel></_opr> <var>customer</var> </atom> <atom> <_opr><rel>regular</rel></_opr> <var>product</var> </atom> </and> </_body></imp>Figure 1: RuleML for a Price Discount DecisionUltra-Structure RulesSince 1985 I’ve developed and used a fourth approach, called “Ultra-Structure”. This approachremoves all business rules that might ever change from the software, leaving only the controllogic for a “competency rule engine” as software. The rest of the rules are represented viarelational tables; there are no data or facts in the system, only rules. Rules can be converted fromtheir natural language form (e.g. a policy manual) into one or more rules having a canonical formconsisting of:  one or more “If” statements, defining conditions under which the rule should be inspected  one or more “Then-Consider” statements, defining additional considerations (before deciding what to do next) and/or actions  one or more metarule data fields specifying who set up the rule, why, whether it can safely be changed without consulting others, etc.We can then categorize those rules into a small number of formats called “ruleforms” that aredefined by their form and meaning, such that any logically possible rule pertaining to thatapplication area (e.g. order processing) can be expressed in some table in the system. This hasthe profound effect of reducing the myriad numbers of known (and future unknown) rules to amanageably small number of tables, typically less than 100 for an enterprise system.Lastly, we can implement each ruleform as a table. All rules having the same number of If-statements and similar meanings are grouped together into one table, with the If-statements 4
  6. 6. (called factors) forming columns that constitute the primary key of the table (and therebyguaranteeing the uniqueness of each rule). Other columns in the table (called considerations)represent the Then-Consider statements and the metadata about the rule. Thus, most businessrules are represented not as software, and not as data in XML tags, but as records (relations) in amodern RDBMS. Questioning decades of focus on software, under this approach software isseen as more of a problem than a solution, and the focus is on rules represented as relational data.By specifying business rules as records in a RDBMS, the only software that remains is controllogic that knows nothing about the world except what tables to look at, in what order, and whatto do based on rules selected for execution. Key benefits of this approach are that:  the amount of software required is reduced between 10-100 times  since this control logic is unlikely to change over time, the software and data structures stay remarkably stable even as the rules continue to evolve  rules can evolve by simply changing data, without any software changes, so many kinds of changes can be implemented immediately  subject experts and business managers can explain new rules to business analysts (not only programmers), who can then directly update the rules through the RDBMS.The key benefits of using a relational database for storing such rules are that the RDBMS:  provides access security and logging of changes  provides utilities for querying and reporting on large numbers (millions) of rules  guarantees referential integrity  can easily handle millions of rules as necessary.This approach is not presented as a perfect solution to the software bottleneck. Still to beaddressed are (a) the need to determine when certain conditions that might arise have not beenanticipated by any rule in the system, (b) the difficulty conventional programmers have withlooking in two places (the “data” as well as the software) to understand the logic of a situation,and (c) the semantics of data such that each data element (such as “order date”) really means thesame thing to all parties. The OMG is working to address this last issue with its new standard.We recently used this approach to create and install an enterprise system for a US$175Mwholesale distributor.ReferencesLong, J., and Denning, D. (1995); Ultra-Structure: A design theory for complex systems andprocesses; Communications of the ACM Vol. 38, No. 1 (pp. 105-120) 5
  7. 7. Four Ways to RepresentComputer-Executable Rules Jeffrey G. Long jefflong@aol.com IIAS Baden-Baden Conference July 2008
  8. 8. Minimum Requirements of Rule Management  The sequence in which rules are to be inspected, if sequence matters (including loops)  The various conditions under which each rule is to be inspected and/or fired  What happens if no rule, one rule, or multiple rules are found rule rule that match selection criteria  How to resolve conflicts if multiple actions are prescribed  When d how t stop/end a rule cascade Wh and h to t / d l d  Exceptions to rules are rules also.2 July 2008
  9. 9. Conventional Ways to Represent Rules  Software (e.g. Java, C#)  Production Rules (e.g. CLIPS, Jess) (e g CLIPS  XML (e.g. RuleML, JessML )  Natural languages  Mathematical functions  Chemical formulae  Music notation3 July 2008
  10. 10. Software Rules  If (premium customer) and (regular product) – Then (discount is 5%) – Else (discount is 0%)  Select Case (customer category) – Case “Premium”  Select Case (product category) (p g y) – Case “Regular”  discount = 5%4 July 2008
  11. 11. Features of Software as a Notational System  Many valid ways to express a given rule – both a strength and a weakness, depending on programmer  Seemingly easy to change – but many times changes create new and unexpected p problems  The starting point, stopping point, and sequence of operations are defined wholly and explicitly by the programmer  Control is based on program structure; rules ( p g (lines of code) are ) data-insensitive and ordered  One missing bracket changes rule, can make it and entire system inoperable (unexecutable)5 July 2008
  12. 12. XML Rules <imp> <_body> <_head> <and> <atom> <atom> <_opr><rel>discount</rel></_op <_opr><rel>premium</rel></_op r> r> <var>customer</var> <var>customer</var> </atom> <var>product</var> d t / <atom> <ind>5.0 percent</ind> <_opr><rel>regular</rel></_opr </atom> > <var>product</var> </_head> </atom> </and> </_body> </imp>6 July 2008
  13. 13. XML Rule Markup Features  Vendor-independent standard. Other rule standardization efforts include RIF, PRR, CL, SBVR; open source rules p communities include jBoss Rules, Jess, Prova, OO jDrew, Mandarax, XSB, XQuery  Designed for use on Semantic Web – distributed, (partially) open, heterogeneous environments  One missing bracket changes rule, can make it unexecutable7 July 2008
  14. 14. Production Rules (defrule MAIN::good-customer-discount (product is regular) (customer is premium) => (assert (price-discount is 5%)))8 July 2008
  15. 15. Production Rule Features  The knowledge (rules) and the data (facts and instances) are separated, and the inference engine is used to apply the p g pp y knowledge to the data  Rules are data-sensitive and unordered; control is based on data state  There are three phases: rule-matching, rule-selection, and rule-execution  There are limited choices during rule selection, depending on the inference engine used to resolve a conflict set9 July 2008
  16. 16. Real-World Rules are More Complex  Must be inspected from most specific circumstances (exceptions) to most general (whole classes)  Have multiple circumstances (3-10 “factors”)  Each factor has many possible values (5+)  Circumstances trigger further inspection of complex Ci t ti f th i ti f l “considerations” (e.g. QOH)  After being selected, additional rules may need to determine final outcome (e.g. lowest price)10 July 2008
  17. 17. But They Don’t Easily Handle Many Rules Having Multiple Factors and Multiple Values Product Type = yp Customer Type yp Price = Order Entry No No Regular? = Premium? Price * 1.00 Yes Yes Price = Customer Type Price = No Price * 1.00 = Premium? Price * 0.90 Yes Price = Price * 0.9511 July 2008
  18. 18. Additional Management Requirements  Who created or updated the rule, and when was last update  Why the rule was created/updated  By what device the rule was created or updated (manually, by import, by software, etc.)  Whether th Wh th the rule can safely be changed b a person without l f l b h d by ith t consulting others  What kind of further “research” ought to be done regarding a rule, if any, e.g. are th l there questions about th rule? Mi ht it b ti b t the l ? Might be obsolete?12 July 2008
  19. 19. Merge Tools & Techniques of:  Information Management – databases industrial strength platforms databases, industrial-strength  Knowledge Management – repository for knowledge of organization, both human- oriented and machine-oriented  Knowledge Engineering – simulation of expert decision-making with continuous decision process improvement p p13 July 2008
  20. 20. Ultra-Structure Rules14 July 2008
  21. 21. Ultra-Structure Provides Rules with Place-Value  Existing Options  Ultra-Structure – freedom of expression – expression of rules is means complex syntax constrained by ruleforms – semantics i assigned ti is i d – semantics i assigned ti is i d largely by syntax positionally – result is great freedom – result is adequate but low manageability freedom plus high manageability15 July 2008
  22. 22. Ruleforms Define Place-Value Rule Semantics Rules16 July 2008
  23. 23. Benefits  Rule-recognition not triggered by working memory state but by events; different events involve different rules  Able to define and manage more complex rules – multiple factors and multiple values per factor address need for high number of possible permutations – multiple considerations applied during rule-recognition  RDBMS permits better management of millions of rules – using standard RDBMS tools, report-writers, etc. – can be read and managed by subject experts  Can exchange tables of rules as data g17 July 2008
  24. 24. Conclusion  The problems with rule management are primarily caused by how we represent rules  This is a classic notation/representation problem  Ultra-Structure uses a new abstraction (i.e. ruleforms) to provide a time-tested way of assigning meaning by column18 July 2008
  25. 25. References  J. Long, D. Denning (1995), “Ultra-Structure: A design theory for complex systems and processes”; Communications of the ACM Vol. 38, No. 1 (pp. 105- 120)  H. Boley, S Tabet, G. Wagner, “Design Rationale for RuleML: A Markup Language for Semantic Web Rules” at citeseer.ist.psu.edu/boley01design.html Rules  CLIPS Reference Manual (3/28/2008)19 July 2008
  26. 26. Other Articles by JL  Long, J., "Automated Identification of Sensitive Information in Documents Using Ultra-Structure". In Proceedings of the 20th Annual ASEM Conference, American Society for Engineering Management (October 1999)  Long, J., "Editors Note." In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999) 125 1/3  Long, J., "A new notation for representing business and other rules." In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125- 1/3 (pp 215 227) (1999) (pp. 215-227)  Long, J., "How could the notation be the limitation?" In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999)20 July 2008
  27. 27. Writings by Others  Shostko, A., “Design of an automatic course-scheduling system using Ultra- Structure.” In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999) 125 1/3  Oh, Y., and Scotti, R., “Analysis and Design of a Database using Ultra- Structure Theory (UST) – Conversion of a Traditional Software System to One Based on UST,” Proceeding of the 20th Annual Conference, American Society for Engineering Management (1999)  Parmelee, M., “Design For Change: Ontology-Driven Knowledgebase Applications For Dynamic Biological Domains.” Master’s Paper for the M.S. in I.S. degree, University of North Carolina, Chapel Hill (November 2002)  Maier, C., CoRE576 : An Exploration of the Ultra-Structure Notational System for Systems Biology Research. Master’s Paper for the M.S. in I.S. degree, University of North Carolina, Chapel Hill (April 2006)21 July 2008

×