Managing and benefiting from multi million rule systems


Published on

October 31, 2007: “Managing and Benefiting from Multi-Million Rule Systems”. Presented at the 2007 Conference of the New England Complex Systems Institute.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Managing and benefiting from multi million rule systems

  1. 1. Cover Page   Managing and  Benefitting from Multi‐ Million Rule Systems  Author: Jeffrey G. Long ( Date: October 31, 2007 Forum: Poster session presented at the 2007 Conference of the New England Complex Systems Institute. Contents Page 1: Abstract Pages 2‐26: Slides (but no text) for presentation   License This work is licensed under the Creative Commons Attribution‐NonCommercial 3.0 Unported License. To view a copy of this license, visit‐nc/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.  Uploaded June 24, 2011 
  2. 2. Managing and Benefitting From Multi-Million Rule Systems Abstract Jeffrey G. Long October 31, 2007This talk will discuss the idea that better representation and understanding of complexsystems will require new abstractions and new uses of existing abstractions. Oneapproach I have been exploring is taking system rules out of software and representingthem as data. I will discuss several abstractions I have found useful in representingvarious kinds of complex business, linguistic, and biological systems as data. Theseinclude (1) the notion of tens of thousands of complex, contingent "Competency Rules"that define or describe the behavior of a system, (2) the implementation of those rulespartly in software (like an inference engine) and primarily in data (like an expert system);(3) the notion of contingent rules having multiple "factors" or primary drivers and zero ormore "considerations" that the system must review before deciding what to do next; and(4) the notion of the form of a rule, as contrasted with its content (like algebra).Reducing complexity cannot mean ignoring details, but must include seeing the largerpicture presented by ruleforms. Several specific examples will be given from current andpast projects.
  3. 3. Managing & Benefiting from Multi-Million Rule SystemsInternational Conference on Complex Systems ICCS2007 – Boston, MA Jeffrey G. Long October 31, 2007
  4. 4. Studying a Variety of Notational Systems Wh t makes th What k them powerful? f l? speech & writing What is their nature & structure? cartography Can their design be facilitated? arithmetic & algebra How and why did they evolve? geometry Who created them? chemical notation p What accelerated or impeded dance/movement notation d / t t ti their general usage and music notation acceptance? logic notation What effects did they have on money society? on cognition? How do we know if we’re at the limits of usefulness of a notational y system?
  5. 5. Key PointsModern society is critically dependant upon a number of different kinds of rule systems Yet we systems. have (increasingly) enormous problems creating and managing large rule systems.This arises from how we currently represent rules and data. We cannot solve them by means of faster computers or other extensions of current representations. Reducing complexity cannot mean ignoring details, but must include seeing an even larger p g g , g g picture.We can look to the past for guidance. Many times in the past, society has overcome “complexity barriers” by means of new notational systems. These events are what I call “notational revolutions”, and they affect how we see the world, how we think about the world, and how readily and what we can communicate with others.My experience is that representing rules and data as an integrated whole, and using a place- value representation, does make large rule systems much more comprehensible, therefore more manageable, and therefore more able to safely grow and change as needed (i.e. evolve). My name for this approach is “Ultra-Structure”.I hope other proposed Rule Calculi will consider the issues and approaches I’m suggesting here.
  6. 6. Rule Systems are Ubi it R l S t Ubiquitous Subject j Business Scientific Legal# Rules Rules Rules Rules etc. Small < 1,000 Medium < 100,000 Large< 10,000,000 10 000 000 Very Large> 10,000,000 10 000 000
  7. 7. Many Types of Rules Ontological Rules (what exists, how entities relate) exists Operating Rules (how a system nominally works) Strategy Rules (how to optimize a process; win; be artful) Ethical Rules (additional guidelines for a clear conscience) Evaluation Rules (how to tell if making progress/“winning”, or detecting that rules are not working well) Learning Rules (rules for changing rules) Historical Rules (past events; custom) Rules are multi-notational: largely qualitative but may include multi notational: quantities or other kinds of abstractions (e.g. musical notes) Rules are probabilistic but can be treated as deterministic
  8. 8. Characteristics of Notational Revolutions g p , g y Some involve looking at the world from a different viewpoint, e.g. a birds-eye rather than a ground-truth viewpoint, or indirect rather than direct reference to the world. Some involve moving from a relative-value representation to a place-value representation. Some involve the introduction of new abstractions such as zero, musical notes or map abstractions, zero notes, coordinates. Physics has benefited from and might be said to have even co evolved with improved co-evolved notational systems such as calculus, Feynman Diagrams, Riemannian geometry, tensors They all greatly expand the sphere of what can be readily said; the notation is the limitation. They Th are examples of “notational engineering” occurring without the benefit of systematic l f “ t ti l i i ” i ith t th b fit f t ti guidelines from the experience of others, or of a general theory of notation derived from a longitudinal and comparative study of humanity’s notational systems
  9. 9. 1. Separation of Algorithms from Data Traditional separation contributes to and is caused by object- centered view of the world. In a process- centered worldview, everything is a process and every process is only describable in terms of r les rules.
  10. 10. Traditional Management Info System Software/ Events Work Algorithms Data
  11. 11. Conventional Data are Rule Fragments Bin Part QOH QOO etc. A X 5 4 RuleFragments B B 15 7 Satisfies TNF requirements, but is still not flexible enough.
  12. 12. Data-Inclusive Rules Include ConventionalData as Part of Larger Rules Universals Qty Provide Loc’n Part Type Qty etc. Context A X QOH 5 Simple A X QOO 4Sourcing Rules B B QOH 15 B B Q QOO 7
  13. 13. 2. Examples of Relative to Place Value Roman to Hindu-Arabic Numerals Hindu Arabic 500 BCE, 200 CE, 875 CE, 1200 CE, 1600 CE Neumatic to Staff Notation eu at c Sta otat o 500 CE, 800 CE, 1025 CE, 1300 CE, 1600 CE Peripli to Coordinate-System maps 500 BCE, 100 CE, 1600 CE
  14. 14. Place-Value b QPl V l by Quantity tit Hindu-Arabic Roman Numerals Numerals 103 102 101 100 IV 4 CXII 1 1 2 MCMIX 1 9 0 9 Without a placeholder, you can’t reliably have columns p ,y y
  15. 15. Place V l b Pit h Pl Value by Pitch Neume direction indicated voice intervalF ED CB AG FE A G F E D C B A G
  16. 16. Place V l b C di tPl Value by Coordinates
  17. 17. RuleML Adds More Complexity <imp> <_head> < head> <atom> <_opr><rel>isAvailable</rel></_opr> <var>Car</var> </atom> </_head> <_body> <and> <atom> “A car is available for rental if it is <_opr rel isPresent /rel /_opr opr><rel>isPresent</rel></ opr> <var>Car</var> </atom> physically present, i not assigned t h i ll t is t i d to <not> <atom> <_opr><rel>isAssignedToRentalOrder</rel></_opr> any rental order, is not scheduled for <var>Car</var> </atom> </not> service, and does not require service.” <not> <atom> <_opr><rel>isScheduledForService</rel></_opr> <var>Car</var> </atom> </not> <not> <atom> <_opr><rel>requiresService</rel></_opr> <var>Car</var> </atom> </not> H. Boley, S Tabet, G. Wagner, “Design Rationale for </and> </_body> RuleML: A Markup Language for Semantic Web Rules” </imp>
  18. 18. Place-Value of Rules Conventional rules are semantically informal and multiplex (many parts) Exceptions to rules are themselves rules Any co e t o a rule ca be co e ted into >1 “simple” rules y conventional u e can converted to s p e u es Each “simple” rule has the form: “If a and b and c…Then Consider x and y and z”, where >= 1 Ifs >= 0 Then Considers Rules converted into simple form are grouped based on their format (# Ifs, # Then-Considers) and meaning (= function) e.g. agencies versus locations versus products d t Result is a small (< 100) set of tables each having different structure and/or function (syntax and semantics)
  19. 19. Each simple rule is represented as one record in one table (out ofn tables)Each column of each table has a general meaning that is used to g gassign context to that part of each rule in that tableInitial rule selection for inspection (the If component) constitutesthe primary key column(s)Subsequent rule evaluation and possible execution (the ThenConsider component) constitutes most other columns p )There are usually several columns of rule metadata at the endSoftware implements a Competency Rule Engine that (ideally)doesn’t know anything about world, only about how to read therules for a broad application area (e.g. business, games, law)
  20. 20. Rule systems have several kinds of Existential Ruleforms as afoundation agencies products/services locations time periodsExistential rules are referenced by foreign key constraints to formCompound Ruleforms network ruleforms define relations among same kind of entities attribute ruleforms define characteristics of entities authorization ruleforms define relations among different kinds of entities p protocol ruleforms define pprocessesMost columns are foreign keys to a particular existential table (this cancause problems with some RDBMS)
  21. 21. 3. 3 Competency Rule Engine (CoRE)Very smallamount of t fcode in engine(~100K LOC) Stimulus Control Response LogicConventionaldata is ab- Competencysorbed intorules; every- Rulesthing is a rule!
  22. 22. Benefits data Representing rules as “data” rather than software decreases required amount of software by 1-2 orders of magnitude: reduced amount of software may reduce initial development cost reduced amount of software definitely reduces chances for bugs, thus reducing d d i development and maintenance costs l t d i t t Rules as “data” can be directly accessed and managed by subject experts, without reliance on programmers: changes in rules normally do not require changes in software, reducing software maintenance costs reduces/eliminates communication requirements from subject expert to programmer As rules are externalized corporate knowledge can be seen externalized, seen, studied, and improved by many with added metadata regarding each rule, and hyperlinks, this can become a true knowledgebase
  23. 23. Exploratory CoREs CoRE650 – Business (wholesaler with 10 000 orders/day) 10,000 CoRE415 – Language (search documents for concepts) CoRE576 – Biology (various toy models in proteomics lab) Im I’m always eager to try this theory on new kinds of rule systems. systems
  24. 24. , 0 50,000 100,000 150,000 200,000 250,000 300,000 350,000 400,000 450,000 9/22/200510/22/200511/22/200512/22/2005 1/22/2006 R l C 2/22/2006 3/22/2006 4/22/2006 5/22/2006 6/22/2006 t 7/22/2006 Rule Counts 8/22/2006 9/22/200610/22/200611/22/200612/22/2006 Over Time 1/22/2007 # Existential Rules 2/22/2007 3/22/2007 4/22/2007 5/22/2007 6/22/2007 7/22/2007 8/22/2007 9/22/2007 Agency Location Product or Serv Master Protocol ice
  25. 25. Summary Rules are a type of abstraction, and should be studied as such. There are higher-level abstractions than individual rules, and many rule types; we need a discipline whose object of study is rules/laws. Putting more rules into software is not the solution, nor is building new layers on top of existing layers Software is the problem It substitutes for a formalized layers. problem. formalized, place-value representation of rules, enforces a divide between algorithms and data, and obscures the rules with significant ancillary syntax. Rule systems must be conceived at a higher level of abstraction to be y g manageable while still maintaining all necessary detail < 100 ruleforms and their interactions are comprehensible 1+ million individual rules are not comprehensible The Th resulting system must b able t perform b th d d ti i f lti t t be bl to f both deductive inference and d computations, and be managed directly by subject experts (not programmers)
  26. 26. Questions Might the problems of large rule systems arise from the way we represent them? What i th Wh t is the optimal representation of l ti l t ti f large numbers ( illi b (millions) of ) f complex, contingent rules? What might a place-value system for representing rules look like? What is the relationship of algorithms and data? Is there benefit in conceiving and representing both as rules?
  27. 27. Ult St tUltra-Structure R f References Long, J., and Denning, D., “Ultra-Structure: A design theory for complex systems and processes.” In Communications of the ACM (January 1995) Long, J., “A new notation for representing business and other rules.” In Long, J. (guest editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999) Shostko, A., “Design of an automatic course-scheduling system using Ultra-Structure.” In Long, J. (guest Design course scheduling Ultra Structure. editor), Semiotica Special Issue: Notational Engineering, Volume 125-1/3 (1999) Long, J., “Automated Identification of Sensitive Information in Documents Using Ultra-Structure.” Proceedings of the 20th Annual ASEM Conference, American Society for Engineering Management (1999) Oh, Y., Oh Y and Scotti, R., “Analysis and Design of a Database using Ultra Structure Theory (UST) – Scotti R Analysis Ultra-Structure Conversion of a Traditional Software System to One Based on UST,” Proceeding of the 20th Annual Conference, American Society for Engineering Management (1999) Parmelee, M., “Design For Change: Ontology-Driven Knowledgebase Applications For Dynamic Biological Domains.” Master’s Paper for the M.S. in I.S. degree, University of North Carolina, Chapel Hill (November 2002) Maier, C., CoRE576 : An Exploration of the Ultra-Structure Notational System for Systems Biology Research. Master’s Paper for the M.S. in I.S. degree, University of North Carolina, Chapel Hill (April 2006)