SlideShare a Scribd company logo
1 of 55
Download to read offline
From Legal Language to
  Computer Language


                   Radboud Winkels
                     Emile de Maat
Outline


                 Leibniz Center for Law
                 From sources of law to ICT
                  applications
                   Structure
                   References
                   Content
                 Empirical results
                 Conclusions and current
                  research

2                 09/08/2010
Leibniz Center for Law


       Computational Legal Theory and
          Legal Knowledge Management
       (Formal) Models of:
        Legal Knowledge
            Sources? Elementary legal concepts?
             Constituents of norms, coherence, …
          Valid Legal Reasoning
            Case assessment, causality, legal comparison,
             …
Leibniz Center for Law -2


        Applied Topics:
         Improve quality of legal products
             Legislation; decisions; advises, etc.
           Inprove access to legal information
            and knowledge
           Support teaching and learning of legal
            knowledge and skills
           Legal organisations and change
            management
Norms and Language




5                        09/08/2010
“Legal Engineering”
                                    Legislation can be seen as
                                       specification of a
                                       normative system.
                                   Legislation is
                                       underspecified.
                                   It suffers from anomalies:
                                   • inconsistencies
                                   • Circle reasoning
                                   • open evaluative terms
                                   • ambiguities



                      09/08/2010
From Sources of Law to ICT Applications
                                       Formal
   Sources                                                                  Applications
                                       Models


                                                                                G
                                                                               Term: This
                                                                                means
                doctrine                                                        that and
                                                                                has
       Case law on                                                              relations
        legislation
      Case law                                                                  with those

  legislation
                                      p1,p2,…

                                      q1,q2,…

                                      O(α І β)


                           concepts    norms              Tasks and
                                                          reasoning
                             Meta-knowledge




                                                 FOLaw
                                                 LLD
                                                 Sartor   CLIME
                                 LRI-core                         e-Court
                                                 …
Sources of Law

       Most important source of „knowledge‟
       Explicite links between sources and
        knowledge models essential for:
         Validation
         Maintenance (traceability)
         Justification
       Link at right level of detail (granularity)
From Sources of Law to Formal Models

      Automatic support :
       Increase quality models and efficiency process
       Increase inter-coder reliability
                  Structured
                   text with         Model of        Integrated
      NL text                       individual        model of
                 explicit and
                  typed refs        provisions        meaning




                         Recognizing          Model
                             and            fragment
                          classifying      suggestions

                         8/9/2010
Text Structure
Structure Marking


                                               Hoofdstuk 1
                                               Paragraaf 1
                                               Artikel 1
                   Paragraaf

                               Artikel

                                         Lid
                                               Lid 1
       Hoofdstuk




                                               Lid 2
                                               Artikel 2
                                               Artikel 3
                                               Paragraaf 2
                                               …
                                               …
                                               Hoofdstuk 2
Relations between Sources of Law




             Legislation           Case Law




             Adm. Case
                                   Doctrine
               Law
Characteristics of Sources of Law


          Legislation
            Precise grammar for reference, clear
             identity and version criteria
          (Adm.) Case Law
            Precise grammar for reference, precise
             identity, no versions
          Doctrine
            Sloppy reference, no identity markings,
             sloppy versioning
The Structure of References: Simple References

           Simple references
             Name
              Customs Law
             Label and number
              Article 1
             Label, number and publication date
              The law of April 13th, 2006
             Indirect references
              That article
The Structure of References: Complex References


           Multi-valued references
            Articles 1, 5 and 12
           Multi-layered references
            Customs Law, article 5, first member
           Multi-valued, multi-layered references
            Customs Law, articles 1, 5, first
            member, and 12
The Structure of References: Ordering


           Zooming in
            Customs Law, article 5, first member
           Zooming out
            first member, article 5, Customs Law
           Zooming in, then zooming out
            article 5, first member, Customs Law
The Structure of References: Miscellaneous


          Opening words
           Article 12, opening words and parts 1
           and 2
          Exceptions
           Articles 5-21, with the exception of
           article 9
          Each time
           Articles 5-10, each time the first
           member
Complete and incomplete references


           Complete references
             Does mention the document that is being
              referred to
               Customs Law, article 5, first member
           Incomplete references
             Does not mention the document that is
              being referred to
               Article 5, first member
Finding references


           Use context-free grammar, e.g:
              <article> 
                    “article”
                   <designation> [[“,”] <lower_level>]
                   [ “-“ <designation> [[“,”] <lower_level>]
                   [
                   ( [“,”]<designation> [[“,”] <lower_level>])*
                   “and” <designation> [[“,”] <lower_level>]
                   ]
                   [[“,”] [“or”] <higher_level>]
Problems


          Names cannot be recognised
            Add names as a list to the grammar
          Headings will (falsely) be recognised
           as a reference
            Mark headings beforehand; use Metalex as
             input
Resolving references

          Incomplete references
            Reference needs to be completed from
             context
            Within a regulation, an incomplete
             reference refers to the regulation itself
            Within commentaries, incomplete
             reference refer back to an earlier made
             complete reference
Automatic Parsing

       1. Determine identity source
              In doc: Title, citation title
              In metadata
       2. Parse document
              “Natural language” – model sentences
       3. Find references
       4. Determine type reference
              E.g. attribution and delegation of power;
               definitions; enactment; change
       5. Determine identity goal
              I.e. the thing it refers to
Results simple parser


           99% of all simple references correctly
            identified
           95% of all complex references
            correctly identified
           Few false positives
           Works adapted for Flemish law
             Opsomer (2009)
Causes of errors


          Failing to detect a reference
            Missing labels or names
            Textual errors
          False positives
            Homonyms: a label has a second
             meaning in addition to being part of a
             reference
             the first member
Conclusions


             Automatic detection of references is
              entirely feasible
             No complicated methods are needed;
              regular grammars may suffice
From Sources of Law to Formal Models

  From structured text to models of individual
     sentences…

                 Structured
                  text with         Model of        Integrated
     NL text                       individual        model of
                explicit and
                 typed refs        provisions        meaning




                        Recognizing          Model
                            and            fragment
                         classifying      suggestions

                        8/9/2010
Towards Automatic modelling
Automatic modelling – Sentences (1)


          Start with sentences
            Independent unit.
            Often marked, otherwise easy to
             recognize
          Different types of sentences require
           different translation, different model
Conclusions From Earlier Research


            Dutch Law:
              Provisions usually match one sentence
              Several types of sentences can be easily
               distinguished
              Limited amount of language constructs
               per type
            Automatic recognition and
              classification seems doable
          Types not specific for Dutch law
           (cf. Tiscornia e.a. for Italian law)
Categories



             1. Definitions         6. Value Assignment
             2. Deeming Provision   7. Change*
             3. Norm –              8. Delegation
                Right/Permission    9. Enactment Date
             4. Norm –              10. Citation Title
                Obligation/Duty     11. Penalization
             5. Application
                Provision
         Each category uses specific language
         constructs that can be used to identify
         them.
Example: Penalisation Provision



           Penalisation provisions set punishments
           for breaking the law, and mark such an
           act as either a misdemeanour or a crime.

           Mining Act, article 133
           1.Breaking article 43, sub 2, is punished
           with a monetary fine of the second
           category.
           2.The fact marked as punishable by this
           article is a misdemeanour.
Example: Norms (1)



             Normative sentences form the core of
              each regulation, stating obligations
              and rights
             Rights can be denoted by a wide
              range of verbs: can, may, is allowed
              to, has a right to, …
             Similarly, obligations can be denoted
              by the use of certain verbs: is
              prohibited, is charged with
             Many variations
Example: Norms (2)



             However, obligations are often represented
              as a “statement of fact”

          Funeral Act, article 46, section 1
          No bodies are interred on a closed cemetery.

             May be about any subject
             No common signal words or patterns
             Preferred by the Guidelines for Legal
              Drafting
Experiment (1)


                Classifier
                  Based on 88 patterns
                  JAVA
                Based on input in which
                 sentences and quoted text have
                 already been marked (MetaLex)
                Assumes a statement of fact
                 norm if no explicit pattern is used
Experiment (2) - Lists



             Lists are classified based on its
              header, if this contains a pattern;
              otherwise, each item is classified
              (without the header)


            Tobacco Act, article 1
            In this law, and in the stipulations based on it, is
                  understood by:
            a. tobacco products: … ;
            b. Our Minister: …;
            c. appendix: …;
            …
Experiment – Test Set

             18 texts
                 One royal decree
                 Three new bills
                 Fourteen amending bills
                 All „recent‟
                 No overlap with the training set
             654 sentences
               592 „regular‟ sentences
               62 lists
Results per Document (1)
                            Sentence                        List
       Source       Total    Correct    %     Total   Correct   Partial    %     Type
Royal Decree Stb.   26         23      97%     4        4          0      75%
                                                                                  New
1945, F 214
Bill 20 585 nr. 2   31         30      97%     4        3          1      75%     New

Bill 22 139 nr. 2   22         20      91%     2        2                 100%    New

Bill 27 570 nr. 4   21         16      76%                                       Change

Bill 27 611 nr. 2    11        11      100%    1        1            100% Change
                                                         Relative low score due to
Bill 30 411 nr. 2   141       128      91%    25        20      3     80%    New
                                                         a misapplied pattern (3x)
Bill 30 435 nr. 2   40         39      98%     4        3       1     75% Change
Bill 30 583 nr. A   27         27      100%                                      Change

Bill 31 531 nr. 2    3         3       100%                                      Change
Results per Document (2)
                            Sentence                         List
        Source      Total    Correct    %     Total   Correct   Partial    %     Type
Bill 31 537 nr. 2    29        29      100%    2        2           0     100%   Change

Bill 31 540 nr. 2    7         7       100%                                      Change

Bill 31 541 nr. 2    8         8       100%                                      Change

Bill 31 713 nr. 2    7         6       86%     2        2           0     100%   Change

Bill 31 722 nr. 2    31        22      71%     6        5           0     83%    Change

Bill 31 726 nr. 2    78        67      86%     2        1       1     50% Change
Bill 31 832 nr. 2    7         7       100%    3        3Relative low100% due to
                                                                      score Change
                                                         a pattern appearing in an
Bill 31 833 nr. 2    4         4       100%               auxiliary sentence Change
                                                                             (5x)
Bill 31 835 nr. 2    99        90      91%     7        4       3     57% Change
Total               592       537      91%     62       50          9     81%
Overall Results



                 91% of all regular sentences have
                  been correctly classified
                   71%-100% over laws
                 81% of all lists have been correctly
                  classified
                   50%-100% over laws
Results per Type (1)
       Type                       In corpus         Missed   False
       Definition                 2%           12       1       0
       Norm - Right/Permission    11%          64       4      13
       Norm - Duty                5%           29       0       1
       Delegation                 3%           19       6       0
       Publication Provision      1%            4       0       0
       Application Provision      7%           40       1       8
       Enactment Date             3%           17       1       0
       Citation Title             1%            3       0       0
       Value Assignment/Change    0%            1       0       0
       Penalisation               0%            0       0       2
       Change                     41%         241      16       8
       Mixed Type                 1%            3       3       0
       Norm - Statement of Fact
       (default)                  27%         159      23      23
       Total                                  592      55      55
Results per Type (2)


                          Mostly norms and
                           modifications
                            right/permission 11%
                            obligation/duty 27% + 5%
                            change 41%
                          Several definitions and
                           application provisions
                          Barely any of the others
Results – Patterns Used
                          Patterns   Patterns
Type                      Known       Used

Definition                  14          5
Norm - Right/Permission     17          3          About 50% of the
Norm - Obligation/Duty
Delegation
                            15
                             7
                                        8
                                        5
                                                    known patterns has
Publication Provision        1          1           been used
Application Provision
Enactment Date
                             5
                             1
                                        5
                                        1
                                                   Difference in age
Citation Title               2          2           between test and
Value Assignment             8          1
                                                    training set?
Penalisation                 3          1
Change - Scope               2          2          Underrepresented
Change - Insertion           4          4
                                                    sentence types
Change - Replacement         3          3
Change - Repeal              2          1
Change - Renumbering         3          2

                            87         44
Problems (1)



              Patterns appearing in auxiliary
               sentences instead of the main
               sentence
              Mostly happens with rights and
               application provisions:
                If x has the right to …
                If x is able to …
                If x applies …
Problems (2)


              Lists need a more serious approach
                Some can be classified by the header
                 only;
                Some can be classified by the list item
                 only;
                Some can only be classified by the header
                 combined with the item.
              Lists need to be converted to
               individual sentences (header plus list
               item)
Minor problems


            Missing patterns
            Mixed sentences
              Difficult to solve, but does not occur often
            Patterns used for other purposes
              Repeal of fines instead of repeal of
               regulations
            Specific patterns for specific laws
              E.g. Tax Law (value assignment)
Conclusions


             This (symbolic) approach is feasible
             Using obligation as a default category
              seems acceptable
               No major categories are missing
             We expect it to generalise to other
              Dutch regulations
             The approach could be used for other
              (civil) jurisdictions and languages
               Biagioli et al. (2005) similar results for
                Italian law but statistical approach
Next Step




                Structured
                 text with         Model of        Integrated
     NL text                      individual        model of
               explicit and
                typed refs        provisions        meaning




                       Recognizing          Model
                           and            fragment
                        classifying      suggestions

                       8/9/2010
Next Step



               Divide sentence in different terms that
                are linked through relations
               Classification (and base pattern) gives
                a rough division, and a rough relation
               More detailed division of the
                sentences is needed
                 Using of Dutch grammar parsers
Current Research (1)
Automatic modelling – Reference parser


            References are important in legal texts
              Useful when the computer understands
               these better
              Better understanding is possible
            References do not fit well in “normal
             Dutch sentence structure”
            Separate reference parser
Things to think about – Granularity


           Granulary – How far do we want to go
            with the splitting of text?

            Liquor: those drinks, that, at a
            temperature of twenty degrees
            Celsius, consist of alcohol for at least
            fifteen volume percents, with the
            exception of wine.
Thinks to think about – Norms



              Classification distinguishes only a
               limited set of norms
              Do we need more distinction?
                For computer calculations?
                For interaction with the user
Things to think about - Procedures



              Procedures use the same language
               constructs as other norms (at least in
               Dutch), but:
              Procedures have a more specific
               context
              Procedures have a stronger ordering
Overall Conclusions

           Distance from Legal Language to Computer
            Language is too big to cross in one step
           Automatic modelling support is already
            partially possible:
               Structure and References
               Classification of sentences in legislation
           Generalisation to all Dutch legislation
            possible
           Same method for other languages and
            jurisdictions
           Generalisation to other sources of law more
            difficult
Questions?




                  winkels@uva.nl e.demaat@uva.nl
                       www.LeibnizCenter.org
59                       09/08/2010

More Related Content

Similar to From legal Language to computer language (2009)

Principles of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systemsPrinciples of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systemsMartin Chapman
 
EDF2012 Andrew Farrow - (Copy)right information in the digital age
EDF2012   Andrew Farrow - (Copy)right information in the digital ageEDF2012   Andrew Farrow - (Copy)right information in the digital age
EDF2012 Andrew Farrow - (Copy)right information in the digital ageEuropean Data Forum
 
Finish the term paper using the following outline. In addition to th.docx
Finish the term paper using the following outline. In addition to th.docxFinish the term paper using the following outline. In addition to th.docx
Finish the term paper using the following outline. In addition to th.docxernestc3
 
Automated Discovery of Logical Fallacies in Legal Argumentation
Automated Discovery of Logical Fallacies in Legal ArgumentationAutomated Discovery of Logical Fallacies in Legal Argumentation
Automated Discovery of Logical Fallacies in Legal Argumentationgerogepatton
 
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONAUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONijaia
 
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONAUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONgerogepatton
 
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONAUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONgerogepatton
 
Akoma Ntoso 2
Akoma Ntoso 2Akoma Ntoso 2
Akoma Ntoso 2tbruce
 
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGEAN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGEIJwest
 
An Intersemiotic Translation of Normative Utterances to Machine Language
An Intersemiotic Translation of Normative Utterances to Machine LanguageAn Intersemiotic Translation of Normative Utterances to Machine Language
An Intersemiotic Translation of Normative Utterances to Machine Languagedannyijwest
 
Legal Information: an introduction for Information Science students
Legal Information: an introduction for Information Science studentsLegal Information: an introduction for Information Science students
Legal Information: an introduction for Information Science studentsEmily Allbon
 
Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Rinke Hoekstra
 
Semantic Modeling for Information Federation
Semantic Modeling for Information FederationSemantic Modeling for Information Federation
Semantic Modeling for Information FederationCory Casanave
 
Stat Int - What is Parliament's intention?
Stat Int - What is Parliament's intention?Stat Int - What is Parliament's intention?
Stat Int - What is Parliament's intention?shummi
 

Similar to From legal Language to computer language (2009) (18)

Principles of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systemsPrinciples of Health Informatics: Terminologies and classification systems
Principles of Health Informatics: Terminologies and classification systems
 
Logic and Law
Logic and LawLogic and Law
Logic and Law
 
Theory Cyberspace
Theory CyberspaceTheory Cyberspace
Theory Cyberspace
 
EDF2012 Andrew Farrow - (Copy)right information in the digital age
EDF2012   Andrew Farrow - (Copy)right information in the digital ageEDF2012   Andrew Farrow - (Copy)right information in the digital age
EDF2012 Andrew Farrow - (Copy)right information in the digital age
 
Finish the term paper using the following outline. In addition to th.docx
Finish the term paper using the following outline. In addition to th.docxFinish the term paper using the following outline. In addition to th.docx
Finish the term paper using the following outline. In addition to th.docx
 
Automated Discovery of Logical Fallacies in Legal Argumentation
Automated Discovery of Logical Fallacies in Legal ArgumentationAutomated Discovery of Logical Fallacies in Legal Argumentation
Automated Discovery of Logical Fallacies in Legal Argumentation
 
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONAUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
 
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONAUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
 
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATIONAUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
AUTOMATED DISCOVERY OF LOGICAL FALLACIES IN LEGAL ARGUMENTATION
 
Akoma Ntoso 2
Akoma Ntoso 2Akoma Ntoso 2
Akoma Ntoso 2
 
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGEAN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
 
An Intersemiotic Translation of Normative Utterances to Machine Language
An Intersemiotic Translation of Normative Utterances to Machine LanguageAn Intersemiotic Translation of Normative Utterances to Machine Language
An Intersemiotic Translation of Normative Utterances to Machine Language
 
Legal Information: an introduction for Information Science students
Legal Information: an introduction for Information Science studentsLegal Information: an introduction for Information Science students
Legal Information: an introduction for Information Science students
 
Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04
 
Semantic Modeling for Information Federation
Semantic Modeling for Information FederationSemantic Modeling for Information Federation
Semantic Modeling for Information Federation
 
Machine Aided Indexer
Machine Aided IndexerMachine Aided Indexer
Machine Aided Indexer
 
Resource Description and Acess
Resource Description and AcessResource Description and Acess
Resource Description and Acess
 
Stat Int - What is Parliament's intention?
Stat Int - What is Parliament's intention?Stat Int - What is Parliament's intention?
Stat Int - What is Parliament's intention?
 

From legal Language to computer language (2009)

  • 1. From Legal Language to Computer Language Radboud Winkels Emile de Maat
  • 2. Outline  Leibniz Center for Law  From sources of law to ICT applications  Structure  References  Content  Empirical results  Conclusions and current research 2 09/08/2010
  • 3. Leibniz Center for Law Computational Legal Theory and Legal Knowledge Management (Formal) Models of:  Legal Knowledge  Sources? Elementary legal concepts? Constituents of norms, coherence, …  Valid Legal Reasoning  Case assessment, causality, legal comparison, …
  • 4. Leibniz Center for Law -2 Applied Topics:  Improve quality of legal products  Legislation; decisions; advises, etc.  Inprove access to legal information and knowledge  Support teaching and learning of legal knowledge and skills  Legal organisations and change management
  • 5. Norms and Language 5 09/08/2010
  • 6. “Legal Engineering” Legislation can be seen as specification of a normative system. Legislation is underspecified. It suffers from anomalies: • inconsistencies • Circle reasoning • open evaluative terms • ambiguities 09/08/2010
  • 7. From Sources of Law to ICT Applications Formal Sources Applications Models G Term: This means doctrine that and has Case law on relations legislation Case law with those legislation p1,p2,… q1,q2,… O(α І β) concepts norms Tasks and reasoning Meta-knowledge FOLaw LLD Sartor CLIME LRI-core e-Court …
  • 8. Sources of Law  Most important source of „knowledge‟  Explicite links between sources and knowledge models essential for:  Validation  Maintenance (traceability)  Justification  Link at right level of detail (granularity)
  • 9. From Sources of Law to Formal Models  Automatic support :  Increase quality models and efficiency process  Increase inter-coder reliability Structured text with Model of Integrated NL text individual model of explicit and typed refs provisions meaning Recognizing Model and fragment classifying suggestions 8/9/2010
  • 11. Structure Marking Hoofdstuk 1 Paragraaf 1 Artikel 1 Paragraaf Artikel Lid Lid 1 Hoofdstuk Lid 2 Artikel 2 Artikel 3 Paragraaf 2 … … Hoofdstuk 2
  • 12. Relations between Sources of Law Legislation Case Law Adm. Case Doctrine Law
  • 13. Characteristics of Sources of Law  Legislation  Precise grammar for reference, clear identity and version criteria  (Adm.) Case Law  Precise grammar for reference, precise identity, no versions  Doctrine  Sloppy reference, no identity markings, sloppy versioning
  • 14. The Structure of References: Simple References  Simple references  Name Customs Law  Label and number Article 1  Label, number and publication date The law of April 13th, 2006  Indirect references That article
  • 15. The Structure of References: Complex References  Multi-valued references Articles 1, 5 and 12  Multi-layered references Customs Law, article 5, first member  Multi-valued, multi-layered references Customs Law, articles 1, 5, first member, and 12
  • 16. The Structure of References: Ordering  Zooming in Customs Law, article 5, first member  Zooming out first member, article 5, Customs Law  Zooming in, then zooming out article 5, first member, Customs Law
  • 17. The Structure of References: Miscellaneous  Opening words Article 12, opening words and parts 1 and 2  Exceptions Articles 5-21, with the exception of article 9  Each time Articles 5-10, each time the first member
  • 18. Complete and incomplete references  Complete references  Does mention the document that is being referred to Customs Law, article 5, first member  Incomplete references  Does not mention the document that is being referred to Article 5, first member
  • 19. Finding references  Use context-free grammar, e.g: <article>  “article” <designation> [[“,”] <lower_level>] [ “-“ <designation> [[“,”] <lower_level>] [ ( [“,”]<designation> [[“,”] <lower_level>])* “and” <designation> [[“,”] <lower_level>] ] [[“,”] [“or”] <higher_level>]
  • 20. Problems  Names cannot be recognised  Add names as a list to the grammar  Headings will (falsely) be recognised as a reference  Mark headings beforehand; use Metalex as input
  • 21. Resolving references  Incomplete references  Reference needs to be completed from context  Within a regulation, an incomplete reference refers to the regulation itself  Within commentaries, incomplete reference refer back to an earlier made complete reference
  • 22. Automatic Parsing 1. Determine identity source  In doc: Title, citation title  In metadata 2. Parse document  “Natural language” – model sentences 3. Find references 4. Determine type reference  E.g. attribution and delegation of power; definitions; enactment; change 5. Determine identity goal  I.e. the thing it refers to
  • 23. Results simple parser  99% of all simple references correctly identified  95% of all complex references correctly identified  Few false positives  Works adapted for Flemish law  Opsomer (2009)
  • 24. Causes of errors  Failing to detect a reference  Missing labels or names  Textual errors  False positives  Homonyms: a label has a second meaning in addition to being part of a reference the first member
  • 25. Conclusions  Automatic detection of references is entirely feasible  No complicated methods are needed; regular grammars may suffice
  • 26. From Sources of Law to Formal Models From structured text to models of individual sentences… Structured text with Model of Integrated NL text individual model of explicit and typed refs provisions meaning Recognizing Model and fragment classifying suggestions 8/9/2010
  • 28. Automatic modelling – Sentences (1)  Start with sentences  Independent unit.  Often marked, otherwise easy to recognize  Different types of sentences require different translation, different model
  • 29. Conclusions From Earlier Research  Dutch Law:  Provisions usually match one sentence  Several types of sentences can be easily distinguished  Limited amount of language constructs per type  Automatic recognition and classification seems doable  Types not specific for Dutch law (cf. Tiscornia e.a. for Italian law)
  • 30. Categories 1. Definitions 6. Value Assignment 2. Deeming Provision 7. Change* 3. Norm – 8. Delegation Right/Permission 9. Enactment Date 4. Norm – 10. Citation Title Obligation/Duty 11. Penalization 5. Application Provision Each category uses specific language constructs that can be used to identify them.
  • 31. Example: Penalisation Provision Penalisation provisions set punishments for breaking the law, and mark such an act as either a misdemeanour or a crime. Mining Act, article 133 1.Breaking article 43, sub 2, is punished with a monetary fine of the second category. 2.The fact marked as punishable by this article is a misdemeanour.
  • 32. Example: Norms (1)  Normative sentences form the core of each regulation, stating obligations and rights  Rights can be denoted by a wide range of verbs: can, may, is allowed to, has a right to, …  Similarly, obligations can be denoted by the use of certain verbs: is prohibited, is charged with  Many variations
  • 33. Example: Norms (2)  However, obligations are often represented as a “statement of fact” Funeral Act, article 46, section 1 No bodies are interred on a closed cemetery.  May be about any subject  No common signal words or patterns  Preferred by the Guidelines for Legal Drafting
  • 34. Experiment (1)  Classifier  Based on 88 patterns  JAVA  Based on input in which sentences and quoted text have already been marked (MetaLex)  Assumes a statement of fact norm if no explicit pattern is used
  • 35. Experiment (2) - Lists  Lists are classified based on its header, if this contains a pattern; otherwise, each item is classified (without the header) Tobacco Act, article 1 In this law, and in the stipulations based on it, is understood by: a. tobacco products: … ; b. Our Minister: …; c. appendix: …; …
  • 36. Experiment – Test Set  18 texts  One royal decree  Three new bills  Fourteen amending bills  All „recent‟  No overlap with the training set  654 sentences  592 „regular‟ sentences  62 lists
  • 37. Results per Document (1) Sentence List Source Total Correct % Total Correct Partial % Type Royal Decree Stb. 26 23 97% 4 4 0 75% New 1945, F 214 Bill 20 585 nr. 2 31 30 97% 4 3 1 75% New Bill 22 139 nr. 2 22 20 91% 2 2 100% New Bill 27 570 nr. 4 21 16 76% Change Bill 27 611 nr. 2 11 11 100% 1 1 100% Change Relative low score due to Bill 30 411 nr. 2 141 128 91% 25 20 3 80% New a misapplied pattern (3x) Bill 30 435 nr. 2 40 39 98% 4 3 1 75% Change Bill 30 583 nr. A 27 27 100% Change Bill 31 531 nr. 2 3 3 100% Change
  • 38. Results per Document (2) Sentence List Source Total Correct % Total Correct Partial % Type Bill 31 537 nr. 2 29 29 100% 2 2 0 100% Change Bill 31 540 nr. 2 7 7 100% Change Bill 31 541 nr. 2 8 8 100% Change Bill 31 713 nr. 2 7 6 86% 2 2 0 100% Change Bill 31 722 nr. 2 31 22 71% 6 5 0 83% Change Bill 31 726 nr. 2 78 67 86% 2 1 1 50% Change Bill 31 832 nr. 2 7 7 100% 3 3Relative low100% due to score Change a pattern appearing in an Bill 31 833 nr. 2 4 4 100% auxiliary sentence Change (5x) Bill 31 835 nr. 2 99 90 91% 7 4 3 57% Change Total 592 537 91% 62 50 9 81%
  • 39. Overall Results  91% of all regular sentences have been correctly classified  71%-100% over laws  81% of all lists have been correctly classified  50%-100% over laws
  • 40. Results per Type (1) Type In corpus Missed False Definition 2% 12 1 0 Norm - Right/Permission 11% 64 4 13 Norm - Duty 5% 29 0 1 Delegation 3% 19 6 0 Publication Provision 1% 4 0 0 Application Provision 7% 40 1 8 Enactment Date 3% 17 1 0 Citation Title 1% 3 0 0 Value Assignment/Change 0% 1 0 0 Penalisation 0% 0 0 2 Change 41% 241 16 8 Mixed Type 1% 3 3 0 Norm - Statement of Fact (default) 27% 159 23 23 Total 592 55 55
  • 41. Results per Type (2)  Mostly norms and modifications  right/permission 11%  obligation/duty 27% + 5%  change 41%  Several definitions and application provisions  Barely any of the others
  • 42. Results – Patterns Used Patterns Patterns Type Known Used Definition 14 5 Norm - Right/Permission 17 3  About 50% of the Norm - Obligation/Duty Delegation 15 7 8 5 known patterns has Publication Provision 1 1 been used Application Provision Enactment Date 5 1 5 1  Difference in age Citation Title 2 2 between test and Value Assignment 8 1 training set? Penalisation 3 1 Change - Scope 2 2  Underrepresented Change - Insertion 4 4 sentence types Change - Replacement 3 3 Change - Repeal 2 1 Change - Renumbering 3 2 87 44
  • 43. Problems (1)  Patterns appearing in auxiliary sentences instead of the main sentence  Mostly happens with rights and application provisions:  If x has the right to …  If x is able to …  If x applies …
  • 44. Problems (2)  Lists need a more serious approach  Some can be classified by the header only;  Some can be classified by the list item only;  Some can only be classified by the header combined with the item.  Lists need to be converted to individual sentences (header plus list item)
  • 45. Minor problems  Missing patterns  Mixed sentences  Difficult to solve, but does not occur often  Patterns used for other purposes  Repeal of fines instead of repeal of regulations  Specific patterns for specific laws  E.g. Tax Law (value assignment)
  • 46. Conclusions  This (symbolic) approach is feasible  Using obligation as a default category seems acceptable  No major categories are missing  We expect it to generalise to other Dutch regulations  The approach could be used for other (civil) jurisdictions and languages  Biagioli et al. (2005) similar results for Italian law but statistical approach
  • 47. Next Step Structured text with Model of Integrated NL text individual model of explicit and typed refs provisions meaning Recognizing Model and fragment classifying suggestions 8/9/2010
  • 48. Next Step  Divide sentence in different terms that are linked through relations  Classification (and base pattern) gives a rough division, and a rough relation  More detailed division of the sentences is needed  Using of Dutch grammar parsers
  • 50. Automatic modelling – Reference parser  References are important in legal texts  Useful when the computer understands these better  Better understanding is possible  References do not fit well in “normal Dutch sentence structure”  Separate reference parser
  • 51. Things to think about – Granularity  Granulary – How far do we want to go with the splitting of text? Liquor: those drinks, that, at a temperature of twenty degrees Celsius, consist of alcohol for at least fifteen volume percents, with the exception of wine.
  • 52. Thinks to think about – Norms  Classification distinguishes only a limited set of norms  Do we need more distinction?  For computer calculations?  For interaction with the user
  • 53. Things to think about - Procedures  Procedures use the same language constructs as other norms (at least in Dutch), but:  Procedures have a more specific context  Procedures have a stronger ordering
  • 54. Overall Conclusions  Distance from Legal Language to Computer Language is too big to cross in one step  Automatic modelling support is already partially possible:  Structure and References  Classification of sentences in legislation  Generalisation to all Dutch legislation possible  Same method for other languages and jurisdictions  Generalisation to other sources of law more difficult
  • 55. Questions? winkels@uva.nl e.demaat@uva.nl www.LeibnizCenter.org 59 09/08/2010