Legal Language Explorer
  Presentation @ 24th International Conference on
Legal Knowledge and Information Systems (Jurix 2011)




                             daniel martin katz
                           michael j bommarito ii
                                julie seaman
                               adam candeub
                              eugene agichtein
Law is characterized by a
relatively small number
  of highly influential
cases and jurists whose
conceptualization of law
   comes to dominate
Most Judges and Cases
Are Quickly Forgotten
But A Select Few Persist ...
Extreme Skewing is a Typical
      Feature of Legal Systems




  Katz, et al (2011)     Katz & Stafford (2010)         Geist (2009)
American Legal Academy   American Federal Judges   Austrian Supreme Court




    Smith (2007)               Smith (2007)         Post & Eisen (2000)
 U.S. Supreme Court          U.S. Law Reviews        NY Ct of Appeals
innovative use of
  language and
   metaphor is
     In Part
How Holmes
became “Holmes”
and how Posner
became “Posner”
And
 how certain cases
came to dominate the
 rest of the corpus
We Believe

The decision corpus is our
  archeological record
We Believe

 The decision corpus is our
   archeological record
  and that record can be
usefully explored using the
  tools of computational
         linguistics
We Want

To Democratize the Exploration
     of Legal Language ...
We Want

 To Democratize the Exploration
      of Legal Language ...

to folks who are not programmers
We Want

 To Democratize the Exploration
      of Legal Language ...

to folks who are not programmers

      to folks who are not
       technically inclined
We Develop a Simple Web
Interface that Leverages
    the Visual Cortex
Relies on Our Ability
    to Engage in
 Pattern Detection
To Explore Linguistic
 Patterns in Large
  Corpora of Legal
     Documents
We Start with the Full Text
 Corpus of Decisions of the
United States Supreme Court
        1791-2005
Develop a N-Gram Based Explorer
with a Portal to the Full Text, etc.
From Text to N-Grams...
Generate the N-Gram Mapping
  (1) extract the opinion text from each
  case and store it as a sequence of
  characters

  (2) convert this sequence of characters
  into a sequence of words through Word
  Tokenization (Penn Tree Bank Algorithm)

  (3) iterate over words in the length M
  sequence from index 1 to index M + 1 – N
Generate the N-Gram Mapping
Legal Language Explorer:
 A Brief Tour of The Interface
Go to
LegalLanguageExplorer.com
 And Access This Interface
Notice The Default Search Is
Interstate Commerce, Railroad, Deed
Notice The Default Search Is
Interstate Commerce, Railroad, Deed
For Each Comma Separated Phrase
The Frequency Plot Appears Below
For Each Comma Separated Phrase
The Frequency Plot Appears Below
Currently Supporting Every U.S. Supreme
   Court Decision From 1791 - 2005
Currently Supporting Every U.S. Supreme
   Court Decision From 1791 - 2005
Years Can Be Changed
   By The End User
Years Can Be Changed
   By The End User
Click Here to Access Various
 Advanced Search Features
The Advanced
Search Features
Are the Observed Trends
a Function of changes in the
 volume of the case docket?
Normalization allows End
 User to control for the
 Size of the Case Docket
Normalization allows End
 User to control for the
 Size of the Case Docket
Change the Graph Type
   By Clicking Here
An Alternative Presentation
        of the Data
Export the
    Chart Data
(If You Want to Replot in
   Stata, Excel, R, etc.)
Export the
    Chart Data
(If You Want to Replot in
   Stata, Excel, R, etc.)
The Results
(If You Want to Replot in
   Stata, Excel, R, etc.)
Access a Case List and
  Full Text Results
Here are the
 Returned
  Results
Download
 List in
  Excel
(or any .csv)
Download
 List in
  Excel
(or any .csv)
Access the
Full Text a
Particular
   Case
Click Through
  to Access
Results from
BulkResource.org
For More Information -
     Full Version of the Paper




http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1971953
For
Implementation
    Details
 (Including the Key Value
   Storage Method, etc.)



 http://www.michaelbommarito.com/
 blog/2011/12/16/building-legal-
 language-explorer-interactivity-
  and-drill-down-nosql-and-sql/
Slides will Be Posted to CLS Blog
      http://computationallegalstudies.com/
This is Only The Beginning
Stay Tuned for More in 2012

            :)




                  daniel martin katz
                  michael j bommarito ii
                  julie seaman
                  adam candeub
                  eugene agichtein

Presentation @ 24th International Conference on Legal Knowledge and Information Systems ( Jurix 2011 - Vienna )