Open Legal Data
Workshop
Stanford University CodeX Center
Harry Surden
Professor of Law, University of Colorado
Affiliated Faculty: Stanford CodeX Center
Overview
• Computation has Revolutionized Many Fields
• Law is not one of them
• Data is required for Computational Analysis
• Legal data: neither accessible nor high-quality
Computational Legal Analysis
• Computational Law (Rules-based, deductive)
– Rules-based systems computing legal outcomes
– Represent laws in computer-understandable form
– Example: Turbotax; Computable contracts
• Machine Learning & Law (often Statistical)
– Algorithms that learn patterns from data
– Widely Used: self-driving cars, translation, etc
– Example: Supreme Court prediction project
Problem
• Computation has revolutionized:
– Finance, medicine, engineering, science, etc.
– Machine learning and computation used for
• Prediction, automation, outlier detection, analysis,
• New drug discovery, etc
• But computation has barely touched law
– Why?
To do computation
We need data to analyze
• Think of Law as data to be analyzed
– Federal statutes and administrative rules
– State and local laws and codes
– Judicial orders and opinions
– Lawsuit motions and evidence, etc.
Quality legal data not widely available for analysis
The Legal Data Bottleneck
• Legal data exists, but it is not
– Openly accessible (behind pay-walls)
– Structured in a way that makes analysis feasible
• Lack of widely accessible legal data
– Bottleneck to really interesting work in
• Machine learning and Law
• Computational law
For really interesting
computational work in law we need
• High-quality legal data that is
– Open and Accessible (little or no cost)
– Structured (machine readable)
– Standardized (common encoding formats)
– Coded (human-tagged and organized)
– Semantic (embedded with meaning)
Possibilities
harrysurden.com
Possibilities
lexpredict.com
Possibilities
• With high quality, structured legal data:
– Predictions of federal, state court decision
– Finding patterns or biases in legal data
– More computational law systems
– Advanced legal data visualizations
– Discovery of unknown connections or structures
– Outlier detection
– ….many more
Open Legal Data
• Legal data for computation that is:
– Open and Accessible (little or no cost)
– Structured (machine readable)
– Standardized (common encoding formats)
– Coded (human-tagged and organized)
– Semantic (embedded with meaning)

Open Legal Data Workshop at Stanford

  • 1.
    Open Legal Data Workshop StanfordUniversity CodeX Center Harry Surden Professor of Law, University of Colorado Affiliated Faculty: Stanford CodeX Center
  • 2.
    Overview • Computation hasRevolutionized Many Fields • Law is not one of them • Data is required for Computational Analysis • Legal data: neither accessible nor high-quality
  • 3.
    Computational Legal Analysis •Computational Law (Rules-based, deductive) – Rules-based systems computing legal outcomes – Represent laws in computer-understandable form – Example: Turbotax; Computable contracts • Machine Learning & Law (often Statistical) – Algorithms that learn patterns from data – Widely Used: self-driving cars, translation, etc – Example: Supreme Court prediction project
  • 4.
    Problem • Computation hasrevolutionized: – Finance, medicine, engineering, science, etc. – Machine learning and computation used for • Prediction, automation, outlier detection, analysis, • New drug discovery, etc • But computation has barely touched law – Why?
  • 5.
    To do computation Weneed data to analyze • Think of Law as data to be analyzed – Federal statutes and administrative rules – State and local laws and codes – Judicial orders and opinions – Lawsuit motions and evidence, etc. Quality legal data not widely available for analysis
  • 6.
    The Legal DataBottleneck • Legal data exists, but it is not – Openly accessible (behind pay-walls) – Structured in a way that makes analysis feasible • Lack of widely accessible legal data – Bottleneck to really interesting work in • Machine learning and Law • Computational law
  • 7.
    For really interesting computationalwork in law we need • High-quality legal data that is – Open and Accessible (little or no cost) – Structured (machine readable) – Standardized (common encoding formats) – Coded (human-tagged and organized) – Semantic (embedded with meaning)
  • 8.
  • 9.
  • 10.
    Possibilities • With highquality, structured legal data: – Predictions of federal, state court decision – Finding patterns or biases in legal data – More computational law systems – Advanced legal data visualizations – Discovery of unknown connections or structures – Outlier detection – ….many more
  • 11.
    Open Legal Data •Legal data for computation that is: – Open and Accessible (little or no cost) – Structured (machine readable) – Standardized (common encoding formats) – Coded (human-tagged and organized) – Semantic (embedded with meaning)