SlideShare a Scribd company logo
1 of 17
Download to read offline
Elasticsearch Goes to Congress
Indexing law
1
ari@xcential.com
xcential.com
@arihersh
Ari Hershowitz
2
Who am I?
Ari Hershowitz
• Neuroscience, environmental
advocacy, etc (prehistory - 2007)
• Lawyer (Georgetown Law 2007)
• Legal technology (2008 - Now)
• Dad
ari@xcential.com
@arihersh
github.com/aih/FlatGov
blog.linkedlegislation.com
3
My goals today
1. Show how we’re using search in law.
2. Get you to help us do it better.
4
Overview
• A brief history of legal ‘search’
• Unique challenges of search for law
• Our work in Congress and beyond
• A vision for modern legal search
5
6
Lawyers like lists
https://www.loc.gov/resource/rbpe.24203100/
7
Lawyers like lists
8
Incremental indexing
1800’s Shepard’s citator
printed gummed paper
to add to law books to
track new references to
prior law
9
10
Legal search
• Lawyers hate relevancy
Complete >>> sort order
Precision >>> efficiency
• Lawyers hate stop-words: every ‘and’ is sacred
• Legal researchers love booleans
11
Our work
12
Office of the House
Parliamentarian
•Cards listing daily
precedents for more
than 200 years
•Thousands of
precedents
•About 1 Gb of data
Index cards -> SQL -> Elasticsearch
Our work
13
Bill similarity
•Open source
•“Plagiarism” software for
bills in Congress
•Using
Elasticsearch/Lucene
‘more like this’
Bill similarity: one section
14
SEC. 2. IMPROVING ACCESS TO INFLUENTIAL VISITOR ACCESS RECORDS.
(a) Definitions.—In this section:
(1) COVERED LOCATION.—The term “covered location” means—
(A) the White House;
(B) the residence of the Vice President; and
(C) any other location at which the President or the Vice President regularly conducts official business.
(2) COVERED RECORDS.—The term “covered records” means information relating to a visit at a covered location, which shall include—
(A) the name of each visitor at the covered location;
Bill similarity: many sections
15
Bill 116hr1500ih
Sections
1
100
101 => similar sections in other bills
102
200
…
Many ways to calculate cumulative similarity.
Our work
16
State law
•Ingest XML
•Link to sources
•Translate user query to
ES DSL
•Create a user interface
for legal search
Summary
• Law is a natural fit for indexing, search
and similarity metrics
• Small data (1 GB) is important, too
• Stop working on logs and work with us
on legislative text
17

More Related Content

Similar to Elasticsearch Goes to Congress

{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...Daniel Katz
 
Harris County: Using Elastic to Accelerate Investigations
Harris County: Using Elastic to Accelerate InvestigationsHarris County: Using Elastic to Accelerate Investigations
Harris County: Using Elastic to Accelerate InvestigationsElasticsearch
 
A survey of electronic research alternatives to lexis and westlaw in law firms
A survey of electronic research alternatives to lexis and westlaw in law firmsA survey of electronic research alternatives to lexis and westlaw in law firms
A survey of electronic research alternatives to lexis and westlaw in law firmsDillard University Library
 
Legal Research: Advanced Techniques and Research Paradigms
Legal Research: Advanced Techniques and Research ParadigmsLegal Research: Advanced Techniques and Research Paradigms
Legal Research: Advanced Techniques and Research ParadigmsNeal Axton
 
Highly Relevant Search Result Ranking for Law Enforcement
Highly Relevant Search Result Ranking for Law EnforcementHighly Relevant Search Result Ranking for Law Enforcement
Highly Relevant Search Result Ranking for Law EnforcementLucidworks (Archived)
 
Fdsysforlscmfeb2010 100916084734-phpapp02
Fdsysforlscmfeb2010 100916084734-phpapp02Fdsysforlscmfeb2010 100916084734-phpapp02
Fdsysforlscmfeb2010 100916084734-phpapp02Elaine Sandberg
 
Making Property Law Wiki Real
Making Property Law Wiki RealMaking Property Law Wiki Real
Making Property Law Wiki RealNick Gogerty
 
Guide privatelawlibraryppt
Guide privatelawlibrarypptGuide privatelawlibraryppt
Guide privatelawlibrarypptLee Peoples
 
An engineer's perspective on law
An engineer's perspective on lawAn engineer's perspective on law
An engineer's perspective on lawBrent Britton
 
Money & Politics: Illuminating the Connection
Money & Politics: Illuminating the ConnectionMoney & Politics: Illuminating the Connection
Money & Politics: Illuminating the ConnectionSteve Toub
 
Exploring free online legal sources
Exploring free online legal sourcesExploring free online legal sources
Exploring free online legal sourcesamysmccoy
 
Research Refresher: Statutes & Legislative History
Research Refresher: Statutes & Legislative HistoryResearch Refresher: Statutes & Legislative History
Research Refresher: Statutes & Legislative Historydukelawreference
 
Intro legal practice not for alr
Intro legal practice   not for alrIntro legal practice   not for alr
Intro legal practice not for alrLee Peoples
 
LRW: Introduction to Legal Research
LRW: Introduction to Legal ResearchLRW: Introduction to Legal Research
LRW: Introduction to Legal ResearchCharlotte Gill
 
The Bluebook for Moot Court
The Bluebook for Moot CourtThe Bluebook for Moot Court
The Bluebook for Moot Courtrmortiz66
 
Research Skills for International Law Moot Court Competition 'Phillip C. Jess...
Research Skills for International Law Moot Court Competition 'Phillip C. Jess...Research Skills for International Law Moot Court Competition 'Phillip C. Jess...
Research Skills for International Law Moot Court Competition 'Phillip C. Jess...Riyad Febrian Anwar
 
Scraping talk public
Scraping talk publicScraping talk public
Scraping talk publicNesta
 

Similar to Elasticsearch Goes to Congress (18)

{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
 
Harris County: Using Elastic to Accelerate Investigations
Harris County: Using Elastic to Accelerate InvestigationsHarris County: Using Elastic to Accelerate Investigations
Harris County: Using Elastic to Accelerate Investigations
 
A survey of electronic research alternatives to lexis and westlaw in law firms
A survey of electronic research alternatives to lexis and westlaw in law firmsA survey of electronic research alternatives to lexis and westlaw in law firms
A survey of electronic research alternatives to lexis and westlaw in law firms
 
Legal Research: Advanced Techniques and Research Paradigms
Legal Research: Advanced Techniques and Research ParadigmsLegal Research: Advanced Techniques and Research Paradigms
Legal Research: Advanced Techniques and Research Paradigms
 
Highly Relevant Search Result Ranking for Law Enforcement
Highly Relevant Search Result Ranking for Law EnforcementHighly Relevant Search Result Ranking for Law Enforcement
Highly Relevant Search Result Ranking for Law Enforcement
 
Fdsysforlscmfeb2010 100916084734-phpapp02
Fdsysforlscmfeb2010 100916084734-phpapp02Fdsysforlscmfeb2010 100916084734-phpapp02
Fdsysforlscmfeb2010 100916084734-phpapp02
 
Making Property Law Wiki Real
Making Property Law Wiki RealMaking Property Law Wiki Real
Making Property Law Wiki Real
 
Guide privatelawlibraryppt
Guide privatelawlibrarypptGuide privatelawlibraryppt
Guide privatelawlibraryppt
 
An engineer's perspective on law
An engineer's perspective on lawAn engineer's perspective on law
An engineer's perspective on law
 
Money & Politics: Illuminating the Connection
Money & Politics: Illuminating the ConnectionMoney & Politics: Illuminating the Connection
Money & Politics: Illuminating the Connection
 
Exploring free online legal sources
Exploring free online legal sourcesExploring free online legal sources
Exploring free online legal sources
 
IndigoBook
IndigoBookIndigoBook
IndigoBook
 
Research Refresher: Statutes & Legislative History
Research Refresher: Statutes & Legislative HistoryResearch Refresher: Statutes & Legislative History
Research Refresher: Statutes & Legislative History
 
Intro legal practice not for alr
Intro legal practice   not for alrIntro legal practice   not for alr
Intro legal practice not for alr
 
LRW: Introduction to Legal Research
LRW: Introduction to Legal ResearchLRW: Introduction to Legal Research
LRW: Introduction to Legal Research
 
The Bluebook for Moot Court
The Bluebook for Moot CourtThe Bluebook for Moot Court
The Bluebook for Moot Court
 
Research Skills for International Law Moot Court Competition 'Phillip C. Jess...
Research Skills for International Law Moot Court Competition 'Phillip C. Jess...Research Skills for International Law Moot Court Competition 'Phillip C. Jess...
Research Skills for International Law Moot Court Competition 'Phillip C. Jess...
 
Scraping talk public
Scraping talk publicScraping talk public
Scraping talk public
 

More from FaithWestdorp

Using Elastiknn for exact and approximate nearest neighbor search
Using Elastiknn for exact and approximate nearest neighbor searchUsing Elastiknn for exact and approximate nearest neighbor search
Using Elastiknn for exact and approximate nearest neighbor searchFaithWestdorp
 
Observability from the Home
Observability from the HomeObservability from the Home
Observability from the HomeFaithWestdorp
 
Eliminate your zombie technology ray myers - 11-5-2020
Eliminate your zombie technology   ray myers - 11-5-2020Eliminate your zombie technology   ray myers - 11-5-2020
Eliminate your zombie technology ray myers - 11-5-2020FaithWestdorp
 
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
Mejorando las busquedas en nuestras aplicaciones web con elasticsearchMejorando las busquedas en nuestras aplicaciones web con elasticsearch
Mejorando las busquedas en nuestras aplicaciones web con elasticsearchFaithWestdorp
 
Evolving with Elastic: GetSet Learning
Evolving with Elastic: GetSet LearningEvolving with Elastic: GetSet Learning
Evolving with Elastic: GetSet LearningFaithWestdorp
 
EmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
EmPOW: Integrating Attack Behavior Intelligence into Logstash PluginsEmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
EmPOW: Integrating Attack Behavior Intelligence into Logstash PluginsFaithWestdorp
 
Examining OpenData with a Search Index using Elasticsearch
Examining OpenData with a Search Index using ElasticsearchExamining OpenData with a Search Index using Elasticsearch
Examining OpenData with a Search Index using ElasticsearchFaithWestdorp
 
From the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFrom the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFaithWestdorp
 
Logstash and Maxmind: not just for GEOIP anymore
Logstash and Maxmind: not just for GEOIP anymoreLogstash and Maxmind: not just for GEOIP anymore
Logstash and Maxmind: not just for GEOIP anymoreFaithWestdorp
 
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
Elasticsearch's aggregations & esctl in action  or how i built a cli tool...Elasticsearch's aggregations & esctl in action  or how i built a cli tool...
Elasticsearch's aggregations & esctl in action or how i built a cli tool...FaithWestdorp
 
Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
 Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex... Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...FaithWestdorp
 
Introduction to machine learning using Elastic
Introduction to machine learning using ElasticIntroduction to machine learning using Elastic
Introduction to machine learning using ElasticFaithWestdorp
 
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...FaithWestdorp
 
Elastic Observability
Elastic Observability Elastic Observability
Elastic Observability FaithWestdorp
 
Threat hunting with Elastic APM
Threat hunting with Elastic APMThreat hunting with Elastic APM
Threat hunting with Elastic APMFaithWestdorp
 
Guide to Data Visualization in Kibana
Guide to Data Visualization in KibanaGuide to Data Visualization in Kibana
Guide to Data Visualization in KibanaFaithWestdorp
 
Elastic's recommendation on keeping services up and running with real-time vi...
Elastic's recommendation on keeping services up and running with real-time vi...Elastic's recommendation on keeping services up and running with real-time vi...
Elastic's recommendation on keeping services up and running with real-time vi...FaithWestdorp
 
Esctl in action elastic user group presentation aug 25 2020
Esctl in action   elastic user group presentation aug 25 2020Esctl in action   elastic user group presentation aug 25 2020
Esctl in action elastic user group presentation aug 25 2020FaithWestdorp
 

More from FaithWestdorp (18)

Using Elastiknn for exact and approximate nearest neighbor search
Using Elastiknn for exact and approximate nearest neighbor searchUsing Elastiknn for exact and approximate nearest neighbor search
Using Elastiknn for exact and approximate nearest neighbor search
 
Observability from the Home
Observability from the HomeObservability from the Home
Observability from the Home
 
Eliminate your zombie technology ray myers - 11-5-2020
Eliminate your zombie technology   ray myers - 11-5-2020Eliminate your zombie technology   ray myers - 11-5-2020
Eliminate your zombie technology ray myers - 11-5-2020
 
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
Mejorando las busquedas en nuestras aplicaciones web con elasticsearchMejorando las busquedas en nuestras aplicaciones web con elasticsearch
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
 
Evolving with Elastic: GetSet Learning
Evolving with Elastic: GetSet LearningEvolving with Elastic: GetSet Learning
Evolving with Elastic: GetSet Learning
 
EmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
EmPOW: Integrating Attack Behavior Intelligence into Logstash PluginsEmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
EmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
 
Examining OpenData with a Search Index using Elasticsearch
Examining OpenData with a Search Index using ElasticsearchExamining OpenData with a Search Index using Elasticsearch
Examining OpenData with a Search Index using Elasticsearch
 
From the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFrom the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deployment
 
Logstash and Maxmind: not just for GEOIP anymore
Logstash and Maxmind: not just for GEOIP anymoreLogstash and Maxmind: not just for GEOIP anymore
Logstash and Maxmind: not just for GEOIP anymore
 
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
Elasticsearch's aggregations & esctl in action  or how i built a cli tool...Elasticsearch's aggregations & esctl in action  or how i built a cli tool...
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
 
Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
 Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex... Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
 
Introduction to machine learning using Elastic
Introduction to machine learning using ElasticIntroduction to machine learning using Elastic
Introduction to machine learning using Elastic
 
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
 
Elastic Observability
Elastic Observability Elastic Observability
Elastic Observability
 
Threat hunting with Elastic APM
Threat hunting with Elastic APMThreat hunting with Elastic APM
Threat hunting with Elastic APM
 
Guide to Data Visualization in Kibana
Guide to Data Visualization in KibanaGuide to Data Visualization in Kibana
Guide to Data Visualization in Kibana
 
Elastic's recommendation on keeping services up and running with real-time vi...
Elastic's recommendation on keeping services up and running with real-time vi...Elastic's recommendation on keeping services up and running with real-time vi...
Elastic's recommendation on keeping services up and running with real-time vi...
 
Esctl in action elastic user group presentation aug 25 2020
Esctl in action   elastic user group presentation aug 25 2020Esctl in action   elastic user group presentation aug 25 2020
Esctl in action elastic user group presentation aug 25 2020
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Elasticsearch Goes to Congress

  • 1. Elasticsearch Goes to Congress Indexing law 1 ari@xcential.com xcential.com @arihersh Ari Hershowitz
  • 2. 2 Who am I? Ari Hershowitz • Neuroscience, environmental advocacy, etc (prehistory - 2007) • Lawyer (Georgetown Law 2007) • Legal technology (2008 - Now) • Dad ari@xcential.com @arihersh github.com/aih/FlatGov blog.linkedlegislation.com
  • 3. 3 My goals today 1. Show how we’re using search in law. 2. Get you to help us do it better.
  • 4. 4 Overview • A brief history of legal ‘search’ • Unique challenges of search for law • Our work in Congress and beyond • A vision for modern legal search
  • 5. 5
  • 8. 8 Incremental indexing 1800’s Shepard’s citator printed gummed paper to add to law books to track new references to prior law
  • 9. 9
  • 10. 10 Legal search • Lawyers hate relevancy Complete >>> sort order Precision >>> efficiency • Lawyers hate stop-words: every ‘and’ is sacred • Legal researchers love booleans
  • 11. 11
  • 12. Our work 12 Office of the House Parliamentarian •Cards listing daily precedents for more than 200 years •Thousands of precedents •About 1 Gb of data Index cards -> SQL -> Elasticsearch
  • 13. Our work 13 Bill similarity •Open source •“Plagiarism” software for bills in Congress •Using Elasticsearch/Lucene ‘more like this’
  • 14. Bill similarity: one section 14 SEC. 2. IMPROVING ACCESS TO INFLUENTIAL VISITOR ACCESS RECORDS. (a) Definitions.—In this section: (1) COVERED LOCATION.—The term “covered location” means— (A) the White House; (B) the residence of the Vice President; and (C) any other location at which the President or the Vice President regularly conducts official business. (2) COVERED RECORDS.—The term “covered records” means information relating to a visit at a covered location, which shall include— (A) the name of each visitor at the covered location;
  • 15. Bill similarity: many sections 15 Bill 116hr1500ih Sections 1 100 101 => similar sections in other bills 102 200 … Many ways to calculate cumulative similarity.
  • 16. Our work 16 State law •Ingest XML •Link to sources •Translate user query to ES DSL •Create a user interface for legal search
  • 17. Summary • Law is a natural fit for indexing, search and similarity metrics • Small data (1 GB) is important, too • Stop working on logs and work with us on legislative text 17