Be the first to like this
Recent progress in incorporating word order and semantics to the decades-old, tried-and-tested bag-of-words representation of text meaning has yielded promising results in computational text classification and analysis. This development, and the availability of a large number of legal rulings from the PTAB (Patent Trial and Appeal Board) motivated us to revisit possibilities for practical, computational models of legal relevance - starting with this narrow and approachable niche of jurisprudence. We present results from our analysis and experiments towards this goal using a corpus of approximately 8000 rulings from the PTAB. This work makes three important contributions towards the development of models for legal relevance semantics: (a) Using state-of-art Natural Language Processing (NLP) methods, we characterize the diversity and types of semantic relationships that are implicit in practical judgements of legal relevance at the PTAB (b) We achieve new state-of-art results on practical information retrieval using our customized semantic representations on this corpus (c) We outline promising avenues for future work in the area - including preliminary evidence from human-in-loop interaction, and new forms of text representation developed using input from over a hundred interviews with practitioners in the field. Finally, we argue that PTAB relevance is a practical and realistic baseline for performance measurement - with the desirable property of evaluating NLP improvements against “real world” legal judgement.