Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and the Modern Information Economy
July 3, 2017
Chicago, IL USA
For Immediate Release
WHY WE’RE OPEN-SOURCING CONTRAXSUITE™
LEGALTECH IN THE MODERN INFORMATION ECONOMY
I. OUR ANNOUNCEMENT
Over the last decade, we’ve spent many thousands of effort-hours developing the contract
analytics and document analytics tools that we use with clients. These tools, based on enterprise-
quality open source frameworks for natural language processing, machine learning, and optical
character recognition, have allowed us to quickly and easily attack many problems, from securities
filings and court opinions to articles of incorporation and lease agreements.
Today, we are proud to announce that we plan to open source the development of our core
platform for contract analytics and document analytics - ContraxSuite. Starting on August 1st, this
code base and our public development roadmap will be hosted on Github under a permissive
open-source licensing model that will allow most organizations to quickly and freely implement
and customize their own contract and document analytics. Like Redhat does for Linux, we will
provide support, customization, and data services to "cover the last mile" for those organizations
who need it.
We believe that a very important future for law lies in its central role in facilitating and regulating
the modern information economy. But unless we start treating law itself like the production of
information, we’ll never get there. Before we can solve big problems with smart contracts, we
need to start by structuring existing legacy contracts. We hope our actions today will help
lawyers, companies, and other LegalTech providers accelerate the pace of improvement and
innovation through more open collaboration.
II. THE RISE OF THE OPEN SOURCE ECONOMY
Over the course of human history, many models of economic development and innovation have
emerged. Some of these models, like the public stock company, are quite new in the scheme of
things. Others, like the community of "natural" philosophers (scientists) or mutual insurance, are
very old. But in all cases - whether intangible insurance or tangible iPad - there is a flow from idea
to execution to economic value.
One crude way of comparing these economic models is to examine how information is held -
publicly or privately. Information, not opposable thumbs, as Cesar Hidalgo elegantly explains, is
the secret weapon of our species – our only defense against the intrinsic chaos and decay of the
universe. And in the mode of economic analysis descended from Adam Smith, private information
is the key. Knowledge and know-how enable enterprises to produce and profit. This principle of
private information has guided most enterprises over the last few centuries, from the coveted
trade route maps of the 18th century to the modern Coca Cola recipe.
So why, especially in the last half-century, have we seen the "open source" or "peer-to-peer"
model of information grow? Google, Apple, and Facebook, three companies worth nearly two
trillion dollars combined, have given away thousands of software projects worth billions of dollars
in effort-hour cost. Have we all gone mad? Or is there something else going on – an emergent value
to holding knowledge more publicly in the modern world?
There are many attempts to answer this fascinating question, and these attempts both challenge
and enlighten our understanding of human behavior, economics, and law. However, especially as it
relates to software, we lean on the words of Prof. David Agarwal:
"For many [...], software is only a semi-finished good that generates little value until the
code has undergone revision by the user. Thus, creating the ultimate finished product will
require a sequence of motivating incentives."
III. LAW AS OPEN SOURCE AND THE LAST MILE PROBLEM
The first sentence above is as true of legal documents - articles of incorporation, operating
agreements, corporate filings, contracts and agreements - as it is of software. While there are
thousands of resources for corporate governance or contracting available online, none of these are
"finished" goods for another party. In some cases, "finishing" may simply required updating the
parties and dates, like a restaurant microwaving a pre-cooked cut of meat. In other cases,
"finishing" may actually involve novel structures or more creative redrafting, like a chef growing
his own produce or combining flavors in unexpected ways.
As to the second sentence - "[...] the ultimate finished product will require a sequence of
motivating incentives" - we arrive back to ContraxSuite. Software products designed to assist in
the drafting and analysis of legal documents are perfect examples of semi-finished goods. Alone,
no software product can incorporate an entity or enter into a sales agreement or manage the
geopolitical risk of supplier networks (let’s ignore Smart Contracts and DAO-like organizations, for
now). Only with the assistance of legal professionals can this semi-finished software deliver value
to the organization and its clients.
In our experience, few legal departments and law firms debate that their legal documents contain
valuable information. Analytics can provide insights into a wide array of opportunities and risks.
Standardization can remove frictions for core business operations and increase the rate and
quality of transactions.
But when you ask these organizations to pay a per-document fee for software that almost always
requires additional customization or produces "unfinished" results, their excitement turns into
hesitation. Contract analytic software, like Google’s TensorFlow or the Linux Kernel, does not
generate value by itself; human capital is required to "cover the last mile" that actually solves the
problem. This is why data scientists have not yet been put out of business by TensorFlow and why
Redhat and Oracle still "sell Linux" for billions of dollars per year.
We have a "last mile" problem in the legal arena. Much like Linux and data science, contract
analytics is largely about the combination of well-known practices with large-scale, high-quality
data. Contracts are natural language encoded in either analog or digital format, and this language
is unlocked and encoded with technologies like optical character recognition (OCR) and natural
language processing (NLP). These encodings are then mapped back to real-world business
problems through techniques like clustering or classification, two types of machine learning (ML)
algorithms. None of these technologies or techniques above are proprietary or novel, and some
version of these ideas have been available nearly as long as there have been digital computers.
The real challenge in contract or other related forms of document analytics is to develop the so -
called "training data" - the set of documents and labels used to "teach" the machine what separates
a lease agreement from a purchase/sale agreement from a retirement benefits plan. Herein lies
the true value of the current software and service providers. But, paradoxically, almost all
providers get their information from one of two sources - either public sources of agreements, like
the SEC’s EDGAR database or evidence from public courts, or private sources of agreements - their
clients. Many organizations have therefore paid for the privilege to give away their own
information so that someone else can profit.
By open-sourcing ContraxSuite, we hope to change this dynamic. The analysis and standardization
of contracts and corporate governance material is key to the transformation of our economy. But
blockchain and Smart Contracts aside, there are significant improvements in risk management,
compliance, and profitability that can be gained by treating contracts as valuable data. Until legal
departments and law firms can be "sequentially motivated," to borrow Professor Agarwal’s
language, we will not see this maturation of the industry.
In the near future, we’ll be revealing more details about this open source strategy - including more
detail on academic and industry partnerships, support and customization services, and our open-
source license model. In the meantime, we hope to get everyone thinking fundamentally about
how we do business in legal tech. What does the client really want - your software license or a
IV. A CHALLENGE TO PROVIDERS, DEVELOPERS & CUSTOMERS
While some in the legal technology community will certainly taut their ability to outperform our
open source framework on some use cases today, we believe that they will find themselves on the
wrong side of history. We encourage others in legal technology and related domains to follow our
lead and think hard about whether closed code is really the right strategy. By committing to open
source and allowing others to validate, improve, and maintain our shared infrastructure, any gaps
in quality or functionality will soon fade away.
More critically, the hype and "vaporware" that pervade legal technology do a disservice to the
overall efforts towards innovation. We would like to move the field forward through the hard
work of iterative, open methodological improvement. This hype and vaporware has been enabled
by a lack of transparency and, to be honest, a lack of sophistication on behalf of customers, but
going forward, both the consumer and developer community should be able inspect the quality of
any existing offering. Our co-founders, as leading educators of the next generation of attorneys,
have committed to making sure that the General Counsel and Managing Partner of the future
won’t be fooled. Others can chase the temporary rents of the middle game, but we’re here to make
moves that support the long-term maturation of the overall legal industry. We hope you will join
Michael J. Bommarito Daniel Martin Katz
CEO @ LexPredict CSO @ LexPredict