The document discusses how innovation and intellectual property are evolving with new technologies. It notes that innovation is becoming more data-driven as artificial intelligence can analyze large amounts of data from sources like patents, documents, emails and reviews to help identify new ideas, problems to solve, and market opportunities. It encourages companies to document all aspects of their work and innovation so this data can be used to guide the innovation process and budget funding appropriately. The future of intellectual property may involve AI playing a bigger role in the innovation pipeline from idea generation to patent applications.
Please feel free to tweet from your phones using the hastag and Bath Digital handles set out on this slide.
Plus a big thanks to the sponsors and organisers – it’s been an amazing festival this year.
PS: this talk does not constitute legal advice. It’s mostly me rambling.
This talk is going to be in roughly three parts.
First we will have a look at the pace of change and innovation.
Then we will look at some areas where common understanding clashes with experience.
Finally, for the meat of the talk, we will look at possible ways you can take on the challenge of innovation and win.
Okay. A quick show of hands: who feels like innovation is accelerating?
Let’s quickly fly over some facts.
Buckle up.
Now we’ll have some fun trying to guess the graph.
What’s this? Any guesses?
GDP (current US$) from the World Bank - https://data.worldbank.org/indicator/NY.GDP.MKTP.CD .
That’s Trillion. There is now 8 times more wealth in the world than when I was born.
What’s this? Any guesses?
Number of patent applications filed per year (from WIPO - http://www.wipo.int/ipstats/en/).
People say to me: has my idea been done before?
I say: it’s fairly difficult to know the contents of tens of millions of patent applications.
What’s this? Any guesses?
Internet Users Worldwide - https://ourworldindata.org/internet/.
It’s probably higher now. That’s > 50% of the world’s population. You want a market? There’s your market.
What’s this? Any guesses?
The number of human genome DNA base pairs which can be sequenced for one US$ - https://ourworldindata.org/technological-progress/ .
I was listening to a podcast featuring an interview with Eric Lander of the Broad Institute - https://art19.com/shows/talking-machines/episodes/b9bc5a6a-048f-455e-9ea9-aa27dcf236af - who was saying that in 1986 biologists at a major US centre for biology they only at the last minute decided to add a computer room – why would a biologist need a computer?
What’s this? Any guesses?
From the Digital Universe Study - https://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf
An exabyte equals one quintillion bytes, or 1 billion Gigabytes - https://en.wikipedia.org/wiki/Exabyte. This is also probably old and an underestimate by now.
Deepmind’s WaveNet – from research paper to production in Google’s Assistant in one year: https://deepmind.com/blog/wavenet-launches-google-assistant/ .
Hot off the press – in 40 days AlphaGo goes from a complete beginner in Go, to being the best IN THE WORLD - https://deepmind.com/blog/alphago-zero-learning-scratch/
Some initial takeaways.
Now we will have a look at some common misunderstandings.
Most innovations are made by domain experts who have immersed themselves in the problem for a long period (months or years).
Realisation comes slowly and often needs to be tested, iterated and refined.
Large corporations often have enough trouble capturing their own internal ideas.
For example, it is typical for a large multinational company to receive 1000s of invention submissions every year and only proceed with less than 50%.
You need to think of large corporations as having the turning circle of an oil tanker.
Smaller companies will typically only come onto the radar when they have competing sales that are driven by the innovation.
Ideas are easy; execution is hard.
If all you have is an idea then chances are that it is too early to protect anything. If you have a working prototype then you are in a much better position.
Often a working implementation of an initial idea looks quite different. The most valuable intellectual property tends to occur iterations down the line, typically as a small incremental improvement.
Most granted patents only become valuable 10 years down the line.
You often hear stories of how a company’s most valuable patent only just saved the chopping board during its early life.
Many defunct technologies actual turn out to cover later cutting edge technologies – for example some Wireless Application Protocol (WAP) patents from the early 2000s ended up being relevant to later smartphone standards.
Some more takeaways.
Okay.
So we know we are all strapped in on an exponential rocket.
We know the future is uncertain and exciting. How can we capture and encourage innovation in our own businesses?
Okay.
So we know we are all strapped in on an exponential rocket.
We know the future is uncertain and exciting. How can we capture and encourage innovation in our own businesses?
20 years ago the general procedure for capturing innovation was something like this.
A group of inventors would work on a product. They would sit down and describe the product and highlight anything they thought was particularly new or exciting.
A patent attorney would then be passed the description and asked to draft a patent application.
All this happened on paper.
Now the world is global and electronic.
Even in small companies you often deal with inventors on different continents.
In many large technology companies there will be a procedure to submit ideas to a secure cloud database. For example, this may be part of a pre-release check, or an internal business proposal for further funding.
These ideas are described and stored electronically. They start out as typically ~2 pages of text (or possibly paper length if you are lucky).
A small set of internal managers or patent attorneys periodically review these submissions, beginning with any that are flagged as having an imminent public disclosure. Some work on a number of ratings, e.g. 0 to 5 – business value, marketing value, ease of detection, value to cross-licences etc.
The ratings and review allow a shortlist of ideas to be selected. For example, less than half of submitted ideas may make it onto the shortlist.
This shortlist is then scheduled to become patent applications. The patent applications may be farmed out to a number of different service providers around the world.
So if we add the accelerating technology and extrapolate from the current processes what do we get?
Can we even remove the human link altogether?
Is this all pie in the sky?
All US Patent Office patent publications and grants can be downloaded here: https://www.uspto.gov/learning-and-resources/bulk-data-products (have a couple of TB handy).
European Patent Office data may be accessed (for free up to a limit) here: http://www.epo.org/searching-for-patents/technical/espacenet/ops.html#tab-1
Wikipedia Database Dumps may be downloaded here: https://dumps.wikimedia.org/enwiki/latest/
(You’ll want a file with a name like enwiki-20171001-pages-articles-multistream.xml.bz2 - this contains compressed XML at 14.2 GB – watch that crash your XML parser .)
(You can either download the files in chunks – or code your own way to read the XML inside the zip without loading everything into memory – can be done – have a Google and check out LXML.)
Everybody else can access the patent data and Wikipedia. No one else has your business data.
Many enterprise software packages have APIs that allow access to data, e.g. Outlook 365 has a REST API - https://msdn.microsoft.com/en-us/office/office365/api/mail-rest-operations.
Most of this information will be unstructured – it is good to assume you are working with raw text (e.g. basically large strings).
As an example of how companies are manoeuvring in this space: Richard Socher is a very clever person from Stanford – his natural language processing company was acquired by Saleforce in 2016 - https://techcrunch.com/2016/04/04/saleforce-acquires-metamind/ .
How do you get a lot of this data? You need to document everything.
No one likes documentation.
The key is to build it into natural development. For example, doc strings can be added to code as it is being written. Tools like Evernote and OneNote can be used as electronic notebooks.
When most people currently refer to an “AI system” what they actually mean is: I have a server (either locally or in the cloud) with some virtualised Linux instances running Python and a bundle of clever Python libraries. To access this black box of joy you can stick a web front end on (e.g. using Flask to serve JSON through an API). Then you have some nice looking Javascript framework working one the browser side.
If you are clever you can use Docker to spin these up and down wherever you like.
If you are a masochist you can try doing this with Node and Javascript. If that isn’t enough punishment you could try it in .Net.
Here are some examples of clever Python libraries for machine learning and AI (all free):
Tensorflow - https://www.tensorflow.org/get_started/get_started
PyTorch - http://pytorch.org/
Jupyter (for your documentation) - http://jupyter.org/
Natural Language Tool Kit – http://www.nltk.org/
spaCy (now my preference over NLTK) - https://spacy.io/
scikit-learn - http://scikit-learn.org/stable/
NumPy - http://www.numpy.org/
Keras - https://keras.io/
Gensim - https://radimrehurek.com/gensim/
Your main cost in all this is labour. There aren’t that many machine learning experts out there!
Sentiment analysis is relatively easy these days. You just need some labelled data.
If you are able to categorise 1000s of reviews into good and bad, you can take the bad reviews (or the portions of reviews with negative sentiment) and cluster terms within those reviews.
This can identify key problem areas with products which you can apply limited resources to solve. Solutions to those problems may then form the next iteration of protectable intellectual property.
Marketing material may be a website or PDFs of all your product leaflets.
Technology categorisation may be based on something as simple as term frequency (or you can use some of the advanced topic modelling features of Gensim and the like).
If you do this for you and your competitors you can identify spaces in the technology landscape. These spaces may be ripe for innovation.
To spot market opportunities you can mine your sales data for changes and patterns.
For example, you may use statistical analysis to look for correlations – e.g. patterns by variable (e.g. time / geography / customer). Changes may be detected by modelling your data as a mixture of Gaussians – when a particular mixture for the data changes to another mixture – a change is deemed to have occurred.
For example, a detectable change in modelled distribution may occur for a cluster of customers in the spring. Work out what causes this and feed it into product / service improvements.
You can use your email records (and possibly social media) to model social interactions between innovators.
For example, using graph theory and modelling relations you model who works well together.
If you have a database of notebooks, invention submissions and patent prosecution results, you can rank the most successful inventors in your company. You can then pair them intelligently (e.g. with those they work well with but that have low success to transfer skills).
Also you can then make efforts to keep this talent.
(Picture by Martin Grandjean)
Budgets are sexy. So are spreadsheets.
£5000 - patents are not really right for you yet. Try to release a product based on existing technology to build up sales / cash reserve / funding.
£50,000-100,000 - good time to start - 10% to start IP process - also you would have been able to prototype (in secret) some improvements to a working product / service – set a budget & work out development/sales projection & timelines - give to a patent attorney and say I would like this e.g. 1 year.
£250k-£5m - for technology company 5-10% IP development costs. Question is more: what can you protect for the budget
A good rule of thumb is to spend 5-10%.
Blackberry, IBM and Nokia are good examples – IP allows you to keep going and pivot into new spaces.
Some more takeaways.
So.
We’ve seen that the world is accelerating. History so far has been exponential. The problem is our expectations also scale exponentially.
We’ve seen that innovation comes about through spending a long period of time within a particular field or related field. It is unlikely large companies will steal your ideas but how you execute your ideas may have value a decade into the future. Also there is a game for funding and registrable protection is part of this game.
How can you guide innovation in your company? You need to gather as much information as possible in digital form. You then need to target some information engineers at it (eventually there may be a retail solution). By using the business data that you have, and combining it with freely available data sources, you can focus your innovation on the problems and opportunities that are unique to you. By iterating this process, you can be guided to success.
Finally, registerable IP fits in as milestones within this iterative process to help lock in innovation, obtain investment and provide for the future.
Any questions?