Analyzing Patent Full-Text
A Study
1 April 7, 2014
Analysing Patent Full Text
Richard Gynn - LexisNexis
Analyzing Patent Full-Text
A Study
2 April 7, 2014
Agenda
1) Full Text Availability
2) Analyzing full text
- Discussion/co...
Analysing Patent Full Text. Availability
Full Text Availability – Top 10 Publishing Authorities (available from most big vendors)
April 7, 2014
Analyzing Patent Fu...
Full Text Availability – Authorities available from at least one vendor
April 7, 2014
Analyzing Patent Full-Text
A Study
5
Full Text Availability by volume- > 100k publications
April 7, 2014
Analyzing Patent Full-Text
A Study
6
0
5
10
15
20
25
J...
Full Text Availability by volume- > 100k publications
April 7, 2014
Analyzing Patent Full-Text
A Study
7
0
5
10
15
20
25
J...
Full Text Availability by volume - < 100k publications
April 7, 2014
Analyzing Patent Full-Text
A Study
8
0
10,000
20,000
...
Full Text Availability by volume - < 100k publications
April 7, 2014
Analyzing Patent Full-Text
A Study
9
0
10,000
20,000
...
Full Text Availability by volume - < 10k publications
April 7, 2014
Analyzing Patent Full-Text
A Study
10
0
1,000
2,000
3,...
Full Text Availability by volume - < 10k publications
April 7, 2014
Analyzing Patent Full-Text
A Study
11
0
1,000
2,000
3,...
Analyzing Patent Full-Text
A Study
12 April 7, 2014
• Are we nearly there yet?
• There’s a lot of full text available to m...
Analysing Patent Full Text. Discussion/considerations
Analyzing Patent Full-Text
A Study
14 April 7, 2014
Full Text – What Is It?
Full-text – what is it?
• Everything of course...
Considerations
April 7, 2014
Analyzing Patent Full-Text
A Study
15
There’s clearly a lot out
there, so why don’t we see
so...
Analyzing Patent Full-Text
A Study
16 April 7, 2014
Considerations - Language
• Can only compare like for like in same lan...
Considerations
Other Considerations:
• Massive amounts of data
– Time?
– How deal with ?
• Will it contain anything useful...
Big Picture - Landscape Analysis
April 7, 2014
Analyzing Patent Full-Text
A Study
18
Big picture, topographic mapping (Dis...
Analysing Patent Full Text. Study
The Details - Study
Detailed analysis – looking for what?
• New/emerging, different
• Competitive/market comparisons
• Str...
The Details - The Technology
April 7, 2014
Analyzing Patent Full-Text
A Study
21
Terahertz analysis, e.g. imaging, spectro...
The Details - The Search
April 7, 2014
Analyzing Patent Full-Text
A Study
22
• Broad Strategy
― Analysis IPCs + Terahertz
...
Study - PatentOptimizer
Analyzing Patent Full-Text
A Study
23 April 7, 2014
Analysis Details:
• Small/emerging areas of 6-...
PatentOptimizer – Terms & Phrases
April 7, 2014
Analyzing Patent Full-Text
A Study
24
Diagnosis - General
PatentOptimizer – Terms & Phrases
April 7, 2014
Analyzing Patent Full-Text
A Study
25
Not found in Title, Abstract (or cla...
PatentOptimizer – Terms & Phrases
April 7, 2014
Analyzing Patent Full-Text
A Study
26
Not found in Title, Abstract (or cla...
PatentOptimizer – Parts
April 7, 2014
Analyzing Patent Full-Text
A Study
27
Remote monitoring, e.g. of Bluetooth® headset ...
PatentOptimizer – Claim Elements
April 7, 2014
Analyzing Patent Full-Text
A Study
28
Looking for infiltration or extravasa...
Study - VantagePoint
Analyzing Patent Full-Text
A Study
29 April 7, 2014
Analysis Details:
• Data Statistics
• Terms uniqu...
Vantage Point - Statistics
Very low percent of terms and words, available for
analysis are actually in the title and abstr...
Vantage Point – Terms only appearing in full text 2013 onwards
April 7, 2014
Analyzing Patent Full-Text
A Study
31
Vantage Point – Terms only appearing in full text 2013 onwards
April 7, 2014
Analyzing Patent Full-Text
A Study
32
Detecti...
Vantage Point – Terms only appearing in full text 2013 onwards
April 7, 2014
Analyzing Patent Full-Text
A Study
33
Looking...
Analysing Patent Full Text. Conclusions
Findings
April 7, 2014
Analyzing Patent Full-Text
A Study
35
• Full text useful
• Claims less so (in this case)
Most words...
Conclusions
Conclusions (Noise and huge amounts of info):
• Background did not really come in as an issue
• Used English t...
What More?
What more?
Further this:
• Life Sciences
• Define processes
Dedicated
machine?
• Detailed full-
text analysis
S...
Questions
April 7, 2014
Analyzing Patent Full-Text
A Study
38
Analysing Patent Full Text. Study – Additional Examples
PatentOptimizer – Terms & Phrases
April 7, 2014
Analyzing Patent Full-Text
A Study
40
2 of 6 have tattoo in Abstract OR Ti...
PatentOptimizer – Terms & Phrases
April 7, 2014
Analyzing Patent Full-Text
A Study
41
Not found in Abstract & Title
(One c...
PatentOptimizer – Claim Elements
April 7, 2014
Analyzing Patent Full-Text
A Study
42
SAME DOCUMENTS
Identifying/determinin...
PatentOptimizer – Terms & Phrases
April 7, 2014
Analyzing Patent Full-Text
A Study
43
Not found in Title, Abstract (or cla...
PatentOptimizer – Claim Elements
April 7, 2014
Analyzing Patent Full-Text
A Study
44
Glucose Monitoring – Far-IR (5/7 have...
Upcoming SlideShare
Loading in …5
×

II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

973 views

Published on

Published in: Software, Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
973
On SlideShare
0
From Embeds
0
Number of Embeds
288
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

  1. 1. Analyzing Patent Full-Text A Study 1 April 7, 2014 Analysing Patent Full Text Richard Gynn - LexisNexis
  2. 2. Analyzing Patent Full-Text A Study 2 April 7, 2014 Agenda 1) Full Text Availability 2) Analyzing full text - Discussion/considerations - Big picture analysis - Detailed analysis - Study 3) Conclusions Full Text content available from vendors has evolved to a point where most of the top publishing authorities are readily available.
  3. 3. Analysing Patent Full Text. Availability
  4. 4. Full Text Availability – Top 10 Publishing Authorities (available from most big vendors) April 7, 2014 Analyzing Patent Full-Text A Study 4 China, Korea, Japan are not the big deal they used to be! Text can be available to analyse in English
  5. 5. Full Text Availability – Authorities available from at least one vendor April 7, 2014 Analyzing Patent Full-Text A Study 5
  6. 6. Full Text Availability by volume- > 100k publications April 7, 2014 Analyzing Patent Full-Text A Study 6 0 5 10 15 20 25 JP US CN DE EP KR GB FR WO CA AU TW SU ES AT SE IT RU CH NL BE FI BR DK IN NO PL IL DD ZA MX HU PT CS AR IE NZ CZ GR Millions
  7. 7. Full Text Availability by volume- > 100k publications April 7, 2014 Analyzing Patent Full-Text A Study 7 0 5 10 15 20 25 JP US CN DE EP KR GB FR WO CA AU TW SU ES AT SE IT RU CH NL BE FI BR DK IN NO PL IL DD ZA MX HU PT CS AR IE NZ CZ GR Millions 31 of these 39 are currently available from vendors Account for vast majority of total volume
  8. 8. Full Text Availability by volume - < 100k publications April 7, 2014 Analyzing Patent Full-Text A Study 8 0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 100,000 HK YU RO SG TR MY LU BG PH UA TH CL EA ID HR SK CO SI VN PE UY OA EG IS EC
  9. 9. Full Text Availability by volume - < 100k publications April 7, 2014 Analyzing Patent Full-Text A Study 9 0 10,000 20,000 30,000 40,000 50,000 60,000 70,000 80,000 90,000 100,000 HK YU RO SG TR MY LU BG PH UA TH CL EA ID HR SK CO SI VN PE UY OA EG IS EC Much smaller amounts currently available from vendors ~ 300,000 If all were to become available would add about 1.5% to full text that is currently available, e.g. equivalent to Spain or Taiwan
  10. 10. Full Text Availability by volume - < 10k publications April 7, 2014 Analyzing Patent Full-Text A Study 10 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 MA AP VE EE LV GT CU LT MD CR PA CY DO MC ZM ZW SV SM JO PY GE DZ KE MT HN MW NI ME TJ GC BO MN BA KZ BY TT
  11. 11. Full Text Availability by volume - < 10k publications April 7, 2014 Analyzing Patent Full-Text A Study 11 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 MA AP VE EE LV GT CU LT MD CR PA CY DO MC ZM ZW SV SM JO PY GE DZ KE MT HN MW NI ME TJ GC BO MN BA KZ BY TT One currently available from vendors In total these would add about 0.1% to full text that is currently available
  12. 12. Analyzing Patent Full-Text A Study 12 April 7, 2014 • Are we nearly there yet? • There’s a lot of full text available to make use • Most vendors have a significant volumes • Rapidly diminishing returns for each authority added Full Text Availability Bringing You The World • We are already in a good place • In terms of % availability at least
  13. 13. Analysing Patent Full Text. Discussion/considerations
  14. 14. Analyzing Patent Full-Text A Study 14 April 7, 2014 Full Text – What Is It? Full-text – what is it? • Everything of course?! ― …will concentrate on:
  15. 15. Considerations April 7, 2014 Analyzing Patent Full-Text A Study 15 There’s clearly a lot out there, so why don’t we see so much analysis of patent full text?
  16. 16. Analyzing Patent Full-Text A Study 16 April 7, 2014 Considerations - Language • Can only compare like for like in same language …non-Latin character issues too • Noise – Patent full-text likes to state things like …the complete opposite of what it’s about! Considerations - Language How I might introduce myself …If I was a patent! 나는 사람들이 밥, 앤드류, 데이브 앨런 같은 이름이, 이름이. 나는 밥, 앤드류, 데이브 나 앨런 아니에요. 내 이름은 리처드입니다 I have a name, people have names like Bob, Andrew, Dave and Alan. I’m not Bob, Andrew, Dave or Alan. My name is Richard 私は人々がボブ、アンドリュー、デイブとアラ私は人々がボブ、アンドリュー、デイブとアラ私は人々がボブ、アンドリュー、デイブとアラ私は人々がボブ、アンドリュー、デイブとアラ ンのような名前を持っている、名前を持っていンのような名前を持っている、名前を持っていンのような名前を持っている、名前を持っていンのような名前を持っている、名前を持ってい ます。私はボブ、アンドリュー、デイブかアランます。私はボブ、アンドリュー、デイブかアランます。私はボブ、アンドリュー、デイブかアランます。私はボブ、アンドリュー、デイブかアラン ないよ。私ないよ。私ないよ。私ないよ。私の名の名の名の名前はリチャードです前はリチャードです前はリチャードです前はリチャードです
  17. 17. Considerations Other Considerations: • Massive amounts of data – Time? – How deal with ? • Will it contain anything useful? /benefit outweigh effort? April 7, 2014 Analyzing Patent Full-Text A Study 17 • Tools – Big picture? – Details?
  18. 18. Big Picture - Landscape Analysis April 7, 2014 Analyzing Patent Full-Text A Study 18 Big picture, topographic mapping (Discussion) Here more full text could provide: • Broader country analysis (often full-text not available) • More consistency across authorities – e.g. more claims ― Compare like for like, e.g. not claims, title & abstract against title • Full text more useful for details • Themes/commonalities easier to find using claims, title, abstract • Whilst useful, vast majority of landscape analysis done elsewhere, …i.e. details rather than big picture
  19. 19. Analysing Patent Full Text. Study
  20. 20. The Details - Study Detailed analysis – looking for what? • New/emerging, different • Competitive/market comparisons • Strength, weakness, opportunity, threat April 7, 2014 Analyzing Patent Full-Text A Study 20 What can I find using the full text that I couldn’t using title, abstract and bibliography?
  21. 21. The Details - The Technology April 7, 2014 Analyzing Patent Full-Text A Study 21 Terahertz analysis, e.g. imaging, spectroscopy? Terahertz radiation - between Infra-red and microwave
  22. 22. The Details - The Search April 7, 2014 Analyzing Patent Full-Text A Study 22 • Broad Strategy ― Analysis IPCs + Terahertz Radiation Synonyms ― Keyword Terahertz Imaging & Spectroscopy 5,955 documents/3,365 families
  23. 23. Study - PatentOptimizer Analyzing Patent Full-Text A Study 23 April 7, 2014 Analysis Details: • Small/emerging areas of 6-7 families • Look at terms & phrases, parts, claim elements (all numbers represent families) PatentOptimizer™ Analysis of EP, PCT & US results • English Translations
  24. 24. PatentOptimizer – Terms & Phrases April 7, 2014 Analyzing Patent Full-Text A Study 24 Diagnosis - General
  25. 25. PatentOptimizer – Terms & Phrases April 7, 2014 Analyzing Patent Full-Text A Study 25 Not found in Title, Abstract (or claims) – All From Spectral Image Inc Learned – Something seemingly unique to them SAME DOCUMENTS
  26. 26. PatentOptimizer – Terms & Phrases April 7, 2014 Analyzing Patent Full-Text A Study 26 Not found in Title, Abstract (or claims) – All monitoring vitamin K concentration in blood Learned – A more recent (emerging?) use Diagnosis - General
  27. 27. PatentOptimizer – Parts April 7, 2014 Analyzing Patent Full-Text A Study 27 Remote monitoring, e.g. of Bluetooth® headset user Learned – Interesting, but not massively relevant result, would like to investigate applications further Diagnosis - general
  28. 28. PatentOptimizer – Claim Elements April 7, 2014 Analyzing Patent Full-Text A Study 28 Looking for infiltration or extravasation during intravenous infusion Learned – New possibly interesting area, seemingly dominated by one organisation Diagnosis – general A61M – introducing remedies
  29. 29. Study - VantagePoint Analyzing Patent Full-Text A Study 29 April 7, 2014 Analysis Details: • Data Statistics • Terms uniquely appearing in full text • Highly occurring terms used in small numbers of documents • Investigate terms unique to 2013 priority onward Vantage Point Analysis of TotalPatent full text results • English Translations
  30. 30. Vantage Point - Statistics Very low percent of terms and words, available for analysis are actually in the title and abstract Title & Abstract • 42,614 words & phrases • 16,251 words Claims • ~132k words and phrases not in Title or Abstract • ~44k words in Title or Abstract Full-text • ~1.3M unique words & phrases • ~650k unique words April 7, 2014 Analyzing Patent Full-Text A Study 30
  31. 31. Vantage Point – Terms only appearing in full text 2013 onwards April 7, 2014 Analyzing Patent Full-Text A Study 31
  32. 32. Vantage Point – Terms only appearing in full text 2013 onwards April 7, 2014 Analyzing Patent Full-Text A Study 32 Detection of tetracycline drug – concern in resistance to antibiotics Learned – New area (clearer language in full-text) optical investigation
  33. 33. Vantage Point – Terms only appearing in full text 2013 onwards April 7, 2014 Analyzing Patent Full-Text A Study 33 Looking for gas hydrates (fracking) Learned – New area (uncovered by more consistent repetition in full text) general investigation, sampling
  34. 34. Analysing Patent Full Text. Conclusions
  35. 35. Findings April 7, 2014 Analyzing Patent Full-Text A Study 35 • Full text useful • Claims less so (in this case) Most words and phrases in the “full text”, did not appear in Abstract & Title • Text mined wasn’t necessarily applications, but pointed towards • More consistent repetition in full text Helped mainly find new/niche applications • Probably wouldn’t have found other ways Interesting companies & technologies to look at further
  36. 36. Conclusions Conclusions (Noise and huge amounts of info): • Background did not really come in as an issue • Used English translations to avoid language issues • Most noise was from search results • My judgement – about 50% proved somewhat interesting upon further investigation • Can this be automated/put into a process? • 4/5+ family groupings seems to be about the sweet spot April 7, 2014 Analyzing Patent Full-Text A Study 36
  37. 37. What More? What more? Further this: • Life Sciences • Define processes Dedicated machine? • Detailed full- text analysis Study analysis of parts • Sellers, inventors, manufacturers etc. April 7, 2014 Analyzing Patent Full-Text A Study 37 Easier than expected More possible & better timescales
  38. 38. Questions April 7, 2014 Analyzing Patent Full-Text A Study 38
  39. 39. Analysing Patent Full Text. Study – Additional Examples
  40. 40. PatentOptimizer – Terms & Phrases April 7, 2014 Analyzing Patent Full-Text A Study 40 2 of 6 have tattoo in Abstract OR Title (same if include claims) Learned – THz radiation can be used for tattoo removal Diagnosis, surgery - General
  41. 41. PatentOptimizer – Terms & Phrases April 7, 2014 Analyzing Patent Full-Text A Study 41 Not found in Abstract & Title (One claimed -Optical Diagnostics) Determining microorganism presence/kind
  42. 42. PatentOptimizer – Claim Elements April 7, 2014 Analyzing Patent Full-Text A Study 42 SAME DOCUMENTS Identifying/determining antimocrobial resistance of Burkholderia Cepacia Learned – Smaller more niche areas?
  43. 43. PatentOptimizer – Terms & Phrases April 7, 2014 Analyzing Patent Full-Text A Study 43 Not found in Title, Abstract (or claims) – All Some detectors, some looking for heavy metal contamination Learned – Some areas to investigate further?
  44. 44. PatentOptimizer – Claim Elements April 7, 2014 Analyzing Patent Full-Text A Study 44 Glucose Monitoring – Far-IR (5/7 have in Abstract & Title) Learned – Not much more than from Title & Abstract Blood measurement

×