Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

IC-SDV 2018: Larry Cady (IFI Claims) Google’s BigQuery offers a new Way to access, explore and analyze public and private Patent Data

509 views

Published on

BigQuery allows researcher to easily combine data from multiple sources to gain new insights. A large collection of public patent data is available. Data is provided by Google, the USPTO, and EBI ChEMBL. Private collections are available through subscriptions from IFI CLAIMS and CPA Global. Private company information can be combined with public data. A direct connection to Tableau enables visualization of BigQuery results. Examples will be provided that illustrate the unique nature of this new research tool.

Published in: Internet
  • Be the first to comment

  • Be the first to like this

IC-SDV 2018: Larry Cady (IFI Claims) Google’s BigQuery offers a new Way to access, explore and analyze public and private Patent Data

  1. 1. Google BigQuery – A New Way to Access Patent Data April 2018
  2. 2. A lot of data – from a lot of sources © 2017 R&D Data – Lab Results Trials & Test Results Sales & Market Data Private Corporate Data Private Vendor Data Public Data Providers Collaborators & Partners
  3. 3. Patent Data on BigQuery © 2018 Not really “search” – Instead SQL Query/ Join Free Database – Google/IFI Public Patents Free Database – EBI Data Free Database – USPTO PAIR/PEDS Paid Table – IFI Data Enrichments & others Private Personal Data Private Corporate Data GoogleBigQueryReports based on Connecting Data from Multiple Tables Together
  4. 4. What is BigQuery? • Enterprise Cloud Data Warehouse • BigQuery is Google's fully managed, petabyte scale, low cost enterprise data warehouse. • Low cost – but not free. • A powerful Big Data analytics platform • Analyze large datasets to find meaningful insights using familiar SQL • Join public, private, free and paid datasets – Including Patent Data © 2018
  5. 5. Example – Full Text Search © 2018 IFI Global Patent Database IFI CLAIMS Direct "VEGF receptor kinase inhibitor“~3 (vascular endothelial growth factor) 2,294 Results Assign a “Relevance Score” and load into BigQuery as a private table.
  6. 6. VEGF_Receptor.LCPatents – private table, ordered by my private relevance field © 2018 LCPatents (private) lc_number lc_score US-20130053409-A1 112 WO-2009053737-A2 116 US-20030144298-A1 126 US-20030055006-A1 130 JP-2002536414-A 132 CN-103702990-A 137 WO-2008078091-A1 142 EP-2269603-A1 146 EP-2783686-B1 146 US-8410131-B2 146
  7. 7. LCPatents (private) – Public Data, ordered by my private relevance field © 2018 LCPatents (private) patents-public-data.patents.publications Row lc_number lc_score text 1 US-8778962-B2 148 Treatment of solid tumors with rapamycin derivatives 2 WO-2013014448-A1 148 2 - (2, 4, 5 - substituted -anilino) pyrimidine derivatives as egfr modulators useful for treating cancer 3 EP-2269604-A1 147 Treatment of solid tumours with rapamycin derivatives 4 RU-2325906-C2 147 Cancer medical treatment 5 EP-2269603-B1 146 Treatment of breast tumors with a rapamycin derivative in combination with exemestane 6 EP-2783686-A1 146 Combination of a rapamycin derivative and letrozole for treating breast cancer 7 EP-2269604-B1 146 Treatment of solid kidney tumours with a rapamycin derivative 8 US-8877771-B2 146 Treatment of solid tumors with rapamycin derivatives 9 EP-2269603-A1 146 Treatment of solid tumours with rapamycin derivatives
  8. 8. LCPatents (private) – Public Data, ordered by my private relevance field © 2018 SELECT lc.lc_number, lc.lc_score, ttl.text FROM `patents-public-data.patents.publications` AS ppd, UNNEST(title_localized) AS ttl JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS lc ON lc.lc_number = ppd.publication_number WHERE ttl.language = "en" ORDER BY lc.lc_score DESC
  9. 9. LCPatents - IFI Private Data: COUNT © 2017 Row assignee Total 1 AstraZeneca AB 102 2 Astex Therapeutics Ltd 84 3 Cancer Research Technology Ltd 33 4 NeuPharma Inc 32 5 Novartis AG 25 6 ForSight Vision4 Inc 22 7 Merck Sharp & Dohme Corp 21 8 Kinex Pharmaceuticals LLC 19 9 Eisai R&D Management Co Ltd 16 10 Novartis Pharma GmbH 15 11 University of Chicago 11 12 Medimmune Ltd 10 LCPatents IFI Data Enrichments
  10. 10. LCPatents - IFI Private Data: COUNT © 2017 SELECT assignee, COUNT(IFI.publication_number) AS Total FROM `striking-joy-185312.IFIDataEnrichments.IFIDataEnrichments` AS IFI, UNNEST(original_assignee) AS assignee JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS lc ON IFI.publication_number = lc.lc_number GROUP BY assignee ORDER BY Total DESC
  11. 11. © 2017 LCPatents (private) – Public Data – IFI Paid Data Row lc_number lc_score family_id priority_ date current_ assignee legal_ status 1 JP-2004525899-A 152 26245731 20010219 Granted 2 WO-2013014448-A1 148 46875901 20110727 3 US-8778962-B2 148 26245731 20010219 Novartis Pharmaceuticals Corp Active 4 EP-2269604-A1 147 26245731 20010219 Novartis AG Granted 5 RU-2325906-C2 147 26245731 20010219 6 CA-2438504-A1 146 26245731 20010219 Novartis AG Granted 7 CA-2438504-C 146 26245731 20010219 Novartis AG Active 8 EP-2764865-A2 146 26245731 20010219 Novartis Pharma GmbH Withdrawn Novartis AG 9 EP-2762140-A1 146 26245731 20010219 Novartis AG Granted LCPatents patents-public-data IFI Data Enrichments
  12. 12. LCPatents (private) – Public Data – IFI Paid Data © 2017 SELECT lc.lc_number, lc.lc_score, ppd.family_id, ppd.priority_date, IFI.current_assignee, IFI.legal_status FROM `striking-joy-185312.IFIDataEnrichments.IFIDataEnrichments` AS IFI JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS lc ON IFI.publication_number = lc.lc_number JOIN `patents-public-data.patents.publications` AS ppd ON lc.lc_number = ppd.publication_number ORDER BY lc.lc_score DESC
  13. 13. SureChEMBL for LCPatents © 2017 Row lc_number schembl_id smiles inchi_key field 1 US-20100092474-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1 IDPURXSQCKYKIJ- UHFFFAOYSA-N 5 2 WO-2008044041-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1 IDPURXSQCKYKIJ- UHFFFAOYSA-N 5 3 WO-2008044045-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1 IDPURXSQCKYKIJ- UHFFFAOYSA-N 5 4 US-20090306079-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1 IDPURXSQCKYKIJ- UHFFFAOYSA-N 5 5 US-20070021494-A1 SCHEMBL8755 COC1=CC=C(CN)C=C1 IDPURXSQCKYKIJ- UHFFFAOYSA-N 5 6 US-7329660-B2 SCHEMBL104340 CC(C)C1=CC(N)=CC=C1 XCCNRBCNYGWTQX- UHFFFAOYSA-N 5 7 WO-2008002674-A2 SCHEMBL133876 COC1=CC(O)=C(C=O)C=C1 WZUODJNEIXSNEU- UHFFFAOYSA-N 5 8 WO-2014037750-A1 SCHEMBL309636 CCOC1=CC(Br)=CC=C1[N+]([O-])=O SVFZXFVVGNPTEF- UHFFFAOYSA-N 5 9 US-20100092474-A1 SCHEMBL383820 COC1=CC(C(O)=O)=C(C=C1)C(O)=O JKZSIEDAEHZAHQ- UHFFFAOYSA-N 5 10 WO-2008044041-A1 SCHEMBL383820 COC1=CC(C(O)=O)=C(C=C1)C(O)=O JKZSIEDAEHZAHQ- UHFFFAOYSA-N 5 LCPatents Ebi_surechembl
  14. 14. SureChEMBL for LCPatents © 2017 SELECT v.lc_number, ebi.schembl_id, ebi.smiles, ebi.inchi_key, ebi.field FROM `patents-public-data.ebi_surechembl.map` AS ebi JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS v ON v.lc_number = ebi.patent_id WHERE ebi.field = "5" LIMIT 100
  15. 15. ChEMBL Compound, Site, Target © 2017 Row standard_inchi_key compound_name site_name target_type pref_name 61 HUMNYLRZRPPJDN- UHFFFAOYSA-N Benzaldehyde Tyrosinase, Tyrosinase domain SINGLE PROTEIN Tyrosinase 62 HUMNYLRZRPPJDN- UHFFFAOYSA-N Benzaldehyde Tyrosinase, Tyrosinase domain SINGLE PROTEIN Tyrosinase 63 HUMNYLRZRPPJDN- UHFFFAOYSA-N Benzaldehyde Tyrosinase, Tyrosinase domain SINGLE PROTEIN Tyrosinase 64 WGQKYBSKWIADBV- UHFFFAOYSA-N Benzyl amine Phenylethanolamine N- methyltransferase, NNMT_PNMT_TEMT domain SINGLE PROTEIN Phenylethanolamine N-methyltransferase 65 WGQKYBSKWIADBV- UHFFFAOYSA-N Benzyl amine Phenylethanolamine N- methyltransferase, NNMT_PNMT_TEMT domain SINGLE PROTEIN Phenylethanolamine N-methyltransferase 66 WGQKYBSKWIADBV- UHFFFAOYSA-N Benzyl amine Phenylethanolamine N- methyltransferase, NNMT_PNMT_TEMT domain SINGLE PROTEIN Phenylethanolamine N-methyltransferase 67 WGQKYBSKWIADBV- UHFFFAOYSA-N Benzyl amine Phenylethanolamine N- methyltransferase, NNMT_PNMT_TEMT domain SINGLE PROTEIN Phenylethanolamine N-methyltransferase 68 XKJCHHZQLQNZHY- UHFFFAOYSA-N SID144208998 Monoamine oxidase A, Amino_oxidase domain SINGLE PROTEIN Monoamine oxidase A 69 XKJCHHZQLQNZHY- UHFFFAOYSA-N SID144208998 Monoamine oxidase B, Amino_oxidase domain SINGLE PROTEIN Monoamine oxidase B
  16. 16. ChEMBL Compount, Site, Target © 2017 SELECT cs.standard_inchi_key, cr.compound_name, bs.site_name, td.target_type, td.pref_name FROM `patents-public-data.ebi_chembl.target_dictionary_23` AS td JOIN `patents-public-data.ebi_chembl.binding_sites_23` AS bs ON td.tid = bs.tid JOIN `patents-public-data.ebi_chembl.predicted_binding_domains_23` AS pbd ON bs.site_id = pbd.site_id JOIN `patents-public-data.ebi_chembl.activities_23` AS act ON pbd.activity_id = act.activity_id JOIN `patents-public-data.ebi_chembl.compound_structures_23` AS cs ON act.molregno = cs.molregno JOIN `patents-public-data.ebi_chembl.compound_records_23` AS cr ON cr.molregno = cs.molregno WHERE cs.standard_inchi_key IN ("JOXIMZWYDAKGHI-UHFFFAOYSA-N", "XKJCHHZQLQNZHY-UHFFFAOYSA-N", "RMVRSNDYEFQCLF-UHFFFAOYSA-N", "WVDDGKGOMKODPV-UHFFFAOYSA-N", "VODUKXHGDCJEOZ-YUMQZZPRSA-N", "LGRFSURHDFAFJT-UHFFFAOYSA-N", "WGQKYBSKWIADBV-UHFFFAOYSA-N", "VOLRSQPSJGXRNJ-UHFFFAOYSA-N", "WFQDTOYDVUWQMS-UHFFFAOYSA-N", "HUMNYLRZRPPJDN-UHFFFAOYSA-N", "RWZYAGGXGHYGMB-UHFFFAOYSA-N", "DGJKKXAFDOWIQI-UHFFFAOYSA-N", "KHBQMWCZKVMBLN-UHFFFAOYSA-N", "KWOLFJPFCHCOCG-UHFFFAOYSA-N")
  17. 17. EBI – European Biomedical Institute on BigQuery © 2018 ebi_chembl ebi_surechembl • Activities • Smiles • Assays • Inchi_Key • Components • Patent_ID • Compounds • Drug Indications • Drug Mechanisms • Molecules • Proteins • Targets
  18. 18. BigQuery Console © 2018
  19. 19. LCPatents – USPTO OCE Office Actions © 2018 LCPatents (private) uspto_oce_office_actions 1 of 1495 rows shown Row publication_ number app_id action_type claim_numbers 1 US-8298578-B2 13252942 103 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,1 8,19,20,21,22,23,24,25,26 2 US-9556120-B2 14554495 nonstatutory double patenting 6,7,8,9,10,11,12,13,14,15,16,17 3 US-9572800-B2 15086485 103 1,2,3,4,5,6,7,8,9,10,11,12,13 4 US-9737544-B2 15000304 nonstatutory double patenting 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,1 8,19,20 5 US-9707202-B2 15044424 nonstatutory double patenting 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,1 9,20,21,22,23,24,25,26,27,28,29,30,31,32,3 3,34,35,36,37,38 6 US-9616050-B2 14840342 nonstatutory double patenting 23,24,25,26,27,28,29,30 7 US-9707248-B2 15279361 nonstatutory double patenting 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,1 8,19,20,21,22,23,24 8 US-8642610-B2 13366726 112 9,11,12,13,14,15,16,17,18,19,20,21,35,48,4 9,50,51,52,53,112 9 US-8673906-B2 13765850 nonstatutory double patenting 1,2,3,4,5,6,7,8,9,10,11,12 10 US-8277830-B2 13252998 103 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,1 8,19,20,21,22,23,24,25,26,27,28,29 LCPatents (private) uspto_oce_office_actions
  20. 20. USPTO OCE Office Actions © 2017 SELECT patents.publication_number, oa.app_id, oa.action_type, oa.claim_numbers FROM `patents-public-data.uspto_oce_office_actions.rejections` AS oa JOIN `patents-public-data.uspto_oce_office_actions.match_app` AS match ON oa.app_id = match.app_id JOIN `patents-public-data.patents.publications` AS patents ON match.application_number = patents.application_number JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS VEGF ON VEGF.lc_number = patents.publication_number
  21. 21. USPTO PTAB Cases © 2017 SELECT ptab.PatentOwnerName, ptab.PatentNumber, ptab.PetitionerPartyName, ptab.TrialNumber, ptab.ProsecutionStatus, IFI.original_assignee FROM `patents-public-data.uspto_ptab.trials` AS ptab JOIN `patents-public-data.uspto_ptab.match` AS ptab_match ON ptab.ApplicationNumber = ptab_match.ApplicationNumber JOIN `patents-public-data.patents.publications` AS PUBLIC ON PUBLIC.application_number = ptab_match.application_number JOIN `striking-joy-185312.IFIDataEnrichments.IFIDataEnrichments` AS IFI ON IFI.publication_number = PUBLIC.publication_number JOIN `civil-dolphin-136720.VEGF_Receptor.LCPatents` AS VEGF ON VEGF.lc_number = IFI.publication_number Row PatentOwner Name PatentNumber Petitioner PartyName TrialNumber Prosecution Status original_ assignee 1 Lane et al 8410131 Breckenridge Pharmaceutial, Inc. IPR2017-01592 Notice OF Filing Date Accorded Novartis Pharmaceuticals Corp
  22. 22. Patent Data on BigQuery Enhances your Search Results © 2018 Public Patent Data (free) SureChEMBL (free) IFI Data Enhancements (paid) Office Actions (free) uspto uspto PAIR, PTAB (free) Private, On-Premise Data (e.g., Docket) Search Result Portfolio or Any List A Better Search Report JOIN
  23. 23. Collaboration © 2017 Share with google accounts or groups.
  24. 24. What this means to you • BigQuery does not replace your text, semantic or structure based search tools • BigQuery does let you make your search results more useful for: • Your Legal Team • Your Business Sponsors • Your Research Partners © 2018
  25. 25. SECRET Data Fields! © 2017 VEGF patents with secret data code. The secret code cannot be visible to Google (even the Google Enterprise Cloud)
  26. 26. Tableau Desktop: Local Excel to BigQuery © 2017
  27. 27. Tableau Desktop + BigQuery © 2017 BigQuery SQL Query Excel file on Desktoppub number Data join is created used publication_number. “Secret Code” is never transmitted to Google
  28. 28. Tableau Visualization © 2017
  29. 29. Resources • https://cloud.google.com/ - Google Cloud Platform > Launcher for Google Patents Public Data • Google Announcement • https://github.com/google/patents-public-data - GitHub Home for Google Patents • Public Patent Data Now Available on Google BigQuery - IFI Blog Post on BigQuery, with examples • IFI Data Enrichments – Information on IFI’s paid data enrichments • W3 Schools SQL - SQL Reference © 2018 Support comes with an IFI Data Enrichments Subscription!
  30. 30. Thank You! Larry Cady Senior Analyst IFI CLAIMS Patent Services larry.cady@ificlaims.com info@ificlaims.com © 2018

×