SlideShare a Scribd company logo
1 of 46
Download to read offline
The Fung Institute Patent Lab:
Products and Future Plans
Lee Fleming, Director of the Coleman Fung Institute
for Engineering Leadership
May 2015
With Gabe Fierro, Ben Balsmeier, Guan-Cheng Li, Kevin
Johnson, Aditya Kaulagi, Douglas O'Reagan, Bill Yeh
We gratefully acknowledge support from the National
Science Foundation Grant #1064182, the US Patent and
Trademark Office, and the American Institutes for Research
My objectives for today’s chat
•  Give you an understanding of our work
– Disambiguation (upcoming JEMS paper)
– Visualization and tools
– Future plans (PAIR)
•  Get your feedback on our research
•  Help me understand bigger picture of data
efforts in innovation and entrepreneurship
– I want to get our stuff used
– and at the same time, aid replication and help our
field to stop re-inventing inferior wheels
Continuing opportunity w/ patent data
•  Despite many papers, basic data remain
inaccessible
– Unstructured and dirty text difficult to aggregate across entities
– (Semi) manual and uncoordinated efforts to date for granted patents
•  We provide parsing, dbase, auto disambig of grants + apps:
•  inventors
•  assignees
•  patent lawyers’ firms
•  location
• Everything made public and supportive of complementary
efforts (mainly AIR and USPTO)
Basic data flow (~2-3 weeks)
Conceptual database schema
10/18/13 database-simplified.svg
Patent
Lawyer
<lawyers,
patents>
Assignee
<assignees,
patents>
Inventor
<patents,
inventors>
RawLawyer
<rawlayers,
lawyer>
RawInventor
<inventor,
rawinventors>
RawAssignee
<assignee,
rawassignees>
Location
<assignees,
locations>
<locations
inventors>
RawLocation
<location,
rawlocations>
<rawlocations,
rawinventor>
<rawassignee,
rawlocations>
USPC
<classes,
patent>
Citation
IPCR
<ipcrs,
patent>
MainClass
<mainclass,
uspc>
SubClass
<subclass,
uspc>
USRelDoc
<patent,
usreldocs>
reldocs>
OtherReference
<patent,
otherreferences>
Application
<application,
patent>
<patent,
citations>
citedby>
<patent,
rawassignees>
<patent,
rawinventors>
<rawlawyers,
patent>
Accessible data: monthly disambiguated grant,
app data Jan ‘75 – Dec ‘14: http://funglab.berkeley.edu/database
•  Parse, clean, disambiguate:
– inventors
– geography (Google lookup)
– assignee (crude Jaro-Winkler)
– lawyer (crude Jaro-Winkler)
– consistent inventor identifiers
– cites, claims, non-pat refs…
– .csv download or SQL query
– future: blocking, tech control
– > 300M observations (not all
characterized yet); ~50GB
Will the real Matt Marx please stand up?
Plainview NY Everett MA Mt View CA
Class 704
Disambiguation: a classifier problem
•  Popular methods: we currently use last three
– Manual
– Linear weighting + manual tuning
– Naïve Bayes, supervised and semi-supervised
– String matching
– K-means intra and inter cluster optimization
– Look up (Google provided access to library)
•  Active research topic in machine learning
•  Julia Lane is planning a contest
•  Had more complex approach (Li et al. 2014)
– latest is simpler, faster, supportable, improvable
• though not as accurate yet – tends to oversplit
Inventor disambiguation
•  Start with (block on) exact name matches
•  Euclidean distance for exact attribute matches
•  Balance min intra cluster and max inter cluster distances
•  Look for no further
improvement
– 4 in this case
•  Re-label each column with a cluster
•  Relax exact name match and merge
•  Use correlation of co-authors as well
Future of inventor disambiguation
•  Relax strict matching
•  Bring in additional data
– All tech fields
– Lexical overlap
– Law firms
– Prior art citations and non patent references
• New algorithms
• Make everything public and support AIR
tournament
Assignee disambiguation
•  Jaro-Winkler after simple string cleaning
•  Unique assignees from 6,700,000 to 507,000
•  Indentifier, raw and cleaned name available
Future of assignee disambiguation
•  Coordinate with NBER and HBS efforts
– The field needs to curate and maintain cumulative progress
•  CONAME data from USPTO
•  Normalize common affixes
•  Train with manually developed NBER disambiguation
•  Apply inventor algorithm
•  Provide Compustat identifier
•  Add subsidiary information
-  BvD sample of 6,000 major U.S. firms revealed 50,000
subsidiaries under parental control (>50% in 2012)
-  GE: 250 subsidiaries, ~98% patents filed under GE
Law firms
•  Similar algorithms to assignees
•  Not aware of any applications yet
Locations
•  Use Google’s geocoding API
•  Unique cities from 333K to 66K
•  City, region, country
– Lat and Long being developed
– Do not provide street level data
If you’re allergic to SQL: http://rosencrantz.berkeley.edu
Approximate results (full 2014 data in process)
http://funglab.berkeley.edu/database
Tools and applications
•  Look for this stuff and high level explanations at:
– http://www.funginstitute.berkeley.edu/blog-categories/faculty-directors-blog#
Visualizations
• Clean tech inventions mapped by type and source
• Inventor mobility movies
• Patent location in technology “space”
• The convergence and divergence, the coalescence and
reconfiguration of components – the flow of technology -
over time
• Visualizing the patent application process
Clean Tech Patent Mapper
•  Li, G., K. Paisner, “A List of Clean Tech Patents.”
•  http://funglab.berkeley.edu/cleantechx/
•  Energy: wind, solar, bio, hydro, geo, nuclear
•  Assignee: VC backed, university, government, large and small incumbents, no assignee
VC patents 1990-1999
Innovation and Entrepreneurship
in Clean Energy: Nanda, Younge, Fleming
Note scale of funding activity 1990-1999
VC patents 2000-2009
Innovation and Entrepreneurship
in Clean Energy: Nanda, Younge, Fleming
See Nanda, R. and K. Younge, L. Fleming.
“Innovation and Entrepreneurship in Clean Energy,”
Forthcoming at Rethinking Science and Innovation Policy, NBER.
Much greater funding activity 2000-2009
Midwest clean tech
Kansas City clean tech
Mobility mapper: http://funglab.berkeley.edu/mobility/
• Larger states
• Example: 1987 immigration to MI (note one IL inventor):
!
!
1987
1982
Illustrates causal impact of
noncompetes on brain drain (Marx, Singh, Fleming, forthcoming RP)
!
Variety of states
Visualizing an
acquisition
Acknowledgment of government support
– Hillary Greene, Dennis Yao, Guan Cheng
• What proportion of 2015 patents can be traced to govt?
5M patent applications as a Markov process?
Starting with an analysis of Bilski vs. Kappos
Network Interface –
http://
douglasoreagan.com
/socialnetwork/
Semiconductor
patents in 438/283
from 1998-2000
Method to illustrate network around seed inventors
Cool pics – but what do they mean?
– Need to validate visualizations with ground truth
– Mixed visualization and historical study of
biggest semiconductor breakthrough of last
decade – the FinFET
Why FinFET?
•  Study intended to explore/develop
breakthrough visualization tools
– tie to reality w/o conflating variables
• All patents Northern CA 1995-2000
• Ranked by future citations
• Tech distance
– from our brains, close but moldy
•  Geographic distance
– about 40 yards
•  Social distance
– head of search committee that hired me
– neighbor
Quintessential architectural BT
Source: King 2012
Inventors
brokered social
and academic/
industry
networks
But they also integrated outsiders
The flow of
technology
1)  Words are
components -> little
differentiation, this
is so incremental
2)  No geographic
localization of
trajectories
3)  How did university
plop in and do this?
4)  FinFET may have
been only govt
supported patent
Coming attractions
• Blocking actions – better than citations as
a measure of patent impact?
• Lexical novelty
– First appearance of new word in corpus
– First pair-wise combination of words
• Lexical distance between classes
Identification of blocking patents – pdf challenges:
OCR 101,195 PDF files…
Claim Rejections –
35 USC 103 3. The
folowing is a
quotation of 35
U.S.C. 103(a) which
forms the basis for
all obviousness
Detail
Enhancement
Noise
Reduction
OC
R
OCRed blocking data
First results from 2012
• 2011 now complete as well
• Need to characterize each type of action
I may come to you tin cup in hand…
•  Download, parse, clean, disambiguate, store
and serve up > 300M data (and weekly updates)
– Julia Lane taking over part of this
•  Blocking data: must OCR ~400M documents
•  Disambiguation takes weeks, PAIR years
– ~$150K hardware alone past year
– database person in Si Valley (~$140K + Cal tax)
•  Mention maintenance in NSF proposal => ding
•  Public good (~50,000 downloads)
•  Talking with firms and private philanthropy

More Related Content

Similar to The Fung Institute Patent Lab: Products and Future Plans

Big Data Curricula at the UW eScience Institute, JSM 2013
Big Data Curricula at the UW eScience Institute, JSM 2013Big Data Curricula at the UW eScience Institute, JSM 2013
Big Data Curricula at the UW eScience Institute, JSM 2013University of Washington
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube
 
Lightning Talks: All EartCube Funded Projects
Lightning Talks: All EartCube Funded ProjectsLightning Talks: All EartCube Funded Projects
Lightning Talks: All EartCube Funded ProjectsEarthCube
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesDaniel S. Katz
 
Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...
Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...
Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...James Baker
 
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...Trevor Owens
 
Designing a synergistic relationship between undergraduate Data Science educa...
Designing a synergistic relationship between undergraduate Data Science educa...Designing a synergistic relationship between undergraduate Data Science educa...
Designing a synergistic relationship between undergraduate Data Science educa...Ciera Martinez
 
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎Libcorpio
 
Relationship Building and Advocacy Across the Campus
Relationship Building and Advocacy Across the CampusRelationship Building and Advocacy Across the Campus
Relationship Building and Advocacy Across the CampusUCD Library
 
Accelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAccelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAnubhav Jain
 
Scientific Software - what happens after the grant?
Scientific Software - what happens after the grant?Scientific Software - what happens after the grant?
Scientific Software - what happens after the grant?James Howison
 
Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Daniel S. Katz
 
SGCI Science Gateways Landscape in North America
SGCI Science Gateways Landscape in North AmericaSGCI Science Gateways Landscape in North America
SGCI Science Gateways Landscape in North AmericaSandra Gesing
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRSusanna-Assunta Sansone
 
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...Shawna Reibling
 
VIVO Conference 2013 Panel Slides
VIVO Conference 2013 Panel SlidesVIVO Conference 2013 Panel Slides
VIVO Conference 2013 Panel SlidesPatrick West
 
Metadata Ownership & Metadata Rights
Metadata Ownership & Metadata RightsMetadata Ownership & Metadata Rights
Metadata Ownership & Metadata RightsChelcie Rowell
 

Similar to The Fung Institute Patent Lab: Products and Future Plans (20)

Big Data Curricula at the UW eScience Institute, JSM 2013
Big Data Curricula at the UW eScience Institute, JSM 2013Big Data Curricula at the UW eScience Institute, JSM 2013
Big Data Curricula at the UW eScience Institute, JSM 2013
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013
 
Lightning Talks: All EartCube Funded Projects
Lightning Talks: All EartCube Funded ProjectsLightning Talks: All EartCube Funded Projects
Lightning Talks: All EartCube Funded Projects
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community Responses
 
Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...
Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...
Enabling Complex Analysis of Large-Scale Digital Collections: Humanities Rese...
 
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
We Have Interesting Problems: Some Applied Grand Challenges from Digital Libr...
 
Designing a synergistic relationship between undergraduate Data Science educa...
Designing a synergistic relationship between undergraduate Data Science educa...Designing a synergistic relationship between undergraduate Data Science educa...
Designing a synergistic relationship between undergraduate Data Science educa...
 
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
 
Relationship Building and Advocacy Across the Campus
Relationship Building and Advocacy Across the CampusRelationship Building and Advocacy Across the Campus
Relationship Building and Advocacy Across the Campus
 
Accelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine LearningAccelerating New Materials Design with Supercomputing and Machine Learning
Accelerating New Materials Design with Supercomputing and Machine Learning
 
Scientific Software - what happens after the grant?
Scientific Software - what happens after the grant?Scientific Software - what happens after the grant?
Scientific Software - what happens after the grant?
 
Sgci iwsg-a-10-10-16
Sgci iwsg-a-10-10-16Sgci iwsg-a-10-10-16
Sgci iwsg-a-10-10-16
 
COPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob DaveyCOPO - Collaborative Open Plant Omics, by Rob Davey
COPO - Collaborative Open Plant Omics, by Rob Davey
 
Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)Open Source and Science at the National Science Foundation (NSF)
Open Source and Science at the National Science Foundation (NSF)
 
SGCI Science Gateways Landscape in North America
SGCI Science Gateways Landscape in North AmericaSGCI Science Gateways Landscape in North America
SGCI Science Gateways Landscape in North America
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIR
 
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
Presentation to 2014 University of Guelph Accessibility Conference Perspectiv...
 
VIVO Conference 2013 Panel Slides
VIVO Conference 2013 Panel SlidesVIVO Conference 2013 Panel Slides
VIVO Conference 2013 Panel Slides
 
Metadata Ownership & Metadata Rights
Metadata Ownership & Metadata RightsMetadata Ownership & Metadata Rights
Metadata Ownership & Metadata Rights
 

More from Arnobio Morelix

Startup Activity in America -- A Look at Startup Policy and the Kauffman Index
Startup Activity in America -- A Look at Startup Policy and the Kauffman IndexStartup Activity in America -- A Look at Startup Policy and the Kauffman Index
Startup Activity in America -- A Look at Startup Policy and the Kauffman IndexArnobio Morelix
 
Four Indicators for a Vibrant Entrepreneurship Ecosystem -- C2ER
Four Indicators for a Vibrant Entrepreneurship Ecosystem -- C2ERFour Indicators for a Vibrant Entrepreneurship Ecosystem -- C2ER
Four Indicators for a Vibrant Entrepreneurship Ecosystem -- C2ERArnobio Morelix
 
From Ingredients to RecipeIn Entrepreneurship Ecosystem
From Ingredients to RecipeIn Entrepreneurship EcosystemFrom Ingredients to RecipeIn Entrepreneurship Ecosystem
From Ingredients to RecipeIn Entrepreneurship EcosystemArnobio Morelix
 
Four Indicators for a Vibrant Entrepreneurship Ecosystem
Four Indicators for a Vibrant Entrepreneurship EcosystemFour Indicators for a Vibrant Entrepreneurship Ecosystem
Four Indicators for a Vibrant Entrepreneurship EcosystemArnobio Morelix
 
The Entrepreneur's Guide to Depression and ADHD
The Entrepreneur's Guide to Depression and ADHDThe Entrepreneur's Guide to Depression and ADHD
The Entrepreneur's Guide to Depression and ADHDArnobio Morelix
 
Global Startup Ecosystems by JF Gauthier, Compass
Global Startup Ecosystems by JF Gauthier, CompassGlobal Startup Ecosystems by JF Gauthier, Compass
Global Startup Ecosystems by JF Gauthier, CompassArnobio Morelix
 
Comparing CrunchBase and MoneyTree Data
Comparing CrunchBase and MoneyTree DataComparing CrunchBase and MoneyTree Data
Comparing CrunchBase and MoneyTree DataArnobio Morelix
 
5 Facts About Mental Health and Entrepreneurship
5 Facts About Mental Health and Entrepreneurship5 Facts About Mental Health and Entrepreneurship
5 Facts About Mental Health and EntrepreneurshipArnobio Morelix
 
The Venture: A Social Entrepreneur Competition
The Venture: A Social Entrepreneur CompetitionThe Venture: A Social Entrepreneur Competition
The Venture: A Social Entrepreneur CompetitionArnobio Morelix
 
Top Social Entrepreneurs
Top Social EntrepreneursTop Social Entrepreneurs
Top Social EntrepreneursArnobio Morelix
 

More from Arnobio Morelix (10)

Startup Activity in America -- A Look at Startup Policy and the Kauffman Index
Startup Activity in America -- A Look at Startup Policy and the Kauffman IndexStartup Activity in America -- A Look at Startup Policy and the Kauffman Index
Startup Activity in America -- A Look at Startup Policy and the Kauffman Index
 
Four Indicators for a Vibrant Entrepreneurship Ecosystem -- C2ER
Four Indicators for a Vibrant Entrepreneurship Ecosystem -- C2ERFour Indicators for a Vibrant Entrepreneurship Ecosystem -- C2ER
Four Indicators for a Vibrant Entrepreneurship Ecosystem -- C2ER
 
From Ingredients to RecipeIn Entrepreneurship Ecosystem
From Ingredients to RecipeIn Entrepreneurship EcosystemFrom Ingredients to RecipeIn Entrepreneurship Ecosystem
From Ingredients to RecipeIn Entrepreneurship Ecosystem
 
Four Indicators for a Vibrant Entrepreneurship Ecosystem
Four Indicators for a Vibrant Entrepreneurship EcosystemFour Indicators for a Vibrant Entrepreneurship Ecosystem
Four Indicators for a Vibrant Entrepreneurship Ecosystem
 
The Entrepreneur's Guide to Depression and ADHD
The Entrepreneur's Guide to Depression and ADHDThe Entrepreneur's Guide to Depression and ADHD
The Entrepreneur's Guide to Depression and ADHD
 
Global Startup Ecosystems by JF Gauthier, Compass
Global Startup Ecosystems by JF Gauthier, CompassGlobal Startup Ecosystems by JF Gauthier, Compass
Global Startup Ecosystems by JF Gauthier, Compass
 
Comparing CrunchBase and MoneyTree Data
Comparing CrunchBase and MoneyTree DataComparing CrunchBase and MoneyTree Data
Comparing CrunchBase and MoneyTree Data
 
5 Facts About Mental Health and Entrepreneurship
5 Facts About Mental Health and Entrepreneurship5 Facts About Mental Health and Entrepreneurship
5 Facts About Mental Health and Entrepreneurship
 
The Venture: A Social Entrepreneur Competition
The Venture: A Social Entrepreneur CompetitionThe Venture: A Social Entrepreneur Competition
The Venture: A Social Entrepreneur Competition
 
Top Social Entrepreneurs
Top Social EntrepreneursTop Social Entrepreneurs
Top Social Entrepreneurs
 

Recently uploaded

kala jadu in canada | amil baba pakistan \ black magic expert
kala jadu in canada | amil baba pakistan \ black magic expertkala jadu in canada | amil baba pakistan \ black magic expert
kala jadu in canada | amil baba pakistan \ black magic expertmazhshah570
 
ACC311_Corporate Income Taxation in the Philippines
ACC311_Corporate Income Taxation  in the PhilippinesACC311_Corporate Income Taxation  in the Philippines
ACC311_Corporate Income Taxation in the PhilippinesAdrinneFlores
 
Economic Risk Factor Update: May 2024 [SlideShare]
Economic Risk Factor Update: May 2024 [SlideShare]Economic Risk Factor Update: May 2024 [SlideShare]
Economic Risk Factor Update: May 2024 [SlideShare]Commonwealth
 
Jual obat aborsi Jogja ( 085657271886 ) Cytote pil telat bulan penggugur kand...
Jual obat aborsi Jogja ( 085657271886 ) Cytote pil telat bulan penggugur kand...Jual obat aborsi Jogja ( 085657271886 ) Cytote pil telat bulan penggugur kand...
Jual obat aborsi Jogja ( 085657271886 ) Cytote pil telat bulan penggugur kand...Klinik kandungan
 
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptxSlideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptxOffice for National Statistics
 
Amil baba australia kala jadu in uk black magic in usa
Amil baba australia kala jadu in uk black magic in usaAmil baba australia kala jadu in uk black magic in usa
Amil baba australia kala jadu in uk black magic in usaisrajan914
 
一比一原版(UCSD毕业证书)加利福尼亚大学圣迭戈分校毕业证成绩单学位证书
一比一原版(UCSD毕业证书)加利福尼亚大学圣迭戈分校毕业证成绩单学位证书一比一原版(UCSD毕业证书)加利福尼亚大学圣迭戈分校毕业证成绩单学位证书
一比一原版(UCSD毕业证书)加利福尼亚大学圣迭戈分校毕业证成绩单学位证书atedyxc
 
Big developments in Lesotho Butha-Buthe.
Big developments in Lesotho Butha-Buthe.Big developments in Lesotho Butha-Buthe.
Big developments in Lesotho Butha-Buthe.ntlhabeli12
 
Retail sector trends for 2024 | European Business Review
Retail sector trends for 2024  | European Business ReviewRetail sector trends for 2024  | European Business Review
Retail sector trends for 2024 | European Business ReviewAntonis Zairis
 
Production and Cost of the firm with curves
Production and Cost of the firm with curvesProduction and Cost of the firm with curves
Production and Cost of the firm with curvesArifa Saeed
 
一比一原版(UCSB毕业证书)圣塔芭芭拉社区大学毕业证成绩单学位证书
一比一原版(UCSB毕业证书)圣塔芭芭拉社区大学毕业证成绩单学位证书一比一原版(UCSB毕业证书)圣塔芭芭拉社区大学毕业证成绩单学位证书
一比一原版(UCSB毕业证书)圣塔芭芭拉社区大学毕业证成绩单学位证书atedyxc
 
amil baba in australia amil baba in canada amil baba in london amil baba in g...
amil baba in australia amil baba in canada amil baba in london amil baba in g...amil baba in australia amil baba in canada amil baba in london amil baba in g...
amil baba in australia amil baba in canada amil baba in london amil baba in g...israjan914
 
1. Elemental Economics - Introduction to mining
1. Elemental Economics - Introduction to mining1. Elemental Economics - Introduction to mining
1. Elemental Economics - Introduction to miningNeal Brewster
 
Q1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfQ1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfAdnet Communications
 
black magic removal amil baba in pakistan karachi islamabad america canada uk...
black magic removal amil baba in pakistan karachi islamabad america canada uk...black magic removal amil baba in pakistan karachi islamabad america canada uk...
black magic removal amil baba in pakistan karachi islamabad america canada uk...batoole333
 
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usanajoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usabatoole333
 
Rapport annuel de Encevo Group pour l'année 2023
Rapport annuel de Encevo Group pour l'année 2023Rapport annuel de Encevo Group pour l'année 2023
Rapport annuel de Encevo Group pour l'année 2023Paperjam_redaction
 
一比一原版(UC Davis毕业证书)加州大学戴维斯分校毕业证成绩单学位证书
一比一原版(UC Davis毕业证书)加州大学戴维斯分校毕业证成绩单学位证书一比一原版(UC Davis毕业证书)加州大学戴维斯分校毕业证成绩单学位证书
一比一原版(UC Davis毕业证书)加州大学戴维斯分校毕业证成绩单学位证书atedyxc
 

Recently uploaded (20)

kala jadu in canada | amil baba pakistan \ black magic expert
kala jadu in canada | amil baba pakistan \ black magic expertkala jadu in canada | amil baba pakistan \ black magic expert
kala jadu in canada | amil baba pakistan \ black magic expert
 
ACC311_Corporate Income Taxation in the Philippines
ACC311_Corporate Income Taxation  in the PhilippinesACC311_Corporate Income Taxation  in the Philippines
ACC311_Corporate Income Taxation in the Philippines
 
Economic Risk Factor Update: May 2024 [SlideShare]
Economic Risk Factor Update: May 2024 [SlideShare]Economic Risk Factor Update: May 2024 [SlideShare]
Economic Risk Factor Update: May 2024 [SlideShare]
 
Jual obat aborsi Jogja ( 085657271886 ) Cytote pil telat bulan penggugur kand...
Jual obat aborsi Jogja ( 085657271886 ) Cytote pil telat bulan penggugur kand...Jual obat aborsi Jogja ( 085657271886 ) Cytote pil telat bulan penggugur kand...
Jual obat aborsi Jogja ( 085657271886 ) Cytote pil telat bulan penggugur kand...
 
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptxSlideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
Slideshare - ONS Economic Forum Slidepack - 13 May 2024.pptx
 
Amil baba australia kala jadu in uk black magic in usa
Amil baba australia kala jadu in uk black magic in usaAmil baba australia kala jadu in uk black magic in usa
Amil baba australia kala jadu in uk black magic in usa
 
一比一原版(UCSD毕业证书)加利福尼亚大学圣迭戈分校毕业证成绩单学位证书
一比一原版(UCSD毕业证书)加利福尼亚大学圣迭戈分校毕业证成绩单学位证书一比一原版(UCSD毕业证书)加利福尼亚大学圣迭戈分校毕业证成绩单学位证书
一比一原版(UCSD毕业证书)加利福尼亚大学圣迭戈分校毕业证成绩单学位证书
 
Abortion pills in Dammam Saudi Arabia | +966572737505 |Get Cytotec
Abortion pills in Dammam Saudi Arabia | +966572737505 |Get CytotecAbortion pills in Dammam Saudi Arabia | +966572737505 |Get Cytotec
Abortion pills in Dammam Saudi Arabia | +966572737505 |Get Cytotec
 
Big developments in Lesotho Butha-Buthe.
Big developments in Lesotho Butha-Buthe.Big developments in Lesotho Butha-Buthe.
Big developments in Lesotho Butha-Buthe.
 
Retail sector trends for 2024 | European Business Review
Retail sector trends for 2024  | European Business ReviewRetail sector trends for 2024  | European Business Review
Retail sector trends for 2024 | European Business Review
 
Production and Cost of the firm with curves
Production and Cost of the firm with curvesProduction and Cost of the firm with curves
Production and Cost of the firm with curves
 
一比一原版(UCSB毕业证书)圣塔芭芭拉社区大学毕业证成绩单学位证书
一比一原版(UCSB毕业证书)圣塔芭芭拉社区大学毕业证成绩单学位证书一比一原版(UCSB毕业证书)圣塔芭芭拉社区大学毕业证成绩单学位证书
一比一原版(UCSB毕业证书)圣塔芭芭拉社区大学毕业证成绩单学位证书
 
amil baba in australia amil baba in canada amil baba in london amil baba in g...
amil baba in australia amil baba in canada amil baba in london amil baba in g...amil baba in australia amil baba in canada amil baba in london amil baba in g...
amil baba in australia amil baba in canada amil baba in london amil baba in g...
 
1. Elemental Economics - Introduction to mining
1. Elemental Economics - Introduction to mining1. Elemental Economics - Introduction to mining
1. Elemental Economics - Introduction to mining
 
Q1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdfQ1 2024 Conference Call Presentation vF.pdf
Q1 2024 Conference Call Presentation vF.pdf
 
black magic removal amil baba in pakistan karachi islamabad america canada uk...
black magic removal amil baba in pakistan karachi islamabad america canada uk...black magic removal amil baba in pakistan karachi islamabad america canada uk...
black magic removal amil baba in pakistan karachi islamabad america canada uk...
 
DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRYDIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
DIGITAL COMMERCE SHAPE VIETNAMESE SHOPPING HABIT IN 4.0 INDUSTRY
 
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usanajoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
najoomi asli amil baba kala jadu expert rawalpindi bangladesh uk usa
 
Rapport annuel de Encevo Group pour l'année 2023
Rapport annuel de Encevo Group pour l'année 2023Rapport annuel de Encevo Group pour l'année 2023
Rapport annuel de Encevo Group pour l'année 2023
 
一比一原版(UC Davis毕业证书)加州大学戴维斯分校毕业证成绩单学位证书
一比一原版(UC Davis毕业证书)加州大学戴维斯分校毕业证成绩单学位证书一比一原版(UC Davis毕业证书)加州大学戴维斯分校毕业证成绩单学位证书
一比一原版(UC Davis毕业证书)加州大学戴维斯分校毕业证成绩单学位证书
 

The Fung Institute Patent Lab: Products and Future Plans

  • 1. The Fung Institute Patent Lab: Products and Future Plans Lee Fleming, Director of the Coleman Fung Institute for Engineering Leadership May 2015 With Gabe Fierro, Ben Balsmeier, Guan-Cheng Li, Kevin Johnson, Aditya Kaulagi, Douglas O'Reagan, Bill Yeh We gratefully acknowledge support from the National Science Foundation Grant #1064182, the US Patent and Trademark Office, and the American Institutes for Research
  • 2. My objectives for today’s chat •  Give you an understanding of our work – Disambiguation (upcoming JEMS paper) – Visualization and tools – Future plans (PAIR) •  Get your feedback on our research •  Help me understand bigger picture of data efforts in innovation and entrepreneurship – I want to get our stuff used – and at the same time, aid replication and help our field to stop re-inventing inferior wheels
  • 3. Continuing opportunity w/ patent data •  Despite many papers, basic data remain inaccessible – Unstructured and dirty text difficult to aggregate across entities – (Semi) manual and uncoordinated efforts to date for granted patents •  We provide parsing, dbase, auto disambig of grants + apps: •  inventors •  assignees •  patent lawyers’ firms •  location • Everything made public and supportive of complementary efforts (mainly AIR and USPTO)
  • 4. Basic data flow (~2-3 weeks)
  • 5. Conceptual database schema 10/18/13 database-simplified.svg Patent Lawyer <lawyers, patents> Assignee <assignees, patents> Inventor <patents, inventors> RawLawyer <rawlayers, lawyer> RawInventor <inventor, rawinventors> RawAssignee <assignee, rawassignees> Location <assignees, locations> <locations inventors> RawLocation <location, rawlocations> <rawlocations, rawinventor> <rawassignee, rawlocations> USPC <classes, patent> Citation IPCR <ipcrs, patent> MainClass <mainclass, uspc> SubClass <subclass, uspc> USRelDoc <patent, usreldocs> reldocs> OtherReference <patent, otherreferences> Application <application, patent> <patent, citations> citedby> <patent, rawassignees> <patent, rawinventors> <rawlawyers, patent>
  • 6. Accessible data: monthly disambiguated grant, app data Jan ‘75 – Dec ‘14: http://funglab.berkeley.edu/database •  Parse, clean, disambiguate: – inventors – geography (Google lookup) – assignee (crude Jaro-Winkler) – lawyer (crude Jaro-Winkler) – consistent inventor identifiers – cites, claims, non-pat refs… – .csv download or SQL query – future: blocking, tech control – > 300M observations (not all characterized yet); ~50GB
  • 7. Will the real Matt Marx please stand up? Plainview NY Everett MA Mt View CA Class 704
  • 8. Disambiguation: a classifier problem •  Popular methods: we currently use last three – Manual – Linear weighting + manual tuning – Naïve Bayes, supervised and semi-supervised – String matching – K-means intra and inter cluster optimization – Look up (Google provided access to library) •  Active research topic in machine learning •  Julia Lane is planning a contest •  Had more complex approach (Li et al. 2014) – latest is simpler, faster, supportable, improvable • though not as accurate yet – tends to oversplit
  • 9. Inventor disambiguation •  Start with (block on) exact name matches •  Euclidean distance for exact attribute matches •  Balance min intra cluster and max inter cluster distances
  • 10. •  Look for no further improvement – 4 in this case
  • 11. •  Re-label each column with a cluster •  Relax exact name match and merge •  Use correlation of co-authors as well
  • 12. Future of inventor disambiguation •  Relax strict matching •  Bring in additional data – All tech fields – Lexical overlap – Law firms – Prior art citations and non patent references • New algorithms • Make everything public and support AIR tournament
  • 13. Assignee disambiguation •  Jaro-Winkler after simple string cleaning •  Unique assignees from 6,700,000 to 507,000 •  Indentifier, raw and cleaned name available
  • 14. Future of assignee disambiguation •  Coordinate with NBER and HBS efforts – The field needs to curate and maintain cumulative progress •  CONAME data from USPTO •  Normalize common affixes •  Train with manually developed NBER disambiguation •  Apply inventor algorithm •  Provide Compustat identifier •  Add subsidiary information -  BvD sample of 6,000 major U.S. firms revealed 50,000 subsidiaries under parental control (>50% in 2012) -  GE: 250 subsidiaries, ~98% patents filed under GE
  • 15. Law firms •  Similar algorithms to assignees •  Not aware of any applications yet
  • 16. Locations •  Use Google’s geocoding API •  Unique cities from 333K to 66K •  City, region, country – Lat and Long being developed – Do not provide street level data
  • 17. If you’re allergic to SQL: http://rosencrantz.berkeley.edu
  • 18. Approximate results (full 2014 data in process) http://funglab.berkeley.edu/database
  • 19. Tools and applications •  Look for this stuff and high level explanations at: – http://www.funginstitute.berkeley.edu/blog-categories/faculty-directors-blog#
  • 20. Visualizations • Clean tech inventions mapped by type and source • Inventor mobility movies • Patent location in technology “space” • The convergence and divergence, the coalescence and reconfiguration of components – the flow of technology - over time • Visualizing the patent application process
  • 21. Clean Tech Patent Mapper •  Li, G., K. Paisner, “A List of Clean Tech Patents.” •  http://funglab.berkeley.edu/cleantechx/ •  Energy: wind, solar, bio, hydro, geo, nuclear •  Assignee: VC backed, university, government, large and small incumbents, no assignee
  • 22. VC patents 1990-1999 Innovation and Entrepreneurship in Clean Energy: Nanda, Younge, Fleming Note scale of funding activity 1990-1999
  • 23. VC patents 2000-2009 Innovation and Entrepreneurship in Clean Energy: Nanda, Younge, Fleming See Nanda, R. and K. Younge, L. Fleming. “Innovation and Entrepreneurship in Clean Energy,” Forthcoming at Rethinking Science and Innovation Policy, NBER. Much greater funding activity 2000-2009
  • 26. Mobility mapper: http://funglab.berkeley.edu/mobility/ • Larger states • Example: 1987 immigration to MI (note one IL inventor):
  • 27. ! ! 1987 1982 Illustrates causal impact of noncompetes on brain drain (Marx, Singh, Fleming, forthcoming RP)
  • 30. Acknowledgment of government support – Hillary Greene, Dennis Yao, Guan Cheng • What proportion of 2015 patents can be traced to govt?
  • 31. 5M patent applications as a Markov process? Starting with an analysis of Bilski vs. Kappos
  • 34. Method to illustrate network around seed inventors
  • 35. Cool pics – but what do they mean? – Need to validate visualizations with ground truth – Mixed visualization and historical study of biggest semiconductor breakthrough of last decade – the FinFET
  • 36. Why FinFET? •  Study intended to explore/develop breakthrough visualization tools – tie to reality w/o conflating variables • All patents Northern CA 1995-2000 • Ranked by future citations • Tech distance – from our brains, close but moldy •  Geographic distance – about 40 yards •  Social distance – head of search committee that hired me – neighbor
  • 39. But they also integrated outsiders
  • 40. The flow of technology 1)  Words are components -> little differentiation, this is so incremental 2)  No geographic localization of trajectories 3)  How did university plop in and do this? 4)  FinFET may have been only govt supported patent
  • 41. Coming attractions • Blocking actions – better than citations as a measure of patent impact? • Lexical novelty – First appearance of new word in corpus – First pair-wise combination of words • Lexical distance between classes
  • 42. Identification of blocking patents – pdf challenges: OCR 101,195 PDF files…
  • 43. Claim Rejections – 35 USC 103 3. The folowing is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness Detail Enhancement Noise Reduction OC R
  • 45. First results from 2012 • 2011 now complete as well • Need to characterize each type of action
  • 46. I may come to you tin cup in hand… •  Download, parse, clean, disambiguate, store and serve up > 300M data (and weekly updates) – Julia Lane taking over part of this •  Blocking data: must OCR ~400M documents •  Disambiguation takes weeks, PAIR years – ~$150K hardware alone past year – database person in Si Valley (~$140K + Cal tax) •  Mention maintenance in NSF proposal => ding •  Public good (~50,000 downloads) •  Talking with firms and private philanthropy