Intelligent Trademark Analysis: Experiments in Large-Scale Evaluation of Real-World Legal AI (ICAIL 2013)


Presented at the International Conference on Artificial Intelligence and Law, Rome, June 2013.


The intelligent trademark analysis system developed by Onomatics is a trademark information system based on an AI model of trademark similarity (likelihood of confusion). The basic technology can be used as a general trademark search engine as well as for more specific purposes ranging from trademark candidate ranking (TrademarkNow NameRank) to comprehensive risk analysis (TrademarkNow NameCheck) and IPR enforcement. This paper presents a brief overview of the system and the concrete applications available as of April 2013. The system has been evaluated on a large number of actual trademark opposition cases, over 30 000 from the USPTO TTAB and over 20 000 from the OHIM Opposition Division, currently with a precision of 79.9% and a recall of 94.9%.

    1. 1. Intelligent TrademarkAnalysisExperiments in Large-Scale Evaluationof Real-World Legal AIAnna RonkainenChief Scientist, Onomatics, Inc.miscellaneous, University of
    2. 2. ―These papers had a low rate ofconsideration of evaluation issuesreflecting common practice in researchbiased development environments.These results confirm that more attentionto evaluation is needed in the legalknowledge based systems domain.‖Hall & Zeleznikow (ICAIL 2001): Acknowledging Insufficiency in the Evaluation ofLegal Knowledge-based Systems: Strategies Towards a Broad-based EvaluationModel2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis2
    3. 3. ―Relevancy ranking is currently a foreignconcept in the trademark legal industry, asassigning numeric weight to potentiallyconflicting citations based on objectivecomputer analysis is currently avant-garde.‖Anderson & Cary (2010): Navigating the Challenges of U.S. PharmaceuticalTrademark Clearance Research―Yet trademark law is supposed to be subjectivein the sense of being based on personalopinions—it is just that the relevant opinion isthat of the average consumer [...]‖Lisa Larrimore Ouellette (forthcoming): The Google Shortcut to Trademark law Ronkainen - @ronkaineIntelligent Trademark Analysis3
    5. 5. TrademarkNow™NameCheck™• a system for trademark risk analysis• relevancy-ranked trademark search• based on a model of trademark (markand product) similarity, derived from (butnot directly based on) MOSONG(Ronkainen 2010)• additional features for absolutegrounds, word meanings etc.• current name since March 2013• originally launched as Onomatics QuickSearch (2012) without risk analysis• web service, currently free preview butotherwise by subscription only2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis5
    6. 6. TrademarkNow™NameRank™• a tool for ranking 2–5 trademarkcandidates to find the least riskyone• same basic technology asNameCheck™• not actively offered at the moment2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis9
    8. 8. Evaluation: MOSONGFirst round:30 most recent (2002) relevant cases:• 20 from the Opposition Division and• 10 from the Boards of AppealResult*: all cases predicted correctly* when coded into the system by a domain expert2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis11
    9. 9. Evaluation: MOSONGSecond round: Non-expert validation:• done by non-law students taking a course inintellectual property law (n=75)• original validation set in two parts (15+15 cases)• at the beginning and the end of the course• completed non-interactively through a web form• correct answer: 54.6±6.5%• incorrect answer: 25.9±7.5%• no answer: 19.5±5.2% (± = σ)2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis12
    10. 10. Evaluation: MOSONG% ±stderr before after totalgroup 1 (n=15) 41.3±1.7 65.8±2.8 53.5±1.7group 2 (n=12) 46.1±2.0 65.0±3.0 55.6±1.9group 3 (n=48) 43.3±1.3 65.9±1.3 54.7±0.9total (n=75) 43.4±1.0 65.8±1.1 54.6±0.82013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis13Second round: Non-expert validation:
    11. 11. Testing NameCheck™• system must work in real-worldconditions ⇒ has to be robust• as wide a range of realistic test cases aspossible is desirable• system is based on a predictive modelfor trademark opposition cases so suchcases can be used (almost) directly fortesting• specific service subcomponent forevaluating cases in bulk: text file inputbut otherwise same process• inputs: mark and plain-text product(s)2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis14
    12. 12. Test cases• closed, unappealed first-instanceopposition cases• only those where the prior right is foundin the database:– same jurisdiction– still valid– word or combination mark• typically multiple prior rights per case• readily available as XML (but USPTOTTAB XML is write-only)• grand total: 55435 cases2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis15
    13. 13. Test cases: EU• source: OHIM (EU TM office)Opposition Division• for marks filed in 1996–2011• opposition considered successful:– opposition upheld (1563 cases)– mark withdrawn (10516 cases)• opposition unsuccessful:– opposition rejected (4217 cases)– opposition withdrawn (5368 cases)2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis16
    14. 14. Test cases: US• source: USPTO Trademark Trial and AppealBoard (TTAB)• for marks filed in 1980, 1985, 1990, 1995–2011• opposition successful– opposition upheld (18431 cases)– mark withdrawn (7329 cases)• opposition unsuccessful:– opposition dismissed (8897 cases)– opposition withdrawn (8603 cases)• categories and outcomes determinedheuristically so some overlap (duplicatesremoved for the grand totals)2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis17
    15. 15. Performance criteria• result considered correct basedon the highest similarity score forall prior rights iff– opposition successful for at leastsome of the goods and services andscore ≥ 0.5– opposition unsuccessful andscore < 0.52013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis18
    16. 16. Overall performance• EU (21162 cases): precision79.2%, recall 97.3%, F 87.3%• US (33773 cases): precision80.4%, recall 93.3%, F 86.4%• total (55435 cases): precision79.9%, recall 94.9%, F 86.7%• low-risk cases hurt precision:erring on the side of safety2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis19
    17. 17. High-risk performanceCase type n listed intop 10listed intop 100correctdecisionEU:oppositionupheld1473 54436.9%92162.5%142596.7%EU:oppositionwithdrawn9515 483650.8%686572.1%926197.3%US:oppositionupheld18409 561830.5%995554.1%1713893.1%US:oppositionwithdrawn7278 236732.5%412956.7%677693.1%2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis20
    18. 18. What is this good for(absolutely nothing?)• necessary but not sufficient• inputs quite noisy: many cases decided on factorsother than likelihood of confusion: reviewing all ofthem and removing the noise the only way to >99% (not going to happen)• mainly useful for regression testing (but too slowfor that, we mostly use two 1000-case samples)• only measures one particular aspect of systemperformance (and not the most relevant one)• additional specific test case sets needed forvarious elements in similarity ordering: developed(and under development) from scratch, typicallybased on actual marks• to be continued...2013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis21
    19. 19. Thank you!Further reading:• @ronkaine on Twitter• (my research blog)•• Ronkainen, Anna (2010): MOSONG, a Fuzzy Logic Model of Trade Mark Similarity• Ronkainen, Anna (2013): Redefining Trademark Clearance with Intelligent LegalTechnology. IPRinfo 1/2013.• Ronkainen, Anna (forthcoming): Scaling Intelligent Trademark Analysis fromPrototype to Production: From MOSONG to Onomatics Quick Search. Presented atthe 1st International Workshop on Artificial Intelligence and Intellectual Property.222013-06-12Anna Ronkainen - @ronkaineIntelligent Trademark Analysis