Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

Published in: Business, Education
  • Login to see the comments

  • Be the first to like this


  1. 1. Derek Sturdy Tikit Granite & Comfrey Non-legal content integration – issues, methods, benefits
  2. 2. Our founders Sir William Granite 1738 - 1813 Rev. Dr. Nicholas Comfrey 1742 - 1818 Tikit Granite & Comfrey
  3. 3. Their first employee Tikit Granite & Comfrey Miss Emma Hardfarthing, c. 1801
  4. 4. KM in perspective matter documents know-how external resources internal (including primary law, government, online commentary, non-legal content, "trade" sites, CDs, etc) marketing, project documents Tikit Granite & Comfrey
  5. 5. Outline <ul><li>Who needs to link to non-legal content? </li></ul><ul><li>Linking via taxonomies </li></ul><ul><ul><li>implications for internal taxonomies </li></ul></ul><ul><ul><li>taxonomy to taxonomy </li></ul></ul><ul><ul><li>taxonomy to full text </li></ul></ul><ul><li>Linking by straight search </li></ul>Tikit Granite & Comfrey
  6. 6. Who are the users? <ul><li>Not primarily researchers because they know how to set about it anyway </li></ul><ul><li>Non-legal content linking is mainly for </li></ul><ul><ul><li>lawyers at their desks </li></ul></ul><ul><ul><li>marketing people </li></ul></ul><ul><ul><li>services staff eg IT, secretaries </li></ul></ul>Tikit Granite & Comfrey
  7. 7. What do our users have in common? <ul><li>They want a complex issue made simple, which is impossible </li></ul><ul><ul><li>all that silly stuff about &quot;integration&quot; and &quot;just give me a simple box&quot; which results in 75,000 hits or nothing </li></ul></ul><ul><li>They will gratefully accept handsomely presented guidance </li></ul>Tikit Granite & Comfrey
  8. 8. What's wrong with Google? <ul><li>Nothing at all, except that </li></ul><ul><ul><li>your users do not know what is verified and what is rubbish </li></ul></ul><ul><ul><li>even the &quot;advanced&quot; search is just one of those oh-so-nineties field things </li></ul></ul><ul><ul><li>50 pages * 20 hits at legal costs = ruin </li></ul></ul><ul><ul><li>basically, far too much information because of all the junk on the web </li></ul></ul>Tikit Granite & Comfrey
  9. 9. What does this actually mean? <ul><li>That all successful attempts to integrate valuable content are trying their own methods of getting round the structured – unstructured issue </li></ul><ul><li>Is there a one size fits all answer? No, there isn't. Let's look at that ... </li></ul>Tikit Granite & Comfrey
  10. 10. The internal only answer <ul><li>Relational databases (ie organised metadata) </li></ul><ul><ul><li>handle precision recall </li></ul></ul><ul><ul><li>handle the updating issues </li></ul></ul><ul><ul><li>handle lateral linking </li></ul></ul><ul><ul><li>but sadly .... </li></ul></ul><ul><li>Outside your control is all the other external stuff which is still unstructured &quot;content&quot; – ie straight text – low value, but lots of it! </li></ul><ul><li>This is a temporary phase, but it will see most of us out ..... </li></ul>Tikit Granite & Comfrey
  11. 11. Ways to approach this <ul><li>Autonomy – designed for science, brilliant at science, rubbish for law </li></ul><ul><li>Metadata – which means the taxonomy stuff in terms of added value – designed for soft subjects like law and social science </li></ul><ul><li>Hybrid systems – like xrefer – which use ingenious software to try and cut down the costs of the metadata approach </li></ul>Tikit Granite & Comfrey
  12. 12. Why not purely automatic software? <ul><li>Because of the tiny legal vocabulary – 5000 terms, instead of 250,000 – with meaning dependent on context </li></ul><ul><li>Because of the citation problem – not to be discussed in detail today </li></ul><ul><li>In essence: automatic software needs one word to have one meaning, which is true in science (normally) and often not true of law (except at the highest level) </li></ul>Tikit Granite & Comfrey
  13. 13. What must a taxonomy deliver? <ul><li>Real help in finding things </li></ul><ul><li>Therefore - no ambiguity! </li></ul><ul><li>Comfort for users of collections </li></ul><ul><ul><li>have I got everything relevant? - comprehensiveness </li></ul></ul><ul><ul><li>have I avoided irrelevance? - accuracy </li></ul></ul><ul><ul><li>can I easily find similar stuff? – lateral linking </li></ul></ul><ul><li>Is it still true? </li></ul><ul><ul><li>if the firm knows anything about anything on which practice is based, do I know it too? </li></ul></ul>Tikit Granite & Comfrey
  14. 14. Components: taxonomies <ul><li>Thesauri </li></ul><ul><ul><li>legal subject, legal work type </li></ul></ul><ul><ul><li>geog./jurisdiction, industry/sector, assets </li></ul></ul><ul><li>Authority files built up for </li></ul><ul><ul><li>cases </li></ul></ul><ul><ul><li>legislation </li></ul></ul><ul><ul><li>own know-how documents </li></ul></ul><ul><ul><li>grey paper </li></ul></ul>Tikit Granite & Comfrey
  15. 15. The Three C’s <ul><li>Classification – subject matter </li></ul><ul><li>Categorisation – types of work </li></ul><ul><li>Citation – reference to other documents, but especially to legal authorities </li></ul><ul><ul><ul><li>Cases </li></ul></ul></ul><ul><ul><ul><li>Legislation </li></ul></ul></ul>Tikit Granite & Comfrey
  16. 16. Where might these be applied? general www resources paid-for online resources primary law resources document management practice management know-how management The Firm External Resources Tikit Granite & Comfrey
  17. 17. example: Search Engine Applications general www resources paid-for online resources primary law resources document management practice management know-how management Tikit Granite & Comfrey
  18. 18. Classification - subject thesauri general www resources primary law resources document management practice management know-how management paid-for online resources Tikit Granite & Comfrey
  19. 19. Categorisation – type of work general www resources primary law resources paid-for online resources practice management document management know-how management Tikit Granite & Comfrey
  20. 20. Authority Files – exact Citations general www resources paid-for online resources primary law resources document management practice management know-how management Tikit Granite & Comfrey
  21. 21. Matter documents Know How Matter metadata KH m’data Metadata density and classification workloads Relatively simple, high volume Millions of documents Specialist, complex, low volume External Sources Specialist know-how Tikit Granite & Comfrey
  22. 22. Conclusions so far <ul><li>Only certain materials within the firm – know-how - will have detailed classification, categorisation and citation work done on them </li></ul><ul><li>Most other materials in the firm will be classified at a high level only, or be classified by inheritance (eg documents within a matter file) </li></ul>Tikit Granite & Comfrey
  23. 23. Direct taxonomy-taxonomy linking <ul><li>For the users - seriously cool and dead easy </li></ul><ul><li>For IS staff - match terms not by their letters and spaces but by a one-off human reconciliation of meanings and context – ie some work </li></ul><ul><li>Illustration: PLC </li></ul>Tikit Granite & Comfrey
  24. 24. Hybrid methodologies <ul><li>Use the taxonomy to guide the inexperienced user's thoughts to the topic concerned </li></ul><ul><li>Drill-down and drill-up techniques are both useful </li></ul><ul><ul><li>drill down: start with the general, go to the particular </li></ul></ul><ul><ul><li>drill up: choose a particular term, see if it exists, see the context and alternative terms </li></ul></ul>Tikit Granite & Comfrey
  25. 25. Transfer the idea <ul><li>You then use this approach, developed for your own internal resources – intranets, knowledge systems, DMS – to link out to external resources </li></ul><ul><li>Illustration using xrefer here .... </li></ul>Tikit Granite & Comfrey
  26. 26. Implications for your taxonomies <ul><li>Ambiguity remains the big enemy! </li></ul><ul><li>Other enemies: </li></ul><ul><ul><li>&quot;gosh aren't I clever&quot; terms </li></ul></ul><ul><ul><li>jobs for the boys/girls – which usually result in loss of jobs for the ...... </li></ul></ul><ul><li>Pointless complexity is the source of most ambiguity – simplify! </li></ul><ul><ul><li>segmented taxonomies are the neat way to simplify </li></ul></ul>Tikit Granite & Comfrey
  27. 27. Ambiguity - continued <ul><li>If your taxonomies are not simple enough to avoid ambiguity, you should not be meddling with the idea at all </li></ul><ul><li>Complex taxonomies are for academics with time and a clear need for lateral thinking to the n'th degree </li></ul><ul><li>In legal and governmental practice, your users (as defined) may have the brain, but not the time, or may not have the brain </li></ul>Tikit Granite & Comfrey
  28. 28. Ambiguity - continued <ul><li>Most ambiguity comes from a failure to grasp the point behind the metadata approach, which is &quot;make it easy to find&quot; </li></ul><ul><li>Classification and categorisation are simply tools, not ends in themselves </li></ul><ul><li>Ambiguity is what search engines do! </li></ul>Tikit Granite & Comfrey
  29. 29. Direct search <ul><li>The user sees a word or phrase ... </li></ul><ul><ul><li>and does not understand it </li></ul></ul><ul><ul><li>and wants to know more about it </li></ul></ul><ul><li>In the ideal world, she highlights it, clicks it, and gets seven, organised results </li></ul><ul><li>In the real world, this does not happen ... but </li></ul>Tikit Granite & Comfrey
  30. 30. Reference linking, concept blow-ups <ul><li>The jury is out on this at present </li></ul><ul><li>If you do not know your topic, then you can be misled very easily </li></ul><ul><li>If you have a smattering of knowledge, you can probably navigate successfully </li></ul><ul><li>A little knowledge is much less dangerous than none, despite the proverb to the opposite effect! </li></ul>Tikit Granite & Comfrey
  31. 31. Where does this leave us? <ul><li>Correct choice of provider </li></ul><ul><li>This remains the only way at present to handle the &quot;integration&quot; and &quot;unstructured data&quot; problem </li></ul><ul><li>pure software doesn't do it – for us </li></ul><ul><li>&quot;integrators&quot; are fine for bulletins, but often useless for research and briefings </li></ul><ul><li>therefore the human element has to be introduced at some point or another </li></ul>Tikit Granite & Comfrey
  32. 32. Where is the point of human input? <ul><li>the metadata approach: after content has been published, the key content is indexed and abstracted </li></ul><ul><li>the source selection approach: from certain sources, the content is of sufficient quality that it does not need weeding – it's already abstracted, in other words </li></ul><ul><li>two sides of the same coin? </li></ul>Tikit Granite & Comfrey
  33. 33. Conclusions <ul><li>If you develop a single, large, unsegmented taxonomy, you will be stuck with search-engine approaches to external non-legal content </li></ul><ul><li>If you think beyond legal, to office (ie admin), industries / sectors (ie marketing), and so on, you can develop hybrid approaches </li></ul><ul><li>These will be more powerful than just search engines, though you need those too for esoterica </li></ul><ul><li>The key to this remains: choose your external content providers – give them the problem! </li></ul>Tikit Granite & Comfrey
  34. 34. Tikit Granite & Comfrey
  35. 35. How it's done Tikit Granite & Comfrey classification by subject and work type identification of legal references authority files: subject work type cases legislation Doc 5 Doc 4 Doc 3 Doc 2 &quot;link tables&quot;: piece of text to identified authority Doc 1