Your SlideShare is downloading. ×
0
IAs, Language and Lego™ –  an Introduction to Semantic Analysis Matthew Hodgson Regional-lead, Web and Information Managem...
 
 
IA Tools for understanding content
Content analysis…
<ul><li>We all: </li></ul><ul><li>Think about information in different ways </li></ul><ul><li>Write about information in d...
…  we all even write differently …
Jeffrey Veen on analysing content <ul><li>“ a mind-numbingly detailed odyssey through your  web site... </li></ul><ul><li>...
When analysing content …
An extract of medical restrictions text
What is this content?!  <ul><li>Medical restrictions text </li></ul><ul><li>Free-text built in Word and hand-crafted (*grr...
The task . . .analyse and codify
What tools would be appropriate? <ul><li>? </li></ul>
<ul><li>Linguistics </li></ul><ul><li>… a whole discipline devoted to the </li></ul><ul><li>study of language… </li></ul>p...
Language is like Lego™ <ul><li>Building blocks </li></ul><ul><li>Subject (S) </li></ul><ul><li>Verb  (V) </li></ul><ul><li...
Language is like Lego™ <ul><li>SVO languages </li></ul><ul><li>English, French, Chinese, Bulgarian, Swahili </li></ul><ul>...
Lego bricks: subjects, verbs and objects <ul><li>Sometimes, though, the SVO structure is hidden: </li></ul><ul><li>“ The L...
Uncovering hidden meaning <ul><li>If the LEGO trademark is used at all, it should always be used as an  adjective , not as...
Lego trees…
Semantic analysis <ul><li>Medical restrictions wording: </li></ul><ul><li>Restricted benefit Gastro-oesophageal reflux dis...
Semantic analysis (cont.) <ul><li>Actual sentence </li></ul><ul><li>Peptic ulcer </li></ul><ul><li>Implied sentence </li><...
Semantic structure of ‘peptic ulcer’
Semantic model for restrictions text
Semantics describing “Who Treated”
Semantics describing “Authority Action”
High-level semantic overview
Yes, it can be codified! <ul><li>Medical restrictions: </li></ul><ul><li>Did have structure </li></ul><ul><li>Did have und...
Demo <ul><li>Putting it together in a system: </li></ul><ul><li>Supporting building of content restrictions in a codified ...
 
 
 
 
 
 
 
 
The semantic analysis advantage vs <ul><li>Identifies: </li></ul><ul><li>Themes in content </li></ul><ul><li>Identifies: <...
What else could you use it for? <ul><li>When you need to understand: </li></ul><ul><li>Business  processes  that create co...
How can I add this to my toolbox??! <ul><li>Theory is important </li></ul><ul><li>An understanding of semantics - sentence...
Demo <ul><li>Connexor: </li></ul><ul><li>http://www.connexor.eu/technology/machinese/demo/ </li></ul>
Connexor
Connexor – machine tagger
Connexor – machine syntax
Why should I care about this? <ul><li>Google  uses semantic analysis to index content </li></ul><ul><li>Translation softwa...
Why should I care about this?
‘ Calais’ by Reuters
Summing up <ul><li>Content is still king! </li></ul><ul><li>But how can you tell if your content: </li></ul><ul><li>Is of ...
Take-home message <ul><li>Semantic analysis can help IAs: </li></ul><ul><li>Infer </li></ul><ul><li>How people think about...
<ul><li>Fin </li></ul>
IAs, Language and Lego™ an Introduction to  Semantic Analysis
by Matthew Hodgson Regional-lead,  Web and Information Management  SMS Management & Technology  Canberra Australia
by Matthew Hodgson Email  [email_address] Blog   magia3e.wordpress.com Slideshare   www.slideshare.net/magia3e Twitter  ma...
Upcoming SlideShare
Loading in...5
×

IAs, Language and Lego -- an introduction to Semantic Analysis

4,442

Published on

This presentation will introduce Semantic Analysis – a way in which content can be analysed and classified through its linguistic basis, rather than through its overt meaning. It will achieve this by using Lego as a metaphor for language and demonstrating that by examining the building blocks of language a deeper understanding of content can be gained.

Published in: Technology, Business
1 Comment
9 Likes
Statistics
Notes
  • Great preso Matthew. I really enjoyed your talk and it's always refreshing to add something new to my IA toolbox. Keep up the good work.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
4,442
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
166
Comments
1
Likes
9
Embeds 0
No embeds

No notes for slide
  • Transcript of "IAs, Language and Lego -- an introduction to Semantic Analysis"

    1. 1. IAs, Language and Lego™ – an Introduction to Semantic Analysis Matthew Hodgson Regional-lead, Web and Information Management, Canberra Australia 12 April 2008
    2. 4. IA Tools for understanding content
    3. 5. Content analysis…
    4. 6. <ul><li>We all: </li></ul><ul><li>Think about information in different ways </li></ul><ul><li>Write about information in different ways </li></ul>Information: we all think differently …
    5. 7. … we all even write differently …
    6. 8. Jeffrey Veen on analysing content <ul><li>“ a mind-numbingly detailed odyssey through your web site... </li></ul><ul><li>… this process…is a relatively straightforward process of clicking through your web site and recording what you find.” </li></ul>Source: http://www.adaptivepath.com/ideas/essays/archives/000040.php
    7. 9. When analysing content …
    8. 10. An extract of medical restrictions text
    9. 11. What is this content?! <ul><li>Medical restrictions text </li></ul><ul><li>Free-text built in Word and hand-crafted (*grrr*) </li></ul><ul><li>Unclassified </li></ul><ul><li>Varied consistency within and between texts </li></ul><ul><li>Highly complex sentence structures in pseudo-legalese </li></ul><ul><li>Style reflects the author rather than the meaning in the communication </li></ul><ul><li>Content needed for re-use </li></ul><ul><li>Content output was needed for reuse by others </li></ul><ul><li>Multiple audiences </li></ul><ul><li>Multiple purposes for re-use </li></ul><ul><li>Codification </li></ul><ul><li>Codification by 3 rd parties (after authoring) takes too long </li></ul><ul><li>Need to reduce timeframes! </li></ul>
    10. 12. The task . . .analyse and codify
    11. 13. What tools would be appropriate? <ul><li>? </li></ul>
    12. 14. <ul><li>Linguistics </li></ul><ul><li>… a whole discipline devoted to the </li></ul><ul><li>study of language… </li></ul>preposition verb adjective noun determiner subject object conjunction semantics sentence structure all language has structure
    13. 15. Language is like Lego™ <ul><li>Building blocks </li></ul><ul><li>Subject (S) </li></ul><ul><li>Verb (V) </li></ul><ul><li>Object (O) </li></ul><ul><li>Order of blocks </li></ul><ul><li>Differs depending on the language </li></ul>
    14. 16. Language is like Lego™ <ul><li>SVO languages </li></ul><ul><li>English, French, Chinese, Bulgarian, Swahili </li></ul><ul><li>SOV </li></ul><ul><li>Japanese, Turkish, Korean </li></ul><ul><li>VSO </li></ul><ul><li>Classical Arabic, Celtic and Hawaiian </li></ul><ul><li>VOS </li></ul><ul><li>Fijian, Yoda’s amusing phrases </li></ul>
    15. 17. Lego bricks: subjects, verbs and objects <ul><li>Sometimes, though, the SVO structure is hidden: </li></ul><ul><li>“ The Lego is red” or </li></ul><ul><li>“ Those Lego bricks are [some] red Lego bricks” ? </li></ul><ul><li>Uncovering the hidden structure helps to differentiate between the subject and the object and identify the who and what </li></ul>
    16. 18. Uncovering hidden meaning <ul><li>If the LEGO trademark is used at all, it should always be used as an adjective , not as a noun . </li></ul><ul><li>For example, say </li></ul><ul><li>&quot;MODELS BUILT OF LEGO BRICKS&quot;. </li></ul><ul><li>Never say </li></ul><ul><li>&quot;MODELS BUILT OF LEGOs&quot;. </li></ul><ul><li>Source: http://everything2.com/title/legOS </li></ul>
    17. 19. Lego trees…
    18. 20. Semantic analysis <ul><li>Medical restrictions wording: </li></ul><ul><li>Restricted benefit Gastro-oesophageal reflux disease; Scleroderma oesophagus; </li></ul><ul><li>Authority required Peptic ulcer </li></ul>
    19. 21. Semantic analysis (cont.) <ul><li>Actual sentence </li></ul><ul><li>Peptic ulcer </li></ul><ul><li>Implied sentence </li></ul><ul><li>The prescription of medicine is restricted to the initial treatment of patients with peptic ulcer </li></ul>
    20. 22. Semantic structure of ‘peptic ulcer’
    21. 23. Semantic model for restrictions text
    22. 24. Semantics describing “Who Treated”
    23. 25. Semantics describing “Authority Action”
    24. 26. High-level semantic overview
    25. 27. Yes, it can be codified! <ul><li>Medical restrictions: </li></ul><ul><li>Did have structure </li></ul><ul><li>Did have underlying logic </li></ul><ul><li>Were based on repeatable business processes </li></ul><ul><li>Could be codified </li></ul><ul><li>Could we make a ‘system’ to reinforce the structure at the point of authoring? </li></ul>
    26. 28. Demo <ul><li>Putting it together in a system: </li></ul><ul><li>Supporting building of content restrictions in a codified way </li></ul><ul><li>Protyotyping with Axure </li></ul>
    27. 37. The semantic analysis advantage vs <ul><li>Identifies: </li></ul><ul><li>Themes in content </li></ul><ul><li>Identifies: </li></ul><ul><li>Themes in content </li></ul><ul><li>Work processes </li></ul><ul><li>Folk taxonomies used </li></ul><ul><li>‘ Things’ written about </li></ul>
    28. 38. What else could you use it for? <ul><li>When you need to understand: </li></ul><ul><li>Business processes that create content </li></ul><ul><li>When you want to disassemble content for: </li></ul><ul><li>FAQs </li></ul><ul><li>A-Z indexes </li></ul><ul><li>Help files </li></ul>
    29. 39. How can I add this to my toolbox??! <ul><li>Theory is important </li></ul><ul><li>An understanding of semantics - sentence trees and grammar </li></ul><ul><li>Text books by authors like Fromkin and Rodman can help through the tricky bits </li></ul><ul><li>Need good tools </li></ul><ul><li>Connexor : http://www.connexor.eu/technology/machinese/demo/ </li></ul><ul><li>Big sheets of paper (and an electronic whiteboard) </li></ul><ul><li>Visio (not PowerPoint!) </li></ul>
    30. 40. Demo <ul><li>Connexor: </li></ul><ul><li>http://www.connexor.eu/technology/machinese/demo/ </li></ul>
    31. 41. Connexor
    32. 42. Connexor – machine tagger
    33. 43. Connexor – machine syntax
    34. 44. Why should I care about this? <ul><li>Google uses semantic analysis to index content </li></ul><ul><li>Translation software uses semantic analysis to identify ‘components’ for translation </li></ul><ul><li>Good sentence structure equals: </li></ul><ul><ul><li>Accurate indexing </li></ul></ul><ul><ul><li>Higher rank relevance of content </li></ul></ul><ul><ul><li>Happy people (they find what they’re looking for) </li></ul></ul>
    35. 45. Why should I care about this?
    36. 46. ‘ Calais’ by Reuters
    37. 47. Summing up <ul><li>Content is still king! </li></ul><ul><li>But how can you tell if your content: </li></ul><ul><li>Is of good quality? </li></ul><ul><li>Matches your website’s categories? </li></ul><ul><li>Accurately reflects your metadata? </li></ul><ul><li>Can be found by people? </li></ul><ul><li>Semantic analysis can: </li></ul><ul><li>Make your content audits more objective </li></ul><ul><li>Inform processes to improve the quality of the content </li></ul><ul><li>Inform processes to improve search engine indexing </li></ul><ul><li>Inform metadata creation </li></ul><ul><li>Inform choice of taxonomy </li></ul>
    38. 48. Take-home message <ul><li>Semantic analysis can help IAs: </li></ul><ul><li>Infer </li></ul><ul><li>How people think about, and structure, their information </li></ul><ul><li>Describe </li></ul><ul><li>Business processes that produce content </li></ul><ul><li>Identify </li></ul><ul><li>Where content quality is poor so it can be improved </li></ul><ul><li>Critical components of the sentence for codification </li></ul><ul><li>Design </li></ul><ul><li>Taxonomies and describe folk taxonomies </li></ul><ul><li>Build </li></ul><ul><li>Systems to help bring some structure to content authoring </li></ul>
    39. 49. <ul><li>Fin </li></ul>
    40. 50. IAs, Language and Lego™ an Introduction to Semantic Analysis
    41. 51. by Matthew Hodgson Regional-lead, Web and Information Management SMS Management & Technology Canberra Australia
    42. 52. by Matthew Hodgson Email [email_address] Blog magia3e.wordpress.com Slideshare www.slideshare.net/magia3e Twitter magia3e
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×