Semantic Analysis in IA

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    7 Favorites, 1 Group & 1 Event

    Semantic Analysis in IA - Presentation Transcript

    1. Semantic Analysis in IA Matthew Hodgson ACT regional-lead, Web and Information Management 23 Sept 2007
    2.  
    3.  
    4. Jeffrey Veen on analysing content
      • “a mind-numbingly detailed odyssey through your web site...
      • …this process…is a relatively straightforward process of clicking through your web site and recording what you find.”
      Source: http://www.adaptivepath.com/ideas/essays/archives/000040.php
    5.  
    6. Content overview – first take
      • Medical restrictions text
      • Free-text built in Word and hand-crafted (*grrr*)
      • Unclassified
      • Varied consistency within and between texts
      • Highly complex sentence structures in pseudo-legalese
      • Style reflects the author rather than the meaning in the communication
      • Content needed for re-use
      • Content output was needed for reuse by others
      • Multiple audiences
      • Multiple purposes for re-use
      • Codification
      • Codification (after authoring) takes too long
      • Need to reduce timeframes!
    7. The task . . .analyse and codify
      • Linguistics
      • … a whole discipline devoted to the
      • study of language
    8. “You’re joking!?”
      • All language has structure – even someone’s pseudo-legal English
      • Analysing language is actually easier than you might think
    9. The approach
      • Analyse semantics of content
      • There is a predicable structure
      • It’s all just Lego™ building blocks (nouns, verbs, adjectives, etc)
      • Implied meaning can be made overt
      • New tools for IAs to play with!
      • Understand semantics, the structure of sentences, and you can analyse, categorise and codify English!
    10. Language as Lego™
      • Building blocks
      • Subject (S)
      • Verb (V)
      • Object (O)
      • Order of blocks
      • Differs depending on the language
    11. Order from chaos
      • SVO languages
      • English, French, Chinese, Bulgarian, Swahili
      • SOV
      • Japanese, Turkish, Korean
      • VSO
      • Classical Arabic, Celtic and Hawaiian
      • VOS
      • Fijian, Yoda’s amusing phrases
    12. Subjects, verbs and objects
      • Sometimes, though, the SVO structure is hidden:
      • The apple is red or
      • The apple is a red apple?
      • Uncovering the hidden structure helps to differentiate between the subject and the object and identify the who and what
    13. Sentences as (apple) trees
    14. Semantic analysis
      • Medical restrictions wording:
      • Restricted benefit Gastro-oesophageal reflux disease; Scleroderma oesophagus;
      • Authority required Peptic ulcer
    15. Semantic analysis (cont.)
      • Actual sentence
      • Peptic ulcer
      • Implied sentence
      • The prescription of medicine is restricted to the initial treatment of patients with peptic ulcer
    16.  
    17.  
    18. “ Who Treated” semantic model
    19. “ Authority Action” semantic model
    20. High-level semantic overview
    21. How did the ‘trees’ help?
      • Inferred
      • How people think about and structure content
      • Described
      • Business processes that produce content
      • Identified
      • Where content quality is poor so it can be improved
      • Critical components of the sentence for codification
      • Designed
      • Taxonomies and describe folk taxonomies
      • Built
      • Systems to help bring some structure to content authoring
    22. How can I do this stuff too?! (a side-step)
      • Theory is important
      • An understanding of semantics - sentence trees and grammar
      • Text books by authors like Fromkin and Rodman can help through the tricky bits
      • Need good tools
      • Conexor : www.conexor.fi/demo/syntax
      • Big sheets of paper (and an electronic whiteboard)
      • Visio (not PowerPoint!)
    23. Demo
      • Connexor
      • www.conexor.fi/demo/syntax
    24. Introducing ways to codify restrictions
      • How are we actually going to codify the stuff?!
      • Give people Lego™ or ‘fridge-magnets’ to build sentences
      • Build a prototype to explore and demonstrate conceptual design
      • Communicate
      • Talk about ideas with business owners
      • Explore possibilities with end-users
      • Build-in ‘no surprises’ into change management
      • Iterate
      • Iterate and refine concepts and design before it was built
      • Inform
      • Developers of intent and requirements
      • The building of an ‘tool’ for codifying content (hooray for Axure!)
    25. Demo
      • Protyotyping with Axure
    26.  
    27.  
    28.  
    29.  
    30.  
    31.  
    32.  
    33.  
    34. Why should I care about this?
      • Google uses semantic analysis to index content
      • Translation software uses semantic analysis to identify ‘components’ for translation
      • Good sentence structure equals:
        • Accurate indexing
        • Higher rank relevance of content
        • Happy people (they find what they’re looking for)
    35. Summing up
      • Content is still king, but:
      • Is it’s quality any good?
      • Does it match your website’s categories?
      • Is your metadata ok?
      • Can people find the content they need?
      • Do you need to understand your content better?
      • Semantic analysis can:
      • Make your content audits more objective
      • Inform processes to improve the quality of the content
      • Inform processes to improve search engine indexing
      • Inform metadata creation
      • Improve website navigation design
      • email: [email_address] web: www.smsmt.com
      • blog: magia3e.wordpress.com twitter: magia3e community: iacanberra.org
      • cartoons: © Garry Larson
      Please Sir, can I have some more…?
      • Fin

    + Matthew HodgsonMatthew Hodgson, 2 years ago

    custom

    2696 views, 7 favs, 1 embeds more stats

    English is a messy and chaotic language, with excep more

    More Info

    CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

    Go to text version
    • Total Views 2696
      • 2576 on SlideShare
      • 120 from embeds
    • Comments 0
    • Favorites 7
    • Downloads 146
    Most viewed embeds
    • 120 views on http://magia3e.wordpress.com

    more

    All embeds
    • 120 views on http://magia3e.wordpress.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as innappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel

    Categories