Document versus content: getting quality information across the web Kate Forbes-Pitt 15 th  June 2006
Documents vs content Why bother? Users like documents Why? What has this got to do with quality? Quality of understanding Quality of dissemination Information is in-forming, it is a change in a person from an encounter with data  Getting ideas across can be a tricky process. Human inventiveness draws on all resources – down to the look, the feel, and even the smell coming off a document.
Users and their documents Users ask for documents – why? This is how they get the information and share it  They think they are efficient They know them and  trust  them They think documents communicate information effectively They have them already – it is less work
What is a document? A piece of paper with writing on it We can scan this and ‘pdf’ it A way of capturing / imparting information
The letter How do we know it’s a letter? It is not a piece of paper with writing on it It contains no ‘information’ at all But: It contains the information that it is a letter What can be said about this information? It is not held as words Our recognition of it is almost instant We all agree on it social implicit learned knowledge about our world
What can be said about documents? They contain information separate from their  content They ‘wrap’ the content and give information about it They are layered: Properties Class Text This is what a makes a document a document 1 3 2
Documents on the web Is the document information successfully captured and reproduced? No (arguably) Why? Two principal reasons Document’s ‘old’ social rules are masked by ‘new’ rules Social context is not ‘automatable’
1. Rules of interaction Normative and pragmatic access Normative access – rights and obligations surrounding a document Letter is it addressed to you? Are you obliged to reply to it? Pragmatic access – governed by relevance and opportunity A notice pinned to a notice board Relevance is determined by those walking past Opportunity – walking past and being literate or tall enough to see it
Computer access Pragmatic access Opportunity is not immediate Level of literacy required is different  Normative access   Rights and obligations are different on the web Communication of it is unnecessary Social knowledge  of the same kind  is unnecessary The implicit information that makes a document a document is superfluous It retains it as a printed document
2. What can be automated A document requires social knowledge in order to interpret it The social knowledge required is not available within the document i.e. to interpret the document one must have access to implicit rule not contained within the system   This is arguably impossible to ‘automate’ Dreyfuss Collins and Kusch
Dreyfus Four levels of intelligence Highest level – natural language translation First I was afraid, I was petrified Kept thinking I could never live without you by my side But I spent so many nights thinking how you did me wrong I grew strong I learned how to carry on and so you’re back from outer space First had I keep thinking fear I petrified I could without to apart from my side never live but I had spent thought so much nights how you made yourselves me wrongly I developed much I learned how and to continue in such a way you are from return of the special atmospheric area  Natural language translation requires social knowledge Documents are the same
Collins and Kusch Two types of action Mimeomorphic Reptitive – robotic Spot welding a car Polimorphic Rule bound BUT Impossible to write a recipe for HSBC adverts Concur with Dreyfus: Where social rules are involved you cannot automate
 
2. What can be automated? We know we cannot automate a social process We know the document to be a social artefact Therefore we know that the document will be impoverished by automating it
Where does that leave us? The web as a new postal system? Or way of getting content to users?
Web Computer has its own ‘rules of engagement’ It destroys pragmatic access It has its own set of expectations It masks the documents own rules Text is never ‘naked’ It is always in context Understand the context At best document rules are confused Resulting in confused users At worst rules are lost Resulting in lost information
Content Content wins out Content uses the rules of the web Content is enriched by this environment Documents are impoverished by it
Quality information Users clear about purpose of text Users able to interpret without confusion

Delivering Information: Document vs. Content

  • 1.
    Document versus content:getting quality information across the web Kate Forbes-Pitt 15 th June 2006
  • 2.
    Documents vs contentWhy bother? Users like documents Why? What has this got to do with quality? Quality of understanding Quality of dissemination Information is in-forming, it is a change in a person from an encounter with data Getting ideas across can be a tricky process. Human inventiveness draws on all resources – down to the look, the feel, and even the smell coming off a document.
  • 3.
    Users and theirdocuments Users ask for documents – why? This is how they get the information and share it They think they are efficient They know them and trust them They think documents communicate information effectively They have them already – it is less work
  • 4.
    What is adocument? A piece of paper with writing on it We can scan this and ‘pdf’ it A way of capturing / imparting information
  • 5.
    The letter Howdo we know it’s a letter? It is not a piece of paper with writing on it It contains no ‘information’ at all But: It contains the information that it is a letter What can be said about this information? It is not held as words Our recognition of it is almost instant We all agree on it social implicit learned knowledge about our world
  • 6.
    What can besaid about documents? They contain information separate from their content They ‘wrap’ the content and give information about it They are layered: Properties Class Text This is what a makes a document a document 1 3 2
  • 7.
    Documents on theweb Is the document information successfully captured and reproduced? No (arguably) Why? Two principal reasons Document’s ‘old’ social rules are masked by ‘new’ rules Social context is not ‘automatable’
  • 8.
    1. Rules ofinteraction Normative and pragmatic access Normative access – rights and obligations surrounding a document Letter is it addressed to you? Are you obliged to reply to it? Pragmatic access – governed by relevance and opportunity A notice pinned to a notice board Relevance is determined by those walking past Opportunity – walking past and being literate or tall enough to see it
  • 9.
    Computer access Pragmaticaccess Opportunity is not immediate Level of literacy required is different Normative access Rights and obligations are different on the web Communication of it is unnecessary Social knowledge of the same kind is unnecessary The implicit information that makes a document a document is superfluous It retains it as a printed document
  • 10.
    2. What canbe automated A document requires social knowledge in order to interpret it The social knowledge required is not available within the document i.e. to interpret the document one must have access to implicit rule not contained within the system This is arguably impossible to ‘automate’ Dreyfuss Collins and Kusch
  • 11.
    Dreyfus Four levelsof intelligence Highest level – natural language translation First I was afraid, I was petrified Kept thinking I could never live without you by my side But I spent so many nights thinking how you did me wrong I grew strong I learned how to carry on and so you’re back from outer space First had I keep thinking fear I petrified I could without to apart from my side never live but I had spent thought so much nights how you made yourselves me wrongly I developed much I learned how and to continue in such a way you are from return of the special atmospheric area Natural language translation requires social knowledge Documents are the same
  • 12.
    Collins and KuschTwo types of action Mimeomorphic Reptitive – robotic Spot welding a car Polimorphic Rule bound BUT Impossible to write a recipe for HSBC adverts Concur with Dreyfus: Where social rules are involved you cannot automate
  • 13.
  • 14.
    2. What canbe automated? We know we cannot automate a social process We know the document to be a social artefact Therefore we know that the document will be impoverished by automating it
  • 15.
    Where does thatleave us? The web as a new postal system? Or way of getting content to users?
  • 16.
    Web Computer hasits own ‘rules of engagement’ It destroys pragmatic access It has its own set of expectations It masks the documents own rules Text is never ‘naked’ It is always in context Understand the context At best document rules are confused Resulting in confused users At worst rules are lost Resulting in lost information
  • 17.
    Content Content winsout Content uses the rules of the web Content is enriched by this environment Documents are impoverished by it
  • 18.
    Quality information Usersclear about purpose of text Users able to interpret without confusion