Cloudof Knowing


Published on

How content analytics can be brought into research. The presentation was given as a webinar for the IE Business school where John Griffiths is a visiting professor.It features examples of the use of Purefold transmedia as a research methodology and the use of demographic replicator research bots. Part of the Cloud of Knowing project

Published in: Business
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Online Research is a horseless carriage which looks a lot like offline research. It hasn’t really started to take on unique forms which are suited to the internet.
  • Ridley Scott Associates – directors came up with short film stories – the concept being that these would be co-created with web content – funded by brand sponsorship and would be a collaboration between commercial storytellers the film industry, the content of the web, and brands creating entertainment which allowed them to dialogue with their customers. A trans media concept – using the creative commons – copyright free content anyone could write stories, make films or hyper documents as part of this collaboration. Take script ideas and put them into a wiki Using RSS feeds to collect story seeds and to feed them into a friendfeed forum where visitors would grade them, link to other content or bring in their own. Extended to storyline characters, set design. The shaped stories were then presented to brand sponsors who could work branded entertainment story lines, characters and themes into them.
  • Data gathered by RSS feed – research participants who go out tagging relevant content. So finding more relevant data. Other research participants who grade what comes through the friend feed/wiki.   Analysis and grading carried out by research professionals. How does this look as a hybrid research methodology? Sampling? Yes getting selection and explanation from respondents? Yes - This broadens out the ORC online research community to something much richer where participants play a much more active part. And the richness comes from the data flowing through – less pressure on moderating and surveys of the internal community. Recognisably a research hybrid – with a similar cost structure as ORCs.
  • Going deeper. Demographic replicators – living samples who represent a quantitative sample whose blogging tweeting is gathered in real time. Who respond to live questions. Who have been aggregated to research sample criteria. But who may not even know that they are a subject for research.   Demographic replicators also incorporate other forms of data GPS, potentially spending data, media consumption. So represent an aggregation of actual engagement and behavioural data.   So still fulfilling sampling criteria but now a weird combination of machine and human. Still recognisable as research but becoming more abstract.
  • Research needs to establish 2 things –who and what – what they are and what they think, what they feel, how they behave In quant sampling questions at the end of surveys- used for analysis but not for content. For qual sampling questions are asked during recruitment – using profile questionnaire. Sampling not only distinct but kept separate. Rule is that you can’t use the same piece of data to identify the participant (who) and as evidence of their opinion (what) – the strain of keeping them separate is preventing research from using data when it is not possible to identify who it is who has produced the material. Secondly sampling is to be literal. As far as possible year, demographics and usage are to be verified. By the word of the research respondent.
  • This is analogous to the development of the computer when instructions and data used to be on 2 different media card based knitting machine. Alan Turing whose papers conceptualised the computer and Van Neuman who separated out the storing of data from the processing of data – turing’s concept allowed instructions and data to be in the same medium. Van Neuman introduced a bottleneck because processors cost so much more than memory. So programmes were stored and put through a CPU. That bottleneck is still with us today. I suggest that the separate identification of the sample from the content is a similar bottleneck which is preventing research from going any further.
  • Sampling – first probabilistic and using a tagging system to tag content either sample or thematic content tags. Cookies already in usage for identifying webusers. Once we have overcome the bottleneck. There is no reason why we shouldn’t identify content first and then use cookies/sampling tags to eliminate all the content which is off sample. Now this starts to make available to us the riches of the web in realtime searchengines and google profile.
  • Will research go this route? At present this is not allowed. Vested interest in not going this route makes it unlikely. Qualitative researchers functioning rather like portrait painters in the 19 th century – not immediately displaced by the invention of photography but increasingly irrelevant as it became possible for people to take their own photos – not as good but much quicker and much more cost effective. Quantitative research has colonised the internet with a close analogue of what offline surveys have offered as a successful industrial model of data collection. The fact that online panels have become discredited so quickly is a sign the drawbacks of offline surveys were simply imported without being addressed – artificial, overly structured, measuring themselves rather than capturing reliable insightful claims of customer behaviour.
  • I hope it does for the sake of research. Because research has a great future following customers. This is where CRM is going without the data warehouses. Research methodologies badly needed where customer data is being aggregated and marketing decisions being made client side. If research doesn’t make this jump then it risks being demoted to a specialism. If you don’t know ask the customer. When more and more customer decision making is made without the customer.
  • Parting shot. As lines of demarcation are redrawn around companies it is possible that customers become just another stakeholder group with whom client companies are continually in dialogue and they won’t call it research any more. When I started my career within a week I was attending meetings at which the most senior marketing people were present. If we don’t broaden research to include content analytics it won’t stop clients anlysisng content as part of decision support. They may just not ask us to do it for them. So a junior researcher in 2013 (30 years after I started) may see rather less of the marketing director.
  • Cloudof Knowing

    1. 1. The Cloud of Knowing content analytics and the future of market research John Griffiths Planning Above and Beyond Jan 20 th
    2. 2. Ground I am going to cover <ul><li>Work in progress from Cloud of Knowing project </li></ul><ul><li>Paper for the Market Research Society conference </li></ul><ul><li>Online content – how it can be part of MR </li></ul><ul><li>MR stuck using offline research paradigms </li></ul><ul><li>Will introduce 2 new approaches </li></ul><ul><li>A way to bring content into research </li></ul>
    3. 3. The questions we need answers to <ul><li>Is sampling central to research or does research have to change? </li></ul><ul><li>If we don’t ask questions is it still research? </li></ul><ul><li>Do we need permission to use people’s content? </li></ul><ul><li>Do we need to protect anonymity? </li></ul><ul><li>How can research use internet content for business decision support? </li></ul><ul><li>Is there such a thing as real time research? </li></ul>
    4. 4. Redrawing the lines around research Will research advance to include content? Or will research retreat and become a specialism? the guys who do Interviews..
    5. 5. Research cultures <ul><li>Farmer/cultivators </li></ul><ul><li>Hunter gatherers </li></ul><ul><li>Scavengers </li></ul>
    6. 6. Blog site
    7. 7. Scavenging on the web <ul><li>But it is only one blogger even with comments from others </li></ul><ul><li>Who is the blogger? Where are they based? </li></ul><ul><li>Are they representative? All internet posters are atypical </li></ul><ul><li>That person may be misrepresenting themselves – corporate blogging/PR </li></ul>
    8. 8. Current resources to analyse online content <ul><li>Social media measurement experiments - eg #measurementcamp </li></ul><ul><li>Netnography – online behaviour and content coming from this </li></ul><ul><li>E-Anthropology –social analysis using web artefacts </li></ul><ul><li>Bricolage – support to qualitative research </li></ul><ul><li>Research practice says </li></ul><ul><ul><li>research data needs to come from a valid sample representative of the population </li></ul></ul><ul><ul><li>If projective materials are used then research participants need to explain it </li></ul></ul>
    9. 9. So can online research help? Online survey panels Online focus groups Face to face/CATI surveys Offline discussion/ Focus groups Research Communities { } { } Open platforms Eg FB group
    10. 10. Business decision support using more and more data sources <ul><li>Business process </li></ul><ul><li>Web analytics </li></ul><ul><li>Customer satisfaction </li></ul><ul><li>Corporate reputation tracking </li></ul><ul><li>Transaction data </li></ul><ul><li>Customer research? </li></ul>A lot of these aren’t being managed by the marketing department
    11. 11. Research at the crossroads
    12. 12. Purefold: open source co-created transmedia film director story ‘ seeds’ RSS feeds gather raw content Friendfeed hopper Visitors /linkers grade and link to other Web content storylines amplified Brand owners sponsor & create characters storylines Output put into production Ridley Scott Assoc/Ag 8
    13. 13. Purefold model applied to research
    14. 14. Benefits of the Purefold model <ul><li>Continues to use real people as participants </li></ul><ul><li>Dual role of research participants as graders and taggers </li></ul><ul><li>Sampling of participants can be controlled </li></ul><ul><li>Content can be filtered and amplified </li></ul>
    15. 15. Demographic replicators: Research bots ‘ Wefeelfine’ emotional wrapper DGR Felix text bot twitter posterous friendfeed network Analysis of others who share bot tastes Comments/ retweets of those who dialogue with the bot Findings aggregated with other bots
    16. 16. Meet felix <ul><li>Emotional wrapper: ShowFeelings?feeling&limit=50 gender=M city=london agerange=20 </li></ul><ul><li>emotion_keywords: { sad: [divorce, desk], great: [Floyd, Doors], depressed: [sales, weight, divorce, client], lonely: [at home, the divorce, miss my kids, desk], tired: [work, boss, home, hangover, sales] </li></ul><ul><li># Times of day time_keywords: 04: [work, sales, weight, conference, procurement, desk] </li></ul><ul><li>weekend_time_keywords: { 10: [home, kids], </li></ul><ul><li># Words to fall back on if nothing else matches, or to add randomly character_keywords: [work, boss, divorce, fat] </li></ul>
    17. 17. What is Felix? A projective device for researchers to reframe research questions? A dynamic sample/panel which can be observed and with which the researcher can interact A lure/decoy for drawing out customer response
    18. 18. Benefits of Replicators <ul><li>Continues to use real people as participants </li></ul><ul><li>You don’t need to recruit research participants unless you want to </li></ul><ul><li>Sampling of participants can be controlled </li></ul><ul><li>Content is filtered and amplified </li></ul><ul><li>Content can be used to stimulate online response </li></ul><ul><li>Content can be used to inspire the researcher as analyst </li></ul>
    19. 19. Past the sample/content bottleneck Sample Content Quant & Qual research keep sample and content questions separate Inability to control sample is what makes web content analysis problematic Sampling is literal: 20 male, works in local govt, lives in London
    20. 20. Computing has a similar bottleneck Instructions/data Memory/CPU
    21. 21. Knitting machines.. two different media programme cards wool
    22. 22. Into the cloud.. <ul><li>Sampling </li></ul><ul><li>Permission </li></ul><ul><li>Anonymity </li></ul>
    23. 23. Sampling <ul><li>Probabilistic scoring has been standard in lifestyle database marketing and geo-demographic marketing for 20 years </li></ul><ul><li>Use sample and content tags to mark up data. Could use cookies as a type of sample tag. </li></ul><ul><li>Use scoring models to decide whether to include content within sample. </li></ul><ul><li>Use best fit algorithms to aggregate content and sample tags. </li></ul><ul><li>Identify content first and then use sample tags to eliminate all the content which is off sample. </li></ul>this starts to make available to us the riches of the web in realtime search engines and Google/Yahoo alerts
    24. 24. Permission <ul><li>If we source content probabilistically then permission is no longer an issue </li></ul><ul><li>We can use their content without ever disturbing them. </li></ul><ul><li>Is this permissible within research or not? </li></ul><ul><li>The volume of online content that could become the subject of research is potentially so great that data protection becomes meaningless – permission becomes unnecessary </li></ul>
    25. 25. Anonymity <ul><li>MR principle protects research respondents from being sold to </li></ul><ul><li>Anonymity can only be granted in the context of a customer interview. There is always a possibility with behavioural targeting that that the digital marketer knows something about you that you don’t. </li></ul><ul><li>Probabilistic systems already being used for behavioural targeting and direct marketing would identify them anyway but there would not be greater disadvantage than anyone else who is the subject of behavioural targeting. </li></ul>
    26. 26. To summarise <ul><li>To bring content within research – need new methods </li></ul><ul><li>Purefold co-creation using taggers and graders </li></ul><ul><li>Use of DGR research bots </li></ul><ul><li>Probabilistic sampling </li></ul><ul><li>Challenge the use of permission and the protection of anonymity </li></ul>
    27. 27. Vested interest in MR preventing the incorporation of online content Qual: Research by hand Quant: Self measurement
    28. 28. CRM the natural starting point <ul><li>The need to engagement means that CRM is no longer focussed on hard sell </li></ul><ul><li>Multiple contact points with customers – very few falling within conventional research practice </li></ul><ul><li>The natural starting point for analysing web content. </li></ul><ul><li>Research needs to be part of it </li></ul>
    29. 29. Final thought? Customers are tribal stakeholders an extension of the company <ul><li>Perhaps customers willing to be involved in regular dialogue as part of CRM programmes </li></ul><ul><li>Customers as valued stakeholders </li></ul><ul><li>Extended members of the company ‘tribe’ </li></ul><ul><li>Is it possible that research industry is perpetuating the divide: </li></ul><ul><ul><li>Research respondents as outsiders? </li></ul></ul><ul><ul><li>Researchers as essential intermediaries? </li></ul></ul>
    30. 30. Thank you .. Questions??
    31. 31. Some references you may find useful.. <ul><li> </li></ul><ul><li> </li></ul><ul><li> purefold </li></ul><ul><li> </li></ul><ul><li> </li></ul><ul><li>[email_address] </li></ul><ul><li>johngriffiths7 twitter </li></ul>