Cloud of Knowing MRS 2010 conference slides - the award winning paper


Published on

The actual presentation I gave at the Market Research Society conference 2010. There is an earlier preview uploaded here on slideshare from I think the Cloud 2 meetup when I did a dummy run in front of the group. The presentation was shortlisted for best presentation and won best new thinking the last time the award was given funnily enough

Published in: Marketing
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • The Cloud of Knowing project came about to address the growing use of web content in research. There is now so much data across so many platforms that it is inevitable that marketers are using it more and more.
    But its use has been problematic. There are issues around sampling – who is posting the content? How can we find out about them. Are we required to ask their permission? are we bound to protect their anonymity if we never recruited them in the first place?
    And as our ability to gather this data is virtually instant we are moving to real time research where there is no time to validate – to ask for permission – Research without asking questions.
    Whatever may be researchers’ misgivings, marketers are striding in to grasp this data with both hands. If we hang back we don’t prevent the use of this data – what we risk is irrelevance as companies make it central to their way of working.
  • So the research industry has a choice to make.
    Conquistador like do we advance to absorb web content as standard research practice. Can we do it properly?
    Or do we retreat to the safety of the keep as the people who ask questions?
  • Here is a blog page – a classic piece of content which certainly Lego ought to be interested in.
    Which you can readily arrange to have hoovered up and delivered to your computer gratis every time the blog is changed in any way. Using an RSS feed.
    Sounds perfect.
    But we don’t know anything about Mike Walsh.
    This may not be his name. He may live in another part of the world.
    We don’t know where he fits in a taxonomy of Lego users.
    He may be a journalist or a lobbyist. Perhaps he works for Lego.
    Actually he may not even be human – as we shall see the use of bots on the web is increasing – the content may be aggregated and republished from lots of sources.
    Actually we can’t trust Mike Walsh at all.
  • But there are so many Mike Walshes posting out there that we have been forced to find workarounds.
    Here are some – measurement camp is an initiative where those working in social media specifically address the issue of how to get behind the surface of web content to measure what is happening behind the scenes
    Netnography has brought in ethnographic techniques – by watching what people do and say over time. Unlike ethnography however there is no requirement to hear the participants side of the story.
    E- anthropologists get off the hook by explaining that they are studying social rituals rather than people. So can look at the artefacts – again this means they don’t have to address questions directly. We researchers cannot get off that hook quite so easily.
    Which is why we use triangulation – if the online behaviours and artefacts are similar to those of respondents about which we know much more, then we triangulate and infill,
    I include CRM to underline another danger. Because Lego may not care where or who Mike Walsh is as long as he’s useful and influences other people. Of which more later!
  • Last year at Research 2009 I likened research as I was trained in it as being rather like farming. Cultivation using fixed quantities and orderly rows.
    But then we discovered that there was wild fruit and wild game which tasted better. So hunting came back in. Hunters go out and catch fresh game or gather wild fruit. Hunters work very differently from farmers – they follow the game – they catch it fresh – and they don’t pick up just anything. Research on the internet is a lot more like hunting than farming.
    Notice also that farmers and hunters have never really got on very well together.
    There is a third type of activity – scavenging. Picking up something interesting you found on the net and bringing it in for examination. Scavengers pick up anything. They don’t ask hard questions about where it is from or who it represents. I suggest that a lot of scavenging is going on at present.
    I would also suggestion that reputable researchers should have nothing to do with it. However farmers doing a spot of hunting and hunters trying to show they are applying farming methods can start to look a lot like scavengers!
  • Ray has been sceptical about our ability to sample properly any more. I take a different view. We need to have some notion of representativeness. I think we need to think much more creatively about how we sample.
  • And as a researcher I also struggle with the notion of never having the time or luxury to allow people to interpret the content they have placed online. I think central to research is people giving their own account and not having professionals interpret them at one remove.
  • I come back to this issue of decision support. Businesses are now getting so much of their decision support information online that they really don’t have time for my research scruples. Day by day research is losing its share of decision support – I don’t think we have a choice – we have to find a way to use web content better as researchers to continue to maintain our share of decision support.
    So here very quickly 3 different models for doing just that.
  • My first example is Purefold. Purefold is a project developed by Ag8 for Ridley Scott Associates for creating transmedia plotlines which could be made into films video games and other types of online online content. Directors came up with story seeds – these were amplified by using RSS feeds to draw relevant content into a Friendfeed area. where visitors would grade them, link to other content or bring in their own. This content grew the seeds into storylines, characters and set designs which were then presented to brand sponsors who could work branded entertainment story lines, characters and themes into them. Drawn from the web and co-created with active web users. When I worked on this project last year it seemed to me a useful model we could borrow for drawing web content into a research framework.
  • The model looks rather like a jet engine. We start with a research brief. Which using keywords to gather relevant data. The RSS feeds flow back into the Friendfeed area. Research participants also go out hunting relevant content. Using tagging systems to tag it. And so this data too is grabbed and flows back into the friendfeed hopper. Other research participants operate like farmers grading what comes through the friend feed/wiki.
    And note that we have a sampling framework for those who hunt and those who grade.
  • The benefits are that we continue to use real people as participants.
    Research participants play complementary active roles as graders/farmers and tagger/hunters.
    The sampling of respondents can be controlled
    And the content can be filtered and amplifed – rather like the turbine inside a a jet engine the flow of data can be controlled using those participating in the research
    How does this look as a hybrid research methodology? This broadens out online research community to something much richer where participants play a much more active part. And the richness comes from the data flowing through – less pressure on moderating and surveys of the internal community. Recognisably a research hybrid – with a similar cost structure as MROCs.
  • We turn next to example no 2. And the use of research robots.
    Called Demographic replicators and developed by Philter Phactory, DGRs are text based robots who draw their content from real people whose blogs and tweets are gathered in real time. Who respond to live questions. Who have been aggregated to research sample criteria. But whose constituent members may not even know that their content is being used for research.
    Demographic replicators also incorporate GPS data, media behaviour, sand pending behaviour as well as emotional states. So represent an aggregation of human behaviours.
    So still fulfilling sampling criteria but now a weird combination of machine and human. Still recognisable as research but more abstract.
  • To give a practical example we could create an interaction between Margery – a bot which picks up on web content about margarine
    And Wilma a bot made up of a live panel of 200 housewives in Wilmslow the northern heartland of qual research. The interaction is almost like a peal of bells or Tibetan music bowls. As the harmonics interact they are rich and complex. Much more so than the programmed output of specific research projects. Giving us a broader understanding of the lives of the users of a particular margarine brand. And much more generative.
  • There’s no time to explore the different ways these bots can be used. This morning I want to take the safe option and consider them as decoys for drawing out the interest response and participation of real people who don’t realise that the bot is an elaborate form of stimulus.
    DGRs are already being tested by Brain Juicer working with Kraft. The outputs can be used to study small customer segments in phenomenal detail. At nominal cost. Well below that for conventional research studies.
  • What are the benefits of Replicators?
    They continue to use real people
    These participants aren’t technically respondents because we are not asking them questions and their content is effectively public.
    Sampling can be controlled
    Content is filtered and amplified
    And can be used to stimulate online response
  • Armed with two live examples lets go to the next level which is more speculative
    Research has a historic bottleneck. It has always treated sample and content data separately. This is true for both quant and qual research Research needs to establish 2 things –who and what – what they are and what they think, what they feel, how they behave
    The rule is that you can’t use the same piece of data to identify the participant (who) and as evidence of their opinion (what) – the strain of keeping them separate is preventing research from using similar data for content and for sampling information.
    Secondly sampling is to be literal. As far as possible year, demographics and usage are to be verified. Usually by the respondents themselves.
  • I want to suggest that it is time we started to sample probabilistically. To tag internet content with probability scores on who posted it. And to use the connections to fill in the picture. And to aggregate samples that way. This is not unlike the way geo-demographics has been used to project the behaviour of a sample of households onto the rest of the household population.
    The fact that much of the internet content is produced by the same people across different platforms means we have extended data sets to validate samples. As long as we do it probabilistically.
  • The recent launch of Google Social search is an excellent example the recommendations given to me are amplified because Google works from the recommendations of the friends of my friends and family.
    These strands of relationships are like the tendrils of a jelly fish – they make accurate guessing possible.
    Sampling is on the cusp of a revolution.
  • We are so used to the miracle of a content search engine – where Google uses 200 variables to target the most relevant content to my enquiry.
    It is time to consider what would happen if search engines used a similar number of variables to assign a sampling score to every piece of internet data. What would be the implication for constructing an instant sample using an advanced search mechanism?
    How many engines do you need? One per study? One per client? Or one to rule them all? The incorporation of internet content and samples could be the internet dis-intermediation which has transformed so many markets and which has yet to hit the research industry. All we have done so far is to move offline methodologies online. Real disintermediation is still to come.
  • What would be the benefits of using sampling engines ?
    EITHER it starts with real sample data then proceeds using probabilistic tags to use best fit to build virtual sample frames
    OR Could recruit seed samples then use probabilistic sampling for special studies
    Sampling of participants can be controlled
    Content is already there and can be weighted or excluded
    Oversampling and infill possible – cf Photoshop
  • Will research agencies go this route willingly? Of course not the pain will be too great.
    Qualitative researchers function rather like portrait painters in the 19th century – not immediately displaced by the invention of the camera but increasingly irrelevant as it became possible for people to take their own photos – not as good but much quicker and much more cost effective.
    Quant research have colonised the internet with a literal analogue of what offline surveys have offered as a successful industrial model of data collection. The fact that online panels have become discredited so quickly is a sign the drawbacks of offline surveys were simply imported without being addressed – artificial, overly structured, measuring themselves rather than capturing reliable insightful claims of customer behaviour. Quantitative research is rather like a 19th century photographers needing large numbers of people to stand still. While the photographer fumbles under a black cloth there is a large explosion and the output a single static photographic plate.
    What happens when marketers have their own cameras? Which is what branded communities are turning into.
  • As lines of demarcation are redrawn around companies customers will become another stakeholder group with whom client companies are continually in dialogue and they won’t call it research any more. Customer dialogue will become a normal part of trading.
    We need to consider how research can inform and amplify stakeholder models. Tribe Research in Australia takes a radical view of employees suppliers and customers. All are extended members of the tribe and can be researched continually. At low cost.
    Researchers may have to reinvent themselves as facilitators and stop thinking that they will be allowed to continue as simple intermediaries.
    It’s a different place. But its not a bad place to be. And web content will be integral.
  • The Cloud of knowing project is work in progess – we’ve only just started and we’re not close to being done yet.
    I have argued that online content has to be part of research for research to have a future.
    I have suggested that Market Research is stuck in offline research paradigms which are ill suited to the new environment.
    And I have put forward several models some in use today for incorporating this content.
    Its still research folks. But not as we know it.
  • I end with Avatar – is the future of research to be tribal? If so then we have to stop being interlopers and have to start making the tribe work better. Using the content they naturally use. That is the future of research.
  • Cloud of Knowing MRS 2010 conference slides - the award winning paper

    1. 1. The Cloud of Knowing Project Content analytics and the future of market research John Griffiths Planning Above and Beyond March 24th 2010
    2. 2. Planning Above and Beyond Web content The questions we need answers to SAMPLING RESEARCH WITHOUT QUESTIONS PERMISSION % OF BUSINESS DECISION SUPPORT REAL TIME ANONYMITY
    3. 3. Planning Above and Beyond Redrawing the lines around research Will research Advance to include content? Retreat and become a specialism? the people who ask questions.. CONQUISTADORS THE KEEP
    5. 5. Planning Above and Beyond Workarounds  #measurementcamp  Netnography  E-anthropology  Triangulation  CRM
    6. 6. Planning Above and Beyond Research cultures  Farmer/cultivators  Hunter gatherers  Scavengers 
    7. 7. Planning Above and Beyond Research data needs to come from a valid sample of the population 
    8. 8. Planning Above and Beyond If projective materials are used then research participants need to explain it
    9. 9. Planning Above and Beyond Purefold: open source co-created transmedia film director story ‘seeds’ RSS feeds gather raw content Friendfeed hopper Visitors /linkers grade and link to other Web content storylines amplified Brand owners sponsor & create characters storylines Ridley Scott Assoc/Ag 8 Output put into production
    10. 10. Planning Above and Beyond Purefold model applied to research
    11. 11. Planning Above and Beyond Benefits of the Purefold model  Continues to use real people as participants  Dual role of research participants as graders and taggers  Sampling of participants can be controlled  Content can be filtered and amplified
    12. 12. Planning Above and Beyond Demographic replicators: Research bots ‘Wefeelfine’ emotional wrapper DGR Wilma text bot twitter posterous friendfeed network Analysis of others who share bot tastes Comments/ retweets of those who dialogue with the bot Findings aggregated with other bots Margery – food bot
    13. 13. Planning Above and Beyond Harmonics – the music of the spheres  Margery – keywords related to margarine – interacts with…  Wilma – circa 200 housewives in Wilmslow to produce..  Their interaction triggers range of harmonics which current research cannot anticipate or address
    14. 14. Planning Above and Beyond How could DGRs be used? Stimulus for researchers/ clients to reframe research projects? A lure/decoy for drawing out customer response A dynamic sample/panel which can be observed and with which the researcher/ client can interact
    15. 15. Planning Above and Beyond Benefits of Replicators  Continues to use real people as participants  You don’t need to recruit research participants unless you want to  Sampling of participants can be controlled  Content is filtered and amplified  Content can be used to stimulate online response  Content can be used to inspire the researcher as analyst
    16. 16. Planning Above and Beyond Past the sample/content bottleneck Sample Content Inability to control sample is what makes web content analysis problematic Sampling is literal: 45 yr old woman, works part time, lives in Wilmslow Quant & Qual research keep sample and content questions separate
    17. 17. Planning Above and Beyond Into the cloud..  Probabilistic Sampling  Outside the code if they’re not respondents  Permission not necessary  Anonymity becomes irrelevant  Free to build extended profiles across different platforms  Assign probable demogs based on content and proximity to others
    18. 18. Planning Above and Beyond Probabilistic sampling – Google social search Me friend Their friends friend friend friend Their friends Their friends Their friends Their friends Their friends Their friends Their friends postcode income band Size of HH
    19. 19. Planning Above and Beyond Content + Sampling engines
    20. 20. Planning Above and Beyond Benefits of sampling engines- 2 routes 1. Engine constructs virtual sample frames using best fit algorithms using found data OR 2. Engine builds special studies from seed samples  Sampling of participants can be controlled  Content is already there and can be weighted or excluded  Oversampling and infill possible – cf Photoshop
    21. 21. Planning Above and Beyond Vested research interests carrying on regardless Quant researchers 19th C photographers Qual researchers 19th C portrait painters
    22. 22. Planning Above and Beyond Time to get tribal  CRM a better starting point than conventional MR – web content now integral  Stakeholder model of involvement – part of the tribe cf Tribe Research in Australia  Researchers no longer intermediaries?  Emergent role as facilitators
    23. 23. Planning Above and Beyond To conclude  Cloud of Knowing project – Work in progress  Online content should be part of research  MR stuck in offline research paradigms  Alternative models for incorporating content It’s still research but not as we know it..
    24. 24. Planning Above and Beyond Thank you ..