Content rating behind the firewall<br />April2011<br />Presented for SIKM by David Thomas, Deloitte<br />
Background<br />Dave Thomas is a Product Manager for the U.S. Intranet at Deloitte. He previously worked in client service at Deloitte Consulting in their SharePoint Practice and as a Project Manager for Global Consulting Knowledge Management (GCKM) on their Portal: the Knowledge Exchange (KX). <br /><ul><li> KX is available globally and attracts over 20,000 unique visitors per month
KX is built on a heavily customized version of SharePoint 2003
Users typically come a few times a month to retrieve documents, utilize communities and complete other knowledge related tasks</li></li></ul><li>Content rating behind the firewall<br />
Content rating behind the firewall<br /><ul><li>Content rating has been common on the internet for some time but there seem to be limited examples of successful rating systems behind the firewall. Internal usage of ratings at our organization is historically <5% on the twoplatforms that it was deployed on
Stan Garfield posed the question “Has anyone had positive experiences with 1-5 star content rating mechanisms inside a firewall?”January 2010. Here are selection of responses (thank you to SIKM members):</li></ul>“I think that 5-star rating systems are ideal for apples to apples comparisons. Most knowledge objects (and of course people) cannot be compared in this manner”<br />“I think there is an added complication in that inside the firewall it might also be important to know who is doing the rating. The CEO's rating might carry a little more weight than the janitor's”<br />“With process documents, I'd like the idea of ratings because the purpose of the document is clear. For other documents, the case is muddier. <br />“rating a book or toaster is very different from rating a specific piece of content. Products purchased on Amazon tend to have more common use cases”<br /> On a deployed CR system: ”At first, pretty much no one rated. We suspect this was for several of the reasons that you pose in your document but also because it was unclear what they were being asked to rate - the quality of the writing, whether or not they agreed with the author, whether or not they thought highly of the author, or whether they liked the quality of the document. In an effort to encourage participation, the sponsors clarified the intent of the ratings”<br />
Content rating behind the firewall<br /><ul><li>In March 2010, a project was initiated to provide a content rating system that was integrated into the Portal. The perceived value was that a content rating system would allow stronger rated content to be easily identified and promoted accordingly. Later on, weaker content could be removed /archived earlier (separate project)
Our Portal runs on SharePoint 2003 so this project entailed custom work (no 3rd party webparts available) . The rating system integrated with</li></ul>Published content pages giving ability to rate<br />Search user interface allowing retrieval based on rating<br /><ul><li>The first part of the project was research on various options for rating. Determined that most rating systems fell into one of three buckets: </li></ul>‘favorites and flags’<br />‘this or that’ <br />‘rating schemes’<br />
Type 1: ‘Favorites and Flags’<br />Design: Single value rating scheme - usually positive<br />
Type 2: ‘This or That’<br />Design: Positive/Negative value: Yes/No, Like/Dislike, Up/Down<br />
Type 3: Rating Schemes<br />Design: Traditional 1-X rating scheme (1-5 and 1-10 are common)<br />
Some of the concerns identified pre-deployment<br /><ul><li>‘Inside the firewall’ is not ‘outsidethe firewall’ – user behavior might to be different
Scale could be an issue (will there be enough people rating enough content to be meaningful)
There is a lag between the time a knowledge asset is accessed and the time a rating can fairly be made. The user may also no longer be logged into the repository at the time the rating could be applied
When someone watches a short video they watch and can rate quickly in most cases as the rating mechanism is often easy and convenient. They have consumed the media asset and are positioned to make a judgment on it. The value of a document is not known until after it has been downloaded and readand that can take time.
We could experience cultural resistance when trying to implement content rating
Lack of desire to rate content as poor will likely be evident
People are not used to rating content inside the firewall</li></li></ul><li>Typical ratings distributions<br />Outside the firewall: generally ‘J Curves’ exist . The authors of Building Web Reputation Systems did research on ratings of various Yahoo sites <br /><ul><li>“Eight of these graphs have what is known to reputation system aficionados as J-curves- where the far right point (5 Stars) has the very highest count, 4-Stars the next, and 1-Star a little more than the rest.”
“a J-curve is considered less-than ideal for several reasons: The average aggregate scores all clump together between 4.5 to 4.7 and therefore they all display as 4- or 5-stars and are not-so-useful for visually sorting between options. Also, this sort of curve begs the question: Why use a 5-point scale at all? Wouldn't you get the same effect with a simpler thumbs-up/down scale, or maybe even just a super-simple favoritepattern?”
“If a user sees an object that isn't rated, but they like, they may also rate and/or review, usually giving 5-stars - otherwise why bother - so that others may share in their discovery. People don't think that mediocre objects are worth the bother of seeking out and creating internet ratings”
“There is one ratings curve not shown here, the U-curve, where 1 and 5 stars are disproportionately selected”
Product or service based sites with either a) tightly nit communities or b) Incentivization or c) huge user groups can generate U curves also (Amazon.com is often cited as an example)</li></li></ul><li>Typical ratings distributions cont..d<br /><ul><li>One of the groups evaluated (custom autos) generated a ‘W Curve’. This actually represented a preferred distribution for our deployment and we later speculated on whether we would achieve it.
“The biggest difference is most likely that Autos Custom users were rating each other's content.The other sites had users evaluating static, unchanging or feed-based content in which they don't have a vested interest”
“Looking more closely at how Autos Custom ratings worked and the content was being evaluated showed why 1-stars were given out so often: users were providing feedback to other users in order to get them to change their behavior. Specifically, you would get one star if you 1) Didn't upload a picture of your ride, or 2) uploaded a dealer stock photo of your ride”
“The 5-star ratings were reserved for the best-of-the-best.Two through Four stars were actually used to evaluate quality and completeness of the car's profile. Unlike all the sites graphed here, the 5-star scale truly represented a broad sentiment and people worked to improve their scores.”</li></li></ul><li>Deciding on a content rating design<br />
Deciding on a content rating design<br />Based on the W distribution example, we asked some questions to determine whether a 1-5 rating scheme would work and we could get the desired W.<br />Ultimately, we decided to custom develop a 1-5 Rating Scheme (Type 3). There were other drivers identified on that drove this decision.<br />
Deploying a rating system<br /><ul><li>The business drivers for implementing a 1-5 Rating scheme
Simple, familiar model to rate published content
Granularity – ability to get and average score and promote / remove content as needed
Alignment with SharePoint 2010 (future platform) reduced disruption for the user when we moved
Resource constraints meant some deferral of some functionalityfor future releases. For the first release:
Single classification of knowledge asset - published content. (Other types would later follow).
No mechanism for comments (even though they often go hand in hand with ratings) </li></ul>concern on moderation team requirements, some risk aversion, worried comments would either not be used (people not comfortable) or perhaps inappropriate in some cases<br /><ul><li>We didn’t give explicit guidance on what each of the ratings meant – just used the ‘1. Not recommended –5. Highly recommended’ nomenclature. Suggestion to provide something like below was not pursued:
Marketing and promotion at the launch of rating meant that the rating activity was essentially incentivized for the user. This had an impact on the usage as you will see.</li></li></ul><li>Content rating data<br />
Rating events / unique pieces of content rated<br />Commentary<br /><ul><li>Average of 836 unique pieces of content each month (about 2.5% of content available) was rated.
Additional rating capabilities were deployed for qualifications in October/November – potentially raised awareness around rating in general.
KX has seasonality effects in user visits</li></li></ul><li>Content Conversion<br />Commentary<br /><ul><li>This graph normalizes the absolute number of ratings to the page views for item. The long-term average is around 1-1.5%.</li></li></ul><li>Ratings per user / monthly new users<br />Commentary<br /><ul><li>Average of around 2.75 ratingsapplied by each user.
Average of around 270 new raters each month although heavily skewed by the first 2 months.
There are repeat raters using the system. Current new rater run-rate is around 100 users a month</li></li></ul><li>Average score and rating distribution<br />Commentary<br /><ul><li>There is some fluctuation in the 4 and 5 ratings but the long-term average is 73%.
Average rating is extremely steady and has been from month1.</li></li></ul><li>Cumulative table of results<br />
What did we learn from the experience? Can content rating happen behind the firewall effectively?<br /><ul><li>You can build a custom simple content rating system and it will get some meaningful use: Over the last ten months, 10000+ rating events following a fairly typical J-Curve distribution with ~70% of ratings a 4 or a 5.
There are no real benchmarks for ‘success’. Project set target was 1-3% of viewed content would be rated. Noted if 1-2% of our MUVs rate content that equates to 2500 total ratings a year.(YouTube Rating from 0.1 – 0.5% of viewersis common (sign in to rate has impact?))
Value for knowledge assets can be situational – “one mans trash is another mans treasure”. Without comments system it is difficult to understand why something is rated a certain score.
Feel that our users are pre-disposed to rate a lot of content 3-4. They get the concept of best in class. We have firm wide methodologies that are broadly used – they would equate that with best in class/ 5 star.
Experienced excessive rating and of course, self rating.
Still some level of fear that if they rate something a 1, then the document author will find out – to the extent that we put that in the FAQs for the system to address this.
Incentivization had an impact. Require more data to see exactly how much.</li></li></ul><li>What else could be done in the future?<br /><ul><li>Identification and promotion of high quality content in a meaningful way. Some challenges given the sheer volume of content and the distribution of our business and the interests of individual users.
A model for removal/archive of ‘low quality content’. Business rule definition is still outstanding around this. Basic idea: If there is compelling evidence (multiple ratings) look to possibly retire the content earlier.
Additional marketing and promotions, sponsorship. Incentivization will drive a temporary increase in rating activity, but is not sustainable.
Demo rating as part of onboarding materials for new hires – set it as an expectation to rate assets that are used. This should be part of a broader initiative about the value of a knowledge sharing culture.
Authored or contributed content appears on your profile. Add rating of that content as well.
Deployment of a DVD by mail rating type model for rating (‘Blockbuster ‘). Essentially these are communications asking the user to take a proactive step of rating a consumed asset. Note: This was discussed but de-scoped as the only options at the time were heavily manual.</li></li></ul><li>Closing thoughts and Q&A<br /><ul><li> Still feel there value in content rating we experienced both technical and organizational challenges implementing it behind the firewall but learned from the experience and at least the capability is there for users to use.
Rating is built into common platforms that we use for document management, collaboration and knowledge sharing. In theory if rating is pervasive, and if users see it all the time, they *may* use it more.
User behavior was generally consistent</li></ul>Generally more ratings were applied by junior – mid level users (time, familiarity with the Portal)<br />Common for users to rate items a ‘4’ or a ‘5’<br />We saw long term usage is around 1% of page views and 2% of total content in the content store rated in any one month<br />Strong impact when the process is incentivized (not that surprising)<br />Concern about the time it would take to get meaningful ratings on a lot of the content on the Portal<br />Outliers are likely ‘power raters’ although might not be self-directed!<br />
You-tube’s position on rating drove a change for them…<br />‘“Seems like when it comes to ratings it's pretty much all or nothing.Great videos prompt action; anything less prompts indifferenceThus, the ratings system is primarily being used as a seal of approval, not as an editorial indicator of what the community thinks out a video. Rating a video joins favoriting and sharing as a way to tell the world that this is something you love”. (22/9/09 YouTube blog)<br />Questions? <br />Feel free to contact email@example.com<br />