Navigation Through Social Computing (Enterprise Search Summit 2008)


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Thanks for having me I want to talk about three things today How an epochal shift in the Web and computing in general is creating a need for enterprises to fight for user attention and regain control over how information relevant to their business is made accessible on the Web Then I want to describe how social computing can provide semantically-expressive metadata that enables a rich search and navigation user experience, supporting the creation of authoritative resources that can recapture user attention Finally I’ll illustrate this approach with some work we been doing over the last year in the development of information hubs for provisioning information to external audiences
  • The walls are coming down between: Producers and consumers of information Catalogers and researchers IT and users Content and data inside and outside the firewall This is an epochal change, like the end of the Cold War… or the invention of the printing press Call it Web 2.0, 3.0, Enterprise 2.0, just the Web, or computing in general… whatever We’re just at the beginning of this transition
  • Clay Shirky’s book “Here Comes Everybody”: epochal technology transitions trigger the mass amateurization of established professions Shirky’s example of the impact of the printing press on the profession of scribe: the Abbott of Sponheim’s 1492 book defending scribes was printed. Today: photographers, journalists, publishers… and now librarians Nancy Pearl (the Librarian Action Figure) is really shushing us not to keep us quiet , but to not let the cat out of the bag Tremendous tumult, a profession in crisis In spite of the fact that all things information retrieval are deeply rooted in library science e.g., Google’s origins in NSF-funded digital library research FRBR, RDA/DCMI (which we are helping fund) etc. are attempts to grapple with this transition Other professions under siege: taxonomists and information architects
  • Hat tip to David Weinberger: “everything is metadata and everything is connected” This is a conceptual map of the various information objects in Flickr Not just documents (images), but many types of objects Rich with not just attributes of but with relationships between documents, people, concepts/tags, places, etc. Lower quality but greater quantity than that created using traditional approaches
  • The mass amateurization of metadata creation means that everyone is not only a publisher but everyone is also a cataloger Some librarians would dispute this Martha Yee’s definition of cataloging attempts to carve out the masses, but it’s a stretch Counterexamples: LIbraryThing,, Flickr Ultimately: the Linked Open Data initiative takes collaborative cataloging and combines it with Semantic Web metadata standards to yield the next-generation web-based equivalent to the name authority resources of the library world
  • The notion of a static collection of documents is increasing dated Recency trumps archival longevity Constant, near-real-time updates coming from everywhere 12-14 Million Blogs in USA 900,000 Blogs in USA making money Top 200,000 blogs exceed $250 monthly Top 20,000 blogs exceed $2,000 monthly Standard formats for syndicated content and data (RSS, Atom) dramatically lower the costs of aggregation and integration (mashups) For enterprises this is hugely important Most of the relevant content and data about your products and services comes from somewhere other than you, from outside your firewall Sources: individual and collaborative blogs, news organizations, feed aggregators, Twitter lifestreams Herb Simon: abundance of information yields scarcity of attention Enterprises are losing control of how information about them gets to their customers, partners, press They’re losing their ability to retain attention
  • Sue Feldman’s work on the current state and future evolution of query traffic on the Web Traffic shifting from gateways (Google, Yahoo, MSN) to hubs A hub is an aggregation of content and data focused on a specific subject area, organized according to an editorial point of view Authoritative hubs attract attention, provide value to the audience by being the best place for finding, discovering and exploring information in a given subject area, delivering thought leadership supporting market leadership Assertion: Enterprise search and navigation will focus on building hubs that provision information to external audiences with the enterprise’s editorial point of view A big problem: customer searches Google, finds a page on your site, then goes away A bigger problem: customer searches Google, finds a blog posting with misinformation about your product, and never gets to your site Essentially this is a marketing communications problem, which is a communciations problem, which is what the Web is all about Who will build these hubs and how? Enterprises themselves or third-party publishers? Examples: Zillow vs. MLS, Macenstein vs. Apple, TechCrunch vs. InfoWorld
  • The question of how to build hubs hinges on giving a user the best possible search and navigation experience So how do you do that? Marti Hearst’s dichotomy: teleportation vs. orienteering The dream is teleportation: something that will instantly take you where you want to go Assumes you know where you want to go - only indirectly supports exploration and discovery Great when it works, messy when it doesn’t (The Fly) Teleportation is a holy grail Google has convinced users that this is what they want But this is perhaps a great example of an AI-complete problem, in that in the limit it assumes deep understanding of user intent Perhaps we can chip away at this in specific domains and use cases But we’re a long way from teleportation that works flawlessly
  • The reality is orienteering: using hints and cues embedded in an information space to work your way towards where you want to go Multiple, mutually reinforcing ways to navigate Search as a filter and an arrow in the quiver, as opposed to a be-all-end-all Kevin Lynch’s “Image of the City”: contexts and paths as key Not magic, but fundamentally social Issue: built on metadata, lots and lots of it
  • In practice the two are being blended Marcia Bates’ berrypicking model anticipated this by twenty years Iterative query refinement across multiple collections Jump, filter down, re-orient, jump again It’s taken twenty years to slough off the older, simplistic, teleportation-centric paradigm of a single query returning a relevance-ranked list of results The transition discussed earlier is providing a point of departure for the development of information hubs using this approach
  • Faceted navigation was the first computational realization of an orienteering approach to information access Hearst, Endeca, Siderean, now in many products and services Quick explanation of diagram: navigation over a collection of documents by hierarchical subject and genre taxonomies Requires attribute metadata on information objects Plays well with search as a filter
  • Relational navigation extends faceted navigation beyond retrieval of a single type of information object by adding relational metadata and multiple object types Siderean uses semantic technology (RDF/OWL) to represent this information A tremendously effective representation that exploits the power of Web architecture: URIs and XML namespaces Pivoting to get different views across these types Quick explanation of diagram: faceted navigation over documents, people and organizations, with the ability to find and follow relations, ask queries from different perspectives Any type can be the initial entry point Supporting both focused finding and serendipitous discovery of information Zoom in and out, then tumble across related items Closer to Bates’ berrypicking model than search or faceted navigation Even more dependent on an abundance of metadata As always the question is: where does the metadata come from?
  • This is where social computing comes in: as a rich source of metadata for navigation Last November at the first Defrag conference in Denver, Andrew McAfee of Harvard Business School described a model understanding the impact of social computing in the enterprise It organizes thinking about tools for communication and collaboration by knowledge worker tie strength Quick description of diagram: tie strength goes from strong (people you work with everyday) to weak (people you interact with occasional) to potential (people you know but haven’t met) to none (people you’ll never deal with) McAfee associates with each levels of tie strength a particular type of social computing tool, together with the resulting type of intellectual product that the tool enables A model to think about where and how to apply social computing to support productivity within the enterprise
  • Let’s apply this model to the task of metadata production in a editorial team of one or more The people you work with everyday select syndicated feeds from blogs, wikis, feed generation wrappers around traditional CMSs and DBMSs Feeds come with simple asset metadata (date, publisher, author, etc.) Organizational metadata comes from social network systems used by contributors, either internal to the application or leveraged for existing SNS or wrappers around personnel registries Subject metadata comes from user tagging of navigation results, as a replacement for or a complement to automated entity extraction and autocategorization Usage metadata comes from search and navigation logs associated with the system, from people you don’t necessarily know: the end users This is one take applying McAfee’s method to this problem; there can be others but this is the one we have used in application
  • Here’s some examples of this approach to leveraging semantic technology and social computing to produce information hubs These were done in the context of ongoing work over the last year with Oracle Corporation in building information hubs as microsites on top of content from both inside and outside Oracle Target audience: press and analysts, customers, software developers Goals: improved discoverability, richer user experience, thought leadership Done using our Seamark relational navigation product, delivered as a hosted service with DNS configured to be reachable through the Oracle domain Sites use syndicated feeds from Oracle Marketing as well as external sources such as blogs by Oracle experts, This is the Oracle Technology Network Semantic Web microsite Aggregates feeds about product releases, announcements, as well as developer forums and blogs Here contributor usernames are exposed, with a distinction made about Oracle Aces, I.e. acknowledged experts in a given area Provides a way to identify experts as well as find relevant product and issue information Supports pivoting between these views
  • This is a screenshot of the Oracle Pressroom microsite ( This is the second version of the first initial microsite Initially a response to senior management frustration with the accessibility and discoverability of Oracle press releases Again, some subject metadata comes from automated processes, and usage metadata provides information about popular documents But now user tagging support is blended in with this as a separate facet to support navigation on the basis of idiomatic or newly emerging subject terms
  • Finally, this is a screenshot of the Oracle Events microsite ( Several feeds of Oracle in-person and web-based marketing events Facets include relative date, geographical location, subject, intended audience Attributes automatically extracted Usage metadata from logs shows level of user interest in events in the near future Across each of these sites, Oracle is seeing increases in time on site: an indicator of an increase in user attention Our product direction: reduce effort in building hubs to point where marketing organizations and new media publishers can do it in minutes Special focus on eliminating need for technical expertise in metadata extraction and management
  • Summary: social computing provides semantically-expressive metadata that enables a rich search and navigation user experience The benefit addresses the issue raised earlier with respect to winning and keeping the audience’s attention By providing better, more productive ways for users to allocate their attention, providing opportunities for learning and discovery as well as finding, enterprises can achieve greater user satisfaction and traffic in the long run A bigger piece of the hub query traffic pie Reclaiming user attention Reasserting control over the conversation with the customer
  • Thanks very much for your attention
  • Navigation Through Social Computing (Enterprise Search Summit 2008)

    1. 1. Navigation Through Social Computing Bradley P. Allen Founder & CTO Siderean Software, Inc. Enterprise Search Summit 21 May 2008
    2. 2. A New Order Is Emerging
    3. 3. Information Organization Is No Longer A Exclusive Profession
    4. 4. Everything Is Metadata Source:
    5. 5. Everyone Is A Cataloger
    6. 6. Feeds Are The New Collections
    7. 7. How Does This Impact Search And Navigation In The Enterprise?
    8. 8. What We Want: Teleportation
    9. 9. What We Got: Orienteering
    10. 10. Orienteering Is A Better Model For Discovery Bates, M. J. (1989a). Design of browsing and berrypicking techniques for the online search interface. Online Review, 13, 407-424.
    11. 11. Faceted Navigation Was The First Realization Of Orienteering subject s 3 genre s 2 s m s n g 3 g 2 g m g n documents g 0 g 1 s 0 s 1
    12. 12. Relational Navigation Adds Semantics And Pivoting subject genre documents persons dc:creator frbr:creatorOf title location organizations size industry foaf:member foaf:member -1
    13. 13. McAfee’s Model Of Enterprise 2.0: Organize Tools By Tie Strength strong wikis content weak social networks information potential blogs teams none prediction markets answers ties to co-workers tool result
    14. 14. Producing Metadata For Navigation In Enterprise 2.0 strong feeds content weak social networks people metadata potential social bookmarking subject metadata none search logs usage metadata ties to co-workers tool result
    15. 15. Feeds And People Metadata In Oracle Technology Network
    16. 16. Subject Metadata In Oracle Pressroom
    17. 17. Usage Metadata In Oracle Events
    18. 18. Social Computing Yields Metadata For Navigation
    19. 19. [email_address]