Intelligent search-in-the-age-of-big-data-may-2013 (Source: KM World)


Published on

This is an interesting report which talks about revolutionizing your approach to Knowledge Management

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Intelligent search-in-the-age-of-big-data-may-2013 (Source: KM World)

  1. 1. May 2013Best Practices in Intelligent Searchin the Age of Big DataKMWorldSupplement toPremium SponsorAndy Moore . . . . . . . . . . . . . . . . . . . . 2 The Purpose-Driven Search LifeWriting about enterprise search is not the cakewalk it used to be. With customers demandingmore business value, and vendors responding by becoming more “purpose-driven” andspecialized, the search market has fragmented into a series of business applications that onlyopaquely rely on “the search engine” to accomplish their tasks. I often call it “the technologyarc.” At first, all you have to do is say “enterprise search,” and you have the attention of theusers and the investors. Then after a while, you have to ask, “What can this new technologydo for me?” Then after a while and the shine off the lily (or however that expression goes),you need to ask, “Where is the business process improvement...”Jerome Levadoux, HP Autonomy . . . 4 Revolutionize Your Approach to Knowledge ManagementNinety percent of the world’s data has been created in just the last two years. But when it comes toinformation, there is no immediate benefit to simply amassing exabytes of content. The real gainscome when organizations are able to translate, understand and apply the insight that is containedwithin this flood of information.One of the main challenges posed by today’s information explosion is that content is fragmented intodisparate “silos,” such as file servers, collaboration suites, email systems and other repository types.Also, more information is quickly migrating to the cloud. In this scenario, traditional KM systems fallshort because they are not equipped to derive the intelligence contained in this information. . . .. . . . . . . . . . . . . . . . . . . . . . . . . 6 Advanced Indexing TechnologyBig data. Unstructured data. Semi-structured data. Data is all over the technology news, andfor good reason. It is overwhelming organizations, requiring them to find new ways to operate,stay competitive, better serve their customers and bring new products to market faster.Companies are finding themselves with piles of information within multiple channels, lockedaway in silos-different systems, different departments, different geographies and different datatypes, making it impossible to connect the dots and make sense of critical business information.Hidden inside streams of structured and unstructured data across cloud, social and on-premisesystems are information relationships that answer questions employees havent even thought toask, but need to be asking. . . .Martin Garland, . . . . . . . . . . . . . . . . . . 7 Solving the Inadequacies and Failures in Enterprise SearchThe inability to identify the value in unstructured content is the primary challenge in anyapplication that requires the use of metadata. If you aren’t managing it, you won’t find it. At themost basic level, enterprise search has become inadequate. Bells and whistles abound but theunsolved problem still exists. Search cannot find and deliver relevant information in the rightcontext, at the right time. This laissez-faire approach, starting with executive managementon down, illustrates the inability of organizations to elevate search to a key component andcritical enabler for improving business outcomes. An information governance approach thatcreates the infrastructure framework to encompass automated intelligent metadata generation,auto-classification, and the use of goal- and mission-aligned taxonomies is required. . . .Concept Searching, Inc.Excerpted from “MeasuringReturn on Knowledge in aBig Data World,”Coveo
  2. 2. business... new entries into the space (as Isaid before…) — I wanted to know fromJerome which of these things seemed tomatter most to him?“The enterprise search market is enter-ing its third wave,” he began. And onceJerome begins, it’s best just to lean in andlisten. “The market was created in the early2000s, and was driven by the adoption ofportals and the arrival of more and moresophisticated websites. You need searchengines for those things! The second wavehappened around information complianceand e-discovery... there was a recessionaround 2007, and compliance was consid-ered a great way (by the vendors) to drivebusiness. There was a need that could bematched with a budget,” he explained.“We are now at the beginning of thethird wave, and it is being driven by twothings: One is big data. I know it’s a buzz-word, but every buzzword reflects anunderlying truth. And the underlying trendto big data is that people are trying now toanalyze news kinds of data in novel ways.And the old techniques don’t work to getreal insights into data,” he said.“The second truth—and we’re really justat the beginning of this—is the appearanceof mobile, social and cloud. Information isbecoming much more abundant on one hand,but also much more siloed, and a lot harderto find, and that’s causing a lot of headachein terms of productivity. That’s driving awhole new need to integrate those silos andallow people to get value from the informa-tion. That’s a great new role for search.“The thing is this,” he continued. “As aninformation worker, I have my usualSharePoint and fileshares and content man-agement systems, etc., but on top of thatI have SalesForce and WorkDay andDropBox and and’s a flood of silos! I can’t connect thosesources of information. The same is true ofall my social media and collaboration apps.Yammer and LinkedIn... I’m using all thesethings at once, trying to extract knowledgefrom this very siloed world. That is the nextopportunity for search.”He continued, and I’m still leaned in.“And on the subject of big data... big datais such a big deal because there are novelforms of information that people want toanalyze. In the ‘old days’ (he means like 10years ago, tops, I’m thinking) peoplewould look primarily at financial data inrows and columns, and use BI to draw pret-ty charts, and maybe use some basic levelof analytics to gain value from the infor-mation. Today, for many different reasons,and mainly because of the proliferation ofnon-database data, people now feel theneed to apply analytics to things like socialdata, data on the Web, input from cus-tomers on their websites, and things likethat. It is coming in totally unstructured, inrandom formats and random languages,sometimes in slang...”He thought for a minute... “Oh, andthen there’s video,” he added. “Whether it’ssecurity footage or cameras from surveillancedrones, the amount of video is beyond any-body’s ability to process. Universities arestreaming their courses, and they need thatmaterial to be usable (and thus searchable).Same goes for images and voice. So all theseforms of data that nobody bothered to analyzebefore is suddenly very important to look at,analyze and unlock the value hidden within.The Purpose-DrivenSearch LifeWriting about enterprise search is not thecakewalk it used to be. With customers de-manding more business value, and vendorsresponding by becoming more “purpose-driven” and specialized, the search markethas fragmented into a series of business ap-plications that only opaquely rely on “thesearch engine” to accomplish their tasks.I often call it “the technology arc.” Atfirst, all you have to do is say “enterprisesearch,” and you have the attention of theusers and the investors. Then after a while,you have to ask, “What can this new tech-nology do for me?” Then after a while andthe shine is off the lily (or however thatexpression goes), you need to ask, “Whereis the business process improvement Ishould expect for my (fill in the blank)financial services/manufacturing/health-care/gardening shop... it no longer is aboutthe technology underpinnings. It’s aboutthe work you need to get done.”No better example is enterprise searchover the last few years. And no better inter-view could have fallen into my lap thanthe opportunity to speak with JeromeLevadoux, senior vice president for prod-ucts at (what is now officially known as)HP Autonomy.Now, here’s what I can tell you aboutthat. Not much. The little bit I know, any-way. Autonomy was once the powerhousesoftware license godhead for enterprisesearch. Still is an impressive player in themarket, for certain. But in the meantime,many smaller startups, many of them basedon open-source software and thus carrying a“hipper than thou” aesthetic, rolled onto thescene. Autonomy (like FAST Search) founditself members of a much larger, muchmore complex marketplace. And, being asrespected and enduring as they are, theywere natural targets for acquisition.Which they became. I will not dwellhere on any of the fall-out regarding thatacquisition. It remains to smarter people tosort that out. In fact, when Jerome and Italked, the subject didn’t even come up.But what DID come up was the vastlyand rapidly changing role that search playsin the information management landscape.New markets opening up... new companiesexperiencing happy upticks in theirMay 2013S2 KMWorldBy Andy Moore, Editorial Director, KMWorld Specialty Publishing GroupAndy Moore is thepublisher of KMWorldMagazine. In addition,as the editorialdirector of theKMWorld SpecialtyPublishing Group,Andy Moore overseesthe content of themonthly “KMWorldBest Practices WhitePaper series,” in printand online, as well asassisting with the creation and content of severalsingle-sponsored “positioning papers” per year.He is also the host and moderator of the popularKMWorld Web event online broadcast series.Moore is based in Camden, Maine, and can bereached at andy_moore@kmworld.comAndy Moore“You can’t swinga cat in the averageorganization withouthitting a ‘contentprovider.’”
  3. 3. May 2013 S3It makes former database operations palein comparison. “For example, ERP databasesare not really that big. After they’re com-pressed, they’re usually less than a terabyte,”he claims. (Jerome comes from a backgroundat SAP, so I take his word on this.) “I don’tthink there’s any company in the world thathas a petabyte of ERP data. So the real ‘big’data these days are things like sensor datafrom machines, or click-stream data from theWeb at large. Then there’s also what I call‘human data’—text, social feeds, etc. That’swhere big data really comes into play. Theirony is that the online repositories are actual-ly easier for knowledge workers to access thanmany of the legacy tools that were never easyto access!”And in this I agree. BI systems andfinancial analytic tools have always beencumbersome and “non-democratic.” And thatwas a problem. But now we have the oppositeproblem—information is now TOO damndemocratic.You can’t swing a cat in the aver-age organization without hitting a “contentprovider” of one kind or another.A Brave New MarketHere’s how Jerome describes it in hisgreat article on the following pages:“Today’s workers are increasingly on-the-go, embracing the newest mobile tech-nologies to stay connected and productive,as they continue to fuel the migration ofcontent to the cloud. In addition to drivingthe great shift of data to the cloud, thereis also fragmentation of knowledgeamong multiple systems and repositories.Information today has many addresses. Itlives in email, on mobile devices, inDropbox and Evernote, and in whateverapplications people may choose to install.The consumerization of content has alsomeant that devices and applications areused for both professional and personalpurposes,” he writes.“An important distinction to rememberabout information growth is that it is not justabout documents. We are becoming a multi-media-focused world watching and listeningmore and reading less. Many meetings arenow conducted remotely over video, andtraining sessions are often recorded. Thismeans your search technology is required tohandle these new and pervasive content for-mats. The number of files, images, recordsand other digital information is predicted togrow by a factor of 67 from 2009 to 2020,with corresponding growth of IT profession-als globally by a wimpy 1.4.”Coming from a guy from HPAutonomy, the next part of our conversa-tion was rather revealing. I asked Jeromewhether it was ironic that Autonomy, thatonce wanted to be the omnipotent searchengine for the masses, was now somewhatsofter about that, and was willing to admitthat “enterprise search” was kind of a nonsequitur... that in fact, enterprise searchwas more of a strategy than a product, andthe key to success was to develop a planthat made it all work together.“Yes, there are specialized search enginesfor specialized search problems,” he admit-ted readily. “For example, we are developingspecialized analytics tools for processingunstructured data for healthcare applications.There will be such specialized tools for cer-tain markets. But at the end of the day, forpersonal productivity, people are still lookingfor a single way to navigate and access alltheir data. They don’t have that today.There’s an opportunity there.”Isn’t this problem being addressed bySharePoint, I wondered, where the solutionsolves 80% of the problem, and that’s goodenough for most people?“That’s what the IT people are alwayshoping for... a neat solution where the usercan put everything into a nice little sendbox and control what he’s doing. But thatworld is no more. I use DropBox andSendIt and SharePoint, too... IT would likeme to use only one solution, but that justisn’t the way it is anymore. SharePoint isonly one of the many things I use.”The same goes for search in SharePoint.“Microsoft has bundled FAST Search intoSharePoint, but that’s all you can search...SharePoint! That’s ignoring the fundamentalproblem. Information is very distributed,”he exclaimed.“The way we look at it is this: We want toconnect people with their networks of people,associates and repositories, regardless of whothey are and what tool they’re using. That wasthe origin of enterprise search, but it will soonlook very different than the original enterprisesearch because it’s consumed in such a verydifferent way. It has to address informationthat really didn’t exist 10 years ago, such asmobile and social. Every vendor of contentmanagement repositories, whether it’sMicrosoft or Google or whomever, all assumethat every user is going to put all their infor-mation in those repositories. That’s just notgonna happen.”The Road AheadHe couldn’t resist putting his marketinghat on for a minute: “We currently havemany customers who combine tools for col-laboration and information management,and use Autonomy to search across those.We are still developing other connectors.But we are now able to look for data acrossmany different silos, on-premises as well asin the cloud.“We have customers who have us hosttheir search for cloud-based repositories.But the reality is that most organizationshave some information they prefer to keepon premises, behind a firewall, and somethey have in the cloud. That hybridapproach covers about everyone. Exceptfor some new start-ups maybe, I know ofno company that is willing to put every-thing in the cloud. So we have to provide ameans to search both on-premise informa-tion as well as ‘outside’ information. Wealso maintain the search engines on behalfof many companies. We will soon have aversion of IDOL that will run in the cloud-only, and we expect that will be the trendthat most organizations will follow.”The next challenge, insists Jerome, willbe how companies deal with unstructureddata. Companies have a lot of it, buthaven’t spent much time thinking abouthow to use it. How can we extract valuefrom it? How can we add this data toimprove a business process?“As a best practice, you first have tohave a strategy,” he said. “Instead of index-ing every single piece of data and hoping itmight be useful someday, you first have tothink about: ‘What kind of data do I have?What can be the value of this? What can Iget rid of?’There’s not a universal solution.It depends on what kinds of business thecompanies are in... what kind of verticalmarket do they service... what kinds ofproblems are they trying to solve...?”Jerome talks about a really brave newworld. So do the other writers in this WhitePaper. Please read on and join in. TKMWorld“We want to connect people with their networks of people,associates and repositories, regardless of who they are and what toolthey’re using. That was the origin of enterprise search.”
  4. 4. In today’s information-rich organization,there are three key capabilities that anintelligent search technology must supportto deliver effective knowledge managementin the era of big data:Build a knowledge graph of the organ-ization by analyzing social networks andderiving people’s expertise based onemployee behavior. This will expediteknowledge transfer, reduce duplicateefforts, and encourage a collaborative workenvironment. Generating a knowledgegraph is a complex process, in the same waythat people’s relationships are multi-facetedand ever-evolving—it cannot be owned by asingle content management system. That’swhy your search technology must under-stand relationships by analyzing a varietyof information such as communication pat-terns, work groups, project hierarchies andother attributes.Deliver contextualized search resultspersonalized to the user. Without acontext-aware solution, the same searchquery will mean different things to differentpeople. The same search query could evenmean different things to the same personwhen executed at different points of theday or on different devices. Your searchtechnology should use context and profiledata to not only personalize the delivery ofcontent, but anticipate your needs andproactively push information.Search across any repository from anydevice. Information is becoming increasinglyfragmented, and the boundaries between per-sonal and work productivity is blurring. Inour BYOD world, people are putting person-al and work items in Dropbox, Evernote,Yammer, Salesforce, Google Drive, andaccessing content from desktops, mobiledevices, tablets, “phablets” and any of the lat-est devices in the market. They are mergingpersonal and work identities in their socialnetworks. Your search technology must beable to access data from all systems, andunderstand the data in all its disparate forms.Search for Today’s Worker On-the-GoToday’s workers are increasingly on-the-go, embracing the newest mobile technolo-gies to stay connected and productive, asthey continue to fuel the migration of con-tent to the cloud. In addition to driving thegreat shift of data to the cloud, there is alsofragmentation of knowledge among multi-ple systems and repositories. Informationtoday has many addresses. It lives in email,on mobile devices, in Dropbox andEvernote, and in whatever applications peo-ple may choose to install. The consumeriza-tion of content has also meant that devicesand applications are used for both profes-sional and personal purposes.An important distinction to rememberabout information growth is that it is notjust about documents. We are becoming amultimedia-focused world—watching andlistening more and reading less. Manymeetings are now conducted remotely overvideo, and training sessions are oftenrecorded. This means your search technol-ogy is required to handle these new andIntelligent Search for Big DataRevolutionize YourApproach to KnowledgeManagementNinety percent of the world’s data hasbeen created in just the last two years. Butwhen it comes to information, there is noimmediate benefit to simply amassing ex-abytes of content. The real gains come whenorganizations are able to translate, under-stand and apply the insight that is containedwithin this flood of information.One of the main challenges posed bytoday’s information explosion is that con-tent is fragmented into disparate “silos,”such as file servers, collaboration suites,email systems and other repository types.Information is also being migrated to clouddeployments, effectively creating anothersilo. Traditional KM systems, however, arenot equipped to derive intelligence frominformation scattered across different sys-tems. Unfortunately, when an organizationis unable to leverage information for itshighest value, this can hinder its competi-tive advantage.Getting the Most ValueFrom InformationWithin the volumes of big data, busi-nesses today have more information thanever before about their employees, theircompetitors and their customers. This putsa greater emphasis on search capabilities tonot only understand information generatedby users, but also understand how theinformation flows and connects betweenusers. In essence, users today each createtheir own unique social network. Systemsthat can leverage this shift enable businessesto get more from their information.Traditional KM vendors have focusedon the capture side of the equation: makingpeople enter their information into docu-ment management systems, and relying onthat process to provide intelligence. Butthat is not how people work today. Toaccommodate these changes, a differentsystem is needed—one that mirrors theway people work and think.May 2013S4 KMWorldAs senior vicepresident of productsfor HP Autonomy,Jerome Levadoux isresponsible for thedevelopment andexecution of strategyand product offeringsin the areas of big data,content analytics andcontent management.Prior to joiningAutonomy, Jerome wassenior vice president and general manager at SAP,where he was responsible for product direction,marketing, partnerships and business developmentfor the IT Management Suite.Jerome LevadouxBy Jerome Levadoux, Senior Vice President of Products, HP Autonomy“The inability toleverage informationultimately reducesits value—and hindersthe business’ abilityto compete.”
  5. 5. May 2013 S5pervasive content formats. The number offiles, images, records and other digitalinformation is predicted to grow by a factorof 67 from 2009 to 2020, with correspon-ding growth of IT professionals globally bya wimpy 1.4.By eliminating barriers between reposito-ries, devices, communication channels anddeployment choices, you free the worker to befully engaged and productive wherever theyare. This search experience should be seam-less and yield consistent results.To experiencereal agility, you must be able to search anyrepository from any device, and then searchany file, regardless of its format.This means your search technologymust first have access to the system. If youcan’t search it, you can’t find it. Secondly,your search technology should be advancedenough to derive contextual and conceptu-al signals. If you store all facets of your lifein the cloud, for instance, is your searchtechnology smart enough to return work-related items at the top when you query acommon keyword? Or at least categorizethem according to concept? Can it separatethe relevant items from the noise?There are many advantages to uniting afragmented data landscape. Most obvious-ly, the ability to find an item quickly willincrease a worker’s productivity. But it canalso help organizations monitor socialmedia or their email systems—even ana-lyze embedded or attached rich media, toflag anomalies and find confidential data.Successful search technologies shouldoperate like a conversation, tailored to theuser. Much like in the real world, the phrase“tips for conflict resolution” can mean dif-ferent things depending on the location orcontext. For instance, at a customer site, youwould want to learn more about customerservice-oriented advice. At the office, tipsgeared toward coworker or managerial reso-lutions. And at the vendor site, better com-munication of goals. For this reason, searchtechnology should also respond differentlybased on your context. Given today’s para-digm of the ever-mobile professional, it’seasy to see how big data can use a widerange of contextual elements—time, loca-tion, content, even weather—when deliver-ing search results. This idea is an importantone for today’s organizations—one that canbe applied in a wide range of businessapplications.There is currently a movement under-way in the search community to delivermore personalized, intent-based search. Butwhat about going one step beyond andanticipating people’s needs? Here is anexample: You arrive in the morning to findan email from your manager asking for apresentation about a corporate allianceformed before you joined the company.Yousee that your search engine has alreadybegun working by locating and presentingconceptually relevant pieces of informationon your desktop. The first piece is a videofile. When you click to view the video, youare taken to the exact point in the video thatdiscusses the alliance; there’s no sifting ortime-wasting through irrelevant frames.Youare also presented with pertinent pressreleases, a PowerPoint covering key points,and the partnership contract. At your finger-tips is information you had no idea existed,which may not have included the exactmetadata to produce the same result using akeyword search. Without an intelligentsearch technology, you may have askedcolleagues, searched for files, and thenviewed the entire video to get the kernel ofinformation you needed.Own Your Social Network and Buildyour Knowledge GraphKnowledge is not counted by what isproduced, but what is shared. Content rep-resents only a subset of that knowledge. Totruly maximize an organization’s intellectu-al assets, you must build a knowledge graphby identifying and continually updatingpeople’s respective skills and social net-works. When this happens, people canquickly leverage and share expertise, andconnect and collaborate on similar projects.They can leverage existing work that mightnot live in an enterprise system, but perhapsin the expert’s personal storage. This type ofsocializing and accessibility encouragesmentorship and leverages knowledge acrossthe organization.A knowledge graph is not somethingthat a single content management systemcan own. People interact across differentsystems, and relationships are constantlyevolving. But it is something that an intel-ligent search technology can infer. In theera of big data, we have abundant informa-tion regarding people’s content browsing,consumption and contribution habits. Weknow who emails with which group andwhich individuals. We know who is work-ing cross-functionally on a certain project.We know who works on the same accountteam. Using this type of contextual data todeliver precise search results can changethe course of business—if it can be under-stood quickly enough to make a difference.People often exaggerate or misrepresenttheir level of expertise, or they just fail tokeep their profile updated. But sophisticatedtechnology can properly combine the self-professed profile with an automated analysisof content. Data will tell you not what some-one says they know, but what they actuallyknow; not who they claim to know, but whothey actually know well. These types ofconnections can be used to add another layerof context to provide a better, more person-alized search experience.But understanding a knowledge networkrequires the ability to interact with influ-encers—those individuals who shape opin-ion within communities—the people youcan turn to for support and insight. Whilethis may sound simple, the challenge lies inanalyzing significant volumes of data innear-real time to determine these relationshipsand influences in a manner that optimizes anemployee’s search experience.Search Should be Data-DrivenData-driven predictions and decisionsare gaining a lot of momentum and visibil-ity. From Nate Silver’s accurate predictionof all 50 states’ results in the last presiden-tial race, to the increasing use of statisticsby law enforcement to combat crime,people are finding new ways to apply adata-focused methodology to conventionalthinking. This same data-driven rigorshould be applied to search. Search shouldnot be a static experience, but constantlydeliver customized conversations based onthe data related to the user, the environmentand the context.When organizations choose searchtechnology that understands the conceptsand context of all information—regardlessof where it resides, how it is accessed, or itsformat—it is possible to remove the irrele-vant noise contained in much of big data.Employees that are able to leveragethe wide array of information that existsinside and outside their enterprise can gainan immediate competitive advantage thatyou can only get from one source—yourinformation, in all its forms. TKMWorld“Today’s workersare increasinglyon-the-go, embracingthe newest mobiletechnologies to stayconnected andproductive.”
  6. 6. A recent Coveo survey of 120 executivesshows only 13% said employees can effec-tively tap into the collective knowledge oftheir organizations.There are three main reasons why:Information overload. Knowledge work-ers, says IDC, spend 15% to 35% of their timesearching for information. And, most peopledon’t know where to look, or how to askfor what they are seeking. This challenge isheightened by the explosion of big data.Inability to find information.With infor-mation scattered across an organization’srepositories, directories and intranets, employ-ees cannot easily locate information they needinordertomakecriticalbusinessdecisions.Theresult? Wasted search efforts and decisionsmade in the absence of information.Recreating knowledge that alreadyexists. Knowledge workers spend more timerecreating existing information than they doturning out information that does not alreadyexist. IDC suggests that 90% of the timeknowledge workers spend in creating newreports or other products is spent in recreat-ing information that already exists.Knowledge workers spend a lot of timelooking for and processing information, at ahigh cost. If we can make it easier foremployees to find the information they need,and gain new insights from it, organizationswill get a higher return on their greatestintangible—knowledge. Employee produc-tivity will rise and profits will soar.The Path to Return on KnowledgeReturn on knowledge is linked to yourpeople’s ability to access collective knowl-edge efficiently.Think of a world in which every piece ofinformation all employees need, from anyand all systems, is instantly organized,indexed and combined in ways consumersfind commonplace on the Internet, and yetcompanies have been unable to achieve.Unified indexing technology makes thisa reality. It is the least disruptive and mosteffective path to on-demand access torelevant knowledge. The technology democ-ratizes knowledge access by providing con-textually relevant content to every user andensuring that companies build on pastknowledge rather than recreating the wheel90% of the time.Advanced indexing technology tiestogether the vast variety of systems, both on-premise and in the cloud—email, databases,CRM, ERP, social media, file shares, etc.—to unify, normalize and enrich the informa-tion to uncover hidden relationships fornew insights.Here’s an example: A customer serviceteam with on-demand, actionable insightcan troubleshoot and solve customer prob-lems quickly and consistently—helping cus-tomers to get more value from your productsand services, and retaining their loyalty.There is fierce international competitionfor every dollar of profit in today’s globaleconomy. Organizations must treat knowl-edge and knowledge workers as strategicassets in order to compete and meet the chal-lenges of the future. Very few organizationsare far along the maturity curve in dealingwith big data, but the incentive is there.Increasingly, companies will differenti-ate themselves on the basis of what theyknow—by tapping into their return on col-lective knowledge, and by unlocking thevalue inherent in every company’s disparatesources of data. Gaining real-time, relevantinsight and knowledge is the best way toempower employees and help them performtheir jobs exceedingly well.Coveo’s advanced, Unified Indexing andInsight platform redefines how peopleaccess and share fragmented knowledgearound the social enterprise. Coveo bringstogether the collective and yet fragmentedinformation from cloud-based, social andon-premise systems, and injects it into thecontext of every user, every time. More than2 million people globally and more than 500companies use Coveo to achieve their busi-ness goals.Among Coveo customers are CATechnologies, L’Oreal Switzerland, Lock-heed Martin, YUM! Brands, GEICO andSunGard. TFor more information, visit, follow uson Twitter @coveo or like us on Facebook.The Key to Return on Knowledge in a Big Data WorldAdvanced IndexingTechnologyThe following is an excerpt from theCoveo eBook, “Measuring Return onKnowledge in a Big Data World.” for your free copy.Big data. Unstructured data. Semi-struc-tured data. Data is all over the technologynews, and for good reason. It is over-whelming organizations, requiring them tofind new ways to operate, stay competitive,better serve their customers and bring newproducts to market faster.Companies are finding themselves withpiles of information within multiple channels,locked away in silos—different systems, dif-ferent departments, different geographies anddifferent data types, making it impossible toconnect the dots and make sense of criticalbusiness information.Hidden inside streams of structured andunstructured data across cloud, social andon-premise systems are information rela-tionships that answer questions employeeshaven’t even thought to ask, but need tobe asking.The speed at which business movestoday, combined with the sheer volume ofdata created by the digitized world, requiresnew approaches to deriving value fromdata and knowledge.In this big data world, knowledge is yourcompany’s greatest asset and best differentia-tor. Sohowdoyougetareturnonknowledge?Leveraging KnowledgeHow is collective knowledge beingleveraged today? In a word: poorly.Over the past decade, research firm IDChas regularly conducted research on whatNOT finding information might cost anorganization. In IDC’s most recent surveyof more than 700 knowledge workers, thefirm found that most companies are losingmore than $50,000 per employee per yearin lost productivity.What’s more, according to a 2000 studyby University of Southern California’s Mar-shall School of Business, just over 10% ofpeople reported having access to “lessonslearned” in other parts of their organization.May 2013S6 KMWorld“How is collectiveknowledge beingleveraged today?In a word: poorly.”Excerpted from “Measuring Return on Knowledge in a Big Data World,”
  7. 7. May 2013 S7identified. The elimination of end-usertagging and the resulting organizationalambiguity enables the enriched metadata tobe used by any search engine index, forexample, conceptSearch, SharePoint, Solr,Autonomy or Google Search Appliance.Only when metadata is consistentlyaccurate and trusted by the organization canimprovements be achieved in text analytics,e-discovery and litigation support. In theexploding age of big data, and more specif-ically text analytics, sentiment analysis andeven open source intelligence, the ability toharness the meaning of unstructured contentin real time improves decision-making andenables organizations to proactively actwith greater certainty on rapidly changingbusiness complexities. To achieve an effec-tive information governance strategy forunstructured content, results are predicatedon the ability to find information and elim-inate inappropriate information. The coreenterprise search component must be able toincorporate and digest content from anyrepository, including faxes, scanned content,social sites (blogs, wikis, communities ofinterest, Twitter), emails, and websites. Thisprovides a 360-degree corporate view ofunstructured content, regardless of where itresides or how it was acquired.Ensuringthattherightinformationisavail-able to end users and decision makers is fun-damental to trusting the accuracy of theinformation, another key requirement in intel-ligent search. Organizations can then find thedescriptiveneedlesinthehaystacktogaincom-petitiveadvantageandincreasebusinessagility.An intelligent metadata enabled solution fortext analytics analyzes and extracts highlycorrelated concepts from very large documentcollections.Thisenablesorganizationstoattainan ecosystem of semantics that delivers under-standable and trusted results that is continuallyupdated in realtime.Applying the concept of intelligentsearch to e-discovery and litigation, tradi-tional information retrieval systems use“keyword searches” of text and metadata asa means of identifying and filtering docu-ments. The challenges and escalating costsof e-discovery and litigation support con-tinue to increase. The use of intelligentsearch reduces costs and alleviates many ofthe challenges. Content can be presented toknowledge professionals in a manner thatenables them to more rapidly identify rele-vant information and increase accuracy. Thisapproach has also been proven to reducetime and effort in collection and forensicinvestigations, early-case assessment, ESIprocessing and Web-based documentreview. Significant benefits can be achievedby removing the ambiguity in content andthe identification of concepts within a largecorpus of information. This methodologydelivers expediencies, and reduces costs,offering an effective solution that overcomesmany of the challenges typically not solvedin e-discovery and litigation support.The need for organizations to access andfully exploit the use of their unstructured con-tent won’t happen overnight. Organizationsmust incorporate an approach that addressesthe lack of an intelligent metadata infrastruc-ture, which is the fundamental problem.Intelligent search, a by-product of the infra-structure, must encourage, not hamper, theuse and reuse of information and be rapidlyextendable to address text mining, sentimentanalysis, e-discovery and litigation support.The additional components of auto-classifi-cation and taxonomies complete the coreinfrastructure to deploy intelligent metadataenabled solutions, including records man-agement, data privacy, and migration. Searchcan no longer be evaluated on features, buton proven results that deliver insight into allunstructured content. TConcept Searching specializes in metadata generation,auto-classification and taxonomy management, and is aMicrosoft managed partner with a Gold competency inApplication Development. Its technologies encompassthe entire portfolio of unstructured information, inon-premise, cloud or hybrid environments. Clientsare using the technologies to improve search, recordsmanagement, data privacy, migration and text analytics.Solving the Inadequaciesand Failures inEnterprise SearchTheinabilitytoidentifythevalueinunstruc-tured content is the primary challenge in anyapplication that requires the use of metadata.Ifyouaren’tmanagingit,youwon’tfindit.Atthe most basic level, enterprise search hasbecomeinadequate.Bellsandwhistlesaboundbut the unsolved problem still exists. Searchcannot find and deliver relevant informationintherightcontext,attherighttime.Thislais-sez-faire approach, starting with executivemanagement on down, illustrates the inabilityof organizations to elevate search to a keycomponent and critical enabler for improvingbusiness outcomes. An information gover-nance approach that creates the infrastructureframework to encompass automated intelli-gent metadata generation, auto-classification,and the use of goal- and mission-aligned tax-onomies is required. From this framework,intelligent metadata enabled solutions can berapidly developed and implemented. Onlythen can organizations leverage their knowl-edgeassetstosupport search,litigation,e-dis-covery, text mining, sentiment analysis andopen source intelligence.Manual tagging is still the primaryapproach used to identify the description ofcontent, and often lacks any alignment withenterprise business goals. This subjectivityand ambiguity is applied to search, result-ing in inaccuracy and the inability to findrelevant information across the enterprise.Metadata used by search engines may becomprised of end user tags, pre-definedtags, or generated using system definedmetadata, keyword and proximity matching,extensive rule building, end-user ratings, orartificial intelligence. Typically, searchengines provide no way to rapidly adapt tomeet organizational needs or account for anorganization’s unique nomenclature.More effective is implementing an enter-prise metadata infrastructure that consis-tently generates intelligent metadata usingconcept identification.A profoundly differ-ent approach, relevant documents, regard-less of where they reside, will be retrievedeven if they don’t contain the exact searchterms, because the concepts and relation-ships between similar content has beenKMWorldOne of the founders ofConcept Searching,Martin Garland hasmore than 21 years’experience in ECM. Hisunderstanding of theinformationmanagementlandscape and hisbusiness acumenprovide a foundationfor guidingorganizations toachieve their business objectives using bestpractices, industry experience and technology.Martin’s expertise has been instrumental inassisting multinational clients in diverse industriesto understand the value of managing unstructuredcontent to improve business processes.Martin GarlandBy Martin Garland, CEO, Concept Searching, Inc.
  8. 8. www.infotoday.comProduced by:KMWorld MagazineSpecialty Publishing GroupFor information on participating in the next white paper in the “Best Practices” series, or • 561-483-5190Kathryn Rogals Paul Rosenlund Andy Moore561-483-5190 561-483-5190 207-236-8524 Ext. andy_moore@kmworld.comFor more information on the companies who contributed tothis white paper, visit their websites or contact them directly:www.kmworld.comConcept Searching, Inc.8300 Greensboro Drive, Suite 800McLean VA 22102PH: 703.531.8567Twitter: @conceptsearchContact: info-usa@conceptsearching.comWeb: www.conceptsearching.comCoveoContact: info@coveo.comWeb: www.coveo.comHP AutonomyOne Market PlazaSpear Tower, Suite 1900San Francisco CA 94105PH: 415.243.9955Contact: autonomyinfo@hp.comWeb: