Pingar - The Future of Text Analytics


Published on

Presentation given by the Pingar India office to Partners in India entering the Text Analytics Market.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Pingar - The Future of Text Analytics

  1. 1. The future of Text Analytics
  2. 2. AgendaWho is ChrisThe problemWhat is text analyticsWhy use itHow text analytics evolvedUse casesThe FUTUREWho is Pingar?
  3. 3. Who is ChrisJust learned what Cricket is!VP Marketing @ PingarAuthor in the area ofContent ManagementTwitter: @HoardingInfo
  4. 4. UnstructuredData ProblemUnstructured content makes up 80% of all digitalcontent *The value of unstructured content diminishesexponentially after it is publishedMetadata is key to making any use of a documentafter it is publishedref: 2012
  5. 5. Why use it?Without metadata the time spent in producingcontent is lost, and the content posses a risk for theorganizationExtracting metadata without text analytics is amanual process, which is expensive and prone tohuman error and inconsistency
  6. 6. What is Text AnalyticsTechnology that extracts value from unstructuredcontentTurns documents into Keywords and Entities -MetadataTransforms unstructured to transactional
  7. 7. Evolution of text analyticsStarted appearing around 2003Initial engines were statisticalAccurate but lots of workModern engines use machine learningPower of disambiguation & Linked DataSeveral general purpose engines but mostly verticalsolutions
  8. 8. Use Cases
  9. 9. Use CasesContent Migration and DiscoveryContent Classification and OrganizationInternal Content Publishing
  10. 10. Content Migration &Discovery - ProblemA large oil and gas company in the US wasrecently sued and lost ($ millions ). Due to poorcontent control, documents left theorganization that should not have.So the company decided to implement an ECMsystem. But 90% of the organizations contentis stored in a File Share, the “Z” drive and noone knows what is there.In order to move to ECM they need to quicklyanalyze the file share to isolate relevantcontent, and remove that which is notrelevant. Also to prepare for migration to ECM.
  11. 11. Content Migration &Discovery - SolutionAnalyze the file share to produce a list of content by type andrelationships to other content.Determine what content is relevant, what content should beremoved, and build an information architecture for a properECM platform.Visualize the content based on location, people, etc. to help gaininsight and make decisions how to deal with the content to avoidfuture litigation.
  12. 12. Content Migration &Discovery - Result• New ECM system with relevant content only• Purged non-relevant content• Better control which means less legal risk• Ability to make better business decisions
  13. 13. Content Classification &Organization - ProblemOne of the US’s largest commercial banksproduces regular collateral and promotionalmaterials. Because the resulting scripts andmedia files are poorly organized they arefinding they are duplicating effort on futurecampaigns and losing valuable andexpensive content.They need to improve organization of theseassets, and cross pollination of information.
  14. 14. Content Classification &Organization - SolutionBuild a hierarchy of content, a taxonomy to be used to file content. Ascontent is saved to the rich media content repository have itautomatically filed according to the taxonomy.Automatically generate search filters so navigation of the content ismore efficient, and fewer documents are missed by the team.
  15. 15. Content Classification &Organization - Result• Users spend 50% less time finding content• Content is now organized by topic automatically• Save $750,000 a year in duplicated effort• Improve idea sharing
  16. 16. Internal ContentPublishing - ProblemOne of the worlds largest chemicalmanufactures has many R&D departments. Asnew chemicals are invented scientists publishdocuments discussing the intellectual propertyof these inventions. The articles are to bepublished to other scientist so they can use theknowledge to further their research anddevelopment.The system for publishing this content is manualand costly. A high paid chemical scientist has tomanually tag and summarize articles beforethey are saved to a content managementsystem. Scientist have to “search” for contentthey might find interesting, but they don’talways know what to look for. This is costly,prone to human error, and information is lost.
  17. 17. Internal ContentPublishing - SolutionAutomatically tag, classify, and summarize content asit’s being published by scientists.Generate emails with summaries and links to articles.Send the emails to scientists based on their profile,showing only content that is relevant to them.
  18. 18. Internal ContentPublishing - Result• 70% cost reduction in publishing process• Content is published 150 x faster• Scientist no longer have to search, content is pushed tothem• The content auditors can focus on other responsibilities
  19. 19. Text Analytics is increasing thevalue of unstructured content,reducing risk, and makingorganizations more efficient
  20. 20. The futureText Analytics will be a mandatory for all organizations doingunified information accessMachine Learning Engines take overBigData and BigContent join forcesThe need for Language Scientist and Data Scientist increasesBuzz Words: Unified Information, Content Intelligence,BigContent
  21. 21. Who Is• The Text Analytics SubjectMatter Experts• Helping you make moneywith a Text Analytics practice