The Next-Generation SharePoint: Powered by Text Analytics
Upcoming SlideShare
Loading in...5
×
 

The Next-Generation SharePoint: Powered by Text Analytics

on

  • 1,335 views

 

Statistics

Views

Total Views
1,335
Views on SlideShare
1,334
Embed Views
1

Actions

Likes
1
Downloads
24
Comments
0

1 Embed 1

http://paper.li 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • What are your primary applications where text comes into play?

The Next-Generation SharePoint: Powered by Text Analytics The Next-Generation SharePoint: Powered by Text Analytics Presentation Transcript

  • Alyona Medelyan (Pingar) @zelandiya THE NEXT-GENERATION SHAREPOINT:POWERED BY TEXT ANALYTICS
  • AGENDA• Information tasks• Text analytics• APIs• Demos• Conclusions
  • Information tasksWhat do they cost us?How does SharePoint help?
  • Avg. hours per week14.5 13.3 = $37K year / person 9.6 9.5 8.8 8.3 6.8 6.7 5.6 5.6 4.3 4.2 1 Source: IDC, Hidden Cost of Information (2005)
  • SHAREPOINT SAVES TIME Interact with SP from Outlook  Create docs collaboratively  Customize search configuration  Use sites, sets & libraries  Define Managed Metadata  Configure forms  Design Workflow
  • Text AnalyticsWhat is it and how does it work?What tasks does it solve?
  • WHAT IS TEXT ANALYTICS? unstructured dataLinguistics Search Statistics Data Extraction Text Processing Document OrganizationMachine Learning Business IntelligenceNatural Language Processing Opinion Mining Text Mining
  • TEXT ANALYTICS SAVES MORE TIME  Compose search reports  Extract entities … automatically  Mine opinions & sentiment  Cluster search results  Redact  Summarize  Generate metadata  Fill databases  Profanity check
  • Text Analytics SoftwareWhat companies offer text analytics?What are open source tools like?
  • TEXT ANALYTICS: GLOBAL PERSPECTIVEUser adoption has grown by 25% in 2010 creating an $835 million market because:• Unstructured data grows (ex. social)  Text analytics!• Text analytics is central to effective information access• Many successes in NLP: IBM Watson, Wolfram Alpha Full report by Seth Grimes: http://altaplana.com/TA2011
  • APPLICATIONS OF TEXT ANALYTICS Search & info access 39%Customer experience management 39% Brand management 39% Research 36% Competitive intelligence 33% Customer service 26% E-discovery 15% Life sciences 15% Product design 15% Online commerce 11% Finance 10% Other 9% Content management 8% Insurance & fraud 8% Millitary intelligence 7% Law enforcement 6% Source: http://altaplana.com/TA2011
  • SEARCH & INFO ACCESS METADATA EXTRACTIONDocument Easy to extract: Metadata File type, name & location, creation & modification date, authors Difficult to extract: Keywords, people & companies mentioned, suppliers & addresses mentioned
  • SEARCH & INFO ACCESSKEYWORD EXTRACTIONDocument Candidates Keywords Hi All, As of today, MetaStock has several new functions. The most important new feature is the ability to display forward heat rate charts. Also, notice that the interface looks different -- this reflects and accommodates the new features. If you have any questions regarding this new version of MetaStock, please contact Bella Santuri.
  • SEARCH & INFO ACCESSKEYWORD EXTRACTIONDocument Candidates Keywords Hi All, As of today, MetaStock has several new functions. The most important new feature is the ability to display forward heat rate charts. Also, notice that the interface looks different -- this reflects and accommodates the new features. If you have any questions regarding this new version of MetaStock, please contact Bella Santuri.
  • SEARCH & INFO ACCESS KEYWORD EXTRACTION Document Candidates Properties Keywords Hi All, As of today, MetaStock has several new functions. Frequency The most important new feature is the ability to Position display forward heat rate charts.Corpus stats Also, notice that the interface looks different -- thisRelatedness reflects and accommodates the new features. If you have any questions regarding this new version of MetaStock, please contact Bella Santuri.
  • SEARCH & INFO ACCESS KEYWORD EXTRACTIONDocument Candidates Properties Scoring Keywords Hi All, As of today, MetaStock has several new functions.Heuristic The most important new feature is the ability to scoring display forward heat rate charts. Also, notice that the interface looks different -- thisMachine reflects and accommodates the new features.learning If you have any questions regarding this new version of MetaStock, please contact Bella Santuri.
  • SEARCH & INFO ACCESSNAMES EXTRACTIONDocument Examples Properties Learning Names If you have any questions regarding this new version of MetaStock, please contact Bella Santuri. NLP, Training data Machine Heuristics, (annotations) Learning Text mining
  • <SEARCH + TEXT ANALYTICS> COMPANIES Pingar, BasisTech, AlchemyAPI, LanguageComputer, OpenCalais, Extractiv
  • BRAND & CUSTOMER MANAGEMENT  SENTIMENT ANALYSIS ReviewsDocumentDocument Visualization Tweets Sentiment Analysis Summary SurveysNaïve approach: Sentiment-words dictionary!Negative Positive BUT: suck fantastic If you are reading this because it terrible excellent is your darling fragrance, please awful awesome wear it at home exclusively, and tape the windows shut. No sentiment words!
  • BRAND & CUSTOMER MANAGEMENT  SENTIMENT ANALYSIS ReviewsDocumentDocument Visualization Tweets Examples Properties Learning Summary Surveys Presence PositionTraining data Lexicon Machine Part-of-Speech(annotations) induction Learning Negation Generalization Important: Identifying sentiment bearing sentences Attaching sentiment to a topic!
  • SENTIMENT ANALYSIS COMPANIESAttensityAlchemyAPILexalyticsSaploMedalliaSAS
  • RESEARCH  TEXT SUMMARIZATION Address Hi All, Announcement As of today, MetaStock has several new functions. Details The most important new feature is the ability to display forward heat rate charts. More details Also, notice that the interface looks different -- this reflects and accommodates the new features. Conclusion If you have any questions regarding this new version of MetaStock, please contact Bella Santuri.Extractive summary: As of today, MetaStock has several new functions.Sentence compression: MetaStock has several new functions. The new interface looks different.Abstractive summary: MetaStock has new features and a new interface.
  • TEXT SUMMARIZATION COMPANIESLexalytics, Pingar
  • COMPETITIVE INTELLIGENCE:ENTITY & ENTITY RELATION EXTRACTION Companies: OpenCalais, Extractiv, Pingar, Evri, AlchemyAPI, Zemanta
  • FRAUD INVESTIGATION:NORMALIZATION OF DATES & NAMES Companies: Cicero, BasisTech
  • OPEN-SOURCE TOOLS• NLTK – Apache license, Book, Python & academic datasets, nltk.org• LingPipe – Commercial licenses, Tutorials, Coreference & Chinese segment, alias-i.com/lingpipe• OpenNLP – Apache license, Parsing, MaxEnt ML, incubator.apache.org/opennlp• GATE – restricted GPL, Training courses, Applications & framework, gate.ac.uk• Stanford NLP – full GPL, Online docs, Full library, nlp.stanford.edu
  • APIsWhat’s an API and how does it work?What are the advantages of the API model?Which API is the right one for you?
  • API ACCESS a protocol specifies how • SOAP XML needs to be encoded • REST a call is an XML message describing the request includes API authentication calls via a web service API ENGINE SDK usage examplesDeveloper creates An interface that Software engine an application ensures communication solves a specific task
  • REST API ACCESS FROM A BROWSERAPI requesthttp://search.yahooapis.com/WebSearchService/V1/webSearch?appid=YahooDemo&query=madonna&context=Italian+sculptors+and+painters+of+the+renaissance+favored+the+Virgin+Mary+for+inspirationAPI response
  • SOAP API ACCESS FROM VS2010
  • SOAP API ACCESS IN POWERSHELLRead complete blog post “Bulk metadata extraction in SharePoint”:http://bit.ly/powershell-migrate
  • API = EASY INTEGRATION & FLEXIBILITY• Integrate into existing architecture via any programming language• Improve known flaws in the current system/process• Minimize adoption barriers within the company no or little training required for stuff• Only pay for the features you need• Flexible deployment: • Host API on site = Secure data exchange • Access the API in the cloud = Save on tech support & hardware
  • WHICH API IS BEST FOR YOU? I need to take some text and get a list of the important entities/keywords/phrases. Y: Term Extractor API restrictions OpenCalais Supported languages BeliefNetworks Quality of results OpenAmplify Semantic links AlchemyAPI 2nd Synonyms/Duplicates Evri 1st Blog post on API comparison: faganm.com/blog
  • HOW TO CHOOSE AN API:• Define a specific task• Think of what features are important• Get prepared: • Subscribe for API keys • Get SDKs • Learn libraries• Find representative data• Build a test framework• Compare results
  • METADATA EXTRACTIONIN SHAREPOINTDemoPingar’s add-on for SharePoint 2010built using a text analytics API
  • INTEGRATING APISINTO SCANNINGVideoUsing Fuji Xerox SmartConnect and Pingar APIto scan documents in batch into SharePoint http://www.youtube.com/watch?v=kluVp25upag
  • THE NEXT-GENERATION SHAREPOINT:POWERED BY TEXT ANALYTICS• What can be automated? • Metadata extraction, Data entry, Opinion mining, Sanitization, Doc approval, Summarization, …• How to integrate text analytics into existing SharePoint applications? • Easy! Via an API• How to find the right text analytics API? • Review what’s available Set up an experiment Compare results
  • Thank you to all of our Sponsors