II-SDV 2014 A New Approach to Flexible, Meaning-Rich Document Parsing (Paul Barba -- Lexalytics, USA)

783 views

Published on

Published in: Software, Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
783
On SlideShare
0
From Embeds
0
Number of Embeds
292
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

II-SDV 2014 A New Approach to Flexible, Meaning-Rich Document Parsing (Paul Barba -- Lexalytics, USA)

  1. 1. Nice And how to get what you need.
  2. 2. Lexalytics is • A software company • We sell the “Salience Engine” • Salience is a Text Analytics Engine that fits into your software, services, or applications • What we ship is a set of libraries and configuration files © 2014 Lexalytics Inc. All rights reserved. lexalytics.com 2 S A L I E N C E 5 . 2
  3. 3. Market Proven IP: 11 Years of R&D © 2014 Lexalytics Inc. All rights reserved. lexalytics.com 3 Approximately 3 Billion documents/day go through Salience. 2/2012: Mobile Functionality – Port the Salience engine to Android mobile devices 11/2010: Salience 4.4 released, includes support for first non-English language (French) 10/2011: Salience v5.0 incorporates innovative Concept Matrix functionality 06/2012: Salience v5.1 released, expansion of available options and optimized sentiment analysis functionality 08/2013: Chinese language released; multi-lingual support in 6 languages Q4/2014: Salience v6 – new underpinnings, easier tuning, and “Intent” extraction 2004: Lexalytics launches first commercial text and sentiment analysis engine, Salience v1.0 10/2008: Salience 4.0 released, based on maximum entropy model for detection and labeling of novel entities 08/2010: Salience 4.3 to include custom handling of Twitter and micro-blog content 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Q4/2014: Salience v5.2 released with various feature enhancements
  4. 4. A Multi-lingual World WLOA (With Lots of Acronyms) and Context Everywhere 4 Lexalytics Salience Training prepared for Analytics8 NLP • New Labor Party • National Landcare Program • Network Layer Packet • NeuroLinguistic Programming • Wicked • Sick • Hack
  5. 5. Always running to catch up… 5 Lexalytics Salience Training prepared for Analytics8
  6. 6. New Tools
  7. 7. God Bless Moore’s Law and Librarians 7 Lexalytics Salience Training prepared for Analytics8
  8. 8. Unsupervised learning is the key 8 Lexalytics Salience Training prepared for Analytics8
  9. 9. Meaning Matters 9 Lexalytics Salience Training prepared for Analytics8 It’s not that I don’t like tea I just prefer coffee
  10. 10. Meaning Matters 10 Lexalytics Salience Training prepared for Analytics8 Jane will be joining already with a search experta team
  11. 11. Meaning Matters 11 Lexalytics Salience Training prepared for Analytics8 Jane will be joining a team already with some search experience
  12. 12. Episode 4: A New Hope 12 Lexalytics Salience Training prepared for Analytics8 Sentence POS Tagger Chunker Rules File Candidate Parse Terms
  13. 13. Jane and her team <Jane will be joining a team already with search experience> • Pos Tag <Jane_NNP will_MD be_VB joining_VBP a_DT team_NN already_RB with_PP search_JJ experience_NN> • Chunk <Jane> <will be joining> <a team> <already with search experience> 13 • Extract possible links Jane => will be joining will be joining => a team a team => already with search experience will be joining => already with search experience Jane => already with search experience. Lexalytics Salience Training prepared for Analytics8
  14. 14. Matrices of Meaning 14 Lexalytics Salience Training prepared for Analytics8
  15. 15. Matrix Math 15 Lexalytics Salience Training prepared for Analytics8 All noun phrases All verb phrases
  16. 16. Now look at how easy it is • <Do you want me to get anything else while I go to the store for milk?> • pos tag and chunk it. <Do> <you> <want> <me> <to get> <anything else> <while> <I> <go> <to the store> <for milk> 16 Find the possible links. do want you want want me you to get want to get me to get to get anything else want while to get while while go I go go to the store I to the store get to the store want to the store to the store for milk go for milk want for milk Lexalytics Salience Training prepared for Analytics8
  17. 17. A world of new possibilities 17 Lexalytics Salience Training prepared for Analytics8

×