Smartlogic                                   TM Apache Lucene Eurocon                                    	                ...
1st degree of orderFiling management• 80% of enterprise information isunstructured• Doubling every 19 months andaccelerati...
2nd degree of orderIndex management• File plans and metadata schema• Mono- hierarchical standardisedtaxonomies• Manually a...
3rd degree of orderComputerised 1st and 2nd degrees
5	  A 10 year Flatline Expectation Gap• 2001,	  IDC,	  “Quan5fying	  Enterprise	  Search”	  	  Searchers	  are	  successfu...
The explosion of information                                                               80Tb	                          ...
7	  Search Gets Harder as Data sets Grow	         Circa	  1996	  	                             	  
Different vocabulary and ambiguityYou	  Say	        I	  Say	  Moon	  Buggy	     Lunar	  Roving	  Vehicle	                 ...
Conventional Search - Ineffective, Frustrating, and Inadequate                                                            ...
Knowing what you have
Paradox of Effort Metadata	  is	  to	  search,	  what	  pistons	  are	  to	  a	  petrol	  engine.	                        ...
How do I structure it? Information   Subject	                                                                             ...
3rd degree content universe                                     Enterprise	        Content	                               ...
4th degree of order                                     Enterprise	        Content	                                       ...
4th degree of orderContent Intelligence                                        Content	  Intelligence	  Plahorm	          ...
Semaphore                                    Business	  	                                     Vocabulary	                 ...
Semaphore                                             Business	                                              Vocabulary	  ...
Components• Metadata	  • Seman5c	  Models	  • Contextual	  User	  Experience	  • Seman5c	  Sokware	                       ...
Metadata                            Today	                                                     With	  Content	  Intelligen...
Semantic Models                     Organising                           Contextualising                    Harnessing    ...
Contextual User Experience                                           9	                                                   ...
Content	  ExploraFonHighligh5ng	  rela5onships	  in	  a	  result	  set	  greatly	  improves	  the	  user	  experience.	  
Semantic Software                                 Semaphore	                   Ontology	  	  &	  Metadata	  Management	   ...
Semaphore Search Integration                                                                                              ...
4th degree of order                                     Enterprise	        Content	                                       ...
Content Intelligence                                                Informa5on	                                           ...
Content Intelligent Solutions                              Micro-­‐Targe5ng	  &	                                  Distribu...
www.smartlogic.com	     28	  
Smartlogic                               TMJeremy.Bentley@Smartlogic.com      www.smartlogic.com	     29	  
Upcoming SlideShare
Loading in …5
×

More Powerful Solr Search with Semaphore - Jeremy Bentley

904 views

Published on

See conference video - http://www.lucidimagination.com/devzone/events/conferences/ApacheLuceneEurocon2011

Metadata is widely understood to be a critical element of search, discovery and classification. But with the preponderance of unstructured data addressed by search technology, consistent native metadata is often in short supply. Organizations often find that the quality and depth of contextual metadata -- what documents are about – can maker or break search relevancy, precision and recall.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
904
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

More Powerful Solr Search with Semaphore - Jeremy Bentley

  1. 1. Smartlogic TM Apache Lucene Eurocon     Jeremy  Bentley,  CEO  
  2. 2. 1st degree of orderFiling management• 80% of enterprise information isunstructured• Doubling every 19 months andaccelerating [Gartner]• Increasing burden of compliance• Enterprise 2.0 additions
  3. 3. 2nd degree of orderIndex management• File plans and metadata schema• Mono- hierarchical standardisedtaxonomies• Manually applied classification• Low level of consistency and quality
  4. 4. 3rd degree of orderComputerised 1st and 2nd degrees
  5. 5. 5  A 10 year Flatline Expectation Gap• 2001,  IDC,  “Quan5fying  Enterprise  Search”    Searchers  are  successful  in  finding  what  they  seek  50%  of  the  9me  or  less      • 2011,  MindMetre/SmartLogic  More  than  half    (52%)  cannot  find  the  informa9on  they  need  using  their  Enterprise  search  system    
  6. 6. The explosion of information 80Tb   ?   20  5mes   Terabytes  of  data   increase  in   Informa5on   volume   4Tb   1993-­‐2001   2001-­‐2009   Source:  the  Na5onal  Archives  
  7. 7. 7  Search Gets Harder as Data sets Grow   Circa  1996      
  8. 8. Different vocabulary and ambiguityYou  Say   I  Say  Moon  Buggy   Lunar  Roving  Vehicle   Manned  Lunar  Surface  Vehicle   Missing resultsSwine  Flu   Swine  Influenza  Virus   H1N1  Touchscreen   Touch  screen   Mul5-­‐touch  You  Say   What  do  you  mean?  Apple   A  fruit?   Fiona  -­‐  A  singer  /  songwriter?   An  electronics  company?  Rights   Employment  rights?   Too many results Equal  rights?   Right  of  way?  Ford   Ford  Motor   Forward  Industrials  (5cker=FORD)   A  shallow  river  crossing  
  9. 9. Conventional Search - Ineffective, Frustrating, and Inadequate Drawbacks Apparent 1 Needle in the Haystack 2   2 Multiple search terms 1   3 Irrelevant results 4 Out of date results 5 Multiple media forms 6 Unrestricted geography 7 Inappropriate ads Not So Apparent 7   8 Can’t filter, select subset 9 No related topics 4   10 Missing results 11 No context or guidance 12 Best resource not clear 5   3   ü  Time consuming 6   ü  Inefficient ü  Ineffective
  10. 10. Knowing what you have
  11. 11. Paradox of Effort Metadata  is  to  search,  what  pistons  are  to  a  petrol  engine.   Web Enterprise Metadata effort High Low Result Quality Low High requirement
  12. 12. How do I structure it? Information Subject   Crea5on  Date   Loca5on   Modified  Date   Project   Author   Func5on   Format   (PDF,DOC,XLS)   (IT,HR,Finance)   Protec5ve   Marker   Expiry   Publisher   Expert   Reten5on   Site  Process Structural
  13. 13. 3rd degree content universe Enterprise   Content   Search   Management   Portal   Infrastructure   Document    Management   Social  collaboraFon   Records   Management   Publishing   Process     Systems   Management  &   Digital   Workflow   Asset   Management   eDiscovery  
  14. 14. 4th degree of order Enterprise   Content   Search   Management   Portal   Infrastructure   Document    Management   Social  collaboraFon   Content Records   Intelligence Management   Publishing   Process     Systems   Management  &   Digital   Workflow   Asset   Management   eDiscovery  
  15. 15. 4th degree of orderContent Intelligence Content  Intelligence  Plahorm        Solr  
  16. 16. Semaphore Business     Vocabulary   Expose   Apply   Classifica5on   User   Decision   Ac5on   Inform   Copyright  @  2011  Smartlogic  Semaphore  Limited   16  
  17. 17. Semaphore Business   Vocabulary  Seman6c  models   Expose   Apply   Metadata   Seman6c  So7ware   Classifica5on   User   Decision   Ac5on   Inform   Contextual  User  Experience   Copyright  @  2011  Smartlogic  Semaphore  Limited   17  
  18. 18. Components• Metadata  • Seman5c  Models  • Contextual  User  Experience  • Seman5c  Sokware   Copyright  @  2011  Smartlogic  Semaphore  Limited   18  
  19. 19. Metadata Today   With  Content  Intelligence   Manual   Automa5c   Process   Process   Mul5ple    approaches     Single  Unified  ‘one  size  fits  all’  approach     for  various  domains/audiences   Long  5me  to  crak   Short  5me  to  build    &  build  ,  manually  applied   &  deploy,  automa5cally     Low  Quality  tags   High  Quality  tags   High  cost  to  apply   Low  cost  to  apply   Copyright  @  2011  Smartlogic  Semaphore  Limited   19  
  20. 20. Semantic Models Organising Contextualising Harnessing Parent topics Content-types available Automate Covered by – Automotive sector – Flashnotes compliance and – Bob Smith – Bond issuers – Research reports distribution tasks – Trade ideas – ‘Watch list’ lookup Analytics available – Distribution according to preset – Current bond price rules Preferred term (Agreed Label) – Relative bond spreads – Automated mapping Ford Motor Company Influenced by to create aggregator metadata – Credit ratings on Ford Motor Credit Company User Experience – European and US economies – Conceptual relevance Also known as Location of – Changes in consumer demand – Related topics – Ford fundamental data – Links to analytics – Ford Motor – Earnings estimates Search engine enhancement – F (Bloomberg) – Historic sales Key competitors – Search results – FoMoCo and profits – BMW – Email alerts – blue oval – Daimler Chrysler – General Motors Unstructured Subsidiaries – Toyota content integration – Ford Motor Credit Company – Volkswagen – Published reports – Mazda Products – Related topics – Focus – Links to analytics – Ka – Search results – MX5 – Email alerts
  21. 21. Contextual User Experience 9   Key Features 1 Taxonomy enables discovery, related searches 1   2 Related topics and content 2   3   3 Facets enable filtering results by: 4   4 -  Source 5 -  Numerous topics 6 - Date 5   7   7 Best Bets 8   8 Automated doc. Tagging 9 A-Z ü  More relevant results ü  Fewer “bad hits” ü  Powerful navigation 6  
  22. 22. Content  ExploraFonHighligh5ng  rela5onships  in  a  result  set  greatly  improves  the  user  experience.  
  23. 23. Semantic Software Semaphore   Ontology    &  Metadata  Management   Text  Analysis  &  Extrac5on   Automa5c    and  assisted    Content  classifica5on   Contextual  Naviga5on  Services   Seman5c  Reasoning  &  Processing  
  24. 24. Semaphore Search Integration Classifica5on   Search   Local   Term   Rules   Enhancement   Index   Ontology  Manager   Classifica5on  Server   Server   Web  Services  API   Text  Miner   XML  API   Ontology  Informa5on   Document  “Tags”   Extracted  Text   Sample  Interface  Code   User  Requests   Query   Index   Collector/Normalizer   Search   Applica5on   Framework   Portal   Search  Engine   Corpus   Semaphore  core  module   Semaphore  op5onal  module  
  25. 25. 4th degree of order Enterprise   Content   Search   Management   Portal   Infrastructure   Document    Management   Social  collaboraFon   Content Records   Intelligence Management   Publishing   Process     Systems   Management  &   Digital   Workflow   Asset   Management   eDiscovery  
  26. 26. Content Intelligence Informa5on   Manufacturing   Mone5sa5on   Knowledge   Metadata   Recovery   Data  Loss  Preven5on   Risk  &  Compliance   Content     Analy5cs  
  27. 27. Content Intelligent Solutions Micro-­‐Targe5ng  &   Distribu5on    Web     Knowledge    Self  Service   Acquisi5on   &  Recovery   Governance   Cross  Plahorm   Risk     Content  Integra5on   Compliance  
  28. 28. www.smartlogic.com   28  
  29. 29. Smartlogic TMJeremy.Bentley@Smartlogic.com www.smartlogic.com   29  

×