PLAT-11 Moving from Lucene to Alfresco FTS

2,468 views

Published on

Your questions answered. Why Move? What changes? How do I migrate? What else do I get? With some tips thrown in for date queries, CMIS and Alfresco FTS and how to extend search in Share.

Published in: Technology, Education
  • Be the first to comment

PLAT-11 Moving from Lucene to Alfresco FTS

  1. 1. Moving from Lucene to Alfresco FTS Andy Hind• Senior Developer • twitter @andy_hind
  2. 2. Agenda•  Why move?•  What changes?•  How do I migrate?•  Some more detail …•  Some tips along the way … o  Dates, CMIS, share extension, relevancy
  3. 3. Why move?•  New functionality•  You do not loose anything o  FIELDS are common•  Consider CMIS SQL o  JOIN•  Includes Google syntax
  4. 4. Why move? …•  Pattern/exact vs FTS/expansion o  =name:”doc” ~name:”doc” o  Fields tokenised as “both”•  Templates (share query extension)•  Less (different) escaping o  Escape any character including whitespace o  Java unicode escape sequences
  5. 5. Why move? ….•  Embedded in CMIS QL o  contains•  Ranges o  Google style 0..5 0 <= x <= 5 o  un-symmetrical <0 TO 5] 0 < x <= 5•  Proximity o  TEXT:(big * apple)
  6. 6. Why move? ….•  Literals o  “” – strings o  Java Integer and Decimal literals o  Variable resolution dates•  Future extensions
  7. 7. What changes?•  Leading “@” optional o  Default namespace •  name:docName•  Escaping o  “:” escaping is not required •  cm:name:docName
  8. 8. What changes? …Escaping - in terms, wildcard, prefix, … o  Lucene - anything but whitespace o  Unicode – no escaping for •  Letters, Marks, Numbers •  Currency + Other symbols o  All other unicode characters require escaping •  Includes punctuation - . , + etc •  Escaping is NOT required in phrases
  9. 9. What changes? …•  Escaping – in phrases •  Java uxxxx . ” •  Use phrases – extensions to support wildcards –  Not quite the same …•  Exclusive ranges o  Lucene {} o  FTS <> •  Mixed •  Unbounded MIN, MAX, u0000 uFFFF
  10. 10. How do I migrate?•  Change the query language o  “lucene” -> “fts-alfresco”•  Fix the stuff that has changed ….
  11. 11. Some more detail …•  Dual tokenisation o  Match using FTS •  ~name:woof o  Match using pattern •  =name:woof*•  Unbounded ranges o  MIN, MAX o  u0000, uFFFF
  12. 12. Some more detail …•  Templates o  Share •  search.get.config.xml •  %(cm:name cm:title cm:description ia:whatEvent ia:descriptionEvent lnk:title lnk:description TEXT TAG) o  Now easy to configure o  Define on the search API
  13. 13. Some more detail …•  Default namespace o  Short fields for your own namespace .. o  cm:name -> name o  TYPE:”cm:content” -> TYPE:”content”•  Do not forget ... o  Date time and variable date resolution o  Default field o  Java API •  Specify multiple locales
  14. 14. Some more detail …•  d:datetime o  Default – date resolution only o  You can have date time + variable resolution •  3.4.0 and later o  @cm:created:”2011 @cm:created:”2011-02 @cm:created:”2010-02-01 @cm:created:”2010-02-01T11 @cm:created:”2010-02-01T11:04 @cm:created:”2010-02-01T11:04:31 @cm:created:”2010-02-01T11:04:31.000
  15. 15. Some more detail …•  d:datetime o  @cm:created:["2010" TO "2011"] @cm:created:["2011-02-01T11:03" TO "2011-02-01T11:04"] o  See o  http://blogs.alfresco.com/wp/andyh/ 2011/02/01/whats-in-a-date/
  16. 16. Some more detail …•  AND/OR +-| o  name:(big AND |dog) o  Explicit op likely to change •  +big OR +dog •  +big AND +dog •  +big +dog
  17. 17. Some more detail …•  Indexing Control o  cm:indexControl •  cm:isIndexed •  cm:isContentIndexed•  Model – e.g. cm:thumbnail o  <includedInSuperTypeQuery>false</ includedInSuperTypeQuery>
  18. 18. Some more detail …•  PARENT•  TAG•  Boosts o  Relevance o  Can use in templates o  big or test^2
  19. 19. Demos ….
  20. 20. Questions?

×