Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Webinar: Replace Google Search Appliance with Lucidworks Fusion

Lucidworks Senior Search Engineer, Evan Sayer, and Enterprise Content Management and Big Data Architect for the County of Sacramento, Guy Sperry, explore the benefits of replacing Google Search Appliance with Lucidworks Fusion.

  • Login to see the comments

  • Be the first to like this

Webinar: Replace Google Search Appliance with Lucidworks Fusion

  1. 1. Replacing  GSA  with  Lucidworks   Fusion     Evan  Sayer   Senior  Search  Engineer   Lucidworks     Guy  Sperry   Enterprise  Content  Management  &  Big  Data  Architect     County  of  Sacramento  
  2. 2. Introduc)on   •  Lucidworks   –  Founded  in  2007   –  Contributes  ~70%  of  the  open-­‐source  code  commiJed   to  the  Apache  Lucene/Solr  project   •  Lucidworks  Fusion:  our  enterprise  search   plaNorm  built  on  top  of  Apache  Solr   •  Apache  Solr:  the  most  popular  open-­‐source   enterprise  search  engine  on  Earth  
  3. 3. Google  Search  Appliance  (GSA)   •  Google’s  enterprise  search   soluPon  offered  from  2002-­‐2016   •  One-­‐stop  shopping:  a  complete   enterprise-­‐search  soluPon  in  one   box   •  EoL  as  of  February  2016,  support   phased-­‐out  completely  by  2018  
  4. 4. GSA  Strengths   •  Easy  to  setup  and  configure  –  “plug  and  play”   –  Lower  start-­‐up  cost  and  lower  Pme-­‐to-­‐value  than   many  other  contemporary  soluPons   –  RelaPvely  straighNorward  to  operate  on  an  ongoing   basis   –  Achieve  a  decent  search  experience  quite  quickly  and   easily   •  Takeaway:  GSA  minimized  necessary  investment   in  technical  experPse  
  5. 5. Replacing  GSA  with  Fusion   •  Easy  to  setup  and  configure,  “plug  and  play”   –  Fusion  Index  Workbench   •  Quickly  connect  to  and  ingest  data   •  IntuiPvely  iterate  on  improving  search  results   •  Easily  A/B  test  tweaks  to  ETL  logic   –  Dashboards  and  Log  AnalyPcs   –  Monitoring/alerPng  APIs  that  integrate  with  common   tools  to  ease  ongoing  maintenance  
  6. 6. GSA  Strengths     •  Out-­‐of-­‐box  search  UI   –  Highly  useful  during   development,  iteraPng  on   relevancy  improvements,  etc.   –  Customizable  enough  to  use   as  an  end-­‐user  search  UI   •  Takeaway:  GSA  minimized   necessary  investment  in   technical  experPse  
  7. 7. Replacing  GSA  with  Fusion   •  Out-­‐of-­‐box  search  UI   –  Lucidworks  View   •  Highly   customizable/”skin-­‐able”   •  Fully  open-­‐source:   hJps:// lucidworks/lucidworks-­‐ view   •  Built  on  top  of  a  modern   stack  (AngularJS)  
  8. 8. GSA  Strengths   •  Broad  support  for  connecPng  to,  ingesPng,   and  securing  data   – Many  out-­‐of-­‐box  connectors  to  common  sources:   CRM,  Wikis,  databases  etc.   – Extensible  connector  framework   •  Takeaway:  GSA  minimized  necessary   investment  in  technical  experPse  
  9. 9. Replacing  GSA  with  Fusion   •  Broad  support  for  connecPng  to,  ingesPng,  and  securing  data   –  Fusion  ships  with  ~40  connectors  to  common  sources   •  JDBC,  Web,  Alfresco,  Box,  Dropbox,  Drupal,  Github,  Google  Drive,  Jive,  JIRA,   Sharepoint,  MongoDB,  Hadoop/HDFS,  Salesforce,  Slack,  lots  more…   •  Fusion  connectors’  security-­‐trimming  funcPonality  secures  content/searches   out-­‐of-­‐box   –  Fusion  Index  Pipelines  enable  easily  pushing  data  into  the  index  as   well,  via  a  REST  API   –  Custom  connector  development  via  Fusion’s  Connectors  API    
  10. 10. GSA  Weaknesses   •  Broad  theme:  insufficient  control  over  the  search  experience   –  Relevancy  tuning  and  controls  are  exceedingly  opaque   •  “Source  Biasing”:  +/-­‐  [strong|medium|weak]   –  Lack  of  control  over  indexing  workflow   •  Custom  metadata  processing  was  a  chore,  if  feasible   –  Oren  referred  to  as  a  “black  box”  design   •  Non-­‐trivial  to  scale   –  Appliance  packaging  restricts  freedom  in  scaling  up   –  Per-­‐document  pricing  model   •  Incorrect  facet  counts!?  
  11. 11. Fusion  –  Fine-­‐grained  Control  over  *Everything*  
  12. 12. Fusion  –  Fine-­‐grained  Control  over  *Everything*   •  Fusion  Index  Pipelines   –  True  fine-­‐grained  control  over  ETL;  as  much  or  as  liJle  as  desired   •  For  content  from  source  X,  I  want  to  redact  this  set  of  keywords   •  For  content  from  source  Y,  I  want  to  extract  the  Ptle  from  this  HTML  tag   •  For  content  from  source  Z,  I  want  to  lookup  the  authorized  groups  from  another  database,  and  add   them  to  a  field  in  each  document   •  Fusion  Query  Pipelines   –  True  fine-­‐grained  control  over  request/response  logic  at  query-­‐Pme   •  For  queries  containing  keyword  X,  I  want  to  rewrite  the  query  to  be  something  else   •  For  queries  in  language  Y,  I  want  to  boost  results  matching  in  this  separate  set  of  fields   •  For  matching  documents  containing  keyword  Z,  I  want  to  redact  all  occurrences  of  Z  before  returning   the  results   –  Fusion  signals:  collect  users’  queries+clicks  and  aggregate  them  over  Pme   •  UPlize  this  knowledge  to  dynamically  boost  the  most  commonly-­‐clicked  item(s)  for  a  given  query   •  ConPnually  improve  relevancy  without  manual  human  input   •  If  you’re  already  familiar  with  Solr/Lucene,  hack  away!  J  
  13. 13. Fusion  –  Fine-­‐grained  Control  over  *Everything*   •  Scaling   –  Fusion  uPlizes  best-­‐in-­‐class  Apache  Solr  as  the  backend  search  engine   •  Scale  to  billions  of  documents  linearly   –  Fusion  services  scale  independently   •  As  opposed  to  GSA,  which  scaled  in  units  of  enPre  appliances   •  If  you  want  to  ingest  content  faster,  add  addiPonal  connectors  nodes   •  If  you  want  to  enable  greater  query  throughput,  add  addiPonal  query-­‐processing  nodes     –  StraighNorward  APIs/processes  for  provisioning  addiPonal  nodes   •  Just  spin  up  a  new  node,  install  Fusion,  and  point  it  at  the  central  cluster  manager   (Apache  Zookeeper)   •  Easily  overlay  Fusion  on  top  of  any  exisPng  Solr  cluster  
  14. 14. Fusion  as  a  plaDorm   •  Get  started  with  ease:  hJps://   1.  Point  Fusion  at  your  data   2.  Setup  a  simple  baseline  search  app  with  Lucidworks  View   3.  Iterate  on  the  actual  search  experience  to  your  heart’s  content  J   •  Delve  into  the  details  (or  don’t!)   –  Fusion  provides  the  necessary  framework  to  tackle  tough  and/or  use-­‐case-­‐specific  search   problems   –  Anything  but  a  “black  box”  design   –  Most  components  are  customizable  and  extensible   •  Implement  your  own  Fusion  components  in  Java  using  our  APIs   •  Scale  with  minimal  effort,  maximal  flexibility   –  Scale  linearly  up  to  billions  of  docs  with  Apache  Solr   –  Self-­‐service  APIs  for  se{ng  up  addiPonal  nodes  to  expand  capacity   –  Per-­‐node  instead  of  per-­‐doc  pricing  means  fewer  surprises  when  it’s  Pme  to  renew  licenses    
  15. 15. “Fusion gave us the features we needed to replace Google Search Appliance in a matter of weeks. With Fusion’s out-of-the-box capabilities, we skipped months in our dev cycle so we could focus our team where they would have the most impact. We cut our licensing costs by 50% and improved application usability. The Lucidworks professional services team amplified our success even further. “We’re all Fusion from here on out!” Lourduraju Pamishetty
 Senior IT Application Architect
  16. 16. Customers   Who’ve  Made   the  Switch  
  17. 17. Fusion  as  a  plaDorm   •  Accurate  facet  counts   – What  a  concept!  J   •  Take  Fusion  for  a  spin:   hJps:// download/  
  18. 18. Agenda   •  IntroducPon  to  County  of  Sacramento   •  Why  Sacramento  County  is  search  first  for  data  delivery   •  How  Fusion  helps  us  meet  our  data  delivery  challenges   •  How  Fusion  has  helped  us  fill  gaps  ler  by  GSA  rePrement    
  19. 19. Sacramento  County   •  34  departments  and  affiliated  organizaPons  serving  1.5   million  people   •  Commitment  to  open  government  and  transparency   •  CiPzen  engagement  
  20. 20. Why  Sacramento  County  is  Search  First   •  Enterprise  apps,  data  snackers  and  LOB  apps   –  ADABAS  (Mainframe)   –  RDBMS   –  CDH   –  ECM     •  Diverse,  heterogeneous  data  environment     •  Our  challenge:  securely  deliver  prompt  access  to  relevant  data  
  21. 21. Fusion/Solr  in  Sacramento  County   •  Documents  and  content   –  Cross-­‐repository  search   –  Source  repository  security   •  GIS   •  Cross-­‐Source  Data  Processing  and  AnalyPcs   –  Fusion  connectors   –  Spark  in  Fusion   •  Log  Analysis   •  NOSQL   –  Why  be  MEAN  when  you  can  be  SANE?  
  22. 22. Gaps  LeH  by  GSA   Fusion  was  our  final  GSA  patch  
  23. 23.   •  The  Brown  Act   –  Make  public  meePngs  accessible  to  ciPzens   –  Maintain  transparency     •  AgendaSearch   –  Search  and  consume  public  documents   –  Integrate  with  agenda  management   –  Lucidworks  View   –  Has  reduced  PRAs  
  24. 24. Immediate  Win  with  View   •  County  Legal  Counsel   •  ~2  million  document  archive   •  Document  level  security   •  IntuiPve  and  feature  rich  UI   •  Search  soluPon  delivered  before  lunch  
  25. 25. Q&A     Resources:     •  Download  Fusion:  hJps://   •  Lucene/Solr  RevoluPon  2016  –  Oct  11-­‐14  –  Boston,  MA: