Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Webinar: Fusion for Business Intelligence

Lucidworks Senior Systems Engineer Allan Syiek discusses simple querying vs. data mining and intelligent search, and how Lucidworks Fusion can help you turn raw data into insight.

  • Login to see the comments

  • Be the first to like this

Webinar: Fusion for Business Intelligence

  1. 1. Fusion  for  Business  Intelligence   Allan  Syiek   Senior  Sales  Engineer   September  14,  2016  
  2. 2. Session  Objec,ves   By  the  end  of  this  session,  you  will:     –  Have  a  high  level  awareness  of  the  variety  of   search  and  discovery  funcFonality  available   –  Select  the  right  product  for  a  parFcular  use   case   –  Know  why  this  baby  is  so  happy    
  3. 3. Agenda   Ø The  Beer  and  Diaper  Legend   Ø DIKW  Pyramid   Ø What  is  Enterprise  Search   Ø Indexing  101   Ø StaFsFcs  vs.  Data  Mining  vs.  Machine  Learning   Ø What  is  Business  Intelligence   Ø Where  does  Fusion  Fit?  
  4. 4. Parable  of  the  Beer  and  the  Diapers   Illustrates  the  difference  between  querying  and  data  mining,     already  firmly  enshrined  in  BI  mythology    
  5. 5.   The  DIKW  Pyramid    
  6. 6. What  is  Enterprise  Search   Q.  What  do  you  do  with  a  mountain  of  data  located  everywhere?   A.  Depends….  What  do  you  need  it  for?  
  7. 7. •  Crawling,  Parsing,  Indexing,  Searching   •  Advanced  Searches   •  Searching  Structured  Data   •  Searching  Unstructured  Data   •  Metadata   •  Ranking   •  Results   •  Access  Control   •  UI   •  Tuning   •  ReporFng   •  Scale  and  Performance   Aspects  of  Enterprise  Search  
  8. 8. Index Pipeline Tika  Parser   Exclusion  Filter   Field  Mapper   HTML  Transform  Stage   XML  Transform  Stage   OpenNLP  EnFty  ExtracFon   Gaze]eer  ExtracFon   Regular  Expression   AggregaFng   Javascript  (custom  scripts)   …and  others…   SearchCollection SearchUI Search  Fields/Parameters   Facets     Landing  Pages   Boost  Documents   Block  Documents   Security  Trimming   RecommendaFon  BoosFng   Rollup  Aggregator   Sub  Query   Javascript  (custom  scripts)   …and  others…   Documents Query Pipeline
  9. 9.   Indexing  101     A  system  used  to  make  finding  informa,on  easier.     Every  word  is  converted   into  a  wordID  by  using  an   in-­‐memory  hash  table  -­‐-­‐   the  lexicon.       Occurrences  in  the  current   document  are  translated   into  hit  lists  and  are   wri]en  into  the  forward   “barrels”.       Inverted  Barrels  have  been   sorted.    
  10. 10. Indexing  101  -­‐  Ranking   •  Score  Results  for  PresentaFon   –  Weighted  by          Term  Frequency-­‐Inverse  Document  Frequency            (TF-­‐IDF)   –  Clustering   –  Complex  proprietary  algorithms      
  11. 11. Indexing  101  -­‐  Relevance  
  12. 12. Sta,s,cs  vs.  Data  Mining  vs.  Machine  Learning   – Sta,s,cs  quan%fies  numbers   – Data  Mining  explains  pa]erns   – Machine  Learning  predicts  with  models   – Ar,ficial  Intelligence  behaves  and  reasons  
  13. 13. What  is  Business  Intelligence   •  BI  technologies  provide  historical,  current  and  predicFve  views  of  business   operaFons   •  Business  intelligence  is  made  up  of  an  increasing  number  of  components   including:   –  MulFdimensional  aggregaFon  and  allocaFon  (OLAP–  Online  AnalyFcal  Processing)   –  DenormalizaFon,  tagging  and  standardizaFon  (relaFonal  database)   –  Real  Fme  reporFng  with  analyFcal  alert   –  A  method  of  interfacing  with  unstructured  data  sources  (data  mining)   –  Group  consolidaFon,  budgeFng  and  rolling  forecasts   –  StaFsFcal  inference  and  probabilisFc  simulaFon   –  Key  performance  indicators  opFmizaFon   –  Version  control  and  process  management   –  Open  item  management  
  14. 14. •  Why Fusion for Log Analytics?   •  Secure  access  to   dashboards   •  ETL  of  logs  using  Index   pipelines   •  Spark  run  analysis  models   for  logs  and  leverage  with   ML  index  pipeline     •  Time  series  index   management  
  15. 15. Massive-­‐scale  log  analyFcs   •  Index billions of log events per day, real-time •  Recent event and historical analysis: Analyze logs over time: today, recent, past week, past 30 days, … •  Easy to use dashboards to visualize common questions and allow for ad hoc analysis •  Ability to scale linearly as business grows … with sub-linear growth in costs! •  Easy to setup, easy to manage, easy to use
  16. 16. •  Signals  &  RecommendaFons   Fusion  can  capture,  store,  and  aggregate  signals  from  a   variety  of  sources  to  drive  predicFve  search  capabiliFes   and  conFnuous  relevancy  tuning   Signals can include Clicks  and  queries   Add-­‐to-­‐cart  and   purchase  behavior   Geo-­‐locaFon   User  behavior  and   preferences   User  history  and  past   orders   Device  
  17. 17. VisualizaFon  &  Insight  with  SILK   SILK Dashboards provide a rich visual interface for users to search, inspect and visualize event/log data Gives user the power to perform ad-hoc search and analysis on massive amounts of multi-structured and time series data. Real-time insights and trends for on-the- fly decision making using the most accurate and up-to-date data Users can share visualizations and dashboards
  18. 18. REST  API   Worker   Worker   Cluster  Mgr.   Apache  Spark   Shards   Shards   Apache  Solr   HDFS  (OpFonal)   Shared  Config   Mgmt   Leader  ElecFon   Load  Balancing   ZK  1   Apache  Zookeeper   ZK  N   DATABASEWEBFILELOGSHADOOP CLOUD Connectors Alerting/Messaging NLP Pipelines Blob Storage Scheduling Recommenders/Signals … Core  Services   Admin UI SECURITY BUILT-IN Lucidworks View Where Does Fusion Fit?
  19. 19. Learn  more  at  -­‐  
  20. 20. Thank  You   Q  &  A