SlideShare a Scribd company logo
1 of 20
Download to read offline
Building	
  specialized	
  applica/ons	
  using	
  Solr;	
  	
  
                                                                Migra/on	
  from	
  FAST	
  ESP	
  




                                                                     Rahul	
  Agarwalla	
  
                                                            Head	
  of	
  Interna/onal	
  Business	
  
                                                                Uchida	
  Spectrum	
  Inc.	
  




©2011 Uchida Spectrum, Inc. All rights reserved.
Uchida	
  Spectrum	
  Overview	
  



                                                                     SoDware	
  License	
  Business	
  
                                                                 1995	
  ~	
  
                                                                    •  So)ware	
  License	
  Sales	
  
                                                                    •  License	
  Management	
  Repor:ng	
  
                                                                    •  License	
  Procurement	
  System	
  
                                                                    •  License	
  Adjustment	
  Consul:ng	
  




                            Network	
  Technology	
  Services	
                                          Enterprise	
  Search	
  Business	
  
               1997	
  ~	
  
                  •  Network	
  System	
  Consul:ng	
  Services	
                               2002	
  ~	
  
                          ― Ac:ve	
  Directory	
  Network	
  	
  
                           	
                                                                      •  Enterprise	
  Intelligence	
  Applica:on	
  
                          ― Exchange	
  Messaging	
  Network	
  
                           	
                                                                            ―  SMART	
  InSight	
  G2	
  Enterprise	
  
                    •  License	
  Management	
  System	
  Consul:ng	
                                    ―  SMART	
  InSight	
  G2	
  Professional	
  
                          ― So)ware	
  Management	
  Server	
  
                           	
                                                                       •  Search	
  PlaRorm	
  Consul:ng	
  &	
  Support	
  
                    •  Portal	
  System	
  Consul:ng	
                                                   ―  FAST	
  ESP	
  
                          ― Share	
  Point	
  Portal	
  Server	
  
                           	
                                                                            ―  Lucene/Solr	
  
                          ― Websphere	
  Portal	
  Server	
  
                           	
                                                                            ―  Lucid	
  Works	
  Enterprise	
  	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                                                                                   Page-2   Page-2
Some	
  of	
  Uchida	
  Spectrum’s	
  customers	
  




©2011 Uchida Spectrum, Inc. All rights reserved.             Page-3   Page-3
SMART/InSight	
  History	
  



                                                             Customers	
  in	
  Japan,	
  China	
  &	
  India:	
  	
  
                                                               •    2	
  of	
  top	
  3	
  Japanese	
  car	
  manufacturers	
  
                                                               •    Top	
  consumer	
  electronics	
  company	
  
                                                               •    Large	
  financial	
  ins8tu8ons	
  
                                                               •    China’s	
  biggest	
  eCommerce	
  firm	
  


                                                  2005:	
  	
  
                                              SMART	
  InSight	
  1.1	
  

                2004:	
  PlaRorm	
  for	
  
                 custom	
  solu:ons	
  

     2003:	
  	
  
  FAST	
  Alliance	
  


©2011 Uchida Spectrum, Inc. All rights reserved.                                                                                  Page-4   Page-4
What	
  is	
  today’s	
  buzz	
  word?	
  




                                                                                   Smart Phone	



                                                   • Extreme	
  scalability	
  
                                                   • Flexibility	
  &	
  Extensibility	
  
                                                   • Feature	
  rich	
  search	
  
©2011 Uchida Spectrum, Inc. All rights reserved.                                             Page-5   Page-5
What	
  I	
  learnt	
  from	
  the	
  
     Japan	
  catastrophe	
  




©2011 Uchida Spectrum, Inc. All rights reserved.   Page-6   Page-6
The	
  power	
  of	
  community	
  


              Japanese	
  Government	
                                     Japanese	
  People	
  
              [Closed/big	
  brother]	
                                    [Open	
  community]	
  
              •    Slow,	
  behind	
  the	
  curve	
                       •    Quick	
  response	
  
              •    Legacy/CYA	
                                            •    Disclose	
  /	
  Share	
  
              •    Confusion	
                                             •    Prac:cal	
  Impact	
  



                                                       Power shift
                                                   Driver	
  of	
  innova/on	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                                             Page-7   Page-7
Lessons	
  from	
  FAST	
  ESP	
  Migra/on:	
  advantage	
  LWE/Solr	
  


      •     Key	
  Issues:	
  
                 1.     Smaller	
  record	
  and	
  index	
  size	
  enable	
  faster	
  index	
  maintenance	
  
                 2.     #	
  of	
  records	
  per	
  node:	
  rule	
  of	
  thumb	
  10m	
  vs.	
  2m	
  
                 3.     Licensing	
  &	
  Maintenance	
  Cost:	
  less	
  than	
  ½	
  


      •     Scalability:	
  5x	
  
      •     Cost	
  Performance:	
  10x	
  
      •     High	
  Flexibility	
  
      •     Lower	
  Opera/ons	
  Cost	
  
      •     Faster	
  Innova/on	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                                              Page-8
Enterprise	
  Search	
  expecta/ons	
  


                                                   •  Big	
  data	
  scale	
  
                                                   •  Security	
  is	
  important	
  
                                                   •  Disparate	
  data:	
  geography,	
  systems,	
  
                                                      languages,	
  format,	
  structures	
  
                                                   •  KM	
  is	
  good	
  to	
  have,	
  databases	
  are	
  
                                                      cri:cal	
  
                                                   •  Support	
  different	
  users	
  &	
  usage:	
  
                                                      department,	
  role,	
  tasks	
  
                                                   •  High	
  recall	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                            Page-9        Page-9
Lessons	
  from	
  FAST	
  ESP	
  Migra/on:	
  Filling	
  the	
  gaps	
  


                                                   •    Security	
  
                                                        •    ACL	
  security:	
  complex	
  requirements	
  	
  
                                                        •    File	
  System:	
  file	
  &	
  folder	
  level	
  control	
  
                                                        •    CRM/ERP…	
  :	
  Keeping	
  ACLs	
  up-­‐to-­‐date	
  


                                                   •    Content	
  aggrega/on	
  
                                                        •    Connectors	
  
                                                        •    Normaliza:on	
  
                                                        •    Open	
  source	
  op:ons	
  for	
  ESP	
  pipeline	
  
                                                                •    Openpipeline	
  
                                                                •    Pypes	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                                           Page-10
Building	
  specialized	
  applica/ons:	
  Content	
  fusion	
  


                               •    Content	
  fusion	
  from	
  disparate	
  data:	
  	
  
                                          •    Single	
  index	
  ≠	
  integra:on	
  
                                          •    Modeling	
  of	
  content	
  rela:onships	
  is	
  essen:al	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                                                 Page-11
Virtual	
  integra/on	
  based	
  on	
  search	
  




                 Applica/on	
  layer	
  
                 Content	
  sets	
  and	
  inter-­‐rela/onships	
  	
  

                  Content store
                  Big	
  table,	
  flat	
  index	
  



                                                                     Search Index
                                                                       Search Index
                                                                         Search Index




©2011 Uchida Spectrum, Inc. All rights reserved.                                        Page-12
Virtual	
  integra/on	
  based	
  on	
  search…2	
  




       Search	
  Service	
                                                                                                                         Content	
  




                                                                  Append	
  Pipeline	
  



                                                                                           Tagging	
  Pipeline	
  




                                                                                                                                                                                     Result	
  Pipeline	
  
                                                                                                                         Query	
  Pipeline	
  
                                                                                                                                                   Security	
  


       •  Data	
  transforma:on:	
  	
  	
                                                                                                          .	
  .	
  .	
  .	
  .	
  .	
  


           - key:key,	
  key:value,	
  field	
  names	
                                                                                            Boos:ng	
  

       •  Query	
  &	
  Result	
  transforma:on	
  
                                                                                                                                                  Transform	
  
       •  Boos:ng	
  /	
  Relevancy	
  algorithm	
  
       •  Security	
  
       •  Mul:-­‐Language	
  support	
                       LWE	
  Adapter	
                                        SolrAdapter	
                      ……	
                         Other	
  


       •  Federa:on	
  &	
  mashups	
  	
  


                                                           Search Index                                                                                                                           ……	
  
                                                                 LWE                                                                       Solr



©2011 Uchida Spectrum, Inc. All rights reserved.                                                                                                                                                              Page-13
Building	
  specialized	
  applica/ons:	
  Personaliza/on	
  


                                                   •    Applica/on	
  flow	
  depends	
  on	
  the	
  task	
  	
  
                                                   •    Data	
  Personaliza/on	
  increases	
  produc/vity	
  




                                                   •  SMART	
  InSight	
  approach:	
  Task	
  based	
  UI	
  
                                                          •    Schema	
  independent	
  widgets	
  for	
  analy:cs	
  &	
  
                                                               visualiza:on	
  	
  
                                                          •    Portalized	
  
                                                          •    Personalized:	
  widgets,	
  func:ons,	
  content,	
  fields	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                                           Page-14
Knowledge	
  Center:	
  made	
  possible	
  by	
  Solr	
  


      Scalability	
  and	
  low	
  TCO	
  gives	
  us	
  ability	
  to	
  build	
  new	
  features	
  
              •     Knowledge	
  Centre	
  has	
  logs	
  of	
  all	
  user	
  ac:vity	
  in	
  SMART	
  InSight	
  
              •     This	
  would	
  be	
  too	
  costly	
  with	
  a	
  commercial	
  Search	
  Engine	
  and	
  would	
  	
  
                    not	
  be	
  feasible	
  in	
  a	
  database	
  



      Using	
  this	
  rich	
  data	
  we	
  can:	
  
              •     Profile	
  users,	
  groups	
  and	
  networks	
  
              •     Personalize	
  Recommenda:ons	
  
              •     Create	
  social	
  ranking	
  algorithms	
  
              •     Usage	
  analy:cs	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                                                              Page-15
Overview	
  of	
  SMART	
  InSight	
  for	
  Automo/ve	

                                                                                                                                                                    Task	
  based	
  UIs	
  
             NHTSA	
  
           Internet	
  
                                                            Page	
     Widgets	
        Ajax	
  Portal	
        Personaliza/on	
                                                                                                                                                                                 Benchmarking	
  
                 EDR	
                                                            Virtual	
  Integra/on	
                        Convergent	
  Knowledge	
  
               Repair	
                                         Framework	
                                     Framework	
  
            Dealers	
                                                                                     Knowledge	
  
                                                   SA	
         Contents	
  Set	
                         Centre	
        Recommend	




                                                                                        Data	
  Chain	
                                                   SA	
              Design	
               Claims	
                                                                                                   Profiling	
     Parts	
  Catalog	
  	
  
                                                   SA	
         Engineering	
                             Metadata	
Analysis	
                 PLM	
  
                                                   SA	
              Claims	
                Specs	
                                                                                          Knowledge	
  Log	
                 CAD	
  
           Internal	
                                                           Management	
  &	
  Security	
                                                                                                                                                                               Early	
  Defect	
  
                                                                                                                                                                                	
  Warning	
  



                                                            Content	
  Model	
  




                                                                                                                                                                    Claim	
  Analysis	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                                                                                                      Page-16
:	
  Interac/ve	
  Click	
  Log	
  Analysis	
  System	



                                    • >	
  $50	
  Billion	
  sales	
  /	
  year	
  
                                    • >	
  800	
  Million	
  Items	
  
                                    • >	
  370	
  Million	
  Users	
  	
  
                                    • Billions	
  of	
  clicks	
  per	
  day	
  


       Access	
  Log	


                     Solr	

            Hadoop	




        Solr,	
  Hadoop	
  +	
  SMART/InSight	
  G2	
                                 xxxxxxxx	



                                                                                                   Xxxx
                                                                                                   Xxxx
                                                                                                   Xxxx
                                                                                                   Xxxx
                                                                                                   xxxx	




©2011 Uchida Spectrum, Inc. All rights reserved.                                                            Page-17
:	
  Global	
  Research	
  Community	
  

         •  Top	
  Academic	
  Ins/tutes:	
  	
  
               •  Faculty,	
  Research	
  Fellows	
  &	
  Post	
  
                    graduate	
  students	
  
         •  Govt.	
  Departments	
  &	
  Corporate	
  R&D	
  
               •  Scien:sts	
  and	
  researchers	
  

                        Research	
  Discovery	
  &	
  
                        Collec/ve	
  Intelligence	
  
                         (Knowledge	
  Centre)	
  




                                                                                             •  >	
  270	
  content	
  sources:	
  Socie/es,	
  
                                                                             Broadcast	
        Associa/ons,	
  Publishers	
  &	
  Open	
  
                                                                              Search	
                •  IEEE,	
  ACM…	
  
                                                                                                      •  Elsevier,	
  Wiley,	
  Springer…	
  


                                          Dynamic	
  Result	
  Merging	
                     Real	
  /me	
  indexing	
  
                                                                                 Solr	

©2011 Uchida Spectrum, Inc. All rights reserved.                                                                                  Page-18
Demonstra/on	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                      Page-19
Contact	
  Details	
  




                                                            Rahul	
  Agarwalla	
  
                                                   Head	
  –	
  Interna/onal	
  Business	
  
                                                      rahul@spectrum.co.jp	
  
                                                       www.spectrum.co.jp	
  




©2011 Uchida Spectrum, Inc. All rights reserved.                                               Page-20

More Related Content

What's hot

SmartPlay! The place to be!
SmartPlay! The place to be! SmartPlay! The place to be!
SmartPlay! The place to be!
KR_8
 

What's hot (17)

Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.jsData distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
 
Riverbed Cascade and VXLAN Monitoring
Riverbed Cascade and VXLAN MonitoringRiverbed Cascade and VXLAN Monitoring
Riverbed Cascade and VXLAN Monitoring
 
Mobile Monday DUS Mobile Cloud for Enterprise-final-print
Mobile Monday DUS Mobile Cloud for Enterprise-final-printMobile Monday DUS Mobile Cloud for Enterprise-final-print
Mobile Monday DUS Mobile Cloud for Enterprise-final-print
 
Daitan Group Company Overview
Daitan Group Company OverviewDaitan Group Company Overview
Daitan Group Company Overview
 
Cat5 To 10gig Convergence Makes Cabling An Asset
Cat5 To 10gig Convergence Makes Cabling An AssetCat5 To 10gig Convergence Makes Cabling An Asset
Cat5 To 10gig Convergence Makes Cabling An Asset
 
Overview of LBS for the Enterprise
Overview of LBS for the EnterpriseOverview of LBS for the Enterprise
Overview of LBS for the Enterprise
 
10 fn s13
10 fn s1310 fn s13
10 fn s13
 
Learning & Talent In The Cloud
Learning & Talent In The CloudLearning & Talent In The Cloud
Learning & Talent In The Cloud
 
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
Achieving genuine elastic multitenancy with the Waratek Cloud VM for Java : J...
 
Mobility For Business The Platform Requirements
Mobility For Business   The Platform RequirementsMobility For Business   The Platform Requirements
Mobility For Business The Platform Requirements
 
Mulesoft
MulesoftMulesoft
Mulesoft
 
Nordic Id Products
Nordic Id ProductsNordic Id Products
Nordic Id Products
 
OSS Business models
OSS Business modelsOSS Business models
OSS Business models
 
SmartPlay! The place to be!
SmartPlay! The place to be! SmartPlay! The place to be!
SmartPlay! The place to be!
 
Violin Memory DOAG (German Oracle User Group) Nov 2012
Violin Memory DOAG (German Oracle User Group) Nov 2012Violin Memory DOAG (German Oracle User Group) Nov 2012
Violin Memory DOAG (German Oracle User Group) Nov 2012
 
Cisco Localisation Toolkit
Cisco Localisation ToolkitCisco Localisation Toolkit
Cisco Localisation Toolkit
 
P2P - Real Time Communications in the Enterprise
P2P - Real Time Communications in the EnterpriseP2P - Real Time Communications in the Enterprise
P2P - Real Time Communications in the Enterprise
 

Similar to Building specialized industry apps using solr - By Rahul Agarwalla

Analyst briefing session 1 the challenge of deploying the infrastructure
Analyst briefing session 1   the challenge of deploying the infrastructureAnalyst briefing session 1   the challenge of deploying the infrastructure
Analyst briefing session 1 the challenge of deploying the infrastructure
CGI
 
雲端服務對於台灣製造業IT的意義、挑戰及機會
雲端服務對於台灣製造業IT的意義、挑戰及機會雲端服務對於台灣製造業IT的意義、挑戰及機會
雲端服務對於台灣製造業IT的意義、挑戰及機會
併力科技 JFT
 
Oracle cloud story short
Oracle cloud story   shortOracle cloud story   short
Oracle cloud story short
Yuri Grinshteyn
 
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundoPowered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
GeneXus
 
Ixia anue maximum roi from your existing toolsets
Ixia anue   maximum roi from your existing toolsetsIxia anue   maximum roi from your existing toolsets
Ixia anue maximum roi from your existing toolsets
responsedatacomms
 
Ixia anue maximum roi from your existing toolsets
Ixia anue   maximum roi from your existing toolsetsIxia anue   maximum roi from your existing toolsets
Ixia anue maximum roi from your existing toolsets
responsedatacomms
 

Similar to Building specialized industry apps using solr - By Rahul Agarwalla (20)

Analyst briefing session 1 the challenge of deploying the infrastructure
Analyst briefing session 1   the challenge of deploying the infrastructureAnalyst briefing session 1   the challenge of deploying the infrastructure
Analyst briefing session 1 the challenge of deploying the infrastructure
 
雲端服務對於台灣製造業IT的意義、挑戰及機會
雲端服務對於台灣製造業IT的意義、挑戰及機會雲端服務對於台灣製造業IT的意義、挑戰及機會
雲端服務對於台灣製造業IT的意義、挑戰及機會
 
Oracle cloud story short
Oracle cloud story   shortOracle cloud story   short
Oracle cloud story short
 
Cloud Computing and Electronic Design: Xuropa EDPS, 2010
Cloud Computing and Electronic Design: Xuropa EDPS, 2010Cloud Computing and Electronic Design: Xuropa EDPS, 2010
Cloud Computing and Electronic Design: Xuropa EDPS, 2010
 
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundoPowered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
Powered by Oracle! Te ayudamos a distribuir tu aplicación en todo el mundo
 
Oow con7393
Oow con7393Oow con7393
Oow con7393
 
10 fn s15
10 fn s1510 fn s15
10 fn s15
 
Oracle Open World Preso on Cloud Economics
Oracle Open World Preso on Cloud EconomicsOracle Open World Preso on Cloud Economics
Oracle Open World Preso on Cloud Economics
 
Migrating to Cloud - A Step by Step
Migrating to Cloud - A Step by Step Migrating to Cloud - A Step by Step
Migrating to Cloud - A Step by Step
 
Oracle
OracleOracle
Oracle
 
Extending The Value Of Oracle Crm On Demand Through Cloud Based Extensibility
Extending The Value Of Oracle Crm On Demand Through Cloud Based ExtensibilityExtending The Value Of Oracle Crm On Demand Through Cloud Based Extensibility
Extending The Value Of Oracle Crm On Demand Through Cloud Based Extensibility
 
Xpt
XptXpt
Xpt
 
Xpt
XptXpt
Xpt
 
XPT Corporate Capability
XPT Corporate Capability XPT Corporate Capability
XPT Corporate Capability
 
"End-to-end Interoperability and Mobile Services"
"End-to-end Interoperability and Mobile Services" "End-to-end Interoperability and Mobile Services"
"End-to-end Interoperability and Mobile Services"
 
Evaluating Approaches to Building DPI into an LTE Network at the PDN Gateway ...
Evaluating Approaches to Building DPI into an LTE Network at the PDN Gateway ...Evaluating Approaches to Building DPI into an LTE Network at the PDN Gateway ...
Evaluating Approaches to Building DPI into an LTE Network at the PDN Gateway ...
 
Oracle Staffing Practice
Oracle Staffing PracticeOracle Staffing Practice
Oracle Staffing Practice
 
Tw Technology Radar Qtb Sep11
Tw Technology Radar Qtb Sep11Tw Technology Radar Qtb Sep11
Tw Technology Radar Qtb Sep11
 
Ixia anue maximum roi from your existing toolsets
Ixia anue   maximum roi from your existing toolsetsIxia anue   maximum roi from your existing toolsets
Ixia anue maximum roi from your existing toolsets
 
Ixia anue maximum roi from your existing toolsets
Ixia anue   maximum roi from your existing toolsetsIxia anue   maximum roi from your existing toolsets
Ixia anue maximum roi from your existing toolsets
 

More from lucenerevolution

Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
lucenerevolution
 

More from lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Building specialized industry apps using solr - By Rahul Agarwalla

  • 1. Building  specialized  applica/ons  using  Solr;     Migra/on  from  FAST  ESP   Rahul  Agarwalla   Head  of  Interna/onal  Business   Uchida  Spectrum  Inc.   ©2011 Uchida Spectrum, Inc. All rights reserved.
  • 2. Uchida  Spectrum  Overview   SoDware  License  Business   1995  ~   •  So)ware  License  Sales   •  License  Management  Repor:ng   •  License  Procurement  System   •  License  Adjustment  Consul:ng   Network  Technology  Services   Enterprise  Search  Business   1997  ~   •  Network  System  Consul:ng  Services   2002  ~   ― Ac:ve  Directory  Network       •  Enterprise  Intelligence  Applica:on   ― Exchange  Messaging  Network     ―  SMART  InSight  G2  Enterprise   •  License  Management  System  Consul:ng   ―  SMART  InSight  G2  Professional   ― So)ware  Management  Server     •  Search  PlaRorm  Consul:ng  &  Support   •  Portal  System  Consul:ng   ―  FAST  ESP   ― Share  Point  Portal  Server     ―  Lucene/Solr   ― Websphere  Portal  Server     ―  Lucid  Works  Enterprise     ©2011 Uchida Spectrum, Inc. All rights reserved. Page-2 Page-2
  • 3. Some  of  Uchida  Spectrum’s  customers   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-3 Page-3
  • 4. SMART/InSight  History   Customers  in  Japan,  China  &  India:     •  2  of  top  3  Japanese  car  manufacturers   •  Top  consumer  electronics  company   •  Large  financial  ins8tu8ons   •  China’s  biggest  eCommerce  firm   2005:     SMART  InSight  1.1   2004:  PlaRorm  for   custom  solu:ons   2003:     FAST  Alliance   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-4 Page-4
  • 5. What  is  today’s  buzz  word?   Smart Phone • Extreme  scalability   • Flexibility  &  Extensibility   • Feature  rich  search   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-5 Page-5
  • 6. What  I  learnt  from  the   Japan  catastrophe   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-6 Page-6
  • 7. The  power  of  community   Japanese  Government   Japanese  People   [Closed/big  brother]   [Open  community]   •  Slow,  behind  the  curve   •  Quick  response   •  Legacy/CYA   •  Disclose  /  Share   •  Confusion   •  Prac:cal  Impact   Power shift Driver  of  innova/on   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-7 Page-7
  • 8. Lessons  from  FAST  ESP  Migra/on:  advantage  LWE/Solr   •  Key  Issues:   1.  Smaller  record  and  index  size  enable  faster  index  maintenance   2.  #  of  records  per  node:  rule  of  thumb  10m  vs.  2m   3.  Licensing  &  Maintenance  Cost:  less  than  ½   •  Scalability:  5x   •  Cost  Performance:  10x   •  High  Flexibility   •  Lower  Opera/ons  Cost   •  Faster  Innova/on   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-8
  • 9. Enterprise  Search  expecta/ons   •  Big  data  scale   •  Security  is  important   •  Disparate  data:  geography,  systems,   languages,  format,  structures   •  KM  is  good  to  have,  databases  are   cri:cal   •  Support  different  users  &  usage:   department,  role,  tasks   •  High  recall   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-9 Page-9
  • 10. Lessons  from  FAST  ESP  Migra/on:  Filling  the  gaps   •  Security   •  ACL  security:  complex  requirements     •  File  System:  file  &  folder  level  control   •  CRM/ERP…  :  Keeping  ACLs  up-­‐to-­‐date   •  Content  aggrega/on   •  Connectors   •  Normaliza:on   •  Open  source  op:ons  for  ESP  pipeline   •  Openpipeline   •  Pypes   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-10
  • 11. Building  specialized  applica/ons:  Content  fusion   •  Content  fusion  from  disparate  data:     •  Single  index  ≠  integra:on   •  Modeling  of  content  rela:onships  is  essen:al   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-11
  • 12. Virtual  integra/on  based  on  search   Applica/on  layer   Content  sets  and  inter-­‐rela/onships     Content store Big  table,  flat  index   Search Index Search Index Search Index ©2011 Uchida Spectrum, Inc. All rights reserved. Page-12
  • 13. Virtual  integra/on  based  on  search…2   Search  Service   Content   Append  Pipeline   Tagging  Pipeline   Result  Pipeline   Query  Pipeline   Security   •  Data  transforma:on:       .  .  .  .  .  .   - key:key,  key:value,  field  names   Boos:ng   •  Query  &  Result  transforma:on   Transform   •  Boos:ng  /  Relevancy  algorithm   •  Security   •  Mul:-­‐Language  support   LWE  Adapter   SolrAdapter   ……   Other   •  Federa:on  &  mashups     Search Index ……   LWE Solr ©2011 Uchida Spectrum, Inc. All rights reserved. Page-13
  • 14. Building  specialized  applica/ons:  Personaliza/on   •  Applica/on  flow  depends  on  the  task     •  Data  Personaliza/on  increases  produc/vity   •  SMART  InSight  approach:  Task  based  UI   •  Schema  independent  widgets  for  analy:cs  &   visualiza:on     •  Portalized   •  Personalized:  widgets,  func:ons,  content,  fields   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-14
  • 15. Knowledge  Center:  made  possible  by  Solr   Scalability  and  low  TCO  gives  us  ability  to  build  new  features   •  Knowledge  Centre  has  logs  of  all  user  ac:vity  in  SMART  InSight   •  This  would  be  too  costly  with  a  commercial  Search  Engine  and  would     not  be  feasible  in  a  database   Using  this  rich  data  we  can:   •  Profile  users,  groups  and  networks   •  Personalize  Recommenda:ons   •  Create  social  ranking  algorithms   •  Usage  analy:cs   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-15
  • 16. Overview  of  SMART  InSight  for  Automo/ve Task  based  UIs   NHTSA   Internet   Page Widgets Ajax  Portal Personaliza/on Benchmarking   EDR Virtual  Integra/on   Convergent  Knowledge   Repair   Framework Framework   Dealers   Knowledge   SA   Contents  Set Centre   Recommend Data  Chain SA   Design Claims Profiling Parts  Catalog     SA   Engineering Metadata Analysis PLM   SA   Claims Specs Knowledge  Log CAD   Internal   Management  &  Security Early  Defect    Warning   Content  Model   Claim  Analysis   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-16
  • 17. :  Interac/ve  Click  Log  Analysis  System • >  $50  Billion  sales  /  year   • >  800  Million  Items   • >  370  Million  Users     • Billions  of  clicks  per  day   Access  Log Solr Hadoop Solr,  Hadoop  +  SMART/InSight  G2 xxxxxxxx Xxxx Xxxx Xxxx Xxxx xxxx ©2011 Uchida Spectrum, Inc. All rights reserved. Page-17
  • 18. :  Global  Research  Community   •  Top  Academic  Ins/tutes:     •  Faculty,  Research  Fellows  &  Post   graduate  students   •  Govt.  Departments  &  Corporate  R&D   •  Scien:sts  and  researchers   Research  Discovery  &   Collec/ve  Intelligence   (Knowledge  Centre)   •  >  270  content  sources:  Socie/es,   Broadcast   Associa/ons,  Publishers  &  Open   Search   •  IEEE,  ACM…   •  Elsevier,  Wiley,  Springer…   Dynamic  Result  Merging   Real  /me  indexing   Solr ©2011 Uchida Spectrum, Inc. All rights reserved. Page-18
  • 19. Demonstra/on   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-19
  • 20. Contact  Details   Rahul  Agarwalla   Head  –  Interna/onal  Business   rahul@spectrum.co.jp   www.spectrum.co.jp   ©2011 Uchida Spectrum, Inc. All rights reserved. Page-20