SlideShare a Scribd company logo
1 of 72
Download to read offline
SEARCH LIKE %SQL%
INFIX SEARCH IN LUCENE / SOLR / ELASTIC
SEP 15, 2017
2
Talk Title
Speaker Name
Company
SEARCH LIKE %SQL%
Mikhail Khludnev
EPAM
3
•  work in Search for 6 years
•  Apache Lucene/Solr committer for 2 years
•  speak at LuceneRevolution, BerlinBuzzwords
•  chief search engineer in EPAM
ABOUT ME
4
WE ARE
5
ESTABLISHED & EXPANDING GLOBAL VERTICALS
Award-winning Wealth Management Platform
Deep Expertise in Current and
Emerging FinTech
Working with 5 of the 10 Largest
Investment Banks
Leading Digital Transformation for
Global Retailers
Working with largest online travel association (OTA)
& largest global hospitality company
Recognized M&E Leader by
Independent Research Analysts
Working with 4 out of the 4 Top Broadcast Networks
and 14 out of the top 30 TV Networks to transform
consumer-driven media
R&D Domain Experts with 700+ Complex
Solutions & Services Supporting the Entire
Drug Discovery Workflow
Working with 9 of the 10 Top
Pharma Companies
24-Year History of Leading
Product Development
Working with 30+ of the top 100 ISVs
FINANCIAL SERVICES TRAVEL & CONSUMER
SOFTWARE & HI-TECHLIFE SCIENCES AND HEALTHCARE
MEDIA & ENTERTAINMENT
EMERGING
Deep Expertise Offers
Innovative Solutions
Working with industries ranging from
Energy and Utilities to Telecom and Automotive
6
•  Term and boolean query
•  Prefix*	
  query
•  *suffix	
  query
•  *infix*	
  query
•  Approaching	
  Suggester	
  
•  Derivative	
  Terms	
  	
  
AGENDA
7
•  Endeca
•  MarkLogic
•  FAST, Google Search Appliance
•  Sphinx
•  Apache Lucene
•  Apache Solr
•  Elastic
SEARCH ENGINES
8
9
10
CUSTOMER PROFILE
Any comprehensive text
search service
•  Patent
•  Legal
•  Chemistry
•  Bioinformatics
•  SQL legacy
11
WHERE LIKE %infix% *infix*
12
Business Problem/Opportunity
•  Ill searches for *infix*
CHALLENGE
Bank of England
13
Business Problem/Opportunity
•  Ill searches for *infix*
CHALLENGE
14
…at all?
Or what’s fast at comparison to it?
WHY IT’S A PROBLEM?
15
text:foo	
  	
  	
  OR	
  	
  text:bar	
  	
  	
  	
  	
  	
  	
  	
  
text:foo	
  	
  AND	
  	
  text:bar	
  	
  
THESE SEARCHES ARE (CONSIDERED AS) FAST
16
text:foo	
  OR	
  	
  text:bar	
  
text:foo	
  AND	
  text:bar	
  
	
  
O(r)	
  <<	
  O(Dall)	
  
	
  
r	
  –	
  results	
  
Dall	
  –	
  all	
  docs	
  
	
  
THESE SEARCHES ARE (CONSIDERED AS) FAST
17
• text:[sci	
  TO	
  scj]	
  
• text:sci*	
  
WHY THESE ARE STILL FAST?
18
TERM EXPANSION
•  discipline
•  luscious
•  science
•  scilla
•  scissors
text:[sci	
  TO	
  scj]	
  
text:sci*	
  
text:(science	
  OR	
  scilla	
  OR	
  scissors)	
  
	
  
O(t)+O(r)	
  
t – query terms
r - results
19
TERM EXPANSION
•  discipline
•  luscious
•  science
•  scilla
•  scissors
text:[sci	
  TO	
  scj]	
  
text:sci*	
  
text:(science	
  OR	
  scilla	
  OR	
  scissors)	
  
O(t)+O(r)	
  
20
PREFIX* SEARCH
24 ms
sci*
21
ms
22
WHAT’S THEN?
•  asci
•  disci
•  discipline
•  lemnisci
•  luscious
•  menisci
text:*sci	
  
23
WHAT’S THEN?
text:*sci 	
   	
  
	
  
O(Tall)+O(r)	
  
•  asci
•  disci
•  discipline
•  lemnisci
•  luscious
•  menisci
Tall – all terms
r - results
24
*SUFFIX SEARCH
4948 ms
*sci
25
ms
26
0 1000 2000 3000 4000 5000 6000
prefix*
*suffix
RESPONSE TIME, ms
27
text:*sci	
  
ReversedWildcardFilterFactory	
  
0enilpicsid
0icsa
0icsid
0icsinem
0icsinmel
0suoicsul
asci
disci
discipline
lemnisci
luscious
menisci
text:0ics*	
  
WHAT’S THEN? – REVERSE!
28
ReversedWildcardFilterFactory
29
30
31
WHAT’S THEN? – REVERSE!
text:*sci	
  
	
  
ReversedWildcardFilterFactory	
  
0enilpicsid/0
0icsa/10
0icsid/20
0icsinem/30
0icsinmel/40
0suoicsul/50
asci/60
disci/70
discipline/80
lemnisci/90
luscious/100
menisci/110
32
ReversedWildcardFilterFactory
33
Well.. Postings
asci/0 8, 9, 10, 14, 18, 23, 24, 26, 31, 35
disci/10 8, 11, 14, 18, 18, 18, 21, 23, 25, 27
discipline/20 4, 5, 6, 6, 9, 13, 13, 14, 18, 22
lemnisci/30 3, 4, 7, 9, 9, 9, 12, 13, 17, 20
luscious/40 3, 3, 5, 9, 9, 12, 14, 19, 23, 28
menisci/50 0, 2, 5, 6, 11, 13, 17, 22, 27
34
Well.. Postings .. ah yeah..
0enilpicsid
0icsa
0icsid
0icsinem
0icsinmel
0suoicsul
asci/0 8, 9, 10, 14, 18, 23, 24, 26, 31, 35
disci/10 8, 11, 14, 18, 18, 18, 21, 23, 25, 27
discipline/20 4, 5, 6, 6, 9, 13, 13, 14, 18, 22
lemnisci/30 3, 4, 7, 9, 9, 9, 12, 13, 17, 20
luscious/40 3, 3, 5, 9, 9, 12, 14, 19, 23, 28
menisci/50 0, 2, 5, 6, 11, 13, 17, 22, 27
35
Well.. Postings .. ah yeah.. (and positions!)
0enilpicsid/0 4, 5, 6, 6, 9, 13, 13, 14, 18, 22
0icsa/10 8, 9, 10, 14, 18, 23, 24, 26, 31, 35
0icsid/20 8, 11, 14, 18, 18, 18, 21, 23, 25, 27
0icsinem/30 0, 2, 5, 6, 11, 13, 17, 22, 27
0icsinmel/40 3, 4, 7, 9, 9, 9, 12, 13, 17, 20
0suoicsul/50 3, 3, 5, 9, 9, 12, 14, 19, 23, 28
asci/60 8, 9, 10, 14, 18, 23, 24, 26, 31, 35
disci/70 8, 11, 14, 18, 18, 18, 21, 23, 25, 27
discipline/80 4, 5, 6, 6, 9, 13, 13, 14, 18, 22
lemnisci/90 3, 4, 7, 9, 9, 9, 12, 13, 17, 20
luscious/100 3, 3, 5, 9, 9, 12, 14, 19, 23, 28
menisci/110 0, 2, 5, 6, 11, 13, 17, 22, 27
36
benchmark	
  khludnevm$	
  ant	
  run-­‐task	
  -­‐Dtask.alg=conf/index-­‐5m.alg	
  -­‐
Dtask.mem=1000m	
  
…	
  
	
  	
  	
  	
  	
  [java]	
  -­‐-­‐>	
  Round	
  0-­‐-­‐>1:	
  	
  	
  
solr.server:org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient
-­‐-­‐>org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient	
  
	
  	
  	
  	
  	
  [java]	
  	
  
	
  	
  	
  	
  	
  [java]	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐>	
  starting	
  task:	
  StopSolrServer	
  
	
  	
  	
  	
  	
  [java]	
  	
  
	
  	
  	
  	
  	
  [java]	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐>	
  Report	
  sum	
  by	
  Prefix	
  (AddDocs)	
  and	
  Round	
  (1	
  
about	
  1	
  out	
  of	
  13)	
  
	
  	
  	
  	
  	
  [java]	
  Operation	
  	
  	
  round	
  	
  	
  recsPerRun	
  	
  elapsedSec	
  	
  	
  	
  avgUsedMem	
  	
  	
  	
  
avgTotalMem	
  
	
  	
  	
  	
  	
  [java]	
  AddDocs	
  	
  	
  	
  	
  	
  	
  	
  	
  0	
  	
  	
  	
  	
  5000001	
  	
  	
  1,	
  100.41	
  	
  	
  102,215,792	
  	
  	
  	
  
257,425,408	
  
	
  	
  	
  	
  	
  [java]	
  	
  
	
  	
  	
  	
  	
  [java]	
  Reopen	
  Times:	
  
	
  	
  	
  	
  	
  [java]	
  	
  1166	
  
	
  	
  	
  	
  	
  [java]	
  ####################	
  
	
  	
  	
  	
  	
  [java]	
  ###	
  	
  D	
  O	
  N	
  E	
  !!!	
  ###	
  
	
  	
  	
  	
  	
  [java]	
  ####################	
  
	
  
BUILD	
  SUCCESSFUL	
  
Total	
  time:	
  19	
  minutes	
  28	
  seconds	
  
	
  
$	
  ant	
  run-­‐task	
  -­‐Dtask.alg=conf/index-­‐5m-­‐reverse.alg	
  -­‐Dtask.mem=1000m	
  
	
  	
  	
  	
  [java]	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐>	
  starting	
  task:	
  Rounds	
  
	
  …....	
  
	
  
	
  	
  	
  	
  	
  [java]	
  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐>	
  Report	
  sum	
  by	
  Prefix	
  (AddDocs)	
  and	
  Round	
  (1	
  about	
  1	
  out	
  
of	
  13)	
  
	
  	
  	
  	
  	
  [java]	
  Operation	
  	
  	
  round	
  recsPerRun	
  	
  	
  	
  	
  	
  	
  	
  rec/s	
  	
  elapsedSec	
  	
  	
  	
  avgUsedMem	
  	
  	
  	
  
avgTotalMem	
  
	
  	
  	
  	
  	
  [java]	
  AddDocs	
  	
  	
  	
  	
  	
  	
  	
  	
  0	
  5000001	
  	
  	
  	
  	
  3,556.96	
  	
  	
  	
  1,405.69	
  	
  	
  	
  75,075,400	
  	
  	
  	
  
257,425,408	
  
	
  	
  	
  	
  	
  [java]	
  	
  
	
  	
  	
  	
  	
  [java]	
  Reopen	
  Times:	
  
	
  	
  	
  	
  	
  [java]	
  	
  3114	
  
	
  	
  	
  	
  	
  [java]	
  ####################	
  
	
  	
  	
  	
  	
  [java]	
  ###	
  	
  D	
  O	
  N	
  E	
  !!!	
  ###	
  
	
  	
  	
  	
  	
  [java]	
  ####################	
  
BUILD	
  SUCCESSFUL	
  
Total	
  time:	
  26	
  minutes	
  50	
  seconds	
  
	
  
	
  
$	
  du	
  -­‐hs	
  ../example/schemaless/solr/gettingstarted/data/*	
  
	
  28G	
  ../example/schemaless/solr/gettingstarted/data/index-­‐reverse	
  
	
  13G	
  ../example/schemaless/solr/gettingstarted/data/index-­‐simple	
  
	
  
37
0
5
10
15
20
25
30
Baseline (5M en
wiki)
Reversed
Main Index
INDEX SIZE, GB
13G
28G
38
39
40
text:*sci*
AND THEN
41
discipline
EdgeNGramFilter + ReversedWildcardFilter
EdgeNGram Sort
discipline cipline
iscipline discipline
scipline e
cipline ine
ipline ipline
pline iscipline
line line
ine ne
ne pline
e scipline*sci* -> sci*
42
asci
ci
cious
cipline
disci
discipline
e
emnisci
enisci
i
ine
ious
ipline
isci
iscipline
lemnisci
line
luscious
menisci
mnisci
ne
nisci
ous
pline
s
sci
scious
scipline
us
uscious
43
0
5
10
15
20
25
30
35
Baseline (5M en
wiki)
Reversed EdgeNGramm
Main Index
INDEX SIZE, GB
13G
28G
~60G
44
https://discuss.codechef.com/questions/21385/a-tutorial-on-suffix-arrays
https://issues.apache.org/jira/browse/SOLR-9974
http://labs.carrotsearch.com/jsuffixarrays.html
SUFFIX ARRAY
45
SUGGESTER
46
AnalyzingInfixSuggester
LUCENE-3922: Add Japanese Kanji number normalization to Kuromoji
SOLR-4945: Japanese Autocomplete and Highlighter broken
4945
autocomplete
broken
highlighter
japanese
solr
47
AnalysingInfixSuggester TO RESCUE!
48
49
AnalysingInfixSuggester FOR infix SEARCH
• feed AnalysingInfixSuggester with main index’s terms
• enable EdgeNGramFilter for AnalysingInfixSuggester
discipline
iscipline
scipline
cipline
ipline
pline
line
ine
ne
e
50
• 14 M terms -> 79 M EdgeNGramms
• 10 min
• 3.3 G (25%)
BUILDING SUGGESTER INDEX
discipline
iscipline
scipline
cipline
ipline
pline
line
ine
ne
e
51
ms
52
0
5
10
15
20
25
30
35
Baseline (5M en
wiki)
Reversed EdgeNGramm Suggester
Main Index
Suggester Index
INDEX SIZE, GB
13G
28G
13G+3.3G
~60G
53
http://localhost:8901/solr/gettingstarted/suggest?
suggest.dictionary = body_txt_en & suggest.q = sci
<response>
<lst name="responseHeader"><int
name="QTime">4</int>
</lst>
<lst name="suggest">
  <lst name="body_txt_en">
    <lst name="sci">
<int name="numFound">1000</int>
      <arr name="suggestions">
          <str>scienc</str>
          <str>scientif</str>
          <str>scientist</str>
          <str>disciplin</str>
          <str>sci</str>
          <str>conscious</str>
          <str>category:sci</str>
          <str>fascin</str>
          <str>discipl</str>
          <str>consciou</str>
          <str>unconsci</str>
          <str>conscienc</str>
          <str>oscil</str>
          <str>neurosci</str>
          <str>interdisciplinari</str>
          <str>disciplinari</str>
          <str>scissor</str>
          <str>ascii</str>
          <str>scientolog</str>
          <str>scimitar</str>
          <str>conscienti</str>
          <str>pseudosci</str>
          <str>rescind</str>
          <str>priscilla</str>
          <str>subconsci</str>
          <str>brescia</str>
          <str>scion</str>
          <str>category:scientif</str>
          <str>infobox_scientist</str>
  <str>www.newscientist.com</str>
          <str>resuscit</str>
          <str>plebiscit</str>
          <str>user:scimitar</str>
          <str>multidisciplinari</str>
          <str>fascia</str>
          <str>scifi</str>
          <str>geoscienc</str>
          <str>www.scifi.com</str>
<str>www.sciencemag.org</str>
          <str>omnisci</str>
          <str>scipio</str>
          <str>neuroscientist</str>
          <str>scientologist</str>
….
<int
name="QTime">4</int>
54
AnalysingInfixSuggester FOR infix SEARCH
• feed AnalysingInfixSuggester with main index’s terms
• enable EdgeNGramFilter for AnalysingInfixSuggester
• override wildcard expansion by calling AnalysingInfixSuggester
55
*infix* SEARCH
3834 ms
*sci*
142 ms
*sci*
56
57
58
59
0 1000 2000 3000 4000 5000 6000
prefix*
*suffix
*substr*
*suggester*
RESPONSE TIME, ms
60
AnalysingInfixSuggester FOR infix SEARCH
• existing scalable algorithm
• minor customization
• no postings explosion
• potentially supports NRT
61
discipline
discipline
Suggester
asci
ci
cious
cipline
..
sci
scious
scipline
us
uscious
discipline
asci
disci
discipline
lemnisci
luscious
menisci School discipline
.. is a required set of
actions by a teacher
towards a student …
Main Index
62
discipline
discipline
Derivative Terms
asci
ci
cious
cipline
..
sci
scious
scipline
us
uscious
discipline
asci
disci
discipline
lemnisci
luscious
menisci School discipline
.. is a required set of
actions by a teacher
towards a student …
63
• A slight index format change
• many terms refer to the same postings list
• API is :
•  indexWriter.deriveTerms(“name”, “name_edge”, new EdgeNgrammTokenFilter());
•  search: name_edge:sci*
• Hijacking and Injecting codecs LUCENE-7863
• Promising for deep taxonomies.
Derivative terms
64
*INFIX* SEARCH WITH DERIVATIVE TERMS
127 ms
*sci*
65
ms
66
0 1000 2000 3000 4000 5000 6000
prefix*
*suffix
*substr*
*suggester*
*derived*
RESPONSE TIME, ms
67
0
5
10
15
20
25
30
35
Baseline (5M en
wiki)
Reversed EdgeNGramm Suggester Derived Terms
Main Index
Suggester Index
INDEX SIZE, GB
13G
28G
13G+3.3G 17G
~60G
68
REFERENCES
What is in a Lucene index? Adrien Grand
https://www.youtube.com/watch?v=T5RmMNDR5XI
Automata Invasion. Robert Muir, Michael Mccandless
https://www.youtube.com/watch?v=pd2jvy2IbJE
• Lucene Search Essentials: Scorers, Collectors and Custom Queries, Mikhail Khludnev
https://www.youtube.com/watch?v=X9YovpYj6uo
A new Lucene suggester based on infix matches
http://blog.mikemccandless.com/2013/06/a-new-lucene-suggester-based-on-
infix.html
69
REFERENCES
What is in a Lucene index? Adrien Grand
https://www.youtube.com/watch?v=T5RmMNDR5XI
Automata Invasion. Robert Muir, Michael Mccandless
https://www.youtube.com/watch?v=pd2jvy2IbJE
В поисках Tommy Hilfiger, Михаил Хлуднев
https://www.youtube.com/watch?v=Azf4oUL-Dqc
A new Lucene suggester based on infix matches
http://blog.mikemccandless.com/2013/06/a-new-lucene-suggester-based-on-infix.html
70
ANYWAY
text:*a*
71
CONTACTS
Mikhail_Khludnev@EPAM.COM
mkhl@apache.or g
https://plus.google.com/+MikhailKhludnev
72
Thank YouThank You

More Related Content

Similar to Search LIKE %SQL% - Mikhail Khludnev, EPAM

#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"epamspb
 
Visualizing ORACLE performance data with R @ #C16LV
Visualizing ORACLE performance data with R @ #C16LVVisualizing ORACLE performance data with R @ #C16LV
Visualizing ORACLE performance data with R @ #C16LVMaxym Kharchenko
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeWim Godden
 
Deep Learning Automated Helpdesk
Deep Learning Automated HelpdeskDeep Learning Automated Helpdesk
Deep Learning Automated HelpdeskPranav Sharma
 
Is your excel production code?
Is your excel production code?Is your excel production code?
Is your excel production code?ProCogia
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeWim Godden
 
Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)Sam Livingston-Gray
 
Linked Data in Learning Analytics Tools
Linked Data in Learning Analytics ToolsLinked Data in Learning Analytics Tools
Linked Data in Learning Analytics ToolsMathieu d'Aquin
 
Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019
 Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019 Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019
Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019Vlad Mihalcea
 
DC |> Elixir Meetup - Going off the Rails into Elixir - Dan Ivovich
DC |> Elixir Meetup - Going off the Rails into Elixir - Dan IvovichDC |> Elixir Meetup - Going off the Rails into Elixir - Dan Ivovich
DC |> Elixir Meetup - Going off the Rails into Elixir - Dan IvovichSmartLogic
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringCS, NcState
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slowSolarWinds
 
AI Deeplearning Programming
AI Deeplearning ProgrammingAI Deeplearning Programming
AI Deeplearning ProgrammingPaulSombat
 
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)ruthmcdavitt
 
Four Languages From Forty Years Ago
Four Languages From Forty Years AgoFour Languages From Forty Years Ago
Four Languages From Forty Years AgoScott Wlaschin
 
Learning Python from Data
Learning Python from DataLearning Python from Data
Learning Python from DataMosky Liu
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph DatabaseTobias Lindaaker
 
Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Jordi Valverde
 
DevOps, Waffles, and Superheroes
DevOps, Waffles, and SuperheroesDevOps, Waffles, and Superheroes
DevOps, Waffles, and SuperheroesJessica Deen
 

Similar to Search LIKE %SQL% - Mikhail Khludnev, EPAM (20)

#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
#ITsubbotnik Spring 2017: Mikhail Khludnev "Search like %SQL%"
 
Visualizing ORACLE performance data with R @ #C16LV
Visualizing ORACLE performance data with R @ #C16LVVisualizing ORACLE performance data with R @ #C16LV
Visualizing ORACLE performance data with R @ #C16LV
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Deep Learning Automated Helpdesk
Deep Learning Automated HelpdeskDeep Learning Automated Helpdesk
Deep Learning Automated Helpdesk
 
Is your excel production code?
Is your excel production code?Is your excel production code?
Is your excel production code?
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)Fluent Refactoring (Lone Star Ruby Conf 2013)
Fluent Refactoring (Lone Star Ruby Conf 2013)
 
Linked Data in Learning Analytics Tools
Linked Data in Learning Analytics ToolsLinked Data in Learning Analytics Tools
Linked Data in Learning Analytics Tools
 
Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019
 Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019 Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019
Awesome SQL Tips and Tricks - Voxxed Days Cluj - 2019
 
DC |> Elixir Meetup - Going off the Rails into Elixir - Dan Ivovich
DC |> Elixir Meetup - Going off the Rails into Elixir - Dan IvovichDC |> Elixir Meetup - Going off the Rails into Elixir - Dan Ivovich
DC |> Elixir Meetup - Going off the Rails into Elixir - Dan Ivovich
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software Engineering
 
How to find what is making your Oracle database slow
How to find what is making your Oracle database slowHow to find what is making your Oracle database slow
How to find what is making your Oracle database slow
 
AI Deeplearning Programming
AI Deeplearning ProgrammingAI Deeplearning Programming
AI Deeplearning Programming
 
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
SoTWLG Intro to Code Bootcamps 2016 (Roger Nesbitt)
 
Four Languages From Forty Years Ago
Four Languages From Forty Years AgoFour Languages From Forty Years Ago
Four Languages From Forty Years Ago
 
Learning Python from Data
Learning Python from DataLearning Python from Data
Learning Python from Data
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph Database
 
Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011Lighting talk neo4j fosdem 2011
Lighting talk neo4j fosdem 2011
 
DevOps, Waffles, and Superheroes
DevOps, Waffles, and SuperheroesDevOps, Waffles, and Superheroes
DevOps, Waffles, and Superheroes
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Search LIKE %SQL% - Mikhail Khludnev, EPAM

  • 1. SEARCH LIKE %SQL% INFIX SEARCH IN LUCENE / SOLR / ELASTIC SEP 15, 2017
  • 2. 2 Talk Title Speaker Name Company SEARCH LIKE %SQL% Mikhail Khludnev EPAM
  • 3. 3 •  work in Search for 6 years •  Apache Lucene/Solr committer for 2 years •  speak at LuceneRevolution, BerlinBuzzwords •  chief search engineer in EPAM ABOUT ME
  • 5. 5 ESTABLISHED & EXPANDING GLOBAL VERTICALS Award-winning Wealth Management Platform Deep Expertise in Current and Emerging FinTech Working with 5 of the 10 Largest Investment Banks Leading Digital Transformation for Global Retailers Working with largest online travel association (OTA) & largest global hospitality company Recognized M&E Leader by Independent Research Analysts Working with 4 out of the 4 Top Broadcast Networks and 14 out of the top 30 TV Networks to transform consumer-driven media R&D Domain Experts with 700+ Complex Solutions & Services Supporting the Entire Drug Discovery Workflow Working with 9 of the 10 Top Pharma Companies 24-Year History of Leading Product Development Working with 30+ of the top 100 ISVs FINANCIAL SERVICES TRAVEL & CONSUMER SOFTWARE & HI-TECHLIFE SCIENCES AND HEALTHCARE MEDIA & ENTERTAINMENT EMERGING Deep Expertise Offers Innovative Solutions Working with industries ranging from Energy and Utilities to Telecom and Automotive
  • 6. 6 •  Term and boolean query •  Prefix*  query •  *suffix  query •  *infix*  query •  Approaching  Suggester   •  Derivative  Terms     AGENDA
  • 7. 7 •  Endeca •  MarkLogic •  FAST, Google Search Appliance •  Sphinx •  Apache Lucene •  Apache Solr •  Elastic SEARCH ENGINES
  • 8. 8
  • 9. 9
  • 10. 10 CUSTOMER PROFILE Any comprehensive text search service •  Patent •  Legal •  Chemistry •  Bioinformatics •  SQL legacy
  • 12. 12 Business Problem/Opportunity •  Ill searches for *infix* CHALLENGE Bank of England
  • 13. 13 Business Problem/Opportunity •  Ill searches for *infix* CHALLENGE
  • 14. 14 …at all? Or what’s fast at comparison to it? WHY IT’S A PROBLEM?
  • 15. 15 text:foo      OR    text:bar                 text:foo    AND    text:bar     THESE SEARCHES ARE (CONSIDERED AS) FAST
  • 16. 16 text:foo  OR    text:bar   text:foo  AND  text:bar     O(r)  <<  O(Dall)     r  –  results   Dall  –  all  docs     THESE SEARCHES ARE (CONSIDERED AS) FAST
  • 17. 17 • text:[sci  TO  scj]   • text:sci*   WHY THESE ARE STILL FAST?
  • 18. 18 TERM EXPANSION •  discipline •  luscious •  science •  scilla •  scissors text:[sci  TO  scj]   text:sci*   text:(science  OR  scilla  OR  scissors)     O(t)+O(r)   t – query terms r - results
  • 19. 19 TERM EXPANSION •  discipline •  luscious •  science •  scilla •  scissors text:[sci  TO  scj]   text:sci*   text:(science  OR  scilla  OR  scissors)   O(t)+O(r)  
  • 21. 21 ms
  • 22. 22 WHAT’S THEN? •  asci •  disci •  discipline •  lemnisci •  luscious •  menisci text:*sci  
  • 23. 23 WHAT’S THEN? text:*sci       O(Tall)+O(r)   •  asci •  disci •  discipline •  lemnisci •  luscious •  menisci Tall – all terms r - results
  • 25. 25 ms
  • 26. 26 0 1000 2000 3000 4000 5000 6000 prefix* *suffix RESPONSE TIME, ms
  • 29. 29
  • 30. 30
  • 31. 31 WHAT’S THEN? – REVERSE! text:*sci     ReversedWildcardFilterFactory   0enilpicsid/0 0icsa/10 0icsid/20 0icsinem/30 0icsinmel/40 0suoicsul/50 asci/60 disci/70 discipline/80 lemnisci/90 luscious/100 menisci/110
  • 33. 33 Well.. Postings asci/0 8, 9, 10, 14, 18, 23, 24, 26, 31, 35 disci/10 8, 11, 14, 18, 18, 18, 21, 23, 25, 27 discipline/20 4, 5, 6, 6, 9, 13, 13, 14, 18, 22 lemnisci/30 3, 4, 7, 9, 9, 9, 12, 13, 17, 20 luscious/40 3, 3, 5, 9, 9, 12, 14, 19, 23, 28 menisci/50 0, 2, 5, 6, 11, 13, 17, 22, 27
  • 34. 34 Well.. Postings .. ah yeah.. 0enilpicsid 0icsa 0icsid 0icsinem 0icsinmel 0suoicsul asci/0 8, 9, 10, 14, 18, 23, 24, 26, 31, 35 disci/10 8, 11, 14, 18, 18, 18, 21, 23, 25, 27 discipline/20 4, 5, 6, 6, 9, 13, 13, 14, 18, 22 lemnisci/30 3, 4, 7, 9, 9, 9, 12, 13, 17, 20 luscious/40 3, 3, 5, 9, 9, 12, 14, 19, 23, 28 menisci/50 0, 2, 5, 6, 11, 13, 17, 22, 27
  • 35. 35 Well.. Postings .. ah yeah.. (and positions!) 0enilpicsid/0 4, 5, 6, 6, 9, 13, 13, 14, 18, 22 0icsa/10 8, 9, 10, 14, 18, 23, 24, 26, 31, 35 0icsid/20 8, 11, 14, 18, 18, 18, 21, 23, 25, 27 0icsinem/30 0, 2, 5, 6, 11, 13, 17, 22, 27 0icsinmel/40 3, 4, 7, 9, 9, 9, 12, 13, 17, 20 0suoicsul/50 3, 3, 5, 9, 9, 12, 14, 19, 23, 28 asci/60 8, 9, 10, 14, 18, 23, 24, 26, 31, 35 disci/70 8, 11, 14, 18, 18, 18, 21, 23, 25, 27 discipline/80 4, 5, 6, 6, 9, 13, 13, 14, 18, 22 lemnisci/90 3, 4, 7, 9, 9, 9, 12, 13, 17, 20 luscious/100 3, 3, 5, 9, 9, 12, 14, 19, 23, 28 menisci/110 0, 2, 5, 6, 11, 13, 17, 22, 27
  • 36. 36 benchmark  khludnevm$  ant  run-­‐task  -­‐Dtask.alg=conf/index-­‐5m.alg  -­‐ Dtask.mem=1000m   …            [java]  -­‐-­‐>  Round  0-­‐-­‐>1:       solr.server:org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient -­‐-­‐>org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient            [java]              [java]  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐>  starting  task:  StopSolrServer            [java]              [java]  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐>  Report  sum  by  Prefix  (AddDocs)  and  Round  (1   about  1  out  of  13)            [java]  Operation      round      recsPerRun    elapsedSec        avgUsedMem         avgTotalMem            [java]  AddDocs                  0          5000001      1,  100.41      102,215,792         257,425,408            [java]              [java]  Reopen  Times:            [java]    1166            [java]  ####################            [java]  ###    D  O  N  E  !!!  ###            [java]  ####################     BUILD  SUCCESSFUL   Total  time:  19  minutes  28  seconds     $  ant  run-­‐task  -­‐Dtask.alg=conf/index-­‐5m-­‐reverse.alg  -­‐Dtask.mem=1000m          [java]  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐>  starting  task:  Rounds    …....              [java]  -­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐-­‐>  Report  sum  by  Prefix  (AddDocs)  and  Round  (1  about  1  out   of  13)            [java]  Operation      round  recsPerRun                rec/s    elapsedSec        avgUsedMem         avgTotalMem            [java]  AddDocs                  0  5000001          3,556.96        1,405.69        75,075,400         257,425,408            [java]              [java]  Reopen  Times:            [java]    3114            [java]  ####################            [java]  ###    D  O  N  E  !!!  ###            [java]  ####################   BUILD  SUCCESSFUL   Total  time:  26  minutes  50  seconds       $  du  -­‐hs  ../example/schemaless/solr/gettingstarted/data/*    28G  ../example/schemaless/solr/gettingstarted/data/index-­‐reverse    13G  ../example/schemaless/solr/gettingstarted/data/index-­‐simple    
  • 38. 38
  • 39. 39
  • 41. 41 discipline EdgeNGramFilter + ReversedWildcardFilter EdgeNGram Sort discipline cipline iscipline discipline scipline e cipline ine ipline ipline pline iscipline line line ine ne ne pline e scipline*sci* -> sci*
  • 43. 43 0 5 10 15 20 25 30 35 Baseline (5M en wiki) Reversed EdgeNGramm Main Index INDEX SIZE, GB 13G 28G ~60G
  • 46. 46 AnalyzingInfixSuggester LUCENE-3922: Add Japanese Kanji number normalization to Kuromoji SOLR-4945: Japanese Autocomplete and Highlighter broken 4945 autocomplete broken highlighter japanese solr
  • 48. 48
  • 49. 49 AnalysingInfixSuggester FOR infix SEARCH • feed AnalysingInfixSuggester with main index’s terms • enable EdgeNGramFilter for AnalysingInfixSuggester discipline iscipline scipline cipline ipline pline line ine ne e
  • 50. 50 • 14 M terms -> 79 M EdgeNGramms • 10 min • 3.3 G (25%) BUILDING SUGGESTER INDEX discipline iscipline scipline cipline ipline pline line ine ne e
  • 51. 51 ms
  • 52. 52 0 5 10 15 20 25 30 35 Baseline (5M en wiki) Reversed EdgeNGramm Suggester Main Index Suggester Index INDEX SIZE, GB 13G 28G 13G+3.3G ~60G
  • 53. 53 http://localhost:8901/solr/gettingstarted/suggest? suggest.dictionary = body_txt_en & suggest.q = sci <response> <lst name="responseHeader"><int name="QTime">4</int> </lst> <lst name="suggest">   <lst name="body_txt_en">     <lst name="sci"> <int name="numFound">1000</int>       <arr name="suggestions">           <str>scienc</str>           <str>scientif</str>           <str>scientist</str>           <str>disciplin</str>           <str>sci</str>           <str>conscious</str>           <str>category:sci</str>           <str>fascin</str>           <str>discipl</str>           <str>consciou</str>           <str>unconsci</str>           <str>conscienc</str>           <str>oscil</str>           <str>neurosci</str>           <str>interdisciplinari</str>           <str>disciplinari</str>           <str>scissor</str>           <str>ascii</str>           <str>scientolog</str>           <str>scimitar</str>           <str>conscienti</str>           <str>pseudosci</str>           <str>rescind</str>           <str>priscilla</str>           <str>subconsci</str>           <str>brescia</str>           <str>scion</str>           <str>category:scientif</str>           <str>infobox_scientist</str>   <str>www.newscientist.com</str>           <str>resuscit</str>           <str>plebiscit</str>           <str>user:scimitar</str>           <str>multidisciplinari</str>           <str>fascia</str>           <str>scifi</str>           <str>geoscienc</str>           <str>www.scifi.com</str> <str>www.sciencemag.org</str>           <str>omnisci</str>           <str>scipio</str>           <str>neuroscientist</str>           <str>scientologist</str> …. <int name="QTime">4</int>
  • 54. 54 AnalysingInfixSuggester FOR infix SEARCH • feed AnalysingInfixSuggester with main index’s terms • enable EdgeNGramFilter for AnalysingInfixSuggester • override wildcard expansion by calling AnalysingInfixSuggester
  • 56. 56
  • 57. 57
  • 58. 58
  • 59. 59 0 1000 2000 3000 4000 5000 6000 prefix* *suffix *substr* *suggester* RESPONSE TIME, ms
  • 60. 60 AnalysingInfixSuggester FOR infix SEARCH • existing scalable algorithm • minor customization • no postings explosion • potentially supports NRT
  • 63. 63 • A slight index format change • many terms refer to the same postings list • API is : •  indexWriter.deriveTerms(“name”, “name_edge”, new EdgeNgrammTokenFilter()); •  search: name_edge:sci* • Hijacking and Injecting codecs LUCENE-7863 • Promising for deep taxonomies. Derivative terms
  • 64. 64 *INFIX* SEARCH WITH DERIVATIVE TERMS 127 ms *sci*
  • 65. 65 ms
  • 66. 66 0 1000 2000 3000 4000 5000 6000 prefix* *suffix *substr* *suggester* *derived* RESPONSE TIME, ms
  • 67. 67 0 5 10 15 20 25 30 35 Baseline (5M en wiki) Reversed EdgeNGramm Suggester Derived Terms Main Index Suggester Index INDEX SIZE, GB 13G 28G 13G+3.3G 17G ~60G
  • 68. 68 REFERENCES What is in a Lucene index? Adrien Grand https://www.youtube.com/watch?v=T5RmMNDR5XI Automata Invasion. Robert Muir, Michael Mccandless https://www.youtube.com/watch?v=pd2jvy2IbJE • Lucene Search Essentials: Scorers, Collectors and Custom Queries, Mikhail Khludnev https://www.youtube.com/watch?v=X9YovpYj6uo A new Lucene suggester based on infix matches http://blog.mikemccandless.com/2013/06/a-new-lucene-suggester-based-on- infix.html
  • 69. 69 REFERENCES What is in a Lucene index? Adrien Grand https://www.youtube.com/watch?v=T5RmMNDR5XI Automata Invasion. Robert Muir, Michael Mccandless https://www.youtube.com/watch?v=pd2jvy2IbJE В поисках Tommy Hilfiger, Михаил Хлуднев https://www.youtube.com/watch?v=Azf4oUL-Dqc A new Lucene suggester based on infix matches http://blog.mikemccandless.com/2013/06/a-new-lucene-suggester-based-on-infix.html