A NOOBS LESSON ON SOLR
(CONFIGURATION)
STEVE, STOP ME IF I’M WRONG
at any point
not exactly a full secret, but a disclaimer here:
I don’t completely know everything there is to
know about Solr or its configuration
EASIEST WAY I CAN EXPLAIN SOLR.
how would you find all the pages a term
or phrase appears on in a book?
EASIEST WAY I CAN EXPLAIN SOLR.
How would you find
all the pages a term
or phrase appears
on in a book?
EASIEST WAY I CAN EXPLAIN SOLR.
so we can think of Solr like an
index in the back of a book
we use our brains to find the
words or terms in the index
Solr’s brain is schema.xml
the words or terms refer to
documents (text streams)
?
HOW DOES THE INDEX GET POPULATED?
schema.xml !
HOW DOES THE INDEX GET SEARCHED?
?schema.xml !
SO, SCHEMA.XML IS THE BRAIN
index contains one or more documents
documents are unit of search and index
documents contain fields
so, index = tons of documents =
and each document has field(s)
make sense yet?
SO, SCHEMA.XML IS THE BRAIN
<field name="html" type="example" indexed="true"
stored="true" multiValued="true" />
<fieldType name="example" class="solr.TextField" positionIncrementGap="100"
sortMissingLast="true" />
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and schema.xml is where it’s at!
it defines the fields and how to
index and search each field
SO, SCHEMA.XML IS THE BRAIN
<field name="html" type="example" indexed="true"
stored="true" multiValued="true" />
<fieldType name="example" class="solr.TextField" positionIncrementGap="100"
sortMissingLast="true" />
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and schema.xml is where it’s at!
it defines the fields and how to
index and search each field
SO, SCHEMA.XML IS THE BRAIN
<field name="html" type="example" indexed="true"
stored="true" multiValued="true" />
<fieldType name="example" class="solr.TextField" positionIncrementGap="100"
sortMissingLast="true" />
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and schema.xml is where it’s at!
it defines the fields and how to
index and search each field
SO, SCHEMA.XML IS THE BRAIN
<field name="html" type="example" indexed="true"
stored="true" multiValued="true" />
<fieldType name="example" class="solr.TextField" positionIncrementGap="100"
sortMissingLast="true" />
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and schema.xml is where it’s at!
it defines the fields and how to
index and search each field
SO, SCHEMA.XML IS THE BRAIN
<field name="html" type="example" indexed="true"
stored="true" multiValued="true" />
<fieldType name="example" class="solr.TextField" positionIncrementGap="100"
sortMissingLast="true" />
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and schema.xml is where it’s at!
it defines the fields and how to
index and search each field
SO, SCHEMA.XML IS THE BRAIN
<field name="html" type="example" indexed="true"
stored="true" multiValued="true" />
<fieldType name="example" class="solr.TextField" positionIncrementGap="100"
sortMissingLast="true" />
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and schema.xml is where it’s at!
it defines the fields and how to
index and search each field
SO, SCHEMA.XML IS THE BRAIN
<field name="html" type="example" indexed="true"
stored="true" multiValued="true" />
<fieldType name="example" class="solr.TextField" positionIncrementGap="100"
sortMissingLast="true" />
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and schema.xml is where it’s at!
it defines the fields and how to
index and search each field
SO, SCHEMA.XML IS THE BRAIN
<field name="html" type="example" indexed="true"
stored="true" multiValued="true" />
<fieldType name="example" class="solr.TextField" positionIncrementGap="100"
sortMissingLast="true" />
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
and schema.xml is where it’s at!
it defines the fields and how to
index and search each field
FIELD? FIELDTYPE? HALP PLS.
@Test
public void sslCertsHostNameField() throws SolrServerException
{
}
FIELD? FIELDTYPE? HALP PLS.
@Test
public void sslCertsHostNameField() throws SolrServerException
{
testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu",
}
FIELD? FIELDTYPE? HALP PLS.
@Test
public void sslCertsHostNameField() throws SolrServerException
{
testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu",
hit("VIRGINIA.EDU"),
hit("bootp.virginia.edu"),
hit(""d-128-100-108.bootp.virginia.edu""),
}
FIELD? FIELDTYPE? HALP PLS.
@Test
public void sslCertsHostNameField() throws SolrServerException
{
testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu",
hit("VIRGINIA.EDU"),
hit("bootp.virginia.edu"),
hit(""d-128-100-108.bootp.virginia.edu""),
miss("mail.virginia.edu"));
}
FIELD? FIELDTYPE? HALP PLS.
@Test
public void sslCertsHostNameField() throws SolrServerException
{
testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu",
hit("VIRGINIA.EDU"),
hit("bootp.virginia.edu"),
hit(""d-128-100-108.bootp.virginia.edu""),
miss("mail.virginia.edu"));
}
THIS TEST FAILS :-(
So where do we look?
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
FIELD? FIELDTYPE? HALP PLS.
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
</fieldType>
FIELD? FIELDTYPE? HALP PLS.
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
</fieldType>
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
FIELD? FIELDTYPE? HALP PLS.
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
<fieldType name="sslcerts_hostname" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
</fieldType>
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
<fieldType name="sslcerts_hostname" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory”/>
</analyzer>
</fieldType>
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
<fieldType name="sslcerts_hostname" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory”/>
<filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/>
</analyzer>
</fieldType>
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
<fieldType name="sslcerts_hostname" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory”/>
<filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/>
<filter class=”solr.LowerCaseFilterFactory”/>
</analyzer>
</fieldType>
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="text_general" indexed="true"
stored="true" multiValued="true" />
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
<fieldType name="sslcerts_hostname" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory”/>
<filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/>
<filter class=”solr.LowerCaseFilterFactory”/>
</analyzer>
</fieldType>
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="sslcerts_hostname" indexed="true"
stored="true" multiValued="true" />
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
<fieldType name="sslcerts_hostname" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory”/>
<filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/>
<filter class=”solr.LowerCaseFilterFactory”/>
</analyzer>
</fieldType>
FIELD? FIELDTYPE? HALP PLS.
<field name="sslcerts-hostname" type="sslcerts_hostname" indexed="true"
stored="true" multiValued="true" />
<fieldType name="text_general" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory” />
</analyzer>
<fieldType name="sslcerts_hostname" class="solr.TextField"
positionIncrementGap=”100” sortMissingLast=”true”>
<analyzer>
<tokenizer class=”solr.WhitespaceTokenizerFactory”/>
<filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/>
<filter class=”solr.LowerCaseFilterFactory”/>
</analyzer>
</fieldType>
FIELD? FIELDTYPE? HALP PLS.
@Test
public void sslCertsHostNameField() throws SolrServerException
{
testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu",
hit("VIRGINIA.EDU"),
hit("bootp.virginia.edu"),
hit(""d-128-100-108.bootp.virginia.edu""),
miss("mail.virginia.edu"));
}
FIELD? FIELDTYPE? HALP PLS.
@Test
public void sslCertsHostNameField() throws SolrServerException
{
testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu",
hit("VIRGINIA.EDU"),
hit("bootp.virginia.edu"),
hit(""d-128-100-108.bootp.virginia.edu""),
miss("mail.virginia.edu"));
}
THIS TEST PASSES :-D
SUMMARY OF WHAT WE LEARNED.
A Solr index is comprised of a bunch of documents (token streams)
–  think index in the back of a book example
schema.xml holds the brains, the power, the rules
–  for how data gets stored as documents and how
they’re returned from matching queries
thanks to Steve’s exercises, I was able to look at the
schema.xml file and… for the most part, understand it
Hopefully you can look at it now and understand it too
QUESTIONS?

A noobs lesson on solr (configuration)

  • 1.
    A NOOBS LESSONON SOLR (CONFIGURATION)
  • 2.
    STEVE, STOP MEIF I’M WRONG at any point not exactly a full secret, but a disclaimer here: I don’t completely know everything there is to know about Solr or its configuration
  • 3.
    EASIEST WAY ICAN EXPLAIN SOLR. how would you find all the pages a term or phrase appears on in a book?
  • 4.
    EASIEST WAY ICAN EXPLAIN SOLR. How would you find all the pages a term or phrase appears on in a book?
  • 5.
    EASIEST WAY ICAN EXPLAIN SOLR. so we can think of Solr like an index in the back of a book we use our brains to find the words or terms in the index Solr’s brain is schema.xml the words or terms refer to documents (text streams)
  • 6.
    ? HOW DOES THEINDEX GET POPULATED? schema.xml !
  • 7.
    HOW DOES THEINDEX GET SEARCHED? ?schema.xml !
  • 8.
    SO, SCHEMA.XML ISTHE BRAIN index contains one or more documents documents are unit of search and index documents contain fields so, index = tons of documents = and each document has field(s) make sense yet?
  • 9.
    SO, SCHEMA.XML ISTHE BRAIN <field name="html" type="example" indexed="true" stored="true" multiValued="true" /> <fieldType name="example" class="solr.TextField" positionIncrementGap="100" sortMissingLast="true" /> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> and schema.xml is where it’s at! it defines the fields and how to index and search each field
  • 10.
    SO, SCHEMA.XML ISTHE BRAIN <field name="html" type="example" indexed="true" stored="true" multiValued="true" /> <fieldType name="example" class="solr.TextField" positionIncrementGap="100" sortMissingLast="true" /> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> and schema.xml is where it’s at! it defines the fields and how to index and search each field
  • 11.
    SO, SCHEMA.XML ISTHE BRAIN <field name="html" type="example" indexed="true" stored="true" multiValued="true" /> <fieldType name="example" class="solr.TextField" positionIncrementGap="100" sortMissingLast="true" /> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> and schema.xml is where it’s at! it defines the fields and how to index and search each field
  • 12.
    SO, SCHEMA.XML ISTHE BRAIN <field name="html" type="example" indexed="true" stored="true" multiValued="true" /> <fieldType name="example" class="solr.TextField" positionIncrementGap="100" sortMissingLast="true" /> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> and schema.xml is where it’s at! it defines the fields and how to index and search each field
  • 13.
    SO, SCHEMA.XML ISTHE BRAIN <field name="html" type="example" indexed="true" stored="true" multiValued="true" /> <fieldType name="example" class="solr.TextField" positionIncrementGap="100" sortMissingLast="true" /> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> and schema.xml is where it’s at! it defines the fields and how to index and search each field
  • 14.
    SO, SCHEMA.XML ISTHE BRAIN <field name="html" type="example" indexed="true" stored="true" multiValued="true" /> <fieldType name="example" class="solr.TextField" positionIncrementGap="100" sortMissingLast="true" /> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> and schema.xml is where it’s at! it defines the fields and how to index and search each field
  • 15.
    SO, SCHEMA.XML ISTHE BRAIN <field name="html" type="example" indexed="true" stored="true" multiValued="true" /> <fieldType name="example" class="solr.TextField" positionIncrementGap="100" sortMissingLast="true" /> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> and schema.xml is where it’s at! it defines the fields and how to index and search each field
  • 16.
    SO, SCHEMA.XML ISTHE BRAIN <field name="html" type="example" indexed="true" stored="true" multiValued="true" /> <fieldType name="example" class="solr.TextField" positionIncrementGap="100" sortMissingLast="true" /> <analyzer> <charFilter class="solr.HTMLStripCharFilterFactory"/> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> and schema.xml is where it’s at! it defines the fields and how to index and search each field
  • 17.
    FIELD? FIELDTYPE? HALPPLS. @Test public void sslCertsHostNameField() throws SolrServerException { }
  • 18.
    FIELD? FIELDTYPE? HALPPLS. @Test public void sslCertsHostNameField() throws SolrServerException { testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu", }
  • 19.
    FIELD? FIELDTYPE? HALPPLS. @Test public void sslCertsHostNameField() throws SolrServerException { testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu", hit("VIRGINIA.EDU"), hit("bootp.virginia.edu"), hit(""d-128-100-108.bootp.virginia.edu""), }
  • 20.
    FIELD? FIELDTYPE? HALPPLS. @Test public void sslCertsHostNameField() throws SolrServerException { testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu", hit("VIRGINIA.EDU"), hit("bootp.virginia.edu"), hit(""d-128-100-108.bootp.virginia.edu""), miss("mail.virginia.edu")); }
  • 21.
    FIELD? FIELDTYPE? HALPPLS. @Test public void sslCertsHostNameField() throws SolrServerException { testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu", hit("VIRGINIA.EDU"), hit("bootp.virginia.edu"), hit(""d-128-100-108.bootp.virginia.edu""), miss("mail.virginia.edu")); } THIS TEST FAILS :-( So where do we look?
  • 22.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" />
  • 23.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" />
  • 24.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" />
  • 25.
    <field name="sslcerts-hostname" type="text_general"indexed="true" stored="true" multiValued="true" /> FIELD? FIELDTYPE? HALP PLS. <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> </fieldType>
  • 26.
    FIELD? FIELDTYPE? HALPPLS. <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> </fieldType> <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" />
  • 27.
    FIELD? FIELDTYPE? HALPPLS. <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> <fieldType name="sslcerts_hostname" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> </fieldType> <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" />
  • 28.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" /> <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> <fieldType name="sslcerts_hostname" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory”/> </analyzer> </fieldType>
  • 29.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" /> <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> <fieldType name="sslcerts_hostname" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory”/> <filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/> </analyzer> </fieldType>
  • 30.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" /> <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> <fieldType name="sslcerts_hostname" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory”/> <filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/> <filter class=”solr.LowerCaseFilterFactory”/> </analyzer> </fieldType>
  • 31.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="text_general" indexed="true" stored="true" multiValued="true" /> <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> <fieldType name="sslcerts_hostname" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory”/> <filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/> <filter class=”solr.LowerCaseFilterFactory”/> </analyzer> </fieldType>
  • 32.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="sslcerts_hostname" indexed="true" stored="true" multiValued="true" /> <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> <fieldType name="sslcerts_hostname" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory”/> <filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/> <filter class=”solr.LowerCaseFilterFactory”/> </analyzer> </fieldType>
  • 33.
    FIELD? FIELDTYPE? HALPPLS. <field name="sslcerts-hostname" type="sslcerts_hostname" indexed="true" stored="true" multiValued="true" /> <fieldType name="text_general" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory” /> </analyzer> <fieldType name="sslcerts_hostname" class="solr.TextField" positionIncrementGap=”100” sortMissingLast=”true”> <analyzer> <tokenizer class=”solr.WhitespaceTokenizerFactory”/> <filter class=”solr.NGramFilterFactory” maxGramSize=”25” minGramSize=”3”/> <filter class=”solr.LowerCaseFilterFactory”/> </analyzer> </fieldType>
  • 34.
    FIELD? FIELDTYPE? HALPPLS. @Test public void sslCertsHostNameField() throws SolrServerException { testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu", hit("VIRGINIA.EDU"), hit("bootp.virginia.edu"), hit(""d-128-100-108.bootp.virginia.edu""), miss("mail.virginia.edu")); }
  • 35.
    FIELD? FIELDTYPE? HALPPLS. @Test public void sslCertsHostNameField() throws SolrServerException { testExpectations("sslcerts-hostname", "d-128-100-108.bootp.virginia.edu", hit("VIRGINIA.EDU"), hit("bootp.virginia.edu"), hit(""d-128-100-108.bootp.virginia.edu""), miss("mail.virginia.edu")); } THIS TEST PASSES :-D
  • 36.
    SUMMARY OF WHATWE LEARNED. A Solr index is comprised of a bunch of documents (token streams) –  think index in the back of a book example schema.xml holds the brains, the power, the rules –  for how data gets stored as documents and how they’re returned from matching queries thanks to Steve’s exercises, I was able to look at the schema.xml file and… for the most part, understand it Hopefully you can look at it now and understand it too
  • 37.