Your SlideShare is downloading. ×
Search in the Biblical Domain - BibleTech: 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Search in the Biblical Domain - BibleTech: 2011

1,069
views

Published on

Covers techniques for searching the Bible using multiple translations and searching extra-biblical content like commentaries and journals.

Covers techniques for searching the Bible using multiple translations and searching extra-biblical content like commentaries and journals.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,069
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    • 1. Search in the Biblical Domain Brian Seagraves (Bible.org)
    • 2. What is “Search”?
    • 3. What is “Search”?• Information/Document Retrieval
    • 4. What is “Search”?• Information/Document Retrieval• Basic Definition:
    • 5. What is “Search”?• Information/Document Retrieval• Basic Definition: • Finding previously seen documents that are related to some user-supplied terms.
    • 6. What is “Search”?• Information/Document Retrieval• Basic Definition: • Finding previously seen documents that are related to some user-supplied terms.• Advanced Definition:
    • 7. What is “Search”?• Information/Document Retrieval• Basic Definition: • Finding previously seen documents that are related to some user-supplied terms.• Advanced Definition: • Finding relevant content for some query by understanding the contextual meaning of terms in the search index and query.
    • 8. What is “Search”?• Information/Document Retrieval• Basic Definition: • Finding previously seen documents that are related to some user-supplied terms.• Advanced Definition: • Finding relevant content for some query by understanding the contextual meaning of terms in the search index and query. • Semantic Search
    • 9. Types and Sources of Content
    • 10. Types and Sources of Content• The Bible and its verses
    • 11. Types and Sources of Content• The Bible and its verses• Articles, Journals, and other extra-biblical content
    • 12. Types and Sources of Content• The Bible and its verses• Articles, Journals, and other extra-biblical content• The web
    • 13. Information Retrieval Engines
    • 14. Information Retrieval Engines• Sphinx - http://sphinxsearch.com
    • 15. Information Retrieval Engines• Sphinx - http://sphinxsearch.com• Lucene - http://lucene.apache.org/
    • 16. Information Retrieval Engines• Sphinx - http://sphinxsearch.com• Lucene - http://lucene.apache.org/ • Solr - http://lucene.apache.org/solr/
    • 17. Information Retrieval Engines• Sphinx - http://sphinxsearch.com• Lucene - http://lucene.apache.org/ • Solr - http://lucene.apache.org/solr/• MySQL Fulltext Search - kinda
    • 18. Solr
    • 19. Solr• Open Source
    • 20. Solr• Open Source• Full-text search
    • 21. Solr• Open Source• Full-text search• Hit Highlighting
    • 22. Solr• Open Source• Full-text search• Hit Highlighting• Facets
    • 23. Solr• Open Source• Full-text search• Hit Highlighting• Facets• Java
    • 24. Solr• Open Source• Full-text search• Hit Highlighting• Facets• Java• REST-like HTTP/XML and JSON APIs
    • 25. Solr Documents
    • 26. Solr Documents• A document represents a distinct piece of content that can be stored/retrieved
    • 27. Solr Documents• A document represents a distinct piece of content that can be stored/retrieved • Bible Verse
    • 28. Solr Documents• A document represents a distinct piece of content that can be stored/retrieved • Bible Verse • Journal Article
    • 29. Solr Documents• A document represents a distinct piece of content that can be stored/retrieved • Bible Verse • Journal Article • Commentary Chapter/Section
    • 30. Solr Documents• A document represents a distinct piece of content that can be stored/retrieved • Bible Verse • Journal Article • Commentary Chapter/Section • Web Page
    • 31. Solr Documents
    • 32. Solr Documents• Documents have one or more Fields
    • 33. Solr Documents• Documents have one or more Fields• Fields Have types
    • 34. Solr Documents• Documents have one or more Fields• Fields Have types • Integer
    • 35. Solr Documents• Documents have one or more Fields• Fields Have types • Integer • Float
    • 36. Solr Documents• Documents have one or more Fields• Fields Have types • Integer • Float • String
    • 37. Solr Documents• Documents have one or more Fields• Fields Have types • Integer • Float • String • Text
    • 38. Solr Documents• Documents have one or more Fields• Fields Have types • Integer • Float • String • Text • Date
    • 39. Solr Documents• Documents have one or more Fields• Fields Have types • Integer • Float • String • Text • Date • and More!
    • 40. Solr Fields
    • 41. Solr Fields• Field Types can have:
    • 42. Solr Fields• Field Types can have: • Filters
    • 43. Solr Fields• Field Types can have: • Filters • Remove parts of the content
    • 44. Solr Fields• Field Types can have: • Filters • Remove parts of the content • Tokenizers
    • 45. Solr Fields• Field Types can have: • Filters • Remove parts of the content • Tokenizers • Split content into chunks/tokens
    • 46. Solr Fields
    • 47. Solr Fields• The “String” Field Type
    • 48. Solr Fields• The “String” Field Type• <fieldType name="string" class="solr.StrField" />
    • 49. Solr Fields• The “String” Field Type• <fieldType name="string" class="solr.StrField" />• No Filter; No Tokenizer
    • 50. Solr Fields• The “String” Field Type• <fieldType name="string" class="solr.StrField" />• No Filter; No Tokenizer • Field content won’t be split or changed
    • 51. <fieldtype name="html_text" class="solr.TextField" > <analyzer type="index"> <tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory"/> <filter class="solr.WordDelimiterFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPorterFilterFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" /> <filter class="solr.StopFilterFactory"/> <filter class="solr.WordDelimiterFilterFactory" /> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPorterFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer></fieldtype>
    • 52. Sample Schema (cont.)<fieldtype name="sint" class="solr.SortableIntField" omitNorms="true" /><fieldtype name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
    • 53. Sample Schema (cont.)<fields> <field name="id" type="sint" indexed="true" stored="true" multiValued="false" /> <field name="abbr" type="string" indexed="true" stored="true" multiValued="false" /> <field name="name" type="string" indexed="true" stored="true" multiValued="false" /> <field name="book" type="sint" indexed="true" stored="true" multiValued="false" /> <field name="chapter" type="sint" indexed="true" stored="true" multiValued="false" /> <field name="verse" type="sint" indexed="true" stored="true" multiValued="false" /> <field name="ot_nt" type="string" indexed="true" stored="true" multiValued="false" /> <field name="net" type="text" indexed="false" stored="true" multiValued="false" /> <field name="all_index" type="html_text" indexed="true" stored="false" /></fields><copyField source="net" dest="all_index" /><uniqueKey>id</uniqueKey><defaultSearchField>all_index</defaultSearchField><solrQueryParser defaultOperator="OR" />
    • 54. Put Data in Solr
    • 55. Put Data in Solr• Remember, Solr communicates using XML over HTTP
    • 56. Put Data in Solr• Remember, Solr communicates using XML over HTTP• No concept of updating a document - delete, then add
    • 57. Put Data in Solr• Remember, Solr communicates using XML over HTTP• No concept of updating a document - delete, then add• To add, POST XML to update handler
    • 58. Put Data in Solr• Remember, Solr communicates using XML over HTTP• No concept of updating a document - delete, then add• To add, POST XML to update handler • http://localhost:8080/solr/bible/update
    • 59. Add XML<add> <doc> <id>1</id> <net>In the beginning God created the heavens and the earth.</net> </doc></add>
    • 60. PHP API• No XML!• $client = new SolrClient($options); $doc = new SolrInputDocument(); $doc->addField(id, 1); //Must be Integer $doc->addField(net, ‘In the beginning God created the heavens and the earth.’); $client->addDocument($doc);
    • 61. Querying Solr
    • 62. Querying Solr• HTTP GET Request
    • 63. Querying Solr• HTTP GET Request• http://localhost:8080/solr/bible3/select?q=god
    • 64. Querying Solr• HTTP GET Request• http://localhost:8080/solr/bible3/select?q=god• | Path to Solr ||Core||Handler||Query |
    • 65. Querying Solr• HTTP GET Request• http://localhost:8080/solr/bible3/select?q=god• | Path to Solr ||Core||Handler||Query |• Returns XML By Default
    • 66. Querying Solr• HTTP GET Request• http://localhost:8080/solr/bible3/select?q=god• | Path to Solr ||Core||Handler||Query |• Returns XML By Default• Can return JSON and more
    • 67. Querying Solr
    • 68. Querying Solr• Queries the defaultSearchField by default
    • 69. Querying Solr• Queries the defaultSearchField by default • <defaultSearchField>all_index</defaultSearchField>
    • 70. Querying Solr• Queries the defaultSearchField by default • <defaultSearchField>all_index</defaultSearchField>• Can query other fields by using the syntax:field:value
    • 71. Querying Solr• Queries the defaultSearchField by default • <defaultSearchField>all_index</defaultSearchField>• Can query other fields by using the syntax:field:value • http://localhost:8080/solr/bible3/select?q=id:27974
    • 72. Querying Solr• Queries the defaultSearchField by default • <defaultSearchField>all_index</defaultSearchField>• Can query other fields by using the syntax:field:value • http://localhost:8080/solr/bible3/select?q=id:27974• Multiple queries / Booleans
    • 73. Querying Solr• Queries the defaultSearchField by default • <defaultSearchField>all_index</defaultSearchField>• Can query other fields by using the syntax:field:value • http://localhost:8080/solr/bible3/select?q=id:27974• Multiple queries / Booleans • http://localhost:8080/solr/bible3/select?q=god AND book:40
    • 74. Search MultipleTranslations (Fields)
    • 75. Search Multiple Translations (Fields)• Let’s add some fields: kjv and kjv_index
    • 76. Search Multiple Translations (Fields)• Let’s add some fields: kjv and kjv_index• Add some copy field directives: <copyField source="kjv" dest="all_index" /> <copyField source="kjv" dest="kjv_index" />
    • 77. Search Multiple Translations (Fields)• Let’s add some fields: kjv and kjv_index• Add some copy field directives: <copyField source="kjv" dest="all_index" /> <copyField source="kjv" dest="kjv_index" />• Query: “Shew Thyself”
    • 78. Search Multiple Translations (Fields)• Let’s add some fields: kjv and kjv_index• Add some copy field directives: <copyField source="kjv" dest="all_index" /> <copyField source="kjv" dest="kjv_index" />• Query: “Shew Thyself” • 0 Results in the NET http://localhost:8080/solr/bible3/select?q=shew%20theyself
    • 79. Search Multiple Translations (Fields)• Let’s add some fields: kjv and kjv_index• Add some copy field directives: <copyField source="kjv" dest="all_index" /> <copyField source="kjv" dest="kjv_index" />• Query: “Shew Thyself” • 0 Results in the NET http://localhost:8080/solr/bible3/select?q=shew%20theyself • 360 Results in the Combined index/field http://localhost:8080/solr/bible4/select?q=shew%20theyself
    • 80. Search Multiple Translations
    • 81. Search Multiple Translations• + Quasi Synonym term/phrase injection
    • 82. Search Multiple Translations• + Quasi Synonym term/phrase injection• + Less variation across translations leads to stronger possible matches
    • 83. Search Multiple Translations• + Quasi Synonym term/phrase injection• + Less variation across translations leads to stronger possible matches• + Matches verses when the source translation isn’t known
    • 84. Search Multiple Translations• + Quasi Synonym term/phrase injection• + Less variation across translations leads to stronger possible matches• + Matches verses when the source translation isn’t known• - No control over which translation gets more weight
    • 85. Search Multiple Translations• + Quasi Synonym term/phrase injection• + Less variation across translations leads to stronger possible matches• + Matches verses when the source translation isn’t known• - No control over which translation gets more weight• - No control over scoring of matches
    • 86. Search Multiple Translations• Another way: Dismax• Can score a document (verse) match based on scores/matches from multiple fields.• net_index^1 kjv_index^1 • Not exponents - weights • We’re searching the net_index and kjv_index fields, each with a boost/weight of 1.• net_index^6 kjv_index^.5• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=. 1&qf=net_index^1%20kjv_index^1&fl=score• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=. 1&qf=net_index^6%20kjv_index^.5&fl=score
    • 87. Scoring
    • 88. Scoring• score(q,d) = coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2·  norm(t,d)) t in q
    • 89. Scoring• score(q,d) = coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2·  norm(t,d)) t in q• Basic Factors
    • 90. Scoring• score(q,d) = coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2·  norm(t,d)) t in q• Basic Factors • Term Frequency in a document (↑ is better)
    • 91. Scoring• score(q,d) = coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2·  norm(t,d)) t in q• Basic Factors • Term Frequency in a document (↑ is better) • Term Frequency in Corpus (↓ is Better)
    • 92. Scoring• score(q,d) = coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2·  norm(t,d)) t in q• Basic Factors • Term Frequency in a document (↑ is better) • Term Frequency in Corpus (↓ is Better) • Length of matching document (↓ is Better)
    • 93. Scoring• score(q,d) = coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2·  norm(t,d)) t in q• Basic Factors • Term Frequency in a document (↑ is better) • Term Frequency in Corpus (↓ is Better) • Length of matching document (↓ is Better) • “Jesus Wept” - John 11:35
    • 94. Scoring• score(q,d) = coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2·  norm(t,d)) t in q• Basic Factors • Term Frequency in a document (↑ is better) • Term Frequency in Corpus (↓ is Better) • Length of matching document (↓ is Better) • “Jesus Wept” - John 11:35 • http://localhost:8080/solr/bible3/select?q=wept
    • 95. Scoring• score(q,d) = coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2·  norm(t,d)) t in q• Basic Factors • Term Frequency in a document (↑ is better) • Term Frequency in Corpus (↓ is Better) • Length of matching document (↓ is Better) • “Jesus Wept” - John 11:35 • http://localhost:8080/solr/bible3/select?q=wept• http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/ Similarity.html
    • 96. Search Multiple Translations
    • 97. Search Multiple Translations• Another way: Dismax
    • 98. Search Multiple Translations• Another way: Dismax• Can score a document (verse) match based on scores/matches from multiple fields.
    • 99. Search Multiple Translations• Another way: Dismax• Can score a document (verse) match based on scores/matches from multiple fields.• net_index^1 kjv_index^1
    • 100. Search Multiple Translations• Another way: Dismax• Can score a document (verse) match based on scores/matches from multiple fields.• net_index^1 kjv_index^1 • Not exponents - weights
    • 101. Search Multiple Translations• Another way: Dismax• Can score a document (verse) match based on scores/matches from multiple fields.• net_index^1 kjv_index^1 • Not exponents - weights • We’re searching the net_index and kjv_index fields, each with a boost/weight of 1.
    • 102. Search Multiple Translations• Another way: Dismax• Can score a document (verse) match based on scores/matches from multiple fields.• net_index^1 kjv_index^1 • Not exponents - weights • We’re searching the net_index and kjv_index fields, each with a boost/weight of 1.• net_index^6 kjv_index^.5
    • 103. Search Multiple Translations• Another way: Dismax• Can score a document (verse) match based on scores/matches from multiple fields.• net_index^1 kjv_index^1 • Not exponents - weights • We’re searching the net_index and kjv_index fields, each with a boost/weight of 1.• net_index^6 kjv_index^.5• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=. 1&qf=net_index^1%20kjv_index^1&fl=score
    • 104. Search Multiple Translations• Another way: Dismax• Can score a document (verse) match based on scores/matches from multiple fields.• net_index^1 kjv_index^1 • Not exponents - weights • We’re searching the net_index and kjv_index fields, each with a boost/weight of 1.• net_index^6 kjv_index^.5• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=. 1&qf=net_index^1%20kjv_index^1&fl=score• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=. 1&qf=net_index^6%20kjv_index^.5&fl=score
    • 105. Topic Tagging
    • 106. Topic Tagging• Use a topically-tagged Bible/concordance to mark- up each verse, or just key verses
    • 107. Topic Tagging• Use a topically-tagged Bible/concordance to mark- up each verse, or just key verses• Helpful for “theme” based queries.
    • 108. Topic Tagging• Use a topically-tagged Bible/concordance to mark- up each verse, or just key verses• Helpful for “theme” based queries. • “Social Justice” - no good matches
    • 109. Topic Tagging• Use a topically-tagged Bible/concordance to mark- up each verse, or just key verses• Helpful for “theme” based queries. • “Social Justice” - no good matches • “Satan” - Many Names
    • 110. Topic Tagging• Use a topically-tagged Bible/concordance to mark- up each verse, or just key verses• Helpful for “theme” based queries. • “Social Justice” - no good matches • “Satan” - Many Names • Name Tagging in general can be very helpful
    • 111. Searching Strong’s
    • 112. Searching Strong’s• Add a field for Strong’s: strongs_index
    • 113. Searching Strong’s• Add a field for Strong’s: strongs_index• 1473 1510 2316 11 2316 2464 2532 2316 2384 1510 3756 2316 3498 235 2198
    • 114. Searching Strong’s• Add a field for Strong’s: strongs_index• 1473 1510 2316 11 2316 2464 2532 2316 2384 1510 3756 2316 3498 235 2198• Most of the benefits of text searching
    • 115. Searching Strong’s• Add a field for Strong’s: strongs_index• 1473 1510 2316 11 2316 2464 2532 2316 2384 1510 3756 2316 3498 235 2198• Most of the benefits of text searching • “Word” frequency
    • 116. Searching Strong’s• Add a field for Strong’s: strongs_index• 1473 1510 2316 11 2316 2464 2532 2316 2384 1510 3756 2316 3498 235 2198• Most of the benefits of text searching • “Word” frequency • Document vs. corpus frequency of search terms
    • 117. Searching Articles
    • 118. Searching Articles• Similar approach to text-based queries
    • 119. Searching Articles• Similar approach to text-based queries • Stem words
    • 120. Searching Articles• Similar approach to text-based queries • Stem words • Use Synonyms
    • 121. Searching Articles• Similar approach to text-based queries • Stem words • Use Synonyms • Remove Stop Words
    • 122. Searching Articles• Similar approach to text-based queries • Stem words • Use Synonyms • Remove Stop Words• Without manual tagging, there’s no automatic way to index/search by Bible Reference
    • 123. Searching Articles
    • 124. Searching Articles• Article contains reference: “John 3”
    • 125. Searching Articles• Article contains reference: “John 3”• User searches for “John 3:16” or “John 2-4”
    • 126. Searching Articles• Article contains reference: “John 3”• User searches for “John 3:16” or “John 2-4”• Results: no meaningful matches at best (unless the documents match the query “John”
    • 127. Searching Articles
    • 128. Searching Articles• Solr-based Solutions:
    • 129. Searching Articles• Solr-based Solutions: • Identify and index references and their composite verses using a grammar.
    • 130. Searching Articles• Solr-based Solutions: • Identify and index references and their composite verses using a grammar. • John 1:1-3 -> John 1:1; John 1:2; John 1:3
    • 131. Searching Articles• Solr-based Solutions: • Identify and index references and their composite verses using a grammar. • John 1:1-3 -> John 1:1; John 1:2; John 1:3 • Store in a multivalued field - each reference is a “term”
    • 132. Searching Articles• Solr-based Solutions: • Identify and index references and their composite verses using a grammar. • John 1:1-3 -> John 1:1; John 1:2; John 1:3 • Store in a multivalued field - each reference is a “term” • Must also parse and expand references in queries in order to match
    • 133. Searching Articles
    • 134. Searching Articles• Relational database-based solution:
    • 135. Searching Articles• Relational database-based solution: • Assign an id to every verse
    • 136. Searching Articles• Relational database-based solution: • Assign an id to every verse • Store: id, articleId, verseId
    • 137. Searching Articles• Relational database-based solution: • Assign an id to every verse • Store: id, articleId, verseId • Parse user query to ids.
    • 138. Searching Articles• Relational database-based solution: • Assign an id to every verse • Store: id, articleId, verseId • Parse user query to ids. • SELECT COUNT(id) WHERE verseId IN (ID_LIST) GROUP BY articleId
    • 139. Searching Articles• Relational database-based solution: • Assign an id to every verse • Store: id, articleId, verseId • Parse user query to ids. • SELECT COUNT(id) WHERE verseId IN (ID_LIST) GROUP BY articleId • Higher count -> Article is most likely to me more about that reference than other articles with a lower count
    • 140. Searching Articles
    • 141. Searching Articles• Relational database-based solution:
    • 142. Searching Articles• Relational database-based solution: • Large amount of rows.
    • 143. Searching Articles• Relational database-based solution: • Large amount of rows. • 15,000 Journal articles have > 9,000,000 rows (verse occurrences)
    • 144. Searching Articles• Relational database-based solution: • Large amount of rows. • 15,000 Journal articles have > 9,000,000 rows (verse occurrences) • Can store id, articleId, verseId, count
    • 145. Searching Articles• Relational database-based solution: • Large amount of rows. • 15,000 Journal articles have > 9,000,000 rows (verse occurrences) • Can store id, articleId, verseId, count • Then SUM() the counts for each articleId.
    • 146. Searching Articles• Relational database-based solution: • Large amount of rows. • 15,000 Journal articles have > 9,000,000 rows (verse occurrences) • Can store id, articleId, verseId, count • Then SUM() the counts for each articleId. • Negligibly faster.
    • 147. Searching Articles• Relational database-based solution: • Large amount of rows. • 15,000 Journal articles have > 9,000,000 rows (verse occurrences) • Can store id, articleId, verseId, count • Then SUM() the counts for each articleId. • Negligibly faster. • Only approx. 3,000,000 rows
    • 148. Heterogeneous Indexes
    • 149. Heterogeneous Indexes• All content is not created equally.
    • 150. Heterogeneous Indexes• All content is not created equally.• Content quality and its affect on the quality of your results becomes a factor when you move from one resource to > one
    • 151. Heterogeneous Indexes• All content is not created equally.• Content quality and its affect on the quality of your results becomes a factor when you move from one resource to > one • One Bible, One website, One Journal
    • 152. Heterogeneous Indexes• All content is not created equally.• Content quality and its affect on the quality of your results becomes a factor when you move from one resource to > one • One Bible, One website, One Journal• Apply a field or document boost to help normalize results
    • 153. Heterogeneous Indexes• All content is not created equally.• Content quality and its affect on the quality of your results becomes a factor when you move from one resource to > one • One Bible, One website, One Journal• Apply a field or document boost to help normalize results• Some content gets bumped up and some down