Lessons from the Field: Common Mistakes and Things Overlooked When Deploying a Search Application Using the Lucidworks Fusion Stack - Josh Goldstein & Michael Hunn, Lucidworks
Similar to Lessons from the Field: Common Mistakes and Things Overlooked When Deploying a Search Application Using the Lucidworks Fusion Stack - Josh Goldstein & Michael Hunn, Lucidworks
Similar to Lessons from the Field: Common Mistakes and Things Overlooked When Deploying a Search Application Using the Lucidworks Fusion Stack - Josh Goldstein & Michael Hunn, Lucidworks (20)
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Lessons from the Field: Common Mistakes and Things Overlooked When Deploying a Search Application Using the Lucidworks Fusion Stack - Josh Goldstein & Michael Hunn, Lucidworks
1. Lessons From the Field
Josh Goldstein and Michael Hunn
Solutions Architects, Lucidworks
@joshsgoldstein and @calispqr
#Activate18 #ActivateSearch
2. Agenda
• What do the numbers tell you?
• Getting to know your Data
• The User Experience
• Countdown to Launch
• Q & A
8. Get To know your Data: Facets
• How Many Facets per page?
• How does my data need to be setup?
• How do I choose my Facet Type?
• Single Select
• Multi Select
• Visualization
• Range
• Search within
12. The user Experience: Typeahead
• What content to use?
• Value vs Speed?
• Deciding what experience you want for the user
• Where would you like the user to go?
13. The user Experience: Branding
• Navigation/Page Types
• Dashboard Result Detail Page
• What do I need to properly brand?
• Can I use templates?
21. Thank you!
Josh Goldstein and Michael Hunn
Solutions Architects, Lucidworks
@twitterhandle
#Activate18 #ActivateSearch
Editor's Notes
THIS IS ALL IN THE CONTEXT OF A SIMPLE MOVIE SEARCHING APP
Architecture:
How we get to the right infrastructure for your application
Fusion
App Studio
How we deploy
Get to know your data
Index Data(Michael)
Query Pipelines (Michael)
Using Facets (Josh and Michael)
Visualization Decision (Josh)
The User Experience:
Sign on experience
Typeahead
branding
Getting Ready for Production
Performance on the query side
Performance on the app studio side
The WAR
introduce the topics we will be covering
Architecture:
How we get to the right infrastructure for your application
Fusion
App Studio
How we deploy
Fusion/Solr
Sizing
SOLR
1 shard 1 server N Documents for a baseline
QPS and DPS
CPU
MEMORY
Disk I/O
Network Latency
Configuring Datasource
For large datasets the best practice would be to distribute the crawl across multiple datasource configurations in order to increase performance and decrease crawl completion time
App Studio App
How many users?
Determines the number of nodes you need
What does it mean when we say complexity?
1 server of App Studio can confidently handle a couple of thousand of concurrent sessions at a time. That means that there at any given moment 1000 - 2000 users can be using the app at a time
What you put into a collection determines what is going to come out and could have an affect on query performance.
Solr field mapping stage for development
By the time you go to production you should know your exact fields and be able to explicitly define your schema
Use the Solr Dynamic Field Mapping Stage during development but it should
How Many Facets per page? (Josh)
Depends on use case
Performance as number of facets grow
Also keep in mind real estate on the page
Collapsed vs expanded
Ecommerce
Complexity of the product you sell could be anywhere from 5 - 10
Enterprise Search
For real estate limit to 5
What does my data need to look like? (Michael)
String Dates and Numbers fields for the actual faceting
Text Fields for search
How do I choose my Facet Type?
Single Select
Multi Select
Visualization
if large data sets (time series say) you can also leverage built in capability to have each chart efficiently fetch its own data.
Range
Search within
Performance impac
The User Experience:
Sign on experience
Typeahead
branding
Start with the question raise your hands if you like signing in to things?
That’s what I thought
The best experience is no experience you should never even have to see
If you have SAML USE IT
Talk about anonymous vs sign on you have the option to sign in.
It needs to be fast
What content should we use?
Titles
Queries
Recommendations
More queries like this
Tradeoff between value vs speed
Grouping vs non grouping
Example: title and recent queries
Enterprise Search Use Case:
Passed queries only
what would you like to see
Helping the user frame their search rather than you get the user to his e
Where you would like the user to go
Search vs going to end result
When to group?
Second secondary bullet
More bulleted content
go beyond your defaults
Load balancing
QPS
High usage threshold
uLimits defaults
perform experiments
A number of things we can do to increase the performance of
Testing
Open endpoints
Proper fields being returned
Reduce log level
Encrypt passwords
AppKit Pre-flight Checklist
Environment
Application
User Experience
Performance
Environment
Ensure appropriate memory allocation is in place.
Ensure that the search engine is not directly available to the internet but suitably protected with only the AppKit application able to access it.
Ensure that the application server is configured to support UTF-8 characters in GET requests.
The application server should be using the Oracle Java Runtime.
Application
Remove all commented out code snippets.
Remove all pages and configuration that is not in use.
Load test for the estimated maximum number of users with an acceptable buffer for peak load.
Ensure that the application has a production license and not a development or evaluation one.
If field names are unconventional use platform configuration aliasing to refer to them in the user interface. This allows you to change field names in the index without affecting the view layer.
User Experience
Keyword search should work from any page and redirect via action as appropriate.
All metadata that makes sense for filtering or discovery should be clickable and linked to the appropriate search page.
Facets with empty values should be replaced with a more appropriate one.
Breadcrumbs should indicate which filters have been applied to the user’s query.
Use spelling suggestions when supported by the platform to automatically correct misspellings or suggest alternatives.
Performance
Only request fields and facets from the search engine that are being displayed on the page.
When using bookmarks or other social widgets in a result list, make sure to use the social response processor whenever possible. This processor loads all the bookmarks, etc. in one database hit. Without this, each bookmark tag has no knowledge of the others in the result list, and will make a separate database query. This results in N queries to the database, where N is the number of results
Detail & Topic Pages
The detail page should have a link back to the previous search page when appropriate.
Use short sluggified identifiers when possible for topic pages.
A number of things we can do to increase the performance of
Testing
Open endpoints
Proper fields being returned
Reduce log level
Encrypt passwords
AppKit Pre-flight Checklist
Environment
Application
User Experience
Performance
Environment
Ensure appropriate memory allocation is in place.
Ensure that the search engine is not directly available to the internet but suitably protected with only the AppKit application able to access it.
Ensure that the application server is configured to support UTF-8 characters in GET requests.
The application server should be using the Oracle Java Runtime.
Application
Remove all commented out code snippets.
Remove all pages and configuration that is not in use.
Load test for the estimated maximum number of users with an acceptable buffer for peak load.
Ensure that the application has a production license and not a development or evaluation one.
If field names are unconventional use platform configuration aliasing to refer to them in the user interface. This allows you to change field names in the index without affecting the view layer.
User Experience
Keyword search should work from any page and redirect via action as appropriate.
All metadata that makes sense for filtering or discovery should be clickable and linked to the appropriate search page.
Facets with empty values should be replaced with a more appropriate one.
Breadcrumbs should indicate which filters have been applied to the user’s query.
Use spelling suggestions when supported by the platform to automatically correct misspellings or suggest alternatives.
Performance
Only request fields and facets from the search engine that are being displayed on the page.
When using bookmarks or other social widgets in a result list, make sure to use the social response processor whenever possible. This processor loads all the bookmarks, etc. in one database hit. Without this, each bookmark tag has no knowledge of the others in the result list, and will make a separate database query. This results in N queries to the database, where N is the number of results
Detail & Topic Pages
The detail page should have a link back to the previous search page when appropriate.
Use short sluggified identifiers when possible for topic pages.