SlideShare a Scribd company logo
1 of 33
Do Some SOLR
Searching in .NET
DAVID HOERSTER
About Me
C# MVP (Since April 2011)
Director of Web Solutions at RGP
Co-Founder of BrainCredits (braincredits.com)
Conference Director for Pittsburgh TechFest
Past President of Pittsburgh .NET Users Group
Organizer of recent Pittsburgh Code Camps and other Tech Events
Twitter - @DavidHoerster
Blog – http://geekswithblogs.net/DavidHoerster
Email – david@agileways.com
Take Aways
What is SOLR
When You May Use SOLR
How to Integrate SOLR in a .NET Application
Strategies for Managing RDBMS and SOLR transactions
Agenda
Searching in Apps
Hello SOLR
Installing and Running SOLR
Admin Interface
Using SOLR in .NET
◦ Retrieving Data
◦ Modifying Collections
◦ Interesting Features
◦ Highlighting, Snippets, Facets
Searching in Applications
Searching in Applications
How do we accomplish these?
◦ Stored Procs?
◦ Bunch of LIKE’s?
◦ SQL Server Full-Text?
◦ Something else?
Lots of solutions
SOLR could be one
SOLR
Open Source
Search Service Platform
Built on Lucene
Provides a number of features, such as
◦ Full Text Indexing
◦ Hit Highlighting
◦ Faceted Searches
◦ Clustering and Replication
HTTP REST-like interface, providing results in JSON, XML, CSV, and other formats
Written in Java and runs within the JVM
Why Use SOLR?
Small application or prototype environment
Mixed environment or maybe non-SQL Server environment
NoSQL usage that doesn’t have full-text indexing
Features required such as faceted search, highlighting, more-like-this
Extensible search features and data types
SOLR Deployment (Basic)
Application
(e.g. web
server – port
80)
SOLR Service
(port 8983)
Client
Application may or may not connect directly with SOLR
SOLR Service runs within JVM
Usually not best to publicly expose SOLR
HTTP
Some Things to Remember
SOLR does not have authentication built in
◦ Treat it as a service to your application
◦ Do not expose externally unless you want the world to search
SOLR is not a document database in the league of MongoDB
◦ Some NoSQL features
◦ Flat structures (MongoDB has some depth)
◦ Some examples use SOLR like a DB…
◦ More for expedience and simplicity
◦ Not a recommendation
My Implementation of SOLR
Web Client
Web
Server
(PHP)
SOLR
Instance
Content
Database
(postgreSQL /
SQL
SOLR
Indexer
(.NET)
GIT
Repository
Fetch
Get
Create /
Update
Search
Get
Internal NetworkPublic Internet
HTTP
Remote Repo
Installing SOLR
Very simple to quickly get up and running
Assumes you have JRE installed
Download SOLR from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html?
Extract ZIP file to a directory of your choice
◦ I chose C:SOLR as my SOLR root
From a command prompt, navigate to the examples directory and start the Jetty server
◦ cd c:solr4.4.0examples
◦ java -jar start.jar
That’s it – SOLR is ready to go!
Default “collection1” core is set up (but you’ll probably want to delete it)
SOLR Administration Interface
Admin UI available out of the box
Check status
Add/Remove Cores
Issue Queries
Check Logs
Modify Schemas
Lots more!
SOLR Collections, Schemas and
Documents
Collection is a group of similar items
◦ Like a table in SQL
Document is a single item in a collection
◦ Defines an item to be searched
◦ Contains fields
◦ Document is like a SQL row
Fields are individual properties of a document
◦ Like a SQL column
◦ Has a type and a value
Schema defines the structure of documents in a collection
◦ Defines fields, types, keys, dynamic fields and copy rules
Schema basic structure:
<schema>
<types>
<fields>
<uniqueKey>
<defaultSearchField>
<solrQueryParser defaultOperator>
<copyField>
</schema>
Document Fields
Area in schema most likely to alter
Various data types available built-in
◦ int, float, string, date, …
Fields have a number of properties
◦ can be single or multi-valued
◦ fields like ‘text’ are great for concatenating fields together for aggregated searching
◦ you can choose to index a field, store the field value, or both
<field name="lahmanId" type="int" indexed="true" stored="true" required="true"
multiValued="false" />
Querying
Let’s Get Some Data!!
SOLR is based on Inverted Index concept
◦ Instead of ID’s mapped to entries, words are mapped to ID’s.
◦ Analyzers then traverse inverted index and evaluates relevance
Admin UI provides a quick and dirty interface to retrieve data
Most query options available
Can also specify format
Once parameters issued, URL is available as reference
Querying
Basic parameter is `q`
◦ http://localhost:8983/solr/<collection>?q=<field>:<value>
Other basic parameters include:
◦ Query Fields (qf) – selects the fields to return
◦ Sorting (sort) – specifies the fields to sort on and direction
◦ Row Offset (start) – which row to start with when returning results (default is
0)
◦ Caching (cache) – tells SOLR whether to cache the results (default is true)
◦ Rows to return (rows) – how many rows to return in the call (default is 10)
These are all query string parameters.
Demo
USING THE ADMIN INTERFACE
Working with SOLR in .NET
solrnet library
◦ https://code.google.com/p/solrnet/
◦ Source: https://github.com/mausch/SolrNet/tree/master/SolrNet
WARNING: If you’re using SOLR 4+
◦ Committing in solrnet will throw an error
◦ Need to download latest code from GitHub and compile
◦ Or download a package’s code and remove the initialization of the waitFlush property from
solr/commands/parameters/CommitOptions.cs
Set Up Typed Entities
public class Quote{
[SolrUniqueKey("id")]
public String Id { get; set; }
[SolrField("title")]
public String Title { get; set; }
[SolrField("articleBody")]
public String ArticleBody { get; set; }
[SolrField("year")]
public Int32 Year { get; set; }
[SolrField("abstract")]
public String Abstract { get; set; }
[SolrField("source")]
public String Source { get; set; }
}
Initializing solrnet
Startup.Init<Quote>("http://localhost:8983/solr/historicalQuotes");
Startup.Init<Hitter>("http://localhost:8983/solr/baseball");
ISolrOperations<Quote> _solr =
ServiceLocator.Current.GetInstance<ISolrOperations<Quote>>();
Uses Microsoft p&p’s ServiceLocator class to get SOLR
instance
Issuing a Query
Basic query, as it selects everything:
var quotes = _solr.Query(new SolrQuery("*:*"));
Returns just those records with an id of 12345:
var quotes = _solr.Query(new SolrQuery(“id:12345”));
Searches for specific text, and only returns 3 fields:
var query = new SolrQuery("text:" + id);
var options = new QueryOptions() {
Fields = new[] { "id", "title", "source" }
};
var results = _solr.Query(query, options);
Filter Queries
‘fq’ parameter
Runs the filter against the entire index and caches the results
Can help speed up searching if you know of common, recurring searches
In solrnet, use the FilterQueries QueryOption
_solr.Query(“*:*”, new QueryOptions {
FilterQueries = new ISolrQuery[] {
new SolrQueryByField(“HR”, “[50 TO *]”),
…
}
}
Modifying Data in SOLR
Using the existing SOLR instance to perform an insert…
_solr.Add(theQuote);
_solr.Commit();
Use the same instance to perform an update…
_solr.Add(theQuote);
_solr.Commit();
_solr.Optimize();
Commit writes your changes to SOLR’s index
Optimize rebuilds the index
◦ More expensive
◦ Be mindful when called
Search Features (Query Options)
Highlighting
Highlight = new HighlightingParameters() {
Fields = new[] { "articleBody", "abstract" },
Fragsize = 200,
AfterTerm = "</em></strong>",
BeforeTerm = "<em><strong>",
UsePhraseHighlighter = true
//, AlternateField = "source"
}
More Like This
MoreLikeThis = new MoreLikeThisParameters(
new[] { "articlebody", "source" })
{ MinDocFreq = 1, MinTermFreq = 1 }
Search Features (Query Options)
Faceted Search
Facet = new FacetParameters() {
Queries = FacetQueryCategories(minHomeRuns)
}
private SolrFacetQuery[] FacetQueryCategories(Int32 minHomeRuns) {
var salaryFacet1 =
new SolrQueryByRange<Int32>("salary", 0, 1000000);
...
return new[] { salaryFacet1 };
}
Demo
USING SOLR FEATURES TO ENHANCE YOUR SEARCH
EXPERIENCE
Handling the Distribution for Mods
Client Server
SOLR
RDBMS
Send the modification to
the RDBMS and to SOLR
and hope for the best.
Pretty optimistic!
Handling the Distribution for Mods
Client Server
SOLR
RDBMS
Wrap the RDBMS call in a
System.Transaction and
Rollback if SOLR throws
an exception.
Rollback if SOLR error
Check for error
More cautious
Handling the Distribution for Mods
Client Server
SOLR
RDBMS
Drop a command into a
queue for a Command
Handler to pick up.
Command
Handler/Domain
processes and raises
Event which can end up
in SOLR.
More complicated, but
more reliable.
Queue
Command
Handler
Persist Command
More Message Oriented (CQRS???)
SOLR as a Windows Service
NSSM can install SOLR quickly
◦ Non Sucking Service Manager
◦ http://nssm.cc/
◦ Version 2.16
◦ Hasn’t been updated in a little while
Launch NSSM as administrator
◦ nssm install SOLR
Java.exe is the executable
Command Line args are (specific to my install directory):
◦ -Djetty.logs=C:/solr/logs/-Djetty.home=C:/solr/-Dsolr.solr.home=C:/solr/solr/ -cp
C:/solr/lib/*.jar;C:/solr/start.jar -jar C:/solr/start.jar
Name the service and hit install. Done!
What’s Next
Other query techniques
◦ Boosting
◦ http://localhost:8983/solr/historicalQuotes/select/?defType=dismax&q=text&qf=source^20.0+te
xt^0.3
◦ Spatial
◦ Sounds like
SOLR Cloud
◦ SOLR replication and sharding
◦ Moving to the enterprise space
Extending SOLR Behaviors and Using Other Parsers
Using Dynamic Properties
Using SOLR in a full NoSQL Environment
Resources
SOLR Reference Guide
https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
SOLR Tutorial
http://lucene.apache.org/solr/4_4_0/tutorial.html
Nice SOLR Walk-Through
http://www.solrtutorial.com/
Books
Apache Solr 4 Cookbook (Packt Publishing)
Apache Solr 4 In Action (Manning – MEAP)

More Related Content

Viewers also liked

Reactive Development: Commands, Actors and Events. Oh My!!
Reactive Development: Commands, Actors and Events.  Oh My!!Reactive Development: Commands, Actors and Events.  Oh My!!
Reactive Development: Commands, Actors and Events. Oh My!!David Hoerster
 
Distributed Transactions in Akka.NET
Distributed Transactions in Akka.NETDistributed Transactions in Akka.NET
Distributed Transactions in Akka.NETpetabridge
 
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherFrom Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherTamir Dresher
 
Creating scalable message driven solutions akkadotnet
Creating scalable message driven solutions akkadotnetCreating scalable message driven solutions akkadotnet
Creating scalable message driven solutions akkadotnetDavid Hoerster
 
CQRS Evolved - CQRS + Akka.NET
CQRS Evolved - CQRS + Akka.NETCQRS Evolved - CQRS + Akka.NET
CQRS Evolved - CQRS + Akka.NETDavid Hoerster
 
Online game server on Akka.NET (NDC2016)
Online game server on Akka.NET (NDC2016)Online game server on Akka.NET (NDC2016)
Online game server on Akka.NET (NDC2016)Esun Kim
 
Solr: Search at the Speed of Light
Solr: Search at the Speed of LightSolr: Search at the Speed of Light
Solr: Search at the Speed of LightErik Hatcher
 
Google search vs Solr search for Enterprise search
Google search vs Solr search for Enterprise searchGoogle search vs Solr search for Enterprise search
Google search vs Solr search for Enterprise searchVeera Shekar
 
CQRS and Event Sourcing, An Alternative Architecture for DDD
CQRS and Event Sourcing, An Alternative Architecture for DDDCQRS and Event Sourcing, An Alternative Architecture for DDD
CQRS and Event Sourcing, An Alternative Architecture for DDDDennis Doomen
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 

Viewers also liked (10)

Reactive Development: Commands, Actors and Events. Oh My!!
Reactive Development: Commands, Actors and Events.  Oh My!!Reactive Development: Commands, Actors and Events.  Oh My!!
Reactive Development: Commands, Actors and Events. Oh My!!
 
Distributed Transactions in Akka.NET
Distributed Transactions in Akka.NETDistributed Transactions in Akka.NET
Distributed Transactions in Akka.NET
 
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir DresherFrom Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
From Zero to the Actor Model (With Akka.Net) - CodeMash2017 - Tamir Dresher
 
Creating scalable message driven solutions akkadotnet
Creating scalable message driven solutions akkadotnetCreating scalable message driven solutions akkadotnet
Creating scalable message driven solutions akkadotnet
 
CQRS Evolved - CQRS + Akka.NET
CQRS Evolved - CQRS + Akka.NETCQRS Evolved - CQRS + Akka.NET
CQRS Evolved - CQRS + Akka.NET
 
Online game server on Akka.NET (NDC2016)
Online game server on Akka.NET (NDC2016)Online game server on Akka.NET (NDC2016)
Online game server on Akka.NET (NDC2016)
 
Solr: Search at the Speed of Light
Solr: Search at the Speed of LightSolr: Search at the Speed of Light
Solr: Search at the Speed of Light
 
Google search vs Solr search for Enterprise search
Google search vs Solr search for Enterprise searchGoogle search vs Solr search for Enterprise search
Google search vs Solr search for Enterprise search
 
CQRS and Event Sourcing, An Alternative Architecture for DDD
CQRS and Event Sourcing, An Alternative Architecture for DDDCQRS and Event Sourcing, An Alternative Architecture for DDD
CQRS and Event Sourcing, An Alternative Architecture for DDD
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 

More from David Hoerster

Elm - Could this be the Future of Web Dev?
Elm - Could this be the Future of Web Dev?Elm - Could this be the Future of Web Dev?
Elm - Could this be the Future of Web Dev?David Hoerster
 
Being RDBMS Free -- Alternate Approaches to Data Persistence
Being RDBMS Free -- Alternate Approaches to Data PersistenceBeing RDBMS Free -- Alternate Approaches to Data Persistence
Being RDBMS Free -- Alternate Approaches to Data PersistenceDavid Hoerster
 
Freeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS ArchitectureFreeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS ArchitectureDavid Hoerster
 
A Minimalist’s Attempt at Building a Distributed Application
A Minimalist’s Attempt at Building a Distributed ApplicationA Minimalist’s Attempt at Building a Distributed Application
A Minimalist’s Attempt at Building a Distributed ApplicationDavid Hoerster
 
Greenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureGreenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureDavid Hoerster
 
jQuery and OData - Perfect Together
jQuery and OData - Perfect TogetherjQuery and OData - Perfect Together
jQuery and OData - Perfect TogetherDavid Hoerster
 

More from David Hoerster (7)

Elm - Could this be the Future of Web Dev?
Elm - Could this be the Future of Web Dev?Elm - Could this be the Future of Web Dev?
Elm - Could this be the Future of Web Dev?
 
Being RDBMS Free -- Alternate Approaches to Data Persistence
Being RDBMS Free -- Alternate Approaches to Data PersistenceBeing RDBMS Free -- Alternate Approaches to Data Persistence
Being RDBMS Free -- Alternate Approaches to Data Persistence
 
Mongo Baseball .NET
Mongo Baseball .NETMongo Baseball .NET
Mongo Baseball .NET
 
Freeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS ArchitectureFreeing Yourself from an RDBMS Architecture
Freeing Yourself from an RDBMS Architecture
 
A Minimalist’s Attempt at Building a Distributed Application
A Minimalist’s Attempt at Building a Distributed ApplicationA Minimalist’s Attempt at Building a Distributed Application
A Minimalist’s Attempt at Building a Distributed Application
 
Greenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows AzureGreenfield Development with CQRS and Windows Azure
Greenfield Development with CQRS and Windows Azure
 
jQuery and OData - Perfect Together
jQuery and OData - Perfect TogetherjQuery and OData - Perfect Together
jQuery and OData - Perfect Together
 

Recently uploaded

Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfFIDO Alliance
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxJennifer Lim
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPTiSEO AI
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfFIDO Alliance
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimaginedpanagenda
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...panagenda
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...FIDO Alliance
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 

Recently uploaded (20)

Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 

Do Some Solr Searching

  • 1. Do Some SOLR Searching in .NET DAVID HOERSTER
  • 2. About Me C# MVP (Since April 2011) Director of Web Solutions at RGP Co-Founder of BrainCredits (braincredits.com) Conference Director for Pittsburgh TechFest Past President of Pittsburgh .NET Users Group Organizer of recent Pittsburgh Code Camps and other Tech Events Twitter - @DavidHoerster Blog – http://geekswithblogs.net/DavidHoerster Email – david@agileways.com
  • 3. Take Aways What is SOLR When You May Use SOLR How to Integrate SOLR in a .NET Application Strategies for Managing RDBMS and SOLR transactions
  • 4. Agenda Searching in Apps Hello SOLR Installing and Running SOLR Admin Interface Using SOLR in .NET ◦ Retrieving Data ◦ Modifying Collections ◦ Interesting Features ◦ Highlighting, Snippets, Facets
  • 6. Searching in Applications How do we accomplish these? ◦ Stored Procs? ◦ Bunch of LIKE’s? ◦ SQL Server Full-Text? ◦ Something else? Lots of solutions SOLR could be one
  • 7. SOLR Open Source Search Service Platform Built on Lucene Provides a number of features, such as ◦ Full Text Indexing ◦ Hit Highlighting ◦ Faceted Searches ◦ Clustering and Replication HTTP REST-like interface, providing results in JSON, XML, CSV, and other formats Written in Java and runs within the JVM
  • 8. Why Use SOLR? Small application or prototype environment Mixed environment or maybe non-SQL Server environment NoSQL usage that doesn’t have full-text indexing Features required such as faceted search, highlighting, more-like-this Extensible search features and data types
  • 9. SOLR Deployment (Basic) Application (e.g. web server – port 80) SOLR Service (port 8983) Client Application may or may not connect directly with SOLR SOLR Service runs within JVM Usually not best to publicly expose SOLR HTTP
  • 10. Some Things to Remember SOLR does not have authentication built in ◦ Treat it as a service to your application ◦ Do not expose externally unless you want the world to search SOLR is not a document database in the league of MongoDB ◦ Some NoSQL features ◦ Flat structures (MongoDB has some depth) ◦ Some examples use SOLR like a DB… ◦ More for expedience and simplicity ◦ Not a recommendation
  • 11. My Implementation of SOLR Web Client Web Server (PHP) SOLR Instance Content Database (postgreSQL / SQL SOLR Indexer (.NET) GIT Repository Fetch Get Create / Update Search Get Internal NetworkPublic Internet HTTP Remote Repo
  • 12. Installing SOLR Very simple to quickly get up and running Assumes you have JRE installed Download SOLR from http://lucene.apache.org/solr/mirrors-solr-latest-redir.html? Extract ZIP file to a directory of your choice ◦ I chose C:SOLR as my SOLR root From a command prompt, navigate to the examples directory and start the Jetty server ◦ cd c:solr4.4.0examples ◦ java -jar start.jar That’s it – SOLR is ready to go! Default “collection1” core is set up (but you’ll probably want to delete it)
  • 13. SOLR Administration Interface Admin UI available out of the box Check status Add/Remove Cores Issue Queries Check Logs Modify Schemas Lots more!
  • 14. SOLR Collections, Schemas and Documents Collection is a group of similar items ◦ Like a table in SQL Document is a single item in a collection ◦ Defines an item to be searched ◦ Contains fields ◦ Document is like a SQL row Fields are individual properties of a document ◦ Like a SQL column ◦ Has a type and a value Schema defines the structure of documents in a collection ◦ Defines fields, types, keys, dynamic fields and copy rules Schema basic structure: <schema> <types> <fields> <uniqueKey> <defaultSearchField> <solrQueryParser defaultOperator> <copyField> </schema>
  • 15. Document Fields Area in schema most likely to alter Various data types available built-in ◦ int, float, string, date, … Fields have a number of properties ◦ can be single or multi-valued ◦ fields like ‘text’ are great for concatenating fields together for aggregated searching ◦ you can choose to index a field, store the field value, or both <field name="lahmanId" type="int" indexed="true" stored="true" required="true" multiValued="false" />
  • 16. Querying Let’s Get Some Data!! SOLR is based on Inverted Index concept ◦ Instead of ID’s mapped to entries, words are mapped to ID’s. ◦ Analyzers then traverse inverted index and evaluates relevance Admin UI provides a quick and dirty interface to retrieve data Most query options available Can also specify format Once parameters issued, URL is available as reference
  • 17. Querying Basic parameter is `q` ◦ http://localhost:8983/solr/<collection>?q=<field>:<value> Other basic parameters include: ◦ Query Fields (qf) – selects the fields to return ◦ Sorting (sort) – specifies the fields to sort on and direction ◦ Row Offset (start) – which row to start with when returning results (default is 0) ◦ Caching (cache) – tells SOLR whether to cache the results (default is true) ◦ Rows to return (rows) – how many rows to return in the call (default is 10) These are all query string parameters.
  • 18. Demo USING THE ADMIN INTERFACE
  • 19. Working with SOLR in .NET solrnet library ◦ https://code.google.com/p/solrnet/ ◦ Source: https://github.com/mausch/SolrNet/tree/master/SolrNet WARNING: If you’re using SOLR 4+ ◦ Committing in solrnet will throw an error ◦ Need to download latest code from GitHub and compile ◦ Or download a package’s code and remove the initialization of the waitFlush property from solr/commands/parameters/CommitOptions.cs
  • 20. Set Up Typed Entities public class Quote{ [SolrUniqueKey("id")] public String Id { get; set; } [SolrField("title")] public String Title { get; set; } [SolrField("articleBody")] public String ArticleBody { get; set; } [SolrField("year")] public Int32 Year { get; set; } [SolrField("abstract")] public String Abstract { get; set; } [SolrField("source")] public String Source { get; set; } }
  • 21. Initializing solrnet Startup.Init<Quote>("http://localhost:8983/solr/historicalQuotes"); Startup.Init<Hitter>("http://localhost:8983/solr/baseball"); ISolrOperations<Quote> _solr = ServiceLocator.Current.GetInstance<ISolrOperations<Quote>>(); Uses Microsoft p&p’s ServiceLocator class to get SOLR instance
  • 22. Issuing a Query Basic query, as it selects everything: var quotes = _solr.Query(new SolrQuery("*:*")); Returns just those records with an id of 12345: var quotes = _solr.Query(new SolrQuery(“id:12345”)); Searches for specific text, and only returns 3 fields: var query = new SolrQuery("text:" + id); var options = new QueryOptions() { Fields = new[] { "id", "title", "source" } }; var results = _solr.Query(query, options);
  • 23. Filter Queries ‘fq’ parameter Runs the filter against the entire index and caches the results Can help speed up searching if you know of common, recurring searches In solrnet, use the FilterQueries QueryOption _solr.Query(“*:*”, new QueryOptions { FilterQueries = new ISolrQuery[] { new SolrQueryByField(“HR”, “[50 TO *]”), … } }
  • 24. Modifying Data in SOLR Using the existing SOLR instance to perform an insert… _solr.Add(theQuote); _solr.Commit(); Use the same instance to perform an update… _solr.Add(theQuote); _solr.Commit(); _solr.Optimize(); Commit writes your changes to SOLR’s index Optimize rebuilds the index ◦ More expensive ◦ Be mindful when called
  • 25. Search Features (Query Options) Highlighting Highlight = new HighlightingParameters() { Fields = new[] { "articleBody", "abstract" }, Fragsize = 200, AfterTerm = "</em></strong>", BeforeTerm = "<em><strong>", UsePhraseHighlighter = true //, AlternateField = "source" } More Like This MoreLikeThis = new MoreLikeThisParameters( new[] { "articlebody", "source" }) { MinDocFreq = 1, MinTermFreq = 1 }
  • 26. Search Features (Query Options) Faceted Search Facet = new FacetParameters() { Queries = FacetQueryCategories(minHomeRuns) } private SolrFacetQuery[] FacetQueryCategories(Int32 minHomeRuns) { var salaryFacet1 = new SolrQueryByRange<Int32>("salary", 0, 1000000); ... return new[] { salaryFacet1 }; }
  • 27. Demo USING SOLR FEATURES TO ENHANCE YOUR SEARCH EXPERIENCE
  • 28. Handling the Distribution for Mods Client Server SOLR RDBMS Send the modification to the RDBMS and to SOLR and hope for the best. Pretty optimistic!
  • 29. Handling the Distribution for Mods Client Server SOLR RDBMS Wrap the RDBMS call in a System.Transaction and Rollback if SOLR throws an exception. Rollback if SOLR error Check for error More cautious
  • 30. Handling the Distribution for Mods Client Server SOLR RDBMS Drop a command into a queue for a Command Handler to pick up. Command Handler/Domain processes and raises Event which can end up in SOLR. More complicated, but more reliable. Queue Command Handler Persist Command More Message Oriented (CQRS???)
  • 31. SOLR as a Windows Service NSSM can install SOLR quickly ◦ Non Sucking Service Manager ◦ http://nssm.cc/ ◦ Version 2.16 ◦ Hasn’t been updated in a little while Launch NSSM as administrator ◦ nssm install SOLR Java.exe is the executable Command Line args are (specific to my install directory): ◦ -Djetty.logs=C:/solr/logs/-Djetty.home=C:/solr/-Dsolr.solr.home=C:/solr/solr/ -cp C:/solr/lib/*.jar;C:/solr/start.jar -jar C:/solr/start.jar Name the service and hit install. Done!
  • 32. What’s Next Other query techniques ◦ Boosting ◦ http://localhost:8983/solr/historicalQuotes/select/?defType=dismax&q=text&qf=source^20.0+te xt^0.3 ◦ Spatial ◦ Sounds like SOLR Cloud ◦ SOLR replication and sharding ◦ Moving to the enterprise space Extending SOLR Behaviors and Using Other Parsers Using Dynamic Properties Using SOLR in a full NoSQL Environment
  • 33. Resources SOLR Reference Guide https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide SOLR Tutorial http://lucene.apache.org/solr/4_4_0/tutorial.html Nice SOLR Walk-Through http://www.solrtutorial.com/ Books Apache Solr 4 Cookbook (Packt Publishing) Apache Solr 4 In Action (Manning – MEAP)