SlideShare a Scribd company logo
1 of 19
Full Text Search
Django + Postgres
Search is everywhere
Search expectations
● FAST
● Full Text search
● Linguistic support (“craziness | crazy”)
● Ranking
● Fuzzy Searching
● More like this
Django
● SLOW
● `icontains` is dumbed down version of
search
● Searching across tables is pain
● No relevancy, ranking or similar words
unless done manually
● No easy way for fuzzy searching
Other Alternatives
● Solr
● ElasticSearch
● AWS CloudSearch
● Sphinx
● etc*
If you’re using any of the above, use Haystack
Postgres Search
● FAST
● Simple to implement
● Supports Search features like Full Text,
Ranking, Boosting, Fuzzy etc..
Django
Live Example
● Search Students by name or by course
● Use South migration to create tsvector
column
● Store title in Search table
● Update Search table via Celery on Save of
Student data
https://github.com/Syerram/postgres_search
GIN, GIST
● GIST is Hash based, GIN is B-trees
● GINs = GISTs * 3 , s = Speed
● GINu = GISTu * 3 , u = update time
● GINkb = GISTkb * 3, kb = size
A gin index
CREATE INDEX student_index ON students USING gin(to_tsvector('english'
name));
Source http://www.postgresql.org/docs/9.2/static/textsearch-indexes.html
Full Text Search
● All text should be preprocessed using
tsvector and queried using tsquery
● Both reduce the text to lexemes
SELECT to_tsvector('How much wood would a woodchuck chuck If a woodchuck could
chuck wood?')
"'chuck':7,12 'could':11 'much':2 'wood':3,13 'woodchuck':6,10 'would':4"
● Both are required for searching to work on
normal text
SELECT to_tsvector('How much wood would a woodchucks chucks If a woodchucks could
chucks woods?') @@ 'chucks' -- False
SELECT to_tsvector('How much wood would a woodchucks chucks If a woodchucks could
chucks woods?') @@ to_tsquery('chucks') -- True
Full Text Search (Contd.)
● Technically you don’t need index, but for
large tables it will be slow
SELECT * FROM students where to_tsvector('english', name) @@ to_tsquery('english',
'Kirk')
● GIN or GIST Index
CREATE INDEX <index_name> ON <table_name> USING gin(<col_name>);
● Expression Based
CREATE INDEX <index_name> ON <table_name> USING gin(to_tsvector(COALESCE(col_name,'')
|| COALESCE(col_name,'')));
Boosting
● Boost certain results over others
● Still matching
● Use ts_rank to boost results
e.g.
…ORDER BY ts_rank(document,
to_tsquery('python')) DESC
Ranking
● Importance of search term within document
e.g.
Search term found in title > description > tag
● Use setweight to assign importance to each field
when preparing Document
e.g.
setweight(to_tsvector(‘english’, post.title), 'A') ||
setweight(to_tsvector(‘english’, post.description), 'B') ||
setweight(to_tsvector('english', post.tags), 'C'))
...
--In search query use ‘ts_rank’ to order by ranking
Trigram
● Group of 3 consecutive chars from String
● Similarity between strings is matched by # of
trigrams they share
e.g. "hello": "h", "he", "hel", "ell", "llo", "lo", and "o”
"hallo": "h", "ha", "hal", "all", "llo", "lo", and "o”
Number of matches: 4
● Use similarity to find related terms. Returns value
between 0 to 1 where 0 no match and 1 is exact match
Soundex/Metaphone
● Oldest and only good for English names
● Converts to a String of Length 4.
e.g. “Anthony == Anthoney” => “A535 ==
A535”
● Create index itself with Soundex or
Metaphone
e.g. CREATE INDEX idx_name ON tb_name USING
GIN(soundex(col_name));
SELECT ... FROM tb_name WHERE soundex(col_name) = soundex(‘...’)
Pro & Con
Pros
● Quick implementation
● Lot easier to change document format and call refresh index
● Speed comparable to other search engines
● Cost effective
Cons
● Not as flexible as pure search engines, like Solr
● Not as fast as Solr though pretty fast for humans
● Tied to Postgres
● Indexes can get pretty large, but so can search engine indexes
Django ORM
● Implements Full text Search
class StudentCourse(models.Model):
...
search_index = VectorField()
objects = SearchManager(
fields = ('student__user__name', 'course__name'),
config = 'pg_catalog.english', # this is default
search_field = 'search_index', # this is default
auto_update_search_field = True
)
● StudentCourse.objects.search("David")
https://github.com/djangonauts/djorm-ext-pgfulltext
Next Steps
● Add Ranking, Boosting, Fuzzy Search to
djorm pgfulltext
e.g. StudentCourse.objects.search("David & Python").rank("Python")
StudentCourse.objects.fuzzy_search("Jython").rank("Python")
StudentCourse.objects.soundex("Davad").rank("Java") & More
● Continue to add examples to
postgres_search
Tips
● Use separate DB if necessary or use
Materialized Views
● Don’t index everything. Limit your
searchable data
● Analyze using `Explain` and ts_stat
● Create indexes on fly using concurrently
● Don’t pull Foreign Key objects in search
Code
• https://github.com/Syerram/pos
tgres_search
• Stack
• AngularJS, Django, Celery, Postgres
• Feel free to Fork, Pull Request
@agileseeker, github/syerram,
syerram.silvrback.com/
Sai

More Related Content

What's hot

PostgreSQL FTS Solutions FOSDEM 2013 - PGDAY
PostgreSQL FTS Solutions FOSDEM 2013 - PGDAYPostgreSQL FTS Solutions FOSDEM 2013 - PGDAY
PostgreSQL FTS Solutions FOSDEM 2013 - PGDAY
Emanuel Calvo
 

What's hot (20)

Rank Your Results with PostgreSQL Full Text Search (from PGConf2015)
Rank Your Results with PostgreSQL Full Text Search (from PGConf2015)Rank Your Results with PostgreSQL Full Text Search (from PGConf2015)
Rank Your Results with PostgreSQL Full Text Search (from PGConf2015)
 
Better Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQLBetter Full Text Search in PostgreSQL
Better Full Text Search in PostgreSQL
 
On Beyond (PostgreSQL) Data Types
On Beyond (PostgreSQL) Data TypesOn Beyond (PostgreSQL) Data Types
On Beyond (PostgreSQL) Data Types
 
Teaching PostgreSQL to new people
Teaching PostgreSQL to new peopleTeaching PostgreSQL to new people
Teaching PostgreSQL to new people
 
PostgreSQL FTS Solutions FOSDEM 2013 - PGDAY
PostgreSQL FTS Solutions FOSDEM 2013 - PGDAYPostgreSQL FTS Solutions FOSDEM 2013 - PGDAY
PostgreSQL FTS Solutions FOSDEM 2013 - PGDAY
 
Developing and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDWDeveloping and Deploying Apps with the Postgres FDW
Developing and Deploying Apps with the Postgres FDW
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuning
 
Elasticsearch 설치 및 기본 활용
Elasticsearch 설치 및 기본 활용Elasticsearch 설치 및 기본 활용
Elasticsearch 설치 및 기본 활용
 
Azure search
Azure searchAzure search
Azure search
 
Pg 95 new capabilities
Pg 95 new capabilitiesPg 95 new capabilities
Pg 95 new capabilities
 
Spark with Elasticsearch
Spark with ElasticsearchSpark with Elasticsearch
Spark with Elasticsearch
 
Elasticsearch speed is key
Elasticsearch speed is keyElasticsearch speed is key
Elasticsearch speed is key
 
[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 
Dapper performance
Dapper performanceDapper performance
Dapper performance
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearch
 
Alta vista indexing and search engine
Alta vista  indexing and search engineAlta vista  indexing and search engine
Alta vista indexing and search engine
 

Viewers also liked

Бинарные (файловые) хранилища: страшная сказка с мрачным концом / Даниил Подо...
Бинарные (файловые) хранилища: страшная сказка с мрачным концом / Даниил Подо...Бинарные (файловые) хранилища: страшная сказка с мрачным концом / Даниил Подо...
Бинарные (файловые) хранилища: страшная сказка с мрачным концом / Даниил Подо...
Ontico
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
SlideShare
 

Viewers also liked (9)

Full text search | Speech by Matteo Durighetto | PGDay.IT 2013
Full text search | Speech by Matteo Durighetto | PGDay.IT 2013 Full text search | Speech by Matteo Durighetto | PGDay.IT 2013
Full text search | Speech by Matteo Durighetto | PGDay.IT 2013
 
Advanced Search with Solr & django-haystack
Advanced Search with Solr & django-haystackAdvanced Search with Solr & django-haystack
Advanced Search with Solr & django-haystack
 
Scaling search to a million pages with Solr, Python, and Django
Scaling search to a million pages with Solr, Python, and DjangoScaling search to a million pages with Solr, Python, and Django
Scaling search to a million pages with Solr, Python, and Django
 
Как устроен поиск / Андрей Аксенов (Sphinx)
Как устроен поиск / Андрей Аксенов (Sphinx)Как устроен поиск / Андрей Аксенов (Sphinx)
Как устроен поиск / Андрей Аксенов (Sphinx)
 
Practical continuous quality gates for development process
Practical continuous quality gates for development processPractical continuous quality gates for development process
Practical continuous quality gates for development process
 
Бинарные (файловые) хранилища: страшная сказка с мрачным концом / Даниил Подо...
Бинарные (файловые) хранилища: страшная сказка с мрачным концом / Даниил Подо...Бинарные (файловые) хранилища: страшная сказка с мрачным концом / Даниил Подо...
Бинарные (файловые) хранилища: страшная сказка с мрачным концом / Даниил Подо...
 
Annabel Lee
Annabel LeeAnnabel Lee
Annabel Lee
 
Secret History of Silicon Valley - Master Slide Deck
Secret History of Silicon Valley - Master Slide DeckSecret History of Silicon Valley - Master Slide Deck
Secret History of Silicon Valley - Master Slide Deck
 
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
 

Similar to Full Text search in Django with Postgres

Similar to Full Text search in Django with Postgres (20)

Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
Search Engine-Building with Lucene and Solr, Part 1 (SoCal Code Camp LA 2013)
 
Introduction to database
Introduction to databaseIntroduction to database
Introduction to database
 
Search Engine-Building with Lucene and Solr
Search Engine-Building with Lucene and SolrSearch Engine-Building with Lucene and Solr
Search Engine-Building with Lucene and Solr
 
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018
Beyond Wordcount  with spark datasets (and scalaing) - Nide PDX Jan 2018Beyond Wordcount  with spark datasets (and scalaing) - Nide PDX Jan 2018
Beyond Wordcount with spark datasets (and scalaing) - Nide PDX Jan 2018
 
Big Data Grows Up - A (re)introduction to Cassandra
Big Data Grows Up - A (re)introduction to CassandraBig Data Grows Up - A (re)introduction to Cassandra
Big Data Grows Up - A (re)introduction to Cassandra
 
Database 101
Database 101Database 101
Database 101
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django application
 
Postgresql search demystified
Postgresql search demystifiedPostgresql search demystified
Postgresql search demystified
 
Querydsl fin jug - june 2012
Querydsl   fin jug - june 2012Querydsl   fin jug - june 2012
Querydsl fin jug - june 2012
 
The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)
 
The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)
 
Elasticsearch for Data Engineers
Elasticsearch for Data EngineersElasticsearch for Data Engineers
Elasticsearch for Data Engineers
 
9.4json
9.4json9.4json
9.4json
 
PostgreSQL - It's kind've a nifty database
PostgreSQL - It's kind've a nifty databasePostgreSQL - It's kind've a nifty database
PostgreSQL - It's kind've a nifty database
 
How to use the new Domino Query Language
How to use the new Domino Query LanguageHow to use the new Domino Query Language
How to use the new Domino Query Language
 
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
 
Introducing Datawave
Introducing DatawaveIntroducing Datawave
Introducing Datawave
 
Data Exploration with Apache Drill: Day 1
Data Exploration with Apache Drill:  Day 1Data Exploration with Apache Drill:  Day 1
Data Exploration with Apache Drill: Day 1
 
HelsinkiJS - Clojurescript for Javascript Developers
HelsinkiJS - Clojurescript for Javascript DevelopersHelsinkiJS - Clojurescript for Javascript Developers
HelsinkiJS - Clojurescript for Javascript Developers
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to SphinxMYSQL Query Anti-Patterns That Can Be Moved to Sphinx
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Full Text search in Django with Postgres

  • 2. Search is everywhere Search expectations ● FAST ● Full Text search ● Linguistic support (“craziness | crazy”) ● Ranking ● Fuzzy Searching ● More like this
  • 3. Django ● SLOW ● `icontains` is dumbed down version of search ● Searching across tables is pain ● No relevancy, ranking or similar words unless done manually ● No easy way for fuzzy searching
  • 4. Other Alternatives ● Solr ● ElasticSearch ● AWS CloudSearch ● Sphinx ● etc* If you’re using any of the above, use Haystack
  • 5. Postgres Search ● FAST ● Simple to implement ● Supports Search features like Full Text, Ranking, Boosting, Fuzzy etc..
  • 6. Django Live Example ● Search Students by name or by course ● Use South migration to create tsvector column ● Store title in Search table ● Update Search table via Celery on Save of Student data https://github.com/Syerram/postgres_search
  • 7. GIN, GIST ● GIST is Hash based, GIN is B-trees ● GINs = GISTs * 3 , s = Speed ● GINu = GISTu * 3 , u = update time ● GINkb = GISTkb * 3, kb = size A gin index CREATE INDEX student_index ON students USING gin(to_tsvector('english' name)); Source http://www.postgresql.org/docs/9.2/static/textsearch-indexes.html
  • 8. Full Text Search ● All text should be preprocessed using tsvector and queried using tsquery ● Both reduce the text to lexemes SELECT to_tsvector('How much wood would a woodchuck chuck If a woodchuck could chuck wood?') "'chuck':7,12 'could':11 'much':2 'wood':3,13 'woodchuck':6,10 'would':4" ● Both are required for searching to work on normal text SELECT to_tsvector('How much wood would a woodchucks chucks If a woodchucks could chucks woods?') @@ 'chucks' -- False SELECT to_tsvector('How much wood would a woodchucks chucks If a woodchucks could chucks woods?') @@ to_tsquery('chucks') -- True
  • 9. Full Text Search (Contd.) ● Technically you don’t need index, but for large tables it will be slow SELECT * FROM students where to_tsvector('english', name) @@ to_tsquery('english', 'Kirk') ● GIN or GIST Index CREATE INDEX <index_name> ON <table_name> USING gin(<col_name>); ● Expression Based CREATE INDEX <index_name> ON <table_name> USING gin(to_tsvector(COALESCE(col_name,'') || COALESCE(col_name,'')));
  • 10. Boosting ● Boost certain results over others ● Still matching ● Use ts_rank to boost results e.g. …ORDER BY ts_rank(document, to_tsquery('python')) DESC
  • 11. Ranking ● Importance of search term within document e.g. Search term found in title > description > tag ● Use setweight to assign importance to each field when preparing Document e.g. setweight(to_tsvector(‘english’, post.title), 'A') || setweight(to_tsvector(‘english’, post.description), 'B') || setweight(to_tsvector('english', post.tags), 'C')) ... --In search query use ‘ts_rank’ to order by ranking
  • 12. Trigram ● Group of 3 consecutive chars from String ● Similarity between strings is matched by # of trigrams they share e.g. "hello": "h", "he", "hel", "ell", "llo", "lo", and "o” "hallo": "h", "ha", "hal", "all", "llo", "lo", and "o” Number of matches: 4 ● Use similarity to find related terms. Returns value between 0 to 1 where 0 no match and 1 is exact match
  • 13. Soundex/Metaphone ● Oldest and only good for English names ● Converts to a String of Length 4. e.g. “Anthony == Anthoney” => “A535 == A535” ● Create index itself with Soundex or Metaphone e.g. CREATE INDEX idx_name ON tb_name USING GIN(soundex(col_name)); SELECT ... FROM tb_name WHERE soundex(col_name) = soundex(‘...’)
  • 14. Pro & Con Pros ● Quick implementation ● Lot easier to change document format and call refresh index ● Speed comparable to other search engines ● Cost effective Cons ● Not as flexible as pure search engines, like Solr ● Not as fast as Solr though pretty fast for humans ● Tied to Postgres ● Indexes can get pretty large, but so can search engine indexes
  • 15. Django ORM ● Implements Full text Search class StudentCourse(models.Model): ... search_index = VectorField() objects = SearchManager( fields = ('student__user__name', 'course__name'), config = 'pg_catalog.english', # this is default search_field = 'search_index', # this is default auto_update_search_field = True ) ● StudentCourse.objects.search("David") https://github.com/djangonauts/djorm-ext-pgfulltext
  • 16. Next Steps ● Add Ranking, Boosting, Fuzzy Search to djorm pgfulltext e.g. StudentCourse.objects.search("David & Python").rank("Python") StudentCourse.objects.fuzzy_search("Jython").rank("Python") StudentCourse.objects.soundex("Davad").rank("Java") & More ● Continue to add examples to postgres_search
  • 17. Tips ● Use separate DB if necessary or use Materialized Views ● Don’t index everything. Limit your searchable data ● Analyze using `Explain` and ts_stat ● Create indexes on fly using concurrently ● Don’t pull Foreign Key objects in search
  • 18. Code • https://github.com/Syerram/pos tgres_search • Stack • AngularJS, Django, Celery, Postgres • Feel free to Fork, Pull Request