SlideShare a Scribd company logo
1 of 27
Download to read offline
Business Intelligence
Semantic Search in SQL Server 2012
• Semantic search seeks to improve search
accuracy by understanding searcher intent
and the contextual meaning of terms as they
appear in the searchable dataspace.
What is Semantic Search
• Built on top of Full-Text Search
• Requires predefined external Database
• That database should be attached to SQL
Server Instance
• Semantic Search should be configured to use
that Database
Semantic Search in SQL Server 2012
• Exists in all Commercial editions of SQL Server
2012
• Also in SQL Server 2012 Express Advanced
Services Edition
Supported in SQL Server Editions
Semantic Search Installation 1/3
Semantic Search Installation 2/3
Semantic Search Installation 3/3
-- do not use sp_attach_db stored procedure
-- it is obsolete
CREATE DATABASE SemanticsDB
ON (FILENAME = N'C:Program FilesMicrosoft
Semantic Language DatabasesemanticsDB.mdf')
LOG ON (FILENAME = 'C:Program FilesMicrosoft
Semantic Language Databasesemanticsdb_log.ldf')
FOR ATTACH;
GO
Attach Semantics DB
-- Register Semantics Languages Database
-- required once
EXEC
sp_fulltext_semantic_register_language_statisti
cs_db @dbname = N'SemanticsDB';
GO
Register Semantics DB
-- Verify the registration is succeeded
SELECT * FROM
sys.fulltext_semantic_language_statistics_database;
GO
Verify Registration
-- Check available languages for statistical semantic extraction
SELECT * FROM sys.fulltext_semantic_languages;
GO
Supported Languages
Demo
How to Enable On Table
-- Reload filters (iFilter) and restart fulltext
-- host process if needed
EXEC sp_fulltext_service 'load_os_resources', 1;
EXEC sp_fulltext_service 'restart_all_fdhosts';
GO
Restart Processes
Full-Text Search
• Supports character-based columns:
1. char
2. varchar
3. nchar
4. nvarchar
5. text
6. ntext
7. image
8. xml
9. varbinary (max)
10. FileStream
Text
Full-Text Queries Specifics
• Full-text queries are not case-sensitive searching for
"Aluminum" or "aluminum" returns the same results
• Transact-SQL predicates:
– CONTAINS
– FREETEXT
• Transact-SQL functions:
– CONTAINSTABLE
– FREETEXTTABLE
Text
SELECT * FROM sys.fulltext_document_types;
File types supported by iFilters
Three Tabular Functions:
• SemanticKeyPhraseTable - returns the statistically
significant phrases in each document
• SemanticSimilarityTable – returns documents or
rows that are similar or related, based on the key
phrases in each document
• SemanticSimilarityDetailsTable – returns the key
phrases that explain why two documents were
identified as similar
Semantic Search Functions
-- select Full-Text Catalog items count
SELECT FulltextCatalogProperty
('FullTextCatalog', 'itemcount');
GO
Full-Text Catalog Items Count
-- check Population progress
SELECT fulltextcatalogproperty('FullTextCatalog', 'populatestatus');
GO
• 0 = Idle
• 1 = Full population in progress
• 2 = Paused
• 3 = Throttled
• 4 = Recovering
• 5 = Shutdown
• 6 = Incremental population in progress
• 7 = Building index
• 8 = Disk is full. Paused.
• 9 = Change tracking
Full-Text Catalog Population Status
-- Get all key phrases in the entire corpus
SELECT
K.score, K.keyphrase, COUNT(D.stream_id) AS Occurrences
FROM SemanticKeyPhraseTable
(dbo.Documents, (name, file_stream)) AS K
INNER JOIN dbo.Documents AS D
ON D.path_locator = K.document_key
GROUP BY K.score, K.keyphrase
ORDER BY K.score DESC, K.keyphrase ASC;
GO
Get all Key Phrases
-- Find documents by keyphrase – ‘sql’ in the case below
SELECT
K.score, K.keyphrase,
D.stream_id, D.name, D.file_type, D.cached_file_size,
D.creation_time, D.last_write_time, D.last_access_time
FROM dbo.Documents D
INNER JOIN semantickeyphrasetable (
dbo.Documents,
(name, file_stream)
) AS K
ON D.path_locator = K.document_key
WHERE K.keyphrase = N'sql'
ORDER BY K.score DESC;
Find Documents by Key phrase
-- find similar documents
DECLARE @Title NVARCHAR(1000) = (SELECT'Gurevich Vladimir.docx');
DECLARE @DocID HIERARCHYID =
(SELECT path_locator FROM dbo.Documents WHERE name = @Title);
SELECT
@Title AS source_title, D.name AS matched_title,
D.stream_id, K.score
FROM SemanticSimilarityTable(dbo.Documents, *, @DocID) AS K
INNER JOIN dbo.Documents AS D
ON D.path_locator = K.matched_document_key
ORDER BY K.score DESC;
GO
Find Similar Documents
-- find out Key Phrases that make two documents match
DECLARE @SourceTitle NVARCHAR(1000) = (SELECT ‘source.docx');
DECLARE @MatchedTitle NVARCHAR(1000) = (SELECT ‘target.docx');
DECLARE @SourceDocID HIERARCHYID =
(SELECT path_locator FROM dbo.Documents WHERE name = @SourceTitle);
DECLARE @MatchedDocID HIERARCHYID =
(SELECT path_locator FROM dbo.Documents WHERE name = @MatchedTitle);
SELECT
K.keyphrase, K.score, @SourceTitle AS source_title, @MatchedTitle AS matched_title
FROM SemanticSimilarityDetailsTable(dbo.Documents, file_stream, @SourceDocID,
file_stream, @MatchedDocID) AS K
ORDER BY K.score DESC;
GO
Why 2 Documents Are Similar
• The generic NEAR operator is deprecated in SQLServer2012
• It is a new operator and not an extension of the existing NEAR
operator
• Lets to query with 2 optional requirements that you could not
previously specify
1. The maximum gap between the search terms
2. The order of the search terms - for example, “John” must appear
before “Smith”
• Stopwords or noise words are included in the gap count.
CONTAINSTABLE(Documents, Content, ‘NEAR((John, Smith), 4, TRUE)’);
Full-Text Search NEAR Operator 1/2
• -- get documents that contain keywords "sql"
and "server" nearby
• SELECT D.name,
file_stream.GetFileNamespacePath() AS
relative_path
• FROM dbo.Documents D
• WHERE CONTAINS(file_stream, 'NEAR(("sql",
"server"), 1, FALSE)');
• GO
Full-Text Search NEAR Operator 2/2
-- get documents that contain keywords "sql" and
"server" nearby
SELECT D.name,
file_stream.GetFileNamespacePath() AS
relative_path
FROM dbo.Documents D
WHERE CONTAINS
(file_stream, 'NEAR(("sql", "server"), 1, FALSE)');
GO
Full-Text Search in Documents
• Full Text Catalog depend on language selected
Problems

More Related Content

What's hot

Database programming in vb net
Database programming in vb netDatabase programming in vb net
Database programming in vb net
Zishan yousaf
 
Database Architecture and Basic Concepts
Database Architecture and Basic ConceptsDatabase Architecture and Basic Concepts
Database Architecture and Basic Concepts
Tony Wong
 
Web based database application design using vb.net and sql server
Web based database application design using vb.net and sql serverWeb based database application design using vb.net and sql server
Web based database application design using vb.net and sql server
Ammara Arooj
 
Александр Третьяков: "Spring Data JPA and MongoDB"
Александр Третьяков: "Spring Data JPA and MongoDB" Александр Третьяков: "Spring Data JPA and MongoDB"
Александр Третьяков: "Spring Data JPA and MongoDB"
Anna Shymchenko
 
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
Owen Allen
 

What's hot (20)

Oracle by Muhammad Iqbal
Oracle by Muhammad IqbalOracle by Muhammad Iqbal
Oracle by Muhammad Iqbal
 
Database programming in vb net
Database programming in vb netDatabase programming in vb net
Database programming in vb net
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
3. ADO.NET
3. ADO.NET3. ADO.NET
3. ADO.NET
 
Chapter 6(introduction to documnet databse) no sql for mere mortals
Chapter 6(introduction to documnet databse) no sql for mere mortalsChapter 6(introduction to documnet databse) no sql for mere mortals
Chapter 6(introduction to documnet databse) no sql for mere mortals
 
Csci12 report aug18
Csci12 report aug18Csci12 report aug18
Csci12 report aug18
 
Sql Server Basics
Sql Server BasicsSql Server Basics
Sql Server Basics
 
ADO CONTROLS - Database usage
ADO CONTROLS - Database usageADO CONTROLS - Database usage
ADO CONTROLS - Database usage
 
Sql server basics
Sql server basicsSql server basics
Sql server basics
 
Introduction à DocumentDB
Introduction à DocumentDBIntroduction à DocumentDB
Introduction à DocumentDB
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Ado.net
Ado.netAdo.net
Ado.net
 
Database Architecture and Basic Concepts
Database Architecture and Basic ConceptsDatabase Architecture and Basic Concepts
Database Architecture and Basic Concepts
 
NOSQL and MongoDB Database
NOSQL and MongoDB DatabaseNOSQL and MongoDB Database
NOSQL and MongoDB Database
 
Web based database application design using vb.net and sql server
Web based database application design using vb.net and sql serverWeb based database application design using vb.net and sql server
Web based database application design using vb.net and sql server
 
Александр Третьяков: "Spring Data JPA and MongoDB"
Александр Третьяков: "Spring Data JPA and MongoDB" Александр Третьяков: "Spring Data JPA and MongoDB"
Александр Третьяков: "Spring Data JPA and MongoDB"
 
MS SQL Server
MS SQL ServerMS SQL Server
MS SQL Server
 
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
 
MDF and LDF in SQL Server
MDF and LDF in SQL ServerMDF and LDF in SQL Server
MDF and LDF in SQL Server
 
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
 

Similar to SQL Server 2012 - Semantic Search

ASP.Net Presentation Part2
ASP.Net Presentation Part2ASP.Net Presentation Part2
ASP.Net Presentation Part2
Neeraj Mathur
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
Erik Hatcher
 
Data Access Options in SharePoint 2010
Data Access Options in SharePoint 2010Data Access Options in SharePoint 2010
Data Access Options in SharePoint 2010
Rob Windsor
 
Learn PHP Lacture2
Learn PHP Lacture2Learn PHP Lacture2
Learn PHP Lacture2
ADARSH BHATT
 

Similar to SQL Server 2012 - Semantic Search (20)

PHP and MySQL.pptx
PHP and MySQL.pptxPHP and MySQL.pptx
PHP and MySQL.pptx
 
Dynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data MergeDynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data Merge
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
 
ASP.Net Presentation Part2
ASP.Net Presentation Part2ASP.Net Presentation Part2
ASP.Net Presentation Part2
 
Graph db as metastore
Graph db as metastoreGraph db as metastore
Graph db as metastore
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 
Data Access Options in SharePoint 2010
Data Access Options in SharePoint 2010Data Access Options in SharePoint 2010
Data Access Options in SharePoint 2010
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
 
Learn PHP Lacture2
Learn PHP Lacture2Learn PHP Lacture2
Learn PHP Lacture2
 
Local storage in Web apps
Local storage in Web appsLocal storage in Web apps
Local storage in Web apps
 
07 darwino rest services
07   darwino rest services07   darwino rest services
07 darwino rest services
 
R data interfaces
R data interfacesR data interfaces
R data interfaces
 
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Houston tech fest dev intro to sharepoint search
Houston tech fest   dev intro to sharepoint searchHouston tech fest   dev intro to sharepoint search
Houston tech fest dev intro to sharepoint search
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Spark sql
Spark sqlSpark sql
Spark sql
 
La sql
La sqlLa sql
La sql
 
Entity Framework Database and Code First
Entity Framework Database and Code FirstEntity Framework Database and Code First
Entity Framework Database and Code First
 

More from Sperasoft

More from Sperasoft (20)

особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4
 
концепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted Worldконцепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted World
 
Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4
 
Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек
 
Gameplay Tags
Gameplay TagsGameplay Tags
Gameplay Tags
 
Data Driven Gameplay in UE4
Data Driven Gameplay in UE4Data Driven Gameplay in UE4
Data Driven Gameplay in UE4
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks
 
The theory of relational databases
The theory of relational databasesThe theory of relational databases
The theory of relational databases
 
Automated layout testing using Galen Framework
Automated layout testing using Galen FrameworkAutomated layout testing using Galen Framework
Automated layout testing using Galen Framework
 
Sperasoft talks: Android Security Threats
Sperasoft talks: Android Security ThreatsSperasoft talks: Android Security Threats
Sperasoft talks: Android Security Threats
 
Sperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on AndroidSperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on Android
 
Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015
 
Effective Мeetings
Effective МeetingsEffective Мeetings
Effective Мeetings
 
Unreal Engine 4 Introduction
Unreal Engine 4 IntroductionUnreal Engine 4 Introduction
Unreal Engine 4 Introduction
 
JIRA Development
JIRA DevelopmentJIRA Development
JIRA Development
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
MOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JSMOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JS
 
Quick Intro Into Kanban
Quick Intro Into KanbanQuick Intro Into Kanban
Quick Intro Into Kanban
 
ECMAScript 6 Review
ECMAScript 6 ReviewECMAScript 6 Review
ECMAScript 6 Review
 
Console Development in 15 minutes
Console Development in 15 minutesConsole Development in 15 minutes
Console Development in 15 minutes
 

Recently uploaded

Recently uploaded (20)

Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 

SQL Server 2012 - Semantic Search

  • 2. • Semantic search seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace. What is Semantic Search
  • 3. • Built on top of Full-Text Search • Requires predefined external Database • That database should be attached to SQL Server Instance • Semantic Search should be configured to use that Database Semantic Search in SQL Server 2012
  • 4. • Exists in all Commercial editions of SQL Server 2012 • Also in SQL Server 2012 Express Advanced Services Edition Supported in SQL Server Editions
  • 8. -- do not use sp_attach_db stored procedure -- it is obsolete CREATE DATABASE SemanticsDB ON (FILENAME = N'C:Program FilesMicrosoft Semantic Language DatabasesemanticsDB.mdf') LOG ON (FILENAME = 'C:Program FilesMicrosoft Semantic Language Databasesemanticsdb_log.ldf') FOR ATTACH; GO Attach Semantics DB
  • 9. -- Register Semantics Languages Database -- required once EXEC sp_fulltext_semantic_register_language_statisti cs_db @dbname = N'SemanticsDB'; GO Register Semantics DB
  • 10. -- Verify the registration is succeeded SELECT * FROM sys.fulltext_semantic_language_statistics_database; GO Verify Registration
  • 11. -- Check available languages for statistical semantic extraction SELECT * FROM sys.fulltext_semantic_languages; GO Supported Languages
  • 12. Demo How to Enable On Table
  • 13. -- Reload filters (iFilter) and restart fulltext -- host process if needed EXEC sp_fulltext_service 'load_os_resources', 1; EXEC sp_fulltext_service 'restart_all_fdhosts'; GO Restart Processes
  • 14. Full-Text Search • Supports character-based columns: 1. char 2. varchar 3. nchar 4. nvarchar 5. text 6. ntext 7. image 8. xml 9. varbinary (max) 10. FileStream Text
  • 15. Full-Text Queries Specifics • Full-text queries are not case-sensitive searching for "Aluminum" or "aluminum" returns the same results • Transact-SQL predicates: – CONTAINS – FREETEXT • Transact-SQL functions: – CONTAINSTABLE – FREETEXTTABLE Text
  • 16. SELECT * FROM sys.fulltext_document_types; File types supported by iFilters
  • 17. Three Tabular Functions: • SemanticKeyPhraseTable - returns the statistically significant phrases in each document • SemanticSimilarityTable – returns documents or rows that are similar or related, based on the key phrases in each document • SemanticSimilarityDetailsTable – returns the key phrases that explain why two documents were identified as similar Semantic Search Functions
  • 18. -- select Full-Text Catalog items count SELECT FulltextCatalogProperty ('FullTextCatalog', 'itemcount'); GO Full-Text Catalog Items Count
  • 19. -- check Population progress SELECT fulltextcatalogproperty('FullTextCatalog', 'populatestatus'); GO • 0 = Idle • 1 = Full population in progress • 2 = Paused • 3 = Throttled • 4 = Recovering • 5 = Shutdown • 6 = Incremental population in progress • 7 = Building index • 8 = Disk is full. Paused. • 9 = Change tracking Full-Text Catalog Population Status
  • 20. -- Get all key phrases in the entire corpus SELECT K.score, K.keyphrase, COUNT(D.stream_id) AS Occurrences FROM SemanticKeyPhraseTable (dbo.Documents, (name, file_stream)) AS K INNER JOIN dbo.Documents AS D ON D.path_locator = K.document_key GROUP BY K.score, K.keyphrase ORDER BY K.score DESC, K.keyphrase ASC; GO Get all Key Phrases
  • 21. -- Find documents by keyphrase – ‘sql’ in the case below SELECT K.score, K.keyphrase, D.stream_id, D.name, D.file_type, D.cached_file_size, D.creation_time, D.last_write_time, D.last_access_time FROM dbo.Documents D INNER JOIN semantickeyphrasetable ( dbo.Documents, (name, file_stream) ) AS K ON D.path_locator = K.document_key WHERE K.keyphrase = N'sql' ORDER BY K.score DESC; Find Documents by Key phrase
  • 22. -- find similar documents DECLARE @Title NVARCHAR(1000) = (SELECT'Gurevich Vladimir.docx'); DECLARE @DocID HIERARCHYID = (SELECT path_locator FROM dbo.Documents WHERE name = @Title); SELECT @Title AS source_title, D.name AS matched_title, D.stream_id, K.score FROM SemanticSimilarityTable(dbo.Documents, *, @DocID) AS K INNER JOIN dbo.Documents AS D ON D.path_locator = K.matched_document_key ORDER BY K.score DESC; GO Find Similar Documents
  • 23. -- find out Key Phrases that make two documents match DECLARE @SourceTitle NVARCHAR(1000) = (SELECT ‘source.docx'); DECLARE @MatchedTitle NVARCHAR(1000) = (SELECT ‘target.docx'); DECLARE @SourceDocID HIERARCHYID = (SELECT path_locator FROM dbo.Documents WHERE name = @SourceTitle); DECLARE @MatchedDocID HIERARCHYID = (SELECT path_locator FROM dbo.Documents WHERE name = @MatchedTitle); SELECT K.keyphrase, K.score, @SourceTitle AS source_title, @MatchedTitle AS matched_title FROM SemanticSimilarityDetailsTable(dbo.Documents, file_stream, @SourceDocID, file_stream, @MatchedDocID) AS K ORDER BY K.score DESC; GO Why 2 Documents Are Similar
  • 24. • The generic NEAR operator is deprecated in SQLServer2012 • It is a new operator and not an extension of the existing NEAR operator • Lets to query with 2 optional requirements that you could not previously specify 1. The maximum gap between the search terms 2. The order of the search terms - for example, “John” must appear before “Smith” • Stopwords or noise words are included in the gap count. CONTAINSTABLE(Documents, Content, ‘NEAR((John, Smith), 4, TRUE)’); Full-Text Search NEAR Operator 1/2
  • 25. • -- get documents that contain keywords "sql" and "server" nearby • SELECT D.name, file_stream.GetFileNamespacePath() AS relative_path • FROM dbo.Documents D • WHERE CONTAINS(file_stream, 'NEAR(("sql", "server"), 1, FALSE)'); • GO Full-Text Search NEAR Operator 2/2
  • 26. -- get documents that contain keywords "sql" and "server" nearby SELECT D.name, file_stream.GetFileNamespacePath() AS relative_path FROM dbo.Documents D WHERE CONTAINS (file_stream, 'NEAR(("sql", "server"), 1, FALSE)'); GO Full-Text Search in Documents
  • 27. • Full Text Catalog depend on language selected Problems