SlideShare a Scribd company logo
Applied Enterprise
Semantic Mining
Mark Tabladillo, Ph.D. (MVP, MCAD .NET, MCITP, MCT)
PASS SQL Saturday #198 Vancouver BC
February 16, 2013
Photos © 2013 Mark Tabladillo, All Rights Reserved
Photos © 2013 Mark Tabladillo, All Rights Reserved
Networking
Interactive
About MarkTab
Training and Consulting with        Ph.D. – Industrial Engineering,
http://marktab.com                  Georgia Tech
Data Mining Resources and Blog at   Training and consulting
http://marktab.net                  internationally across many
                                    industries – SAS and Microsoft
                                    Contributed to peer-reviewed
                                    research and legislation
                                      Mentoring doctoral dissertations at the
                                      accredited University of Phoenix
                                    Presenter
Quick Look
My Semantic Search
Interactive
Name three things you want from enterprise text
mining
Introduction
SQL Server 2012 has new Programmability Enhancements
  Statistical Semantic Search
  File Tables
  Full-Text Search Improvements
These combined technologies make SQL Server 2012 a strong contender in text
mining
Outline
Why Microsoft is competitive for data mining
Definitions: what is text mining?
History: how Microsoft’s semantic search was born
What is inside semantic search
 Logical model
 Demos
 Performance
Microsoft Resources
Why Microsoft is
Competitive for Data
Mining
Based on 2012 and 2013 Surveys
Gartner 2013
           Magic Quadrant for
           Business Intelligence
           and Analytics
           Platforms




  Retrieved from http://www.gartner.com/technology/reprints.do?id=1-1DZLPEH&ct=130207&st=sb
  – February 5, 2013
Gartner 2013
           Magic Quadrant for
           Data Warehouse
           Database
           Management
           Systems




  Retrieved from http://www.gartner.com/technology/reprints.do?id=1-1DU2VD4&ct=130131&st=sb
  – January 31, 2013
KDNuggets 2012
http://marktab.net/datamining/2012/06/15/excel-number-
commercial-tool-analytics-data-mining-big-data/
Definitions
What is text mining?
Definition
Data mining is the automated or semi-automated process of
discovering patterns in data
  Text mining is the automated or semi-automated process of
  discovering patterns from textual data
Machine learning is the development and optimization of
algorithms for automated or semi-automated pattern discovery
Purposes
    Phrase          Goal

    “Data Mining”   Inform actionable decisions
    “Text Mining”


    “Machine        Determine best performing
    Learning”       algorithm
MarkTab Decision Cycle
                             GO




           Synthesis                 Analysis
               (art)                (science)


         Science needs science fiction -- MarkTab
MarkTab Decision Cycle
                      GO




          Synthesis        Analysis
            (art)          (science)
History
How Microsoft’s semantic search came to be
History
July 2008
  Microsoft purchases Powerset for US$100 Million
  Google Dismisses Semantic Search
  http://venturebeat.com/2008/06/26/microsoft-to-buy-semantic-search-engine-
  powerset-for-100m-plus/
  http://www.forbes.com/2008/07/01/powerset-msft-search-tech-intel-
  cx_ag_0701powerset.html
History
March 2009
 Google announces “snippets” as relevant to search
 The media picks this story up as “semantic search”
 http://googleblog.blogspot.com/2009/03/two-new-improvements-to-google-
 results.html#!/2009/03/two-new-improvements-to-google-results.html
History
February 2012
  Google announces Knowledge Graph, an explicit application of semantic search
  http://mashable.com/2012/02/13/google-knowledge-graph-change-search/
History
April 2012
  Microsoft purchases 800+ patents from AOL for US$1 Billion
  Among the patents are semantic search and metadata querying – older than
  Google
  http://www.theregister.co.uk/2012/04/09/aol_microsoft_patent_deal/
What is inside Semantic
Search
Text Mining introduced for SQL Server 2012
Future: Most data is Text
Two Research Types
• Quantitative research = data mining
• Qualitative research = text mining
The future is combining both
Statistical Semantic Search
Comprises some aspects of text mining
Identifies statistically relevant key phrases
Based on these phrases, can identify (by score) similar documents
FileTables
Built on existing SQL Server FILESTREAM technology
Files and documents
   Stored in special tables in SQL Server
   Accessed if they were stored in the file system
Full-Text Search Enhancements
Property search: search on tagged properties (such as author or title)
Customizable NEAR: find words or phrases close to one another
New Word Breakers and Stemmers (for many languages)
Logical Model
How semantic search works
From Documents to Output
                    Office
         Varchar
                                 PDF
        NVarchar
                     Rowset
                     Output
                   with Scores
(iFilter Required)
                                  iFilters   Full-Text
       Documents                             Keyword
                                              Index
                                               “FTI”



                                              Semantic
                                             Key Phrase
                                  Semantic     Index –
         Semantic Document        Database    Tag Index
         Similarity Index “DSI”                  “TI”
Languages Currently Supported
Traditional Chinese   Simplified Chinese
German                British English
English               Portuguese
French                Chinese (Hong Kong SAR, PRC)
Italian               Spanish
Brazilian             Chinese (Singapore)
Russian               Chinese (Macau SAR)
Swedish
Phases of Semantic Indexing
      Full Text Keyword Index “FTI”

                                                 Semantic Document Similarity
                                                         Index “DSI”
      Semantic Key Phrase Index –
            Tag Index “TI”




     http://msdn.microsoft.com/en-us/library/gg492085.aspx#SemanticIndexing
Interactive Demo
SQL Server Management Studio
Semantic Search and
SQL Server Data Mining
SQL Server Data Tools: data mining plus text mining
Performance
The Million-Dollar Edge
Integrated Full Text Search (iFTS)
Improved Performance and Scale:
  Scale-up to 350M documents for storage and search
  iFTS query performance 7-10 times faster than in SQL Server 2008
  Worst-case iFTS query response times less than 3 sec for corpus
  Similar or better than main database search competitors
(2012, Michael Rys, Microsoft)
Linear Scale of FTI/TI/DSI
First known linearly scaling end-to-end Search and Semantic product in the industry




            Time in Seconds vs. Number of Documents
            (2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)
Text Mining References
Video
  http://channel9.msdn.com/Shows/DataBound/DataBound-Episode-2-Semantic-
  Search
  http://www.microsoftpdc.com/2009/SVR32
Semantic Search (Books Online) – explains the demo
  http://msdn.microsoft.com/en-us/library/gg492075.aspx
Paper
  http://users.cis.fiu.edu/~lzhen001/activities/KDD2011Program/docs/p213.pdf
Microsoft Resources
Links
Software
SQL Server 2012 Enterprise
(includes database engine, Analysis Services, SSMS and SSDT)
 http://www.microsoft.com/sqlserver/en/us/get-sql-server/try-it.aspx
Microsoft Office 2012 Professional
 http://office.microsoft.com/en-us/try
Organizations
 Professional Association for SQL Server http://www.sqlpass.org
   Atlanta MDF http://www.atlantamdf.com/
   Atlanta Microsoft BI Users Group http://www.meetup.com/Atlanta-Microsoft-
   Business-Intelligence-Users/
PASS Business Analytics Conference http://www.passbaconference.com
Microsoft TechEd North America http://northamerica.msteched.com/
Interactive
Takeaways
Conclusion
SQL Server Data Mining 2012 provides data mining and semantic search
The core technology allows document similarity matching
The results can be combined with SQL Server Data Mining (such as
Association Analysis)
Connect
Data Mining Resources and blog http://marktab.net
Data Mining Training and Consulting (especially Microsoft and SAS)
http://marktab.com

More Related Content

What's hot

Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
Kristijan Duvnjak
 
Spsvb Developer Intro to SharePoint Search
Spsvb   Developer Intro to SharePoint SearchSpsvb   Developer Intro to SharePoint Search
Spsvb Developer Intro to SharePoint Search
Michael Oryszak
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
Clifford James
 
Csci12 report aug18
Csci12 report aug18Csci12 report aug18
Csci12 report aug18
karenostil
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Trey Grainger
 
Apache Lucene intro - Breizhcamp 2015
Apache Lucene intro - Breizhcamp 2015Apache Lucene intro - Breizhcamp 2015
Apache Lucene intro - Breizhcamp 2015
Adrien Grand
 
3. ADO.NET
3. ADO.NET3. ADO.NET
3. ADO.NET
Rohit Rao
 
OData and SharePoint
OData and SharePointOData and SharePoint
OData and SharePointSanjay Patel
 
OData Services
OData ServicesOData Services
OData Services
Jovan Popovic
 
Lucene BootCamp
Lucene BootCampLucene BootCamp
Lucene BootCampGokulD
 
Tagging search solution design Advanced edition
Tagging search solution design Advanced editionTagging search solution design Advanced edition
Tagging search solution design Advanced edition
Alexander Tokarev
 
Apache tika
Apache tikaApache tika
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
lucenerevolution
 
ADO CONTROLS - Database usage
ADO CONTROLS - Database usageADO CONTROLS - Database usage
ADO CONTROLS - Database usage
Muralidharan Radhakrishnan
 
Solr Architecture
Solr ArchitectureSolr Architecture
Solr Architecture
Ramez Al-Fayez
 
Database programming in vb net
Database programming in vb netDatabase programming in vb net
Database programming in vb netZishan yousaf
 
Open Data Protocol (OData)
Open Data Protocol (OData)Open Data Protocol (OData)
Open Data Protocol (OData)
Pistoia Alliance
 

What's hot (19)

Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
 
Oracle by Muhammad Iqbal
Oracle by Muhammad IqbalOracle by Muhammad Iqbal
Oracle by Muhammad Iqbal
 
Spsvb Developer Intro to SharePoint Search
Spsvb   Developer Intro to SharePoint SearchSpsvb   Developer Intro to SharePoint Search
Spsvb Developer Intro to SharePoint Search
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
 
Csci12 report aug18
Csci12 report aug18Csci12 report aug18
Csci12 report aug18
 
Lucene basics
Lucene basicsLucene basics
Lucene basics
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Apache Lucene intro - Breizhcamp 2015
Apache Lucene intro - Breizhcamp 2015Apache Lucene intro - Breizhcamp 2015
Apache Lucene intro - Breizhcamp 2015
 
3. ADO.NET
3. ADO.NET3. ADO.NET
3. ADO.NET
 
OData and SharePoint
OData and SharePointOData and SharePoint
OData and SharePoint
 
OData Services
OData ServicesOData Services
OData Services
 
Lucene BootCamp
Lucene BootCampLucene BootCamp
Lucene BootCamp
 
Tagging search solution design Advanced edition
Tagging search solution design Advanced editionTagging search solution design Advanced edition
Tagging search solution design Advanced edition
 
Apache tika
Apache tikaApache tika
Apache tika
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
ADO CONTROLS - Database usage
ADO CONTROLS - Database usageADO CONTROLS - Database usage
ADO CONTROLS - Database usage
 
Solr Architecture
Solr ArchitectureSolr Architecture
Solr Architecture
 
Database programming in vb net
Database programming in vb netDatabase programming in vb net
Database programming in vb net
 
Open Data Protocol (OData)
Open Data Protocol (OData)Open Data Protocol (OData)
Open Data Protocol (OData)
 

Viewers also liked

Understanding indices
Understanding indicesUnderstanding indices
Understanding indices
Richard Douglas
 
Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310
Mark Tabladillo
 
Sql Saturday 111 Atlanta applied enterprise semantic mining
Sql Saturday 111 Atlanta applied enterprise semantic miningSql Saturday 111 Atlanta applied enterprise semantic mining
Sql Saturday 111 Atlanta applied enterprise semantic mining
Mark Tabladillo
 
SQL Server - Full text search
SQL Server - Full text searchSQL Server - Full text search
SQL Server - Full text searchPeter Gfader
 
FileTable and Semantic Search in SQL Server 2012
FileTable and Semantic Search in SQL Server 2012FileTable and Semantic Search in SQL Server 2012
FileTable and Semantic Search in SQL Server 2012
Michael Rys
 
Sql 2012 development and programming
Sql 2012  development and programmingSql 2012  development and programming
Sql 2012 development and programming
LearnNowOnline
 
Effective Usage of SQL Server 2005 Database Mirroring
Effective Usage of SQL Server 2005 Database MirroringEffective Usage of SQL Server 2005 Database Mirroring
Effective Usage of SQL Server 2005 Database Mirroringwebhostingguy
 
SQL Server Performance Tuning Baseline
SQL Server Performance Tuning BaselineSQL Server Performance Tuning Baseline
SQL Server Performance Tuning Baseline
► Supreme Mandal ◄
 
Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance Tuning
Bala Subra
 
SQL Server - Querying and Managing XML Data
SQL Server - Querying and Managing XML DataSQL Server - Querying and Managing XML Data
SQL Server - Querying and Managing XML Data
Marek Maśko
 
Always on in SQL Server 2012
Always on in SQL Server 2012Always on in SQL Server 2012
Always on in SQL Server 2012
Fadi Abdulwahab
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
James Serra
 
Implementing Full Text in SQL Server
Implementing Full Text in SQL ServerImplementing Full Text in SQL Server
Implementing Full Text in SQL Server
Microsoft TechNet - Belgium and Luxembourg
 

Viewers also liked (14)

Understanding indices
Understanding indicesUnderstanding indices
Understanding indices
 
Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310Secrets of Enterprise Data Mining 201310
Secrets of Enterprise Data Mining 201310
 
Sql Saturday 111 Atlanta applied enterprise semantic mining
Sql Saturday 111 Atlanta applied enterprise semantic miningSql Saturday 111 Atlanta applied enterprise semantic mining
Sql Saturday 111 Atlanta applied enterprise semantic mining
 
SQL Server - Full text search
SQL Server - Full text searchSQL Server - Full text search
SQL Server - Full text search
 
FileTable and Semantic Search in SQL Server 2012
FileTable and Semantic Search in SQL Server 2012FileTable and Semantic Search in SQL Server 2012
FileTable and Semantic Search in SQL Server 2012
 
Sql 2012 development and programming
Sql 2012  development and programmingSql 2012  development and programming
Sql 2012 development and programming
 
Effective Usage of SQL Server 2005 Database Mirroring
Effective Usage of SQL Server 2005 Database MirroringEffective Usage of SQL Server 2005 Database Mirroring
Effective Usage of SQL Server 2005 Database Mirroring
 
SQL Server Performance Tuning Baseline
SQL Server Performance Tuning BaselineSQL Server Performance Tuning Baseline
SQL Server Performance Tuning Baseline
 
Sql Server Performance Tuning
Sql Server Performance TuningSql Server Performance Tuning
Sql Server Performance Tuning
 
SQL Server - Querying and Managing XML Data
SQL Server - Querying and Managing XML DataSQL Server - Querying and Managing XML Data
SQL Server - Querying and Managing XML Data
 
Always on in SQL Server 2012
Always on in SQL Server 2012Always on in SQL Server 2012
Always on in SQL Server 2012
 
File Upload
File UploadFile Upload
File Upload
 
What's new in SQL Server 2016
What's new in SQL Server 2016What's new in SQL Server 2016
What's new in SQL Server 2016
 
Implementing Full Text in SQL Server
Implementing Full Text in SQL ServerImplementing Full Text in SQL Server
Implementing Full Text in SQL Server
 

Similar to Applied Semantic Search with Microsoft SQL Server

Applied Enterprise Semantic Search 201305
Applied Enterprise Semantic Search 201305Applied Enterprise Semantic Search 201305
Applied Enterprise Semantic Search 201305
Mark Tabladillo
 
Applied Semantic Search 201306
Applied Semantic Search 201306Applied Semantic Search 201306
Applied Semantic Search 201306
Mark Tabladillo
 
Applied Enterprise Semantic Mining -- Charlotte 201410
Applied Enterprise Semantic Mining -- Charlotte 201410Applied Enterprise Semantic Mining -- Charlotte 201410
Applied Enterprise Semantic Mining -- Charlotte 201410
Mark Tabladillo
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Mark Tabladillo
 
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Mark Tabladillo
 
Secrets of Enterprise Data Mining 201305
Secrets of Enterprise Data Mining 201305Secrets of Enterprise Data Mining 201305
Secrets of Enterprise Data Mining 201305
Mark Tabladillo
 
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePointINFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
Jonathan Ralton
 
MongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesMongoDB Schema Design by Examples
MongoDB Schema Design by Examples
Hadi Ariawan
 
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYSEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
Amit Sheth
 
Use O365 and Azure Cognitive Services for intelligent search
Use O365 and Azure Cognitive Services for intelligent searchUse O365 and Azure Cognitive Services for intelligent search
Use O365 and Azure Cognitive Services for intelligent search
Jeff Fried
 
Wouldn't it be nice if... an introduction to Enterprise Data Mashups
Wouldn't it be nice if... an introduction to Enterprise Data MashupsWouldn't it be nice if... an introduction to Enterprise Data Mashups
Wouldn't it be nice if... an introduction to Enterprise Data Mashups
Justo Hidalgo
 
Using metadata repositories with search
Using metadata repositories with searchUsing metadata repositories with search
Using metadata repositories with search
Jean Graef
 
Getting Ready for Project Cortex and SharePoint Syntex
Getting Ready for Project Cortex and SharePoint SyntexGetting Ready for Project Cortex and SharePoint Syntex
Getting Ready for Project Cortex and SharePoint Syntex
Chris Bortlik
 
You Don't Know SEO
You Don't Know SEOYou Don't Know SEO
You Don't Know SEO
Michael King
 
The Next-Generation SharePoint: Powered by Text Analytics
The Next-Generation SharePoint: Powered by Text Analytics The Next-Generation SharePoint: Powered by Text Analytics
The Next-Generation SharePoint: Powered by Text Analytics Peter Wren-Hilton
 
The Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text AnalyticsThe Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text AnalyticsAlyona Medelyan
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and Retrieval
Optum
 
Python for Data Science - TDC 2015
Python for Data Science - TDC 2015Python for Data Science - TDC 2015
Python for Data Science - TDC 2015
Gabriel Moreira
 
Getting Ready for Project Cortex and SharePoint Syntex
Getting Ready for Project Cortex and SharePoint SyntexGetting Ready for Project Cortex and SharePoint Syntex
Getting Ready for Project Cortex and SharePoint Syntex
Chris Bortlik
 

Similar to Applied Semantic Search with Microsoft SQL Server (20)

Applied Enterprise Semantic Search 201305
Applied Enterprise Semantic Search 201305Applied Enterprise Semantic Search 201305
Applied Enterprise Semantic Search 201305
 
Applied Semantic Search 201306
Applied Semantic Search 201306Applied Semantic Search 201306
Applied Semantic Search 201306
 
Applied Enterprise Semantic Mining -- Charlotte 201410
Applied Enterprise Semantic Mining -- Charlotte 201410Applied Enterprise Semantic Mining -- Charlotte 201410
Applied Enterprise Semantic Mining -- Charlotte 201410
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
 
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
Secrets of Enterprise Data Mining: SQL Saturday Oregon 201411
 
Secrets of Enterprise Data Mining 201305
Secrets of Enterprise Data Mining 201305Secrets of Enterprise Data Mining 201305
Secrets of Enterprise Data Mining 201305
 
Document repositories-and-metadata
Document repositories-and-metadataDocument repositories-and-metadata
Document repositories-and-metadata
 
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePointINFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
 
MongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesMongoDB Schema Design by Examples
MongoDB Schema Design by Examples
 
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITYSEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
SEMANTIC CONTENT MANAGEMENT FOR ENTERPRISES AND NATIONAL SECURITY
 
Use O365 and Azure Cognitive Services for intelligent search
Use O365 and Azure Cognitive Services for intelligent searchUse O365 and Azure Cognitive Services for intelligent search
Use O365 and Azure Cognitive Services for intelligent search
 
Wouldn't it be nice if... an introduction to Enterprise Data Mashups
Wouldn't it be nice if... an introduction to Enterprise Data MashupsWouldn't it be nice if... an introduction to Enterprise Data Mashups
Wouldn't it be nice if... an introduction to Enterprise Data Mashups
 
Using metadata repositories with search
Using metadata repositories with searchUsing metadata repositories with search
Using metadata repositories with search
 
Getting Ready for Project Cortex and SharePoint Syntex
Getting Ready for Project Cortex and SharePoint SyntexGetting Ready for Project Cortex and SharePoint Syntex
Getting Ready for Project Cortex and SharePoint Syntex
 
You Don't Know SEO
You Don't Know SEOYou Don't Know SEO
You Don't Know SEO
 
The Next-Generation SharePoint: Powered by Text Analytics
The Next-Generation SharePoint: Powered by Text Analytics The Next-Generation SharePoint: Powered by Text Analytics
The Next-Generation SharePoint: Powered by Text Analytics
 
The Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text AnalyticsThe Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text Analytics
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and Retrieval
 
Python for Data Science - TDC 2015
Python for Data Science - TDC 2015Python for Data Science - TDC 2015
Python for Data Science - TDC 2015
 
Getting Ready for Project Cortex and SharePoint Syntex
Getting Ready for Project Cortex and SharePoint SyntexGetting Ready for Project Cortex and SharePoint Syntex
Getting Ready for Project Cortex and SharePoint Syntex
 

More from Mark Tabladillo

How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006
Mark Tabladillo
 
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMicrosoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science Recap
Mark Tabladillo
 
201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for Developers
Mark Tabladillo
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
Mark Tabladillo
 
201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0
Mark Tabladillo
 
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019
Mark Tabladillo
 
201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML
Mark Tabladillo
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
Mark Tabladillo
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning
Mark Tabladillo
 
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
Mark Tabladillo
 
Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904
Mark Tabladillo
 
Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904
Mark Tabladillo
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on Azure
Mark Tabladillo
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
Mark Tabladillo
 
Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808
Mark Tabladillo
 
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Mark Tabladillo
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
Mark Tabladillo
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612
Mark Tabladillo
 
How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610
Mark Tabladillo
 
Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016
Mark Tabladillo
 

More from Mark Tabladillo (20)

How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006How to find low-cost or free data science resources 202006
How to find low-cost or free data science resources 202006
 
Microsoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science RecapMicrosoft Build 2020: Data Science Recap
Microsoft Build 2020: Data Science Recap
 
201909 Automated ML for Developers
201909 Automated ML for Developers201909 Automated ML for Developers
201909 Automated ML for Developers
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0201906 01 Introduction to ML.NET 1.0
201906 01 Introduction to ML.NET 1.0
 
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019
 
201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML201906 03 Introduction to NimbusML
201906 03 Introduction to NimbusML
 
201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0201906 02 Introduction to AutoML with ML.NET 1.0
201906 02 Introduction to AutoML with ML.NET 1.0
 
201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning201905 Azure Databricks for Machine Learning
201905 Azure Databricks for Machine Learning
 
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
201905 Azure Certification DP-100: Designing and Implementing a Data Science ...
 
Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904Big Data Advanced Analytics on Microsoft Azure 201904
Big Data Advanced Analytics on Microsoft Azure 201904
 
Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904Managing Enterprise Data Science 201904
Managing Enterprise Data Science 201904
 
Training of Python scikit-learn models on Azure
Training of Python scikit-learn models on AzureTraining of Python scikit-learn models on Azure
Training of Python scikit-learn models on Azure
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
 
Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808Advanced Analytics with Power BI 201808
Advanced Analytics with Power BI 201808
 
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
Microsoft Cognitive Toolkit (Atlanta Code Camp 2017)
 
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
 
Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612Microsoft Technologies for Data Science 201612
Microsoft Technologies for Data Science 201612
 
How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610How Big Companies plan to use Our Big Data 201610
How Big Companies plan to use Our Big Data 201610
 
Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016
 

Recently uploaded

5 Things You Need To Know Before Hiring a Videographer
5 Things You Need To Know Before Hiring a Videographer5 Things You Need To Know Before Hiring a Videographer
5 Things You Need To Know Before Hiring a Videographer
ofm712785
 
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n PrintAffordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Navpack & Print
 
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdfMeas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
dylandmeas
 
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBdCree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
creerey
 
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
Kumar Satyam
 
Buy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star ReviewsBuy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star Reviews
usawebmarket
 
Business Valuation Principles for Entrepreneurs
Business Valuation Principles for EntrepreneursBusiness Valuation Principles for Entrepreneurs
Business Valuation Principles for Entrepreneurs
Ben Wann
 
Unveiling the Secrets How Does Generative AI Work.pdf
Unveiling the Secrets How Does Generative AI Work.pdfUnveiling the Secrets How Does Generative AI Work.pdf
Unveiling the Secrets How Does Generative AI Work.pdf
Sam H
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
taqyed
 
Enterprise Excellence is Inclusive Excellence.pdf
Enterprise Excellence is Inclusive Excellence.pdfEnterprise Excellence is Inclusive Excellence.pdf
Enterprise Excellence is Inclusive Excellence.pdf
KaiNexus
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
dylandmeas
 
Brand Analysis for an artist named Struan
Brand Analysis for an artist named StruanBrand Analysis for an artist named Struan
Brand Analysis for an artist named Struan
sarahvanessa51503
 
anas about venice for grade 6f about venice
anas about venice for grade 6f about veniceanas about venice for grade 6f about venice
anas about venice for grade 6f about venice
anasabutalha2013
 
chapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxationchapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxation
AUDIJEAngelo
 
Digital Transformation in PLM - WHAT and HOW - for distribution.pdf
Digital Transformation in PLM - WHAT and HOW - for distribution.pdfDigital Transformation in PLM - WHAT and HOW - for distribution.pdf
Digital Transformation in PLM - WHAT and HOW - for distribution.pdf
Jos Voskuil
 
April 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products NewsletterApril 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products Newsletter
NathanBaughman3
 
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
BBPMedia1
 
Premium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern BusinessesPremium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern Businesses
SynapseIndia
 
Exploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social DreamingExploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social Dreaming
Nicola Wreford-Howard
 
The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...
awaisafdar
 

Recently uploaded (20)

5 Things You Need To Know Before Hiring a Videographer
5 Things You Need To Know Before Hiring a Videographer5 Things You Need To Know Before Hiring a Videographer
5 Things You Need To Know Before Hiring a Videographer
 
Affordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n PrintAffordable Stationery Printing Services in Jaipur | Navpack n Print
Affordable Stationery Printing Services in Jaipur | Navpack n Print
 
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdfMeas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
 
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBdCree_Rey_BrandIdentityKit.PDF_PersonalBd
Cree_Rey_BrandIdentityKit.PDF_PersonalBd
 
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
India Orthopedic Devices Market: Unlocking Growth Secrets, Trends and Develop...
 
Buy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star ReviewsBuy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star Reviews
 
Business Valuation Principles for Entrepreneurs
Business Valuation Principles for EntrepreneursBusiness Valuation Principles for Entrepreneurs
Business Valuation Principles for Entrepreneurs
 
Unveiling the Secrets How Does Generative AI Work.pdf
Unveiling the Secrets How Does Generative AI Work.pdfUnveiling the Secrets How Does Generative AI Work.pdf
Unveiling the Secrets How Does Generative AI Work.pdf
 
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
Enterprise Excellence is Inclusive Excellence.pdf
Enterprise Excellence is Inclusive Excellence.pdfEnterprise Excellence is Inclusive Excellence.pdf
Enterprise Excellence is Inclusive Excellence.pdf
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
 
Brand Analysis for an artist named Struan
Brand Analysis for an artist named StruanBrand Analysis for an artist named Struan
Brand Analysis for an artist named Struan
 
anas about venice for grade 6f about venice
anas about venice for grade 6f about veniceanas about venice for grade 6f about venice
anas about venice for grade 6f about venice
 
chapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxationchapter 10 - excise tax of transfer and business taxation
chapter 10 - excise tax of transfer and business taxation
 
Digital Transformation in PLM - WHAT and HOW - for distribution.pdf
Digital Transformation in PLM - WHAT and HOW - for distribution.pdfDigital Transformation in PLM - WHAT and HOW - for distribution.pdf
Digital Transformation in PLM - WHAT and HOW - for distribution.pdf
 
April 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products NewsletterApril 2024 Nostalgia Products Newsletter
April 2024 Nostalgia Products Newsletter
 
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
RMD24 | Debunking the non-endemic revenue myth Marvin Vacquier Droop | First ...
 
Premium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern BusinessesPremium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern Businesses
 
Exploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social DreamingExploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social Dreaming
 
The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...The Parable of the Pipeline a book every new businessman or business student ...
The Parable of the Pipeline a book every new businessman or business student ...
 

Applied Semantic Search with Microsoft SQL Server

  • 1. Applied Enterprise Semantic Mining Mark Tabladillo, Ph.D. (MVP, MCAD .NET, MCITP, MCT) PASS SQL Saturday #198 Vancouver BC February 16, 2013
  • 2. Photos © 2013 Mark Tabladillo, All Rights Reserved
  • 3. Photos © 2013 Mark Tabladillo, All Rights Reserved
  • 5. About MarkTab Training and Consulting with Ph.D. – Industrial Engineering, http://marktab.com Georgia Tech Data Mining Resources and Blog at Training and consulting http://marktab.net internationally across many industries – SAS and Microsoft Contributed to peer-reviewed research and legislation Mentoring doctoral dissertations at the accredited University of Phoenix Presenter
  • 7. Interactive Name three things you want from enterprise text mining
  • 8. Introduction SQL Server 2012 has new Programmability Enhancements Statistical Semantic Search File Tables Full-Text Search Improvements These combined technologies make SQL Server 2012 a strong contender in text mining
  • 9. Outline Why Microsoft is competitive for data mining Definitions: what is text mining? History: how Microsoft’s semantic search was born What is inside semantic search Logical model Demos Performance Microsoft Resources
  • 10. Why Microsoft is Competitive for Data Mining Based on 2012 and 2013 Surveys
  • 11. Gartner 2013 Magic Quadrant for Business Intelligence and Analytics Platforms Retrieved from http://www.gartner.com/technology/reprints.do?id=1-1DZLPEH&ct=130207&st=sb – February 5, 2013
  • 12. Gartner 2013 Magic Quadrant for Data Warehouse Database Management Systems Retrieved from http://www.gartner.com/technology/reprints.do?id=1-1DU2VD4&ct=130131&st=sb – January 31, 2013
  • 15. Definition Data mining is the automated or semi-automated process of discovering patterns in data Text mining is the automated or semi-automated process of discovering patterns from textual data Machine learning is the development and optimization of algorithms for automated or semi-automated pattern discovery
  • 16. Purposes Phrase Goal “Data Mining” Inform actionable decisions “Text Mining” “Machine Determine best performing Learning” algorithm
  • 17. MarkTab Decision Cycle GO Synthesis Analysis (art) (science) Science needs science fiction -- MarkTab
  • 18. MarkTab Decision Cycle GO Synthesis Analysis (art) (science)
  • 20. History July 2008 Microsoft purchases Powerset for US$100 Million Google Dismisses Semantic Search http://venturebeat.com/2008/06/26/microsoft-to-buy-semantic-search-engine- powerset-for-100m-plus/ http://www.forbes.com/2008/07/01/powerset-msft-search-tech-intel- cx_ag_0701powerset.html
  • 21. History March 2009 Google announces “snippets” as relevant to search The media picks this story up as “semantic search” http://googleblog.blogspot.com/2009/03/two-new-improvements-to-google- results.html#!/2009/03/two-new-improvements-to-google-results.html
  • 22. History February 2012 Google announces Knowledge Graph, an explicit application of semantic search http://mashable.com/2012/02/13/google-knowledge-graph-change-search/
  • 23. History April 2012 Microsoft purchases 800+ patents from AOL for US$1 Billion Among the patents are semantic search and metadata querying – older than Google http://www.theregister.co.uk/2012/04/09/aol_microsoft_patent_deal/
  • 24. What is inside Semantic Search Text Mining introduced for SQL Server 2012
  • 25. Future: Most data is Text Two Research Types • Quantitative research = data mining • Qualitative research = text mining The future is combining both
  • 26. Statistical Semantic Search Comprises some aspects of text mining Identifies statistically relevant key phrases Based on these phrases, can identify (by score) similar documents
  • 27. FileTables Built on existing SQL Server FILESTREAM technology Files and documents Stored in special tables in SQL Server Accessed if they were stored in the file system
  • 28. Full-Text Search Enhancements Property search: search on tagged properties (such as author or title) Customizable NEAR: find words or phrases close to one another New Word Breakers and Stemmers (for many languages)
  • 30. From Documents to Output Office Varchar PDF NVarchar Rowset Output with Scores
  • 31. (iFilter Required) iFilters Full-Text Documents Keyword Index “FTI” Semantic Key Phrase Semantic Index – Semantic Document Database Tag Index Similarity Index “DSI” “TI”
  • 32. Languages Currently Supported Traditional Chinese Simplified Chinese German British English English Portuguese French Chinese (Hong Kong SAR, PRC) Italian Spanish Brazilian Chinese (Singapore) Russian Chinese (Macau SAR) Swedish
  • 33. Phases of Semantic Indexing Full Text Keyword Index “FTI” Semantic Document Similarity Index “DSI” Semantic Key Phrase Index – Tag Index “TI” http://msdn.microsoft.com/en-us/library/gg492085.aspx#SemanticIndexing
  • 34. Interactive Demo SQL Server Management Studio
  • 35. Semantic Search and SQL Server Data Mining SQL Server Data Tools: data mining plus text mining
  • 37. Integrated Full Text Search (iFTS) Improved Performance and Scale: Scale-up to 350M documents for storage and search iFTS query performance 7-10 times faster than in SQL Server 2008 Worst-case iFTS query response times less than 3 sec for corpus Similar or better than main database search competitors (2012, Michael Rys, Microsoft)
  • 38. Linear Scale of FTI/TI/DSI First known linearly scaling end-to-end Search and Semantic product in the industry Time in Seconds vs. Number of Documents (2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)
  • 39. Text Mining References Video http://channel9.msdn.com/Shows/DataBound/DataBound-Episode-2-Semantic- Search http://www.microsoftpdc.com/2009/SVR32 Semantic Search (Books Online) – explains the demo http://msdn.microsoft.com/en-us/library/gg492075.aspx Paper http://users.cis.fiu.edu/~lzhen001/activities/KDD2011Program/docs/p213.pdf
  • 41. Software SQL Server 2012 Enterprise (includes database engine, Analysis Services, SSMS and SSDT) http://www.microsoft.com/sqlserver/en/us/get-sql-server/try-it.aspx Microsoft Office 2012 Professional http://office.microsoft.com/en-us/try
  • 42. Organizations Professional Association for SQL Server http://www.sqlpass.org Atlanta MDF http://www.atlantamdf.com/ Atlanta Microsoft BI Users Group http://www.meetup.com/Atlanta-Microsoft- Business-Intelligence-Users/ PASS Business Analytics Conference http://www.passbaconference.com Microsoft TechEd North America http://northamerica.msteched.com/
  • 44. Conclusion SQL Server Data Mining 2012 provides data mining and semantic search The core technology allows document similarity matching The results can be combined with SQL Server Data Mining (such as Association Analysis)
  • 45. Connect Data Mining Resources and blog http://marktab.net Data Mining Training and Consulting (especially Microsoft and SAS) http://marktab.com