Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Smarter share point kc user group fast presentation march 2015
1. FAST Search for SharePoint 2010
March 2015
Kyle Bodenstab
MCITP SharePoint 2010
Database Administrator
Jack Henry & Associates
2. SharePoint 2010 Search Products
• SharePoint Foundation
• SharePoint Server 2010
• FAST Search Server 2010 for SharePoint
3. SharePoint 2010 Search Products
• SharePoint Foundation
− SharePoint site search within a single farm
4. SharePoint 2010 Search Products
• SharePoint Server 2010
− All the features of Foundation, plus
− Shallow refinement
− Taxonomy tags
− Crawl external farms, Windows File Shares, Exchange Public
Folders, LOB apps, structured content in DBs, etc.
− 100 million item index capability
5. SharePoint 2010 Search Products
• FAST Search Server 2010 for SharePoint
− All the features of Foundation and Server, plus
− Contextual search
− Deep refinement
− Thumbnails, Previews, and Visual Search
− Advanced linguistics
− Social tags and people search
− Document promotion/demotion
− Search driven applications
− 500 million to 1 billion item index capable
6. What does this mean to users?
• FAST can:
− Deliver results that are relevant
− Search in the language of the business
− Tune results to improve accuracy
− Provide a single platform for indexing and presenting all
content in the enterprise, not just SharePoint content
8. FAST Terms
• Metadata
− Is essential to the success of SharePoint search whether it
be FAST or Server
− Manual metadata is unreliable and costly
− Poor metadata leads to poor findability
9. FAST Terms
• FAST Content Processing
− Is designed as a pipeline that performs:
− Format conversion
− Language encoding and detection
− Tokenization
− Lemmatization
− Property extraction
− Vectorization
− Date/Time Normalization
− Custom processing
− Property mapping
10. FAST Terms
• FAST Content Extraction
− Recognize and deliver entities from unstructured content
such as:
− People, companies, locations (shallow refiners)
− Modified date, result type, language (deep refiners)
− Dictionaries (custom deep refiners);
− Business and industry specific concepts
− Customer names, competitor names
− Employee titles and expertise
− Product names
− Project names
11. FAST Terms
• FAST Content Processing term definitions
− Language encoding and detection – looks at the language of the content so appropriate
dictionaries can be applied downstream
− Tokenization – breaks text into rules regarding punctuation, diacritics, accents, compound
words, phrases, etc.
− Lemmatization – applies linguistic normalization to content so users queries match
documents that contain words and phrases in either canonical or inflected forms
(singular/plural, masculine/feminine) ie, mice would also find mouse.
− Property extraction – recognizing entities such as companies, people, locations, etc within
content
− Vectorization – creates document vectors based on the weighting of phrase/terms based
on frequency of occurrence – find documents similar to this one result
− Date/Time Normalization – converts date/times to standard representation ie 24-Mar-11
is the same as March 24, 2011
− Custom processing – extend content processing with custom dictionaries
− Property mapping – manages the metadata discovered in the pipeline to the index
managed properties
12. SharePoint Search
• Default Ranking
− URL Depth – Higher ranking based on shorter URL.
− Doc Rank – Higher ranking based on the number and
relative importance of links pointing to an item.
− Site Rank – Higher ranking based on the number and
relative importance of links pointing to the items on a site.
− HW Boost – Placeholder used for generic usage of static
rank points
13. Search Results
• Dynamic Ranking
− Freshness – Higher ranking based on age of content. Content just
added is given more points than content that is older.
− Context – Higher ranking based on the search word hits in the content.
− Proximity – Higher ranking based on a short distance between query
terms in the content.
− Managed Property – Higher ranking based on content of a specific
item type defined by a managed property.
− Authority – Higher ranking when the query terms are included in the
link text.
− Query Authority (Click-through) – Higher ranking when query terms
are associated with previous query results and clicked search results.
14. How Do Users Find Content?
• Site Structure
• Library Structure
• SharePoint Search
16. Site Structure
• Plan your site collection and sub-sites
• Consider splitting off projects to their own sites
• Keep things clean!
17. Library Structure
• Plan your libraries
• Consider using multiple shallow libraries vs a single deep
library
• Plan and use metadata tagging
• Keep things clean!
18. SharePoint 2013 Enterprise Search
• New search capabilities in SP2013:
− Single search result center
− Search user interface improvements
− Hover preview of document results
− Results based by type – document, people, sites, etc.
− Results block of similar content
− Accurate query suggestions
− Relevance improvements
− New ranking models
− Query rules
− Changes in crawling
− Continuous crawls
− Results removal from crawl logs
19. SharePoint 2013 Enterprise Search
• New search capabilities in SP2013 (continued):
− Discovering structure and entities in unstructured content
− Configure the crawler to look for entities such as product names
within the body or title of content.
− Create custom dictionaries as an entity
− Removal of redundant information – menu, headers, boilerplate
content
− More flexible search schema
− Refinable and sortable managed properties
− Multiple search schemas
− Search health reports
20. SharePoint 2013 Enterprise Search
• New search capabilities in SP2013 (continued):
− New search architecture
21. Questions
Contact information:
Kyle Bodenstab, MCITP
kbodenstab@jackhenry.com
LinkedIn - www.linkedin.com/in/KyleBodenstab
Twitter - @jackson_curve
For the lighter side of life – jacksoncurve.blogspot.com