I included this slide to sort of give background into some of the backend changes. But for this presentation im going to focus on the search admin component.
The biggest thing to take away from this presentation is that microsoft has divided search between Search Farm admins and sitecollection admin.There is a parent child relationship with search going forward.The ability to run searches at the site and site collection level.Today we are going to touch on whats new in the following areas.1 connectors – not many changes but a few 2 crawling and content sources3. Results sources which replaced federated sources and scopes4 document parsing 5 entity extraction6 schema management – the emphasis of this presentation will be in schema management. Schema management is where in my opinion the bulk of sharepoint 2013 search customization will take place, more By site collections admins now than search adminisrators.Microsoft is making an effort to compartmentalize as many services in sharepoint as possible. Possibly to distribute the workload and even maybe create an entire organization structure around sharepoint roles.
For the most part Connectors have not seen any major improvements . Most of these are available in 2010, except the documentum connector which was added in SP1 I believe.The documentum connector enable sharepointto index content that is stored in the EMC Documentum system.The BDC connector was used to build the other connectors.
with sharepoint 2010 and prior versions even if you are crawling an anonymous site, we would still try and provide a set of credential when we crawl that site. And not to bypass those scenarios we allow a new authentication type: AnonymousSharepoint also supports crawling certain out the box webparts that render their content asynchronochly.Sp 2013 is built as whole on webparts that render content asynchronously. This is problem for the search crawler because it expects to get results on page load, and not waiting for javascript to execute to render additional content.Webparts can now render that content synchronously as the page loads.
Only applies shareponit content sources. Every 15 min by default.You will be able to change the frequency using a powershell command in the Document can now appear in seconds.Previously it took 5 to 10 min before item can be crawled.This is the default out of the box.
RemoteSharepointIndex- if you had multiple farms you need to set one as default crawl index.Use the index in that farm where that farm lives without crawling all farms (chews up network bandwidth)Can have enterprise search for each farm.RSI don’t need kerberos…uses OOP trust between 2 farms and search application. Will pass credentialsNO MORE DOUBLE HOP.Still is security trims.
No more going to farm admin to create federated searches.No more search admins.Exchange is now a data source. Site collections admin can configure this but only for users own mailboxes. Still security trimmed.You can alos apply query transformations to a result source. For example the author equals a certain value. Or the other content type is of a certain value. This gets applied in ADDITION to whatever query you submit.
Show Enable continuous crawls.Show crawling a different site…Microsoft.comCrawl rulesallow anonymousShow results sources for crawling external sharepoint farmFor crawling local sharepoint sites authentication options are still the same but if you select remote sp, they changeDefault authSSO ID for single sign on. This passes caml token to other farm.
Opens the file type isDeep link extraction for word and powerpoint…create links in the search results that go deep into the doc itself.Example into a specific slide or specific section of the document itself.Content authors don’t change much of the metadata properties.Example…in 2010 when you search for powerpoint decks the results with show the title as slide one or the word document or even Table of content.(this can be mitigated in 2010 by turning off Enableoptimistictitleoverride in the registry) Visual metadata extraction uses same visual ques as a user to determine the title. It does that by looking at things font size, whether it is centered, where that text is located.Get much better metadata results this way.Search does this by using high performance handlers for html,docx, pptx txt, image, xml and other formats.Ifilters from preveious versions are still supported.
You can configure the search system to look for "entities" in unstructured content, such as in the body text or the title of a document. These entities can be words or phrases, such as product names. To specify which entities to look for, you can create and deploy your own dictionaries.The extracted entities are stored in the search index as separate managed properties, which are automatically configured to be searchable, queryable, retrievable, sortable and refinable. You can use those properties in search refiners, for example, to help users filter their search results.For companies, you can use the pre-populated company extraction dictionary that SharePoint 2013 provides.In addition, you can deploy several types of custom entity extractors in the form of custom entity extraction dictionaries. You deploy these dictionaries using Windows PowerShell. The entries in these dictionaries (single or multiple words) will be matched on words or parts of words in the content in a case-sensitive or case-insensitive way. For more information, see Create and deploy custom entity extractors in SharePoint 2013.Required a lot of involment from search service admins.Moved it all into the termstore.
Full crawl is managed in the site collection. Site Admins can request a full crawl of just a site collection or even a list.The crawl they request will be scoped only for that site collection or list.
Step 1 – Create Site Column Choice- Application TypeStep 2 – Site Collection Settings – Search Schema and search for RefinableString00.Step 3 – Edit and add ALIAS AplicationType NO Spaces.Step 4 – Click on ADD mapping Step 5 – search for ows_application%20type Step 6 – refiners show up after full crawl but you don’t need full crawl. Go to DOCUMENT library settings-advanced settings- reindex. Step – Applause…
Look at ranking models and what to doQuery rules significant investiment for it pros and end users.
Custom search resultswebpart are used for custom ranking model
Term store is easier to customize
The hope is that you reverse engineer the query rules and create your own