• Like

CREATE SEARCH DRIVEN BUSINESS INTELLIGENCE APPLICATION USING FAST SEARCH FOR SHAREPOINT

  • 755 views
Uploaded on

Understand the Importance of Search Based Applications in today’s enterprise and how to integrate Business Intelligence and Search for business benefit. …

Understand the Importance of Search Based Applications in today’s enterprise and how to integrate Business Intelligence and Search for business benefit.

Role of Microsoft FAST Search in an enterprise for building Search based Business
IntelligenBusiness Intelligence Application.

Demonstration of a FAST search based BI applications.

More in: Business , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
755
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
8
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. CREATE SEARCH DRIVEN BUSINESS INTELLIGENCE APPLICATION USING FAST SEARCH FOR SHAREPOINT Pankaj Bose Niraj Tenany
  • 2. • Session Overview • Presenters Bio • Introduction to Netwoven • Industry Facts (Business Intelligence and Search) • Business Intelligence Challenges • Benefits of integrating Business Intelligence and Search • Search Market • FAST Search features and functions • Steps To Build Search Centric Application • Demo • Wrap up and Next Steps AGENDA
  • 3. SESSION OVERVIEW • Understand the Importance of Search Based Applications in today’s enterprise and how to integrate Business Intelligence and Search for business benefit • Role of Microsoft FAST Search in an enterprise for building Search based Business IntelligenBusiness Intelligence Application • Demonstration of a FAST search based BI applications
  • 4. NIRAJ TENANY – PRESIDENT AND CEO, NETWOVEN, INC. • Based in USA • Formerly Microsoft Consulting Services Head of Enterprise Applications Practice • Frequent speaker in Enterprise Content Management and Search events • Works with Fortune 1000 companies to define and implement ECM, BI, and Search strategies
  • 5. PANKAJ BOSE – ECM AND SEARCH PRACTICE HEAD, NETWOVEN, INC. • Based in India • Architect and implementer of large scale Enterprise Content Management and Search based applications for large and medium sized companies • Formerly, Architect at Lockheed Martin Corp in USA as Technical Lead for ECM and Search implementations • Extensive experience with different ECM and Search platforms
  • 6. NETWOVEN BACKGROUND Founded in 2001 by former Microsoft executives Top talent from industry Firm leadership comprised of Microsoft, Accenture, Oracle and Intel talent Former senior executive of Wipro, Infosys, McKinsey on our board US headquartered company with development center in India Save the Children
  • 7. Industry Verticals Life Sciences Financial Services Energy Manufacturing Not For Profit Software Netwoven Technology Services Enterprise Content Management Business IntelligenceBusiness Process Management Netwoven Solution Practices NETWOVEN SERVICES
  • 8. Solution Area Description Out-Tasking Your SharePoint 2010 SharePoint managed services with L1, L2 and L3 support Upgrading to SharePoint 2010 Upgrade intranet, extranet or internet sites Social Networking with SharePoint 2010 Build communities with SharePoint Document Management with SharePoint 2010 Develop or Migration document management systems to SharePoint Business Intelligence with SharePoint 2010 Develop reports, dashboards and map based drill downs with SharePoint Portal and Collaboration with SharePoint 2010 Developing intranet and collaboration sites using SharePoint Web Content Management with SharePoint 2010 Develop intranet and extranet sites using SharePoint Enterprise Search with SharePoint 2010 Develop Search based Applications using SharePoint 2010 NETWOVEN SHAREPOINT SERVICES
  • 9. • Every 2 days we generate more data than we did from the dawn of time through 2003 • Worldwide volume of data is growing at 59% per year • Between 75% and 85% of data is unstructured • In 5 years the majority of analytic data will come from unstructured sources - Gartner Blog BUSINESS INTELLIGENCE FACTS
  • 10. • Time spent searching for information averages 8.8 hours per week for a cost of $14,209 per worker per year • Analyzing information soaks up 8.1 hours per week, costing an organization $13,078 annually SEARCH FACTS - IDC
  • 11. BUSINESS INTELLIGENCE CHALLENGES • With data growing exponentially businesses need better tools to get information faster • Complexity of integrating large number of disparate data sources • Difficulty in integrating structured and unstructured data • End users spend a great deal of time trying to find information, reinventing the wheel, and not having the right information to make decisions
  • 12. BENEFITS OF INTEGRATING BUSINESS INTELLIGENCE AND SEARCH • Reduce the time lost searching for information • Simplifies integration of disparate data sources • Improves integration of structured and unstructured data there by providing better insights • Reduce the time lost reinventing the wheel • Improve decision making by having the right information available in a timely manner
  • 13. BENEFITS OF INTEGRATING BUSINESS INTELLIGENCE AND SEARCH • Integration of search and other types of applications creates a new category of applications called Search Based Applications • Integration of BI and search is one form of search based application
  • 14. BENEFITS OF INTEGRATING BUSINESS INTELLIGENCE USING SEARCH • Easy to use interface that end users understand • Enables the integration and search of any data source • Search Across Multiple Sources • Easily integrates structured and unstructured data sources • Indexes the sources in Real Time • Provided Assisted Navigation To Filter the Search Results there by reducing the time it takes to find information • Ability to display results in highly visual and interactive form
  • 15. INFORMATION ACCESS COMPLEXITY
  • 16. SIMPLIFIED INFORMATION ACCESS
  • 17. WHAT IS A SEARCH BASED APPLICATION? • Search-based applications (SBA) are software applications in which a search engine platform is used as the core infrastructure for information access and reporting. SBAs use semantic technologies to aggregate, normalize and classify unstructured, semi-structured and/or structured content across multiple repositories, and employ natural language technologies for accessing the aggregated information. - Wikipedia
  • 18. ENTERPRISE SEARCH MARKET
  • 19. • Advanced content processing • Extraction of entities, properties, key phrases • Content classification • Sentiment analysis • Connectors • Out of the box (from SharePoint interface) • Out of the box JDBC connectors • Content API to create custom connectors • Query and Federation Object Model • FOM to search repositories by native search process • FOM to create core results XML and Populates Refiners • Query object model to execute complex queries using Fast Query Language COMPONENTS OF FAST SEARCH
  • 20. • Identify your content source (possibly a mix) • Structured (database fields with traditional field types) • Non-structured (database fields – text, documents, web pages) • Configure connectors to crawl content sources • Use filters to crawl only specific type(s) of content you would like to crawl • Review generated crawled properties • Use SharePoint Central Admin UI or FAST PowerShell cmdlets • Use SPY processor stage to review contents of crawled properties • Add additional crawled properties if needed STEPS TO BUILD A SEARCH CENTRIC APPLICATION - I
  • 21. • Review and update content processing pipeline • Extract entities • Persons / Locations / Companies / Key phrases / Any other custom entities • Use entity extraction framework of FAST For SharePoint, Service Pack 1 • Use Out of The Box or custom dictionaries • Configure custom property extraction stage • Create / Update etcconfig_dataDocumentProcessorCustomPropertyExtractors.xml • Create new crawled properties if needed • Create managed properties and make them searchable and refinable STEPS TO BUILD A SEARCH CENTRIC APPLICATION - II
  • 22. • Review and update content processing pipeline • Extend pipeline with custom processing stages • Why? • Mechanism • Create an executable that takes some inputs and produce some outputs • The executable can be any command (exe, java class, scripts etc.) • Update etcpipelineextensibility.xml to add a RUN section that uses the command. • Provide a set of crawled properties that act as input. • Provide a set of crawled properties that get populated with the output. • Reset the document processor service o psctrl reset » Feed a document » Map crawled and managed properties » Do a full crawl STEPS TO BUILD A SEARCH CENTRIC APPLICATION - III Classification Geo Search Sentiments
  • 23. • Develop Search Interface • Refinement panel makes great Dimensions  Refiners sorted by frequency Indicates importance of a refiner  Exact counts / percentage Helps in deep analysis of content  Applying refiner filters the result set Leads to further granular analysis while exposing new dimensions • Create visual refiners • Extend the Refinement Panel web part • Override the GetXPathNavigator method • Get the refinement XML base.GetXPathNavigator • Use the XML as data source for Chart controls STEPS TO BUILD A SEARCH CENTRIC APPLICATION - IV
  • 24. • Customize Search Result Web parts • Extend SearchCoreResults web part.  Add additional sources  Override CreateDataSource and ConfigureDataSource properties to create / configure data source  Override GetXPathNavigator for mixing of results from data sources • Change XSLT to display specific metadata • Roll-up numbers by result collapsing • Display previews • Aggregate Search Results from Federation • Create a new LocationRunTime class inheriting from ILocationRuntime and Irefinable • Execute queries in native format • Create Core Results XML • Fill up the refiner STEPS TO BUILD A SEARCH CENTRIC APPLICATION - V
  • 25. Overview of the scenario A US based Hospital chain conducts patient surveys for all of its locations to  Improve patient loyalty  Increase referrals  Evaluate healthcare provider performance  Identify areas of improvements They target all of its in-house patient for surveys at the time of their discharge. The survey responses are stored in a database. The hospital typically use SQL Server SSAS and SSRS to produce BI dashboards and reports. While this works to a great extent there are some short falls  The reports only considers the specific answers to objective questions like “How did you like the meal?”. The options being Excellent, Good, Not so good, Horrible. However survey respondents can express their true sentiments in one of more sentences. As traditional BI cannot make use of non-structured content, these are left out.  BI reports precisely tell us about WHAT. However many times it stops short of informing us WHY?  The BI reports does not have provisions for answering to flexible user questions like: Cleanliness of hospital toilets.  Important attributes / entities hidden within the comments text are ignored while they could be crucial business dimensions. Hospital management decided to deploy search to extract information as discussed above while retaining BI capabilities. USE CASE
  • 26. DEMO
  • 27. HOW WE DID IT • Survey data is available in database  Comments is a text field that is used for key phrase extraction  Other fields used are of regular data types – string, integer, etc. • For key phrase extraction and normalization used external application (FAST ESP does have key phrase extraction processor, but FS4SP does not have that yet) • Using key phrases created a dictionary. The dictionary is used in a custom property extraction processor • The processor fills in crawled properties of sentiments during indexing • Database indexing is done using JDBC connector (BDC also works) • Generated crawled properties are mapped to managed properties that need to be searched or used in refiners – such as Overall Experience, Speciality, No of days in hospital, etc.
  • 28. HOW WE DID IT - II • Using Federation Object Model • Visual refiners are created using existing RefinementManager object in the search page. This can also be done extending RefinementPanel webpart. • RefinementManager provides refiner XML • MSCharts control is using refiner XML • Selected refiners are being used to construct the breadcrumb • KeywordQueries objects are also being used to collect data points for multiple timeframes. • SearchCoreResults webpart XSLT has been updated to display patient comments • Sentiments are extracted key phrases represented as refiners
  • 29. COMMON SEARCH APPLICATION CATEGORIES • Extended search platforms • • Search engines • • Question-answering applications • • Categorization/metadata tagging tools • • Categorizers and clustering engines • • Visualization tools for information navigation and analysis • • Filtering and alerting tools and text analytics • • Translation and globalization software
  • 30. CONTACTS • Niraj Tenany • President and CEO, Netwoven, Inc. • ntenany@netwoven.com • Pankaj Bose • ECM Practice Head, Netwoven, Inc. • pbose@netwoven.com • Rashi Bajaj • Business Development Manager • rbajaj@netwoven.com