- The document discusses the evolution of text analytics technologies from early keyword indexing to more advanced mathematical approaches like latent semantic indexing (LSI).
- It explains that early keyword indexing focused only on word frequencies and occurrences, which could lead to false positives and did not capture the conceptual meaning of documents.
- More advanced approaches like LSI use linear algebraic calculations to analyze word co-occurrences across large document sets and derive the conceptual relationships between terms and topics in a way that better mirrors human understanding.
Recent analysis of litigation outcomes suggests that nearly half of the patents litigated to judgment were held invalid. Commonly available patent search software is predominantly keyword based and takes a “one-size-fits-all” approach leaving much to be desired from a practitioner’s perspective. We discuss opportunities for using text mining and information retrieval in the domain of patent litigation. We focus on post-grant inter partes review process, where a company can challenge the validity of an issued patent in order, for example, to protect its product from being viewed as infringing on the patent in question. We discuss both possibilities and obstacles to assistance with such a challenge using a text analytic solution. A range of issues need to be overcome for semantic search and analytic solutions to be of value, ranging from text normalization, support for semantic and faceted search, to predictive analytics. In this context, we evaluate our novel and top performing semantic search solution. For experiments, we use data from the database USPTO Final Decisions of the Patent Trial and Appeal Board. Our experiments and analysis point to limitations of generic semantic search and text analysis tools. We conclude by presenting some research ideas that might help overcome these deficiencies, such as interactive, semantic search, support for a multi- stage approach that distinguishes between a divergent and convergent mode of operation and textual entailment.
Semantic search helps business people find answers to pressing questions by wading through oceans of information to find nuggets of meaningful information. In this presentation we’ll discuss how semantic search and content analysis technologies are starting to appear in the marketplace today. We’ll provide a recap of what semantic search is and what the key benefits are, then we’ll answer the following questions:
• Is semantic search a feature, an application, or enterprise system?
• How can I add semantic search to my existing work processes?
• Will I need to replace my existing content technologies?
• What will I need to do to prepare my content for semantic search?
• Is semantic search just for documents or can I search my data too?
• Can I use semantic search to find information on the internet and other public data sources?
• Are there standards to consider?
If you think you need a search application, there are some useful first steps to take:
* validating that full-text search is the right technology
* producing sets of ideal results you'd like to return for a range of queries
* considering the value of supplementing a basic search result list with document clustering
* producing more specific requirements and investigating technology options
SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence Marina Santini
Query logs are an important source of information to surmize users intents'. Although Karlgren (2010) points out that “There are several reasons to be cautious in drawing too far-reaching conclusions: we cannot say for sure what the users were after; [...]“, some linguistic problems could be sorted out by applying more advanced text/content analytics, such as register/sublanguage identification and terminology classification (see Friberg Heppin, 2011) . In this presentation, I will argue that query logs can be considered a digital textual genre alike emails, blogs, chats, tweets and so forth. All these genres contain unstructured information that, still today, is difficult to leverage upon satisfactorily. The hypothesis that I would like to put forward in this workshop is that query logs might be easier to exploit to extract useful information and actionable intelligence than other digital genres.
Recent analysis of litigation outcomes suggests that nearly half of the patents litigated to judgment were held invalid. Commonly available patent search software is predominantly keyword based and takes a “one-size-fits-all” approach leaving much to be desired from a practitioner’s perspective. We discuss opportunities for using text mining and information retrieval in the domain of patent litigation. We focus on post-grant inter partes review process, where a company can challenge the validity of an issued patent in order, for example, to protect its product from being viewed as infringing on the patent in question. We discuss both possibilities and obstacles to assistance with such a challenge using a text analytic solution. A range of issues need to be overcome for semantic search and analytic solutions to be of value, ranging from text normalization, support for semantic and faceted search, to predictive analytics. In this context, we evaluate our novel and top performing semantic search solution. For experiments, we use data from the database USPTO Final Decisions of the Patent Trial and Appeal Board. Our experiments and analysis point to limitations of generic semantic search and text analysis tools. We conclude by presenting some research ideas that might help overcome these deficiencies, such as interactive, semantic search, support for a multi- stage approach that distinguishes between a divergent and convergent mode of operation and textual entailment.
Semantic search helps business people find answers to pressing questions by wading through oceans of information to find nuggets of meaningful information. In this presentation we’ll discuss how semantic search and content analysis technologies are starting to appear in the marketplace today. We’ll provide a recap of what semantic search is and what the key benefits are, then we’ll answer the following questions:
• Is semantic search a feature, an application, or enterprise system?
• How can I add semantic search to my existing work processes?
• Will I need to replace my existing content technologies?
• What will I need to do to prepare my content for semantic search?
• Is semantic search just for documents or can I search my data too?
• Can I use semantic search to find information on the internet and other public data sources?
• Are there standards to consider?
If you think you need a search application, there are some useful first steps to take:
* validating that full-text search is the right technology
* producing sets of ideal results you'd like to return for a range of queries
* considering the value of supplementing a basic search result list with document clustering
* producing more specific requirements and investigating technology options
SearchInFocus: Exploratory Study on Query Logs and Actionable Intelligence Marina Santini
Query logs are an important source of information to surmize users intents'. Although Karlgren (2010) points out that “There are several reasons to be cautious in drawing too far-reaching conclusions: we cannot say for sure what the users were after; [...]“, some linguistic problems could be sorted out by applying more advanced text/content analytics, such as register/sublanguage identification and terminology classification (see Friberg Heppin, 2011) . In this presentation, I will argue that query logs can be considered a digital textual genre alike emails, blogs, chats, tweets and so forth. All these genres contain unstructured information that, still today, is difficult to leverage upon satisfactorily. The hypothesis that I would like to put forward in this workshop is that query logs might be easier to exploit to extract useful information and actionable intelligence than other digital genres.
What IA, UX and SEO Can Learn from Each OtherIan Lurie
Google has become the arbiter how users experience a website. Their data-driven determinants of what constitute good UX directly influence how a site is found. This is wrong because people, not machines, should determine experience; Google does not tell the SEO or UX community what data is used to measure experience and many elements of experience cannot be measured.This presentation reveals why Google uses UX signals to determine placement in search results and how to create a customer pleasing and highly visible user experience for your website.
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Trey Grainger
Search engines frequently miss the mark when it comes to understanding user intent. This talk will walk through some of the key building blocks necessary to turn a search engine into a dynamically-learning "intent engine", able to interpret and search on meaning, not just keywords. We will walk through CareerBuilder's semantic search architecture, including semantic autocomplete, query and document interpretation, probabilistic query parsing, automatic taxonomy discovery, keyword disambiguation, and personalization based upon user context/behavior. We will also see how to leverage an inverted index (Lucene/Solr) as a knowledge graph that can be used as a dynamic ontology to extract phrases, understand and weight the semantic relationships between those phrases and known entities, and expand the query to include those additional conceptual relationships.
As an example, most search engines completely miss the mark at parsing a query like (Senior Java Developer Portland, OR Hadoop). We will show how to dynamically understand that "senior" designates an experience level, that "java developer" is a job title related to "software engineering", that "portland, or" is a city with a specific geographical boundary (as opposed to a keyword followed by a boolean operator), and that "hadoop" is the skill "Apache Hadoop", which is also related to other terms like "hbase", "hive", and "map/reduce". We will discuss how to train the search engine to parse the query into this intended understanding and how to reflect this understanding to the end user to provide an insightful, augmented search experience.
Topics: Semantic Search, Apache Solr, Finite State Transducers, Probabilistic Query Parsing, Bayes Theorem, Augmented Search, Recommendations, Query Disambiguation, NLP, Knowledge Graphs
Natural Language Search with Knowledge Graphs (Haystack 2019)Trey Grainger
To optimally interpret most natural language queries, it is necessary to understand the phrases, entities, commands, and relationships represented or implied within the search. Knowledge graphs serve as useful instantiations of ontologies which can help represent this kind of knowledge within a domain.
In this talk, we'll walk through techniques to build knowledge graphs automatically from your own domain-specific content, how you can update and edit the nodes and relationships, and how you can seamlessly integrate them into your search solution for enhanced query interpretation and semantic search. We'll have some fun with some of the more search-centric use cased of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "bbq near haystack" into
{ filter:["doc_type":"restaurant"], "query": { "boost": { "b": "recip(geodist(38.034780,-78.486790),1,1000,1000)", "query": "bbq OR barbeque OR barbecue" } } }
We'll also specifically cover use of the Semantic Knowledge Graph, a particularly interesting knowledge graph implementation available within Apache Solr that can be auto-generated from your own domain-specific content and which provides highly-nuanced, contextual interpretation of all of the terms, phrases and entities within your domain. We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding within your search engine.
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
What if your search engine could automatically tune its own domain-specific relevancy model? What if it could learn the important phrases and topics within your domain, automatically identify alternate spellings (synonyms, acronyms, and related phrases) and disambiguate multiple meanings of those phrases, learn the conceptual relationships embedded within your documents, and even use machine-learned ranking to discover the relative importance of different features and then automatically optimize its own ranking algorithms for your domain?
In this presentation, you’ll learn you how to do just that - to evolving Lucene/Solr implementations into self-learning data systems which are able to accept user queries, deliver relevance-ranked results, and automatically learn from your users’ subsequent interactions to continually deliver a more relevant experience for each keyword, category, and group of users.
Such a self-learning system leverages reflected intelligence to consistently improve its understanding of the content (documents and queries), the context of specific users, and the relevance signals present in the collective feedback from every prior user interaction with the system. Come learn how to move beyond manual relevancy tuning and toward a closed-loop system leveraging both the embedded meaning within your content and the wisdom of the crowds to automatically generate search relevancy algorithms optimized for your domain.
Intelligent Semantic Web Search Engines: A Brief Survey dannyijwest
The World Wide Web (WWW) allows the people to share the information (data) from the large database repositories globally. The amount of information grows billions of databases. We need to search the information will specialize tools known generically search engine. There are many of search engines available today, retrieving meaningful information is difficult. However to overcome this problem in search engines to retrieve meaningful information intelligently, semantic web technologies are playing a major role. In this paper we present survey on the search engine generations and the role of search engines in intelligent web and semantic search technologies.
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerOpenSource Connections
To optimally interpret most natural language queries, it is necessary to understand the phrases, entities, commands, and relationships represented or implied within the search. Knowledge graphs serve as useful instantiations of ontologies which can help represent this kind of knowledge within a domain.
In this talk, we'll walk through techniques to build knowledge graphs automatically from your own domain-specific content, how you can update and edit the nodes and relationships, and how you can seamlessly integrate them into your search solution for enhanced query interpretation and semantic search. We'll have some fun with some of the more search-centric use cased of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "bbq near haystack" into
{ filter:["doc_type":"restaurant"], "query": { "boost": { "b": "recip(geodist(38.034780,-78.486790),1,1000,1000)", "query": "bbq OR barbeque OR barbecue" } } }
We'll also specifically cover use of the Semantic Knowledge Graph, a particularly interesting knowledge graph implementation available within Apache Solr that can be auto-generated from your own domain-specific content and which provides highly-nuanced, contextual interpretation of all of the terms, phrases and entities within your domain. We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding within your search engine.
An Empirical Characterization of Touch-Gesture Input-Force on Mobile DevicesUniversity of Sussex
Publication: Faisal Taher, Jason Alexander, John Hardy, and Eduardo Velloso. 2014. An Empirical Characterization of Touch-Gesture Input-Force on Mobile Devices. In Proceedings of the Ninth ACM International Conference on Interactive Tabletops and Surfaces (ITS '14). ACM, New York, NY, USA, 195-204. DOI=http://dx.doi.org/10.1145/2669485.2669515
Abstract:
Designers of force-sensitive user interfaces lack a ground-truth characterization of input force while performing common touch gestures (zooming, panning, tapping, and rotating). This paper provides such a characterization firstly by deriving baseline force profiles in a tightly-controlled user study; then by examining how these profiles vary in different conditions such as form factor (mobile phone and tablet), interaction position (walking and sitting) and urgency (timed tasks and untimed tasks). We conducted two user studies with 14 and 24 participants respectively and report: (1) force profile graphs that depict the force variations of common touch gestures, (2) the effect of the different conditions on force exerted and gesture completion time, (3) the most common forces that users apply, and the time taken to complete the gestures. This characterization is intended to aid the design of interactive devices that integrate force-input with common touch gestures in different conditions.
What IA, UX and SEO Can Learn from Each OtherIan Lurie
Google has become the arbiter how users experience a website. Their data-driven determinants of what constitute good UX directly influence how a site is found. This is wrong because people, not machines, should determine experience; Google does not tell the SEO or UX community what data is used to measure experience and many elements of experience cannot be measured.This presentation reveals why Google uses UX signals to determine placement in search results and how to create a customer pleasing and highly visible user experience for your website.
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Trey Grainger
Search engines frequently miss the mark when it comes to understanding user intent. This talk will walk through some of the key building blocks necessary to turn a search engine into a dynamically-learning "intent engine", able to interpret and search on meaning, not just keywords. We will walk through CareerBuilder's semantic search architecture, including semantic autocomplete, query and document interpretation, probabilistic query parsing, automatic taxonomy discovery, keyword disambiguation, and personalization based upon user context/behavior. We will also see how to leverage an inverted index (Lucene/Solr) as a knowledge graph that can be used as a dynamic ontology to extract phrases, understand and weight the semantic relationships between those phrases and known entities, and expand the query to include those additional conceptual relationships.
As an example, most search engines completely miss the mark at parsing a query like (Senior Java Developer Portland, OR Hadoop). We will show how to dynamically understand that "senior" designates an experience level, that "java developer" is a job title related to "software engineering", that "portland, or" is a city with a specific geographical boundary (as opposed to a keyword followed by a boolean operator), and that "hadoop" is the skill "Apache Hadoop", which is also related to other terms like "hbase", "hive", and "map/reduce". We will discuss how to train the search engine to parse the query into this intended understanding and how to reflect this understanding to the end user to provide an insightful, augmented search experience.
Topics: Semantic Search, Apache Solr, Finite State Transducers, Probabilistic Query Parsing, Bayes Theorem, Augmented Search, Recommendations, Query Disambiguation, NLP, Knowledge Graphs
Natural Language Search with Knowledge Graphs (Haystack 2019)Trey Grainger
To optimally interpret most natural language queries, it is necessary to understand the phrases, entities, commands, and relationships represented or implied within the search. Knowledge graphs serve as useful instantiations of ontologies which can help represent this kind of knowledge within a domain.
In this talk, we'll walk through techniques to build knowledge graphs automatically from your own domain-specific content, how you can update and edit the nodes and relationships, and how you can seamlessly integrate them into your search solution for enhanced query interpretation and semantic search. We'll have some fun with some of the more search-centric use cased of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "bbq near haystack" into
{ filter:["doc_type":"restaurant"], "query": { "boost": { "b": "recip(geodist(38.034780,-78.486790),1,1000,1000)", "query": "bbq OR barbeque OR barbecue" } } }
We'll also specifically cover use of the Semantic Knowledge Graph, a particularly interesting knowledge graph implementation available within Apache Solr that can be auto-generated from your own domain-specific content and which provides highly-nuanced, contextual interpretation of all of the terms, phrases and entities within your domain. We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding within your search engine.
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
What if your search engine could automatically tune its own domain-specific relevancy model? What if it could learn the important phrases and topics within your domain, automatically identify alternate spellings (synonyms, acronyms, and related phrases) and disambiguate multiple meanings of those phrases, learn the conceptual relationships embedded within your documents, and even use machine-learned ranking to discover the relative importance of different features and then automatically optimize its own ranking algorithms for your domain?
In this presentation, you’ll learn you how to do just that - to evolving Lucene/Solr implementations into self-learning data systems which are able to accept user queries, deliver relevance-ranked results, and automatically learn from your users’ subsequent interactions to continually deliver a more relevant experience for each keyword, category, and group of users.
Such a self-learning system leverages reflected intelligence to consistently improve its understanding of the content (documents and queries), the context of specific users, and the relevance signals present in the collective feedback from every prior user interaction with the system. Come learn how to move beyond manual relevancy tuning and toward a closed-loop system leveraging both the embedded meaning within your content and the wisdom of the crowds to automatically generate search relevancy algorithms optimized for your domain.
Intelligent Semantic Web Search Engines: A Brief Survey dannyijwest
The World Wide Web (WWW) allows the people to share the information (data) from the large database repositories globally. The amount of information grows billions of databases. We need to search the information will specialize tools known generically search engine. There are many of search engines available today, retrieving meaningful information is difficult. However to overcome this problem in search engines to retrieve meaningful information intelligently, semantic web technologies are playing a major role. In this paper we present survey on the search engine generations and the role of search engines in intelligent web and semantic search technologies.
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerOpenSource Connections
To optimally interpret most natural language queries, it is necessary to understand the phrases, entities, commands, and relationships represented or implied within the search. Knowledge graphs serve as useful instantiations of ontologies which can help represent this kind of knowledge within a domain.
In this talk, we'll walk through techniques to build knowledge graphs automatically from your own domain-specific content, how you can update and edit the nodes and relationships, and how you can seamlessly integrate them into your search solution for enhanced query interpretation and semantic search. We'll have some fun with some of the more search-centric use cased of knowledge graphs, such as entity extraction, query expansion, disambiguation, and pattern identification within our queries: for example, transforming the query "bbq near haystack" into
{ filter:["doc_type":"restaurant"], "query": { "boost": { "b": "recip(geodist(38.034780,-78.486790),1,1000,1000)", "query": "bbq OR barbeque OR barbecue" } } }
We'll also specifically cover use of the Semantic Knowledge Graph, a particularly interesting knowledge graph implementation available within Apache Solr that can be auto-generated from your own domain-specific content and which provides highly-nuanced, contextual interpretation of all of the terms, phrases and entities within your domain. We'll see a live demo with real world data demonstrating how you can build and apply your own knowledge graphs to power much more relevant query understanding within your search engine.
An Empirical Characterization of Touch-Gesture Input-Force on Mobile DevicesUniversity of Sussex
Publication: Faisal Taher, Jason Alexander, John Hardy, and Eduardo Velloso. 2014. An Empirical Characterization of Touch-Gesture Input-Force on Mobile Devices. In Proceedings of the Ninth ACM International Conference on Interactive Tabletops and Surfaces (ITS '14). ACM, New York, NY, USA, 195-204. DOI=http://dx.doi.org/10.1145/2669485.2669515
Abstract:
Designers of force-sensitive user interfaces lack a ground-truth characterization of input force while performing common touch gestures (zooming, panning, tapping, and rotating). This paper provides such a characterization firstly by deriving baseline force profiles in a tightly-controlled user study; then by examining how these profiles vary in different conditions such as form factor (mobile phone and tablet), interaction position (walking and sitting) and urgency (timed tasks and untimed tasks). We conducted two user studies with 14 and 24 participants respectively and report: (1) force profile graphs that depict the force variations of common touch gestures, (2) the effect of the different conditions on force exerted and gesture completion time, (3) the most common forces that users apply, and the time taken to complete the gestures. This characterization is intended to aid the design of interactive devices that integrate force-input with common touch gestures in different conditions.
Big Data and Harvesting Data from Social MediaR A Akerkar
The value of social media data is only as valuable as the information and insights we can extract from it. It is the information and insights that will help us make better decisions and give us a competitive edge.
This schema represents a general view of the demand management framework. Developed using lean, kanban, project management and software engineering concepts, this framework covers from the business to IT.
The whitepaper addresses the challenges in the data–driven organizations, medical research and health care. It summarizes how the context-enabled and semantic enrichment can transform the traditional method to search optimum data. 3RDi has advanced content enrichment with Named Entity Recognition, Semantic similarity, Content classification and Content summarization. Get the right data at the right time that helps medical researchers and health care practitioners.
Technical Whitepaper: A Knowledge Correlation Search Engines0P5a41b
For the technically oriented reader, this brief paper describes the technical foundation of the Knowledge Correlation Search Engine - patented by Make Sence, Inc.
Extracting and Reducing the Semantic Information Content of Web Documents to ...ijsrd.com
Ranking and optimization of web service compositions represent challenging areas of research with significant implication for realization of the "Web of Services" vision. The semantic web, where the semantics information is indicated using machine-process able language such as the Web Ontology Language (OWL) "Semantic web service" use formal semantic description of web service functionality and enable automated reasoning over web service compositions. These semantic web services can then be automatically discovered, composed into more complex services, and executed. Automating web service composition through the use of semantic technologies calculating the semantic similarities between outputs and inputs of connected constituent services, and aggregate these values into a measure of semantics quality for the composition. It propose a novel and extensible model balancing the new dimensions of semantic quality ( as a functional quality metric) with QoS metric, and using them together as a ranking and optimization criteria. It also demonstrates the utility of Genetic Algorithms to allow optimization within the context of a large number of services foreseen by the "Web of Service" vision. To reduce the semantics of the web documents then to support semantic document retrieval by using Network Ontology Language (NOL) and to improve QoS as a ranking and optimization.
Metaphic or the art of looking another way.Suresh Manian
For all intents and purposes, we are our words. And verbs and adjectives capture actions and sentiments better than any other tool. Metaphic is premised on the belief that a grammar book and a calculator are all you really need to make sense of web search and social media chatter, apart from all text, in general.
16 Decision Support and Business Intelligence Systems (9th E.docxRAJU852744
16 Decision Support and Business Intelligence Systems (9th Edition) Instructor’s Manual
Chapter 7:
Text Analytics, Text Mining, and Sentiment Analysis
Learning Objectives for Chapter 7
1. Describe text mining and understand the need for text mining
2. Differentiate among text analytics, text mining, and data mining
3. Understand the different application areas for text mining
4. Know the process of carrying out a text mining project
5. Appreciate the different methods to introduce structure to text-based data
6. Describe sentiment analysis
7. Develop familiarity with popular applications of sentiment analysis
8. Learn the common methods for sentiment analysis
9. Become familiar with speech analytics as it relates to sentiment analysis
10. Learn three facets of Web analytics—content, structure, and usage mining
11. Know social analytics including social media and social network analyses
CHAPTER OVERVIEW
This chapter provides a comprehensive overview of text analytics/mining and Web analytics/mining along with their popular application areas such as search engines, sentiment analysis, and social network/media analytics. As we have been witnessing in recent years, the unstructured data generated over the Internet of Things (IoT) (Web, sensor networks, radio-frequency identification [RFID]–enabled supply chain systems, surveillance networks, etc.) are increasing at an exponential pace, and there is no indication of its slowing down. This changing nature of data is forcing organizations to make text and Web analytics a critical part of their business intelligence/analytics infrastructure.
CHAPTER OUTLINE
7.1 Opening Vignette: Amadori Group Converts Consumer Sentiments into
Near-Real-Time Sales
7.2 Text Analytics and Text Mining Overview
7.3 Natural Language Processing (NLP)
7.4 Text Mining Applications
7.5 Text Mining Process
7.6 Sentiment Analysis
7.7 Web Mining Overview
7.8 Search Engines
7.9 Web Usage Mining
7.10 Social Analytics
ANSWERS TO END OF SECTION REVIEW QUESTIONS( ( ( ( ( (
Section 7.1 Review Questions
1. According to the vignette and based on your opinion, what are the challenges that the food industry is facing today?
Student perceptions may vary, but some common themes related to the challenges faced by the food industry could include the changing nature and role of food in people’s lifestyles, the shift towards pre-prepared or easily prepared food, and the growing importance of marketing to keep customers interested in brands.
2. How can analytics help businesses in the food industry to survive and thrive in this competitive marketplace?
Analytics can serve dual purposes by both tracking customer interest in the brand as well as providing valuable feedback on customer preferences. An analytics system can be used to evaluate the traffic to various brand marketing campaigns (website or social) that play a pivotal role in ensuring that products are being shown to new pot.
16 Decision Support and Business Intelligence Systems (9th E.docxherminaprocter
16 Decision Support and Business Intelligence Systems (9th Edition) Instructor’s Manual
Chapter 7:
Text Analytics, Text Mining, and Sentiment Analysis
Learning Objectives for Chapter 7
1. Describe text mining and understand the need for text mining
2. Differentiate among text analytics, text mining, and data mining
3. Understand the different application areas for text mining
4. Know the process of carrying out a text mining project
5. Appreciate the different methods to introduce structure to text-based data
6. Describe sentiment analysis
7. Develop familiarity with popular applications of sentiment analysis
8. Learn the common methods for sentiment analysis
9. Become familiar with speech analytics as it relates to sentiment analysis
10. Learn three facets of Web analytics—content, structure, and usage mining
11. Know social analytics including social media and social network analyses
CHAPTER OVERVIEW
This chapter provides a comprehensive overview of text analytics/mining and Web analytics/mining along with their popular application areas such as search engines, sentiment analysis, and social network/media analytics. As we have been witnessing in recent years, the unstructured data generated over the Internet of Things (IoT) (Web, sensor networks, radio-frequency identification [RFID]–enabled supply chain systems, surveillance networks, etc.) are increasing at an exponential pace, and there is no indication of its slowing down. This changing nature of data is forcing organizations to make text and Web analytics a critical part of their business intelligence/analytics infrastructure.
CHAPTER OUTLINE
7.1 Opening Vignette: Amadori Group Converts Consumer Sentiments into
Near-Real-Time Sales
7.2 Text Analytics and Text Mining Overview
7.3 Natural Language Processing (NLP)
7.4 Text Mining Applications
7.5 Text Mining Process
7.6 Sentiment Analysis
7.7 Web Mining Overview
7.8 Search Engines
7.9 Web Usage Mining
7.10 Social Analytics
ANSWERS TO END OF SECTION REVIEW QUESTIONS( ( ( ( ( (
Section 7.1 Review Questions
1. According to the vignette and based on your opinion, what are the challenges that the food industry is facing today?
Student perceptions may vary, but some common themes related to the challenges faced by the food industry could include the changing nature and role of food in people’s lifestyles, the shift towards pre-prepared or easily prepared food, and the growing importance of marketing to keep customers interested in brands.
2. How can analytics help businesses in the food industry to survive and thrive in this competitive marketplace?
Analytics can serve dual purposes by both tracking customer interest in the brand as well as providing valuable feedback on customer preferences. An analytics system can be used to evaluate the traffic to various brand marketing campaigns (website or social) that play a pivotal role in ensuring that products are being shown to new pot.
This is an introduction to text analytics for advanced business users and IT professionals with limited programming expertise. The presentation will go through different areas of text analytics as well as provide some real work examples that help to make the subject matter a little more relatable. We will cover topics like search engine building, categorization (supervised and unsupervised), clustering, NLP, and social media analysis.
TEXT MINING-TAPPING HIDDEN KERNELS OF WISDOMITC Infotech
This paper discusses how automatic document classification, information retrieval, word frequency calculation, sentiment analysis, topic modelling and trend analysis can be utilized for root cause analysis, devising competitive strategies, enhancing customer experience and so on.
Search Solutions 2011: Successful Enterprise Search By DesignMarianne Sweeny
When your colleagues say they want Google, they don’t mean the Google Search Appliance. They mean the Google Search user experience: pervasive, expedient and delivering the information that they need. Successful enterprise search does not start with the application features, is not part of the information architecture, does not come from a controlled vocabulary and does not emerge on its own from the developers. It requires enterprise-specific data mining, enterprise-specific user-centered design and fine tuning to turn “search sucks” into search success within the firewall. This presentation looks at action items, tools and deliverables for Discovery, Planning, Design and Post Launch phases of an enterprise search deployment.
Information Retrieval on Text using Concept Similarityrahulmonikasharma
Retrieving proper information from internet is a huge task due to the high amount of information available there. Identifying the individual concepts according to the queries is time consuming. To retrieve documents, keyword based retrieval method was used before. Using this type searching, the relationship between associated keywords can’t be identified. If the same concept is described by different keywords, inaccurate and improper results will be retrieved. Concept based retrieval methods are the solution for this scenario. This gives the benefit of getting semantic relationships among concepts in finding relevant documents. Irrelevant documents can be eliminated by detecting conceptual mismatches, which is another benefit obtained from this. The main challenges identified are the ambiguity occurring due to multiple nature of words for the same concepts. Semantic analysis can reveal the conceptual relationships among words in a given document. In this paper the potential of concept-based information access via semantic analysis is explored with the help of a lexical database called WordNet. The mechanism is applied in the selected text documents and extracting the Synonym, Hyponym, Hypernym of each word from WordNet. The ranking will be calculated after checking the frequency rate of each word in the input documents and a hierarchy model will be generated according to the ranking.