Semantic Filtering as an example of Semantic technologies for real-time analysis. This presentation emphasizes the value of semantics for social data filtering, specifically for the challenges faced during dynamically evolving event analysis.
Personalized and Adaptive Semantic Information Filtering for Social MediaPavan Kapanipathi
Social media has experienced immense growth in recent times. These platforms are becoming increasingly common for information seeking and consumption, and as part of its growing popularity, information overload pose a significant challenge to users. For instance, Twitter alone generates around 500 million tweets per day and it is impractical for users to have to parse through such an enormous stream to find information that are interesting to them. This situation necessitates efficient personalized filtering mechanisms for users to consume relevant, interesting information from social media.
Building a personalized filtering system involves understanding users interests and utilizing these interests to deliver relevant information to users. These tasks primarily include analyzing and processing social media text which is challenging due to its shortness in length, and the real-time nature of the medium. The challenges include: (1) Lack of semantic context: Social Media posts are on an average short in length, which provides limited semantic context to perform textual analysis. This is particularly detrimental for topic identification which is a necessary task for mining users interests; (2) Dynamically changing vocabulary: Most social media websites such as Twitter and Facebook generate posts that are of current (timely) interests to the users. Due to this real-time nature, information relevant to dynamic topics of interest evolve reflecting the changes in the real world. This in turn changes the vocabulary associated with these dynamic topics of interest making it harder to filter relevant information; (3) Scalability: The number of users on social media platforms are significantly large, which is difficult for centralized systems to scale to deliver relevant information to users. This dissertation is devoted to exploring semantic techniques and Semantic Web technologies to address the above mentioned challenges in building a personalized information filtering system for social media. Particularly, the necessary semantics (knowledge) is derived from crowd sourced knowledge bases such as Wikipedia to improve context for understanding short-text and dynamic topics on social media.
SCA2013 Presentation: A Web-Based Content Analysis ToolXin Chen
This is a presentation at SCA2013, Karlsruhe, Germany. This shows the design (architecture, database, sketch, wireframe, prototype) of a simple web-based tool that supports asynchronous collaboration among researchers when conducting content analysis on qualitative social media data.
The main objective of this paper is to compare
the sentiments that prevailed before and after the presidential
elections, held in both US and France in the year 2012. To
achieve this objective we extracted the content information from a
social medium such as Twitter and used the tweets from electoral
candidates and the public users (voters), collected by means of
crawling during the course of election. In order to gain useful
insights about the US elections, we scored the sentiments for
each tweet using different metrics and performed a time series
analysis for candidates and different topics (identified by specific
keywords). In addition to this, we compared some of our insights
obtained from the US election with what we have observed for
the French election. This deep dive analysis was done in order
to understand the inherent nature of elections and to bring out
the influence of social media on elections.
Semantic Filtering as an example of Semantic technologies for real-time analysis. This presentation emphasizes the value of semantics for social data filtering, specifically for the challenges faced during dynamically evolving event analysis.
Personalized and Adaptive Semantic Information Filtering for Social MediaPavan Kapanipathi
Social media has experienced immense growth in recent times. These platforms are becoming increasingly common for information seeking and consumption, and as part of its growing popularity, information overload pose a significant challenge to users. For instance, Twitter alone generates around 500 million tweets per day and it is impractical for users to have to parse through such an enormous stream to find information that are interesting to them. This situation necessitates efficient personalized filtering mechanisms for users to consume relevant, interesting information from social media.
Building a personalized filtering system involves understanding users interests and utilizing these interests to deliver relevant information to users. These tasks primarily include analyzing and processing social media text which is challenging due to its shortness in length, and the real-time nature of the medium. The challenges include: (1) Lack of semantic context: Social Media posts are on an average short in length, which provides limited semantic context to perform textual analysis. This is particularly detrimental for topic identification which is a necessary task for mining users interests; (2) Dynamically changing vocabulary: Most social media websites such as Twitter and Facebook generate posts that are of current (timely) interests to the users. Due to this real-time nature, information relevant to dynamic topics of interest evolve reflecting the changes in the real world. This in turn changes the vocabulary associated with these dynamic topics of interest making it harder to filter relevant information; (3) Scalability: The number of users on social media platforms are significantly large, which is difficult for centralized systems to scale to deliver relevant information to users. This dissertation is devoted to exploring semantic techniques and Semantic Web technologies to address the above mentioned challenges in building a personalized information filtering system for social media. Particularly, the necessary semantics (knowledge) is derived from crowd sourced knowledge bases such as Wikipedia to improve context for understanding short-text and dynamic topics on social media.
SCA2013 Presentation: A Web-Based Content Analysis ToolXin Chen
This is a presentation at SCA2013, Karlsruhe, Germany. This shows the design (architecture, database, sketch, wireframe, prototype) of a simple web-based tool that supports asynchronous collaboration among researchers when conducting content analysis on qualitative social media data.
The main objective of this paper is to compare
the sentiments that prevailed before and after the presidential
elections, held in both US and France in the year 2012. To
achieve this objective we extracted the content information from a
social medium such as Twitter and used the tweets from electoral
candidates and the public users (voters), collected by means of
crawling during the course of election. In order to gain useful
insights about the US elections, we scored the sentiments for
each tweet using different metrics and performed a time series
analysis for candidates and different topics (identified by specific
keywords). In addition to this, we compared some of our insights
obtained from the US election with what we have observed for
the French election. This deep dive analysis was done in order
to understand the inherent nature of elections and to bring out
the influence of social media on elections.
Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...Ferdin Joe John Joseph PhD
Presented at the 4th International Conference on Information Technology InCIT 2019 organised by Thai-Nichi Institute of Technology and Council of IT Deans in Thailand (CITT)
Once again Simplify 360 has come up with an interesting analysis in a report – “Indian Election 2014, Social Media Buzz Analysis Report”. The report analyses and ranks top politicians and political parties based upon the social media buzz and Simplify 360 Social Index. The report analyses the social media buzz of political parties and leaders before and after Delhi Election.
Knowledge discovery in social media mining for market analysisSenuri Wijenayake
Independent Study on how Social Media Mining can be used by organizations to optimize market analysis in terms of better understanding their customer requirements, influence propagation etc.
This surgery focuses on content marketing and link-building and aims to take you from good to awesome. Using cross-sport examples, along with examples from the 2018 World Cup, this session could be the most profitable hour of your life.
Description of the DaCENA approach to the contextual exploration of knowledge graphs. We use machine learning to learn user preferences using a limited number of user inputs. Through these inputs, we learn a personalized ranking function over semantic associations (semi-paths in a knowledge graph) that best fit users' interests. References for the presentation are:
Bianchi & al.: Actively Learning to Rank Semantic Associations for Personalized Contextual Exploration of Knowledge Graphs. ESWC (1) 2017: 120-135.
Palmonari & al.: DaCENA: Serendipitous News Reading with Data Contexts. ESWC (Satellite Events) 2015: 133-137
§ Gruzd, A., Jacobson, J., Dubois, E. (2017). You’re Hired: Examining Acceptance of Social Media Screening of Job Applicants. In Proceedings of the 23rd Americas Conference on Information Systems (AMCIS), August 10-12, 2017, Boston, MA, USA.
Available at http://aisel.aisnet.org/amcis2017/DataScience/Presentations/28/
Abstract:
The paper examines attitudes towards employers using social media to screen job applicants. In an online survey of 454 participants, we compare the comfort level with this practice in relation to different types of information that can be gathered from publicly accessible social media. The results revealed a nuanced nature of people’s information privacy expectations in the context of hiring practices. People’s perceptions of employers using social media to screen job applicants depends on (1) whether or not they are currently seeking employment (or plan to), (2) the type of information that is being accessed by a prospective em-ployer (if there are on the job market), and (3) their cultural background, but not gender. The findings emphasize the need for employers and recruiters who are relying on social media to screen job applicants to be aware of the types of information that may be perceived to be more sensitive by applicants, such as social network-related information.
Pavan Kapanipathi's talk at IBM's Frontiers of Cloud Computing and Big Data Workshop 2014. http://researcher.ibm.com/researcher/view_group_subpage.php?id=5565
Due to the increased adoption of social web, users, specifically Twitter users are facing information overload. Unless a user is willing to restrict the sources (eg number of followings), important information relevant to users' interests often go unnoticed. The reasons include (1) the postings may be at a time the user is not looking for; (2) the user unaware and hence not following the information source; (3) and the information arrives at a rate at which the user cannot consume. Furthermore, some information that are temporally relevant, discovered late might be of no use.
My research addresses these challenges by
(1) Generating user profiles of interests from Twitter using Wikipedia. The interests gleaned from users' Twitter data can be leveraged by personalization and recommendation systems in order to reduce information overload/Volume for users.
(2) Filtering twitter data relevant to dynamically evolving entities. Including Volume, this addresses the velocity challenge in delivering relevant information in real-time. The approach is deployed on Twitris to crawl for dynamic event-relevant tweets for analysis. The prominent aspect of the approaches is the use of crowd-sourced knowledge-base such as Wikipedia.
This presentation examines public opinion in India, including views of national conditions, issues affecting the country, Prime Minister Modi and national institutions. It is based on 2,452 face-to-face interviews with adults 18 and older conducted from April 6 to May 19, 2015.
Causal data mining: Identifying causal effects at scaleAmit Sharma
Identifying causal effects is an integral part of scientific inquiry, spanning a wide range of questions such as understanding behavior in online systems, effect of social policies, or risk factors for diseases. In the absence of a randomized experiment, however, traditional methods such as matching or instrumental variables fail to provide robust estimates because they depend on strong assumptions that are never tested.
My research shows that many of the strong assumptions are testable. This leads to a data mining framework for causal inference from observed data: instead of relying on untestable assumptions, we develop tests for valid experiment-like data---a "natural" experiment---and estimate causal effects only from subsets of data that pass those tests. Two such methods are presented. The first utilizes auxiliary data from large-scale systems to automate the search for natural experiments. Applying it to estimate the additional activity caused by Amazon's recommendation system, I find over 20,000 natural experiments, an order of magnitude more than those in past work. These experiments indicate that less than half of the click-throughs typically attributed to the recommendation system are causal; the rest would have happened anyways. The second method proposes a general Bayesian test that can be used for validating natural experiments in any dataset. For instance, I find that a majority of natural experiments used in recent studies in a premier economics journal are likely invalid. More generally, the proposed framework presents a viable way of doing causal inference in large-scale datasets with minimal assumptions.
As we enter 2016, SEO specialists tend to be great at what they do but many need to brush up on their blogger outreach and influencer marketing. In this PowerPoint presentation, Andy Crestodina, web strategist and the co-founder of Orbit Media Studies shows how to combine the “dynamic duo” of social media and search to create the ultimate social marketing strategy. Click here to watch the full recording: bit.ly/1Ocweid (If you are going to watch one ppt or slideshow/talk about SEO or social media marketing, this is the one!)
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...CSCJournals
In the era of technology and internet, people use online social media services like Twitter, Instagram, Facebook, Reddit, etc. to express their emotions. The idea behind this paper is to understand people’s emotion on Twitter and their opinion towards Presidential Election 2020. We collected 1.2 million tweets in total with keyword like “RealDonaldTrump”, “JoeBiden”, “Election2020” and other election related keywords using Twitter API and then processed them with natural language processing toolkit. A Bidirectional Long Short-Term Memory (BiLSTM) model has been trained and we have achieved 93.45% accuracy on our test dataset. We then used our trained model to perform sentiment analysis on the rest of our dataset. With the sentiment analysis results and comparison with 2016 Presidential Election, we have made predictions on who could win the US Presidential Election in 2020 with pre-election twitter data. We have also analyzed the impact of COVID-19 on people’s sentiment about the election.
How to Find Your Site's True Ranking FactorsBotify
Ranking factor studies rely on third-party data and don't segment by page type or site type, which is why we recommend using first-party data to find what factors correlate with higher or lower rankings on your unique website.
Ph.D. Defense Video: https://www.youtube.com/watch?v=gpuhqjKNnDg
Thesis Statement:
Knowledge-infused Learning is a class of Neuro-Symbolic AI techniques that incorporate broader forms of knowledge (lexical, domain-specific, common-sense, and constraint-based) into addressing limitations of either symbolic or statistical AI approaches, such as model interpretations and user-level explanations. Compared to powerful statistical AI that exploit data, KiL benefit from data as well as knowledge.
Manas Gaur's Ph.D. Defense talk investigate the knowledge-infusion strategy in two
important ways. The first is to infuse knowledge to make any
classification task explainable. The second is to
achieve explainability in any natural language generation tasks. The defense
demonstrated the effective strategies of knowledge infusion that bring
five characteristic properties in any statistical AI model: (1) Context
Sensitivity, (2) Handling Uncertainty and Risk,
(3) Interpretable in learning, and (4) User-level Explainability, across natural language understanding (NLU) tasks. Along with proven methodological contributions in AI made by the Manas Gaur's dissertation, it also introduces Knowledge-intensive Language Understanding tasks, a variant of General Language Understanding (GLUE) tasks that challenges AI and NLU research on explainability and interpretability.
Furthermore, the Defense showcased the utility of incorporating diverse
forms of knowledge: linguistic, commonsense, broad-based, and
domain-specific. As the Defense illustrated the success in various domains, achieving state-of-the-art in specific applications, and significant contributions towards improving the state of machine intelligence, Manas also mentioned about careful steps to prevent errors arising due to knowledge infusion. The Defense also described Manas's future research direction towards Deep Knowledge Infusion, which would be pivotal in propelling machine understanding.
Improving Natural Language Inference Using External Knowledge in the Science ...Pavan Kapanipathi
Natural Language Inference (NLI) is fundamental to many Natural Language Processing (NLP) applications including semantic search and question answering. The NLI problem has gained significant attention due to the release of large scale, challenging datasets. Present approaches to the problem largely focus on learning-based methods that use only textual information in order to classify whether a given premise entails, contradicts, or is neutral with respect to a given hypothesis. Surprisingly, the use of methods based on structured knowledge – a central topic in artificial intelligence – has not received much attention vis-a-vis the NLI problem. While there are many open knowledge bases that contain various types of reasoning information, their use for NLI has not
been well explored. To address this, we present a combination of techniques that harness external knowledge to improve
performance on the NLI problem in the science questions domain. We present the results of applying our techniques on
text, graph, and text-and-graph based models; and discuss the
implications of using external knowledge to solve the NLI
problem. Our model achieves close to state-of-the-art performance for NLI on the SciTail science questions dataset.
User Interests Identification From Twitter using Hierarchical Knowledge BasePavan Kapanipathi
Twitter, due to its massive growth as a social networking
platform, has been in focus for the analysis of its user generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts, to determine user interests, has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledgebases to create richer user profiles is yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a user's interests.
Presented "Random Walk on Graphs" in the reading group for Knoesis. Specifically for Recommendation Context.
Referred: Purnamrita Sarkar, Random Walks on Graphs: An Overview
More Related Content
Similar to Knowledge base enabled Information Filtering on Social Web -- EMC
Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...Ferdin Joe John Joseph PhD
Presented at the 4th International Conference on Information Technology InCIT 2019 organised by Thai-Nichi Institute of Technology and Council of IT Deans in Thailand (CITT)
Once again Simplify 360 has come up with an interesting analysis in a report – “Indian Election 2014, Social Media Buzz Analysis Report”. The report analyses and ranks top politicians and political parties based upon the social media buzz and Simplify 360 Social Index. The report analyses the social media buzz of political parties and leaders before and after Delhi Election.
Knowledge discovery in social media mining for market analysisSenuri Wijenayake
Independent Study on how Social Media Mining can be used by organizations to optimize market analysis in terms of better understanding their customer requirements, influence propagation etc.
This surgery focuses on content marketing and link-building and aims to take you from good to awesome. Using cross-sport examples, along with examples from the 2018 World Cup, this session could be the most profitable hour of your life.
Description of the DaCENA approach to the contextual exploration of knowledge graphs. We use machine learning to learn user preferences using a limited number of user inputs. Through these inputs, we learn a personalized ranking function over semantic associations (semi-paths in a knowledge graph) that best fit users' interests. References for the presentation are:
Bianchi & al.: Actively Learning to Rank Semantic Associations for Personalized Contextual Exploration of Knowledge Graphs. ESWC (1) 2017: 120-135.
Palmonari & al.: DaCENA: Serendipitous News Reading with Data Contexts. ESWC (Satellite Events) 2015: 133-137
§ Gruzd, A., Jacobson, J., Dubois, E. (2017). You’re Hired: Examining Acceptance of Social Media Screening of Job Applicants. In Proceedings of the 23rd Americas Conference on Information Systems (AMCIS), August 10-12, 2017, Boston, MA, USA.
Available at http://aisel.aisnet.org/amcis2017/DataScience/Presentations/28/
Abstract:
The paper examines attitudes towards employers using social media to screen job applicants. In an online survey of 454 participants, we compare the comfort level with this practice in relation to different types of information that can be gathered from publicly accessible social media. The results revealed a nuanced nature of people’s information privacy expectations in the context of hiring practices. People’s perceptions of employers using social media to screen job applicants depends on (1) whether or not they are currently seeking employment (or plan to), (2) the type of information that is being accessed by a prospective em-ployer (if there are on the job market), and (3) their cultural background, but not gender. The findings emphasize the need for employers and recruiters who are relying on social media to screen job applicants to be aware of the types of information that may be perceived to be more sensitive by applicants, such as social network-related information.
Pavan Kapanipathi's talk at IBM's Frontiers of Cloud Computing and Big Data Workshop 2014. http://researcher.ibm.com/researcher/view_group_subpage.php?id=5565
Due to the increased adoption of social web, users, specifically Twitter users are facing information overload. Unless a user is willing to restrict the sources (eg number of followings), important information relevant to users' interests often go unnoticed. The reasons include (1) the postings may be at a time the user is not looking for; (2) the user unaware and hence not following the information source; (3) and the information arrives at a rate at which the user cannot consume. Furthermore, some information that are temporally relevant, discovered late might be of no use.
My research addresses these challenges by
(1) Generating user profiles of interests from Twitter using Wikipedia. The interests gleaned from users' Twitter data can be leveraged by personalization and recommendation systems in order to reduce information overload/Volume for users.
(2) Filtering twitter data relevant to dynamically evolving entities. Including Volume, this addresses the velocity challenge in delivering relevant information in real-time. The approach is deployed on Twitris to crawl for dynamic event-relevant tweets for analysis. The prominent aspect of the approaches is the use of crowd-sourced knowledge-base such as Wikipedia.
This presentation examines public opinion in India, including views of national conditions, issues affecting the country, Prime Minister Modi and national institutions. It is based on 2,452 face-to-face interviews with adults 18 and older conducted from April 6 to May 19, 2015.
Causal data mining: Identifying causal effects at scaleAmit Sharma
Identifying causal effects is an integral part of scientific inquiry, spanning a wide range of questions such as understanding behavior in online systems, effect of social policies, or risk factors for diseases. In the absence of a randomized experiment, however, traditional methods such as matching or instrumental variables fail to provide robust estimates because they depend on strong assumptions that are never tested.
My research shows that many of the strong assumptions are testable. This leads to a data mining framework for causal inference from observed data: instead of relying on untestable assumptions, we develop tests for valid experiment-like data---a "natural" experiment---and estimate causal effects only from subsets of data that pass those tests. Two such methods are presented. The first utilizes auxiliary data from large-scale systems to automate the search for natural experiments. Applying it to estimate the additional activity caused by Amazon's recommendation system, I find over 20,000 natural experiments, an order of magnitude more than those in past work. These experiments indicate that less than half of the click-throughs typically attributed to the recommendation system are causal; the rest would have happened anyways. The second method proposes a general Bayesian test that can be used for validating natural experiments in any dataset. For instance, I find that a majority of natural experiments used in recent studies in a premier economics journal are likely invalid. More generally, the proposed framework presents a viable way of doing causal inference in large-scale datasets with minimal assumptions.
As we enter 2016, SEO specialists tend to be great at what they do but many need to brush up on their blogger outreach and influencer marketing. In this PowerPoint presentation, Andy Crestodina, web strategist and the co-founder of Orbit Media Studies shows how to combine the “dynamic duo” of social media and search to create the ultimate social marketing strategy. Click here to watch the full recording: bit.ly/1Ocweid (If you are going to watch one ppt or slideshow/talk about SEO or social media marketing, this is the one!)
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...CSCJournals
In the era of technology and internet, people use online social media services like Twitter, Instagram, Facebook, Reddit, etc. to express their emotions. The idea behind this paper is to understand people’s emotion on Twitter and their opinion towards Presidential Election 2020. We collected 1.2 million tweets in total with keyword like “RealDonaldTrump”, “JoeBiden”, “Election2020” and other election related keywords using Twitter API and then processed them with natural language processing toolkit. A Bidirectional Long Short-Term Memory (BiLSTM) model has been trained and we have achieved 93.45% accuracy on our test dataset. We then used our trained model to perform sentiment analysis on the rest of our dataset. With the sentiment analysis results and comparison with 2016 Presidential Election, we have made predictions on who could win the US Presidential Election in 2020 with pre-election twitter data. We have also analyzed the impact of COVID-19 on people’s sentiment about the election.
How to Find Your Site's True Ranking FactorsBotify
Ranking factor studies rely on third-party data and don't segment by page type or site type, which is why we recommend using first-party data to find what factors correlate with higher or lower rankings on your unique website.
Ph.D. Defense Video: https://www.youtube.com/watch?v=gpuhqjKNnDg
Thesis Statement:
Knowledge-infused Learning is a class of Neuro-Symbolic AI techniques that incorporate broader forms of knowledge (lexical, domain-specific, common-sense, and constraint-based) into addressing limitations of either symbolic or statistical AI approaches, such as model interpretations and user-level explanations. Compared to powerful statistical AI that exploit data, KiL benefit from data as well as knowledge.
Manas Gaur's Ph.D. Defense talk investigate the knowledge-infusion strategy in two
important ways. The first is to infuse knowledge to make any
classification task explainable. The second is to
achieve explainability in any natural language generation tasks. The defense
demonstrated the effective strategies of knowledge infusion that bring
five characteristic properties in any statistical AI model: (1) Context
Sensitivity, (2) Handling Uncertainty and Risk,
(3) Interpretable in learning, and (4) User-level Explainability, across natural language understanding (NLU) tasks. Along with proven methodological contributions in AI made by the Manas Gaur's dissertation, it also introduces Knowledge-intensive Language Understanding tasks, a variant of General Language Understanding (GLUE) tasks that challenges AI and NLU research on explainability and interpretability.
Furthermore, the Defense showcased the utility of incorporating diverse
forms of knowledge: linguistic, commonsense, broad-based, and
domain-specific. As the Defense illustrated the success in various domains, achieving state-of-the-art in specific applications, and significant contributions towards improving the state of machine intelligence, Manas also mentioned about careful steps to prevent errors arising due to knowledge infusion. The Defense also described Manas's future research direction towards Deep Knowledge Infusion, which would be pivotal in propelling machine understanding.
Similar to Knowledge base enabled Information Filtering on Social Web -- EMC (17)
Improving Natural Language Inference Using External Knowledge in the Science ...Pavan Kapanipathi
Natural Language Inference (NLI) is fundamental to many Natural Language Processing (NLP) applications including semantic search and question answering. The NLI problem has gained significant attention due to the release of large scale, challenging datasets. Present approaches to the problem largely focus on learning-based methods that use only textual information in order to classify whether a given premise entails, contradicts, or is neutral with respect to a given hypothesis. Surprisingly, the use of methods based on structured knowledge – a central topic in artificial intelligence – has not received much attention vis-a-vis the NLI problem. While there are many open knowledge bases that contain various types of reasoning information, their use for NLI has not
been well explored. To address this, we present a combination of techniques that harness external knowledge to improve
performance on the NLI problem in the science questions domain. We present the results of applying our techniques on
text, graph, and text-and-graph based models; and discuss the
implications of using external knowledge to solve the NLI
problem. Our model achieves close to state-of-the-art performance for NLI on the SciTail science questions dataset.
User Interests Identification From Twitter using Hierarchical Knowledge BasePavan Kapanipathi
Twitter, due to its massive growth as a social networking
platform, has been in focus for the analysis of its user generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts, to determine user interests, has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledgebases to create richer user profiles is yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a user's interests.
Presented "Random Walk on Graphs" in the reading group for Knoesis. Specifically for Recommendation Context.
Referred: Purnamrita Sarkar, Random Walks on Graphs: An Overview
P Kapanipathi, J Anaya, A Passant. SemPuSH: Privacy-Aware and Scalable Broadcasting for Semantic Microblogging (Demo) at International Semantic Web Conference 2011
Centralized social networking websites raise scalability issues — due to the growing number of participants — and policy concerns — such as control, privacy and ownership of users’ data. Distributed Social Networks aim to solve those by enabling architectures where people own their data and share it whenever and to whomever they wish. However, the privacy and scalability challenges are still to be tackled. Here, we present a privacy-aware extension to Google’s PubSubHubbub protocol, using Semantic Web technologies, solving both the scalability and the privacy issues in Distributed Social Networks. We enhanced the tradi- tional features of PubSubHubbub in order to allow content publishers to decide whom they want to share their information with, using semantic and dynamic group-based definition. We also present the application of this extension to SMOB (our Semantic Microblogging framework). Yet, our proposal is application agnostic, and can be adopted by any system requiring scalable and privacy-aware content broadcasting.
With the rapid growth in users on social networks, there is a corresponding increase in user-generated content, in turn resulting in information overload. On Twitter, for example, users tend to receive un- interested information due to their non-overlapping interests from the people whom they follow. In this paper we present a Semantic Web ap- proach to filter public tweets matching interests from personalized user profiles. Our approach includes automatic generation of multi-domain and personalized user profiles, filtering Twitter stream based on the gen- erated profiles and delivering them in real-time. Given that users inter- ests and personalization needs change with time, we also discuss how our application can adapt with these changes.
Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...AJHSSR Journal
ABSTRACT: In the Malaysian context, small and medium enterprises (SMEs) experience a significant
burden of workplace accidents. A consensus among scholars attributes a substantial portion of these incidents to
human factors, particularly unsafe behaviors. This study, conducted in Malaysia's northern region, specifically
targeted Safety and Health/Human Resource professionals within the manufacturing sector of SMEs. We
gathered a robust dataset comprising 107 responses through a meticulously designed self-administered
questionnaire. Employing advanced partial least squares-structural equation modeling (PLS-SEM) techniques
with SmartPLS 3.2.9, we rigorously analyzed the data to scrutinize the intricate relationship between safety
behavior and safety performance. The research findings unequivocally underscore the palpable and
consequential impact of safety behavior variables, namely safety compliance and safety participation, on
improving safety performance indicators such as accidents, injuries, and property damages. These results
strongly validate research hypotheses. Consequently, this study highlights the pivotal significance of cultivating
safety behavior among employees, particularly in resource-constrained SME settings, as an essential step toward
enhancing workplace safety performance.
KEYWORDS :Safety compliance, safety participation, safety performance, SME
Your LinkedIn Success Starts Here.......SocioCosmos
In order to make a lasting impression on your sector, SocioCosmos provides customized solutions to improve your LinkedIn profile.
https://www.sociocosmos.com/product-category/linkedin/
Telegram is a messaging platform that ushers in a new era of communication. Available for Android, Windows, Mac, and Linux, Telegram offers simplicity, privacy, synchronization across devices, speed, and powerful features. It allows users to create their own stickers with a user-friendly editor. With robust encryption, Telegram ensures message security and even offers self-destructing messages. The platform is open, with an API and source code accessible to everyone, making it a secure and social environment where groups can accommodate up to 200,000 members. Customize your messenger experience with Telegram's expressive features.
Grow Your Reddit Community Fast.........SocioCosmos
Sociocosmos helps you gain Reddit followers quickly and easily. Build your community and expand your influence.
https://www.sociocosmos.com/product-category/reddit/
Exploring The Dimensions and Dynamics of Felt Obligation: A Bibliometric Anal...AJHSSR Journal
ABSTARCT: This study presents, to our knowledge, the first bibliometric analysis focusing on the concept of
"felt obligation," examining 120 articles published between 1986 and 2024. The aim of the study is to deepen our
understanding of the existing knowledge in the field of "felt obligation" and to provide guidance for further
research. The analysis is centered around the authors, countries, institutions, and keywords of the articles. The
findings highlight prominent researchers in this field, leading universities, and influential journals. Particularly,
it is identified that China plays a leading role in "felt obligation" research. The analysis of keywords emphasizes
the thematic focuses of these studies and provides a roadmap for future research. Finally, various
recommendations are presented to deepen the knowledge in this area and promote applied research. This study
serves as a foundation to expand and advance the understanding of "felt obligation" in the field.
KEYWORDS: Felt Obligation, Bibliometric Analysis, Research Trends
Project Serenity is an innovative initiative aimed at transforming urban environments into sustainable, self-sufficient communities. By integrating green architecture, renewable energy, smart technology, sustainable transportation, and urban farming, Project Serenity seeks to minimize the ecological footprint of cities while enhancing residents' quality of life. Key components include energy-efficient buildings, IoT-enabled resource management, electric and autonomous transportation options, green spaces, and robust waste management systems. Emphasizing community engagement and social equity, Project Serenity aspires to serve as a global model for creating eco-friendly, livable urban spaces that harmonize modern conveniences with environmental stewardship.
Enhance your social media strategy with the best digital marketing agency in Kolkata. This PPT covers 7 essential tips for effective social media marketing, offering practical advice and actionable insights to help you boost engagement, reach your target audience, and grow your online presence.
Your Path to YouTube Stardom Starts HereSocioCosmos
Skyrocket your YouTube presence with Sociocosmos' proven methods. Gain real engagement and build a loyal audience. Join us now.
https://www.sociocosmos.com/product-category/youtube/
Multilingual SEO Services | Multilingual Keyword Research | Filosemadisonsmith478075
Multilingual SEO services are essential for businesses aiming to expand their global presence. They involve optimizing a website for search engines in multiple languages, enhancing visibility, and reaching diverse audiences. Filose offers comprehensive multilingual SEO services designed to help businesses optimize their websites for search engines in various languages, enhancing their global reach and market presence. These services ensure that your content is not only translated but also culturally and contextually adapted to resonate with local audiences.
Visit us at -https://www.filose.com/
Buy Pinterest Followers, Reactions & Repins Go Viral on Pinterest with Socio...SocioCosmos
Get more Pinterest followers, reactions, and repins with Sociocosmos, the leading platform to buy all kinds of Pinterest presence. Boost your profile and reach a wider audience.
https://www.sociocosmos.com/product-category/pinterest/
The Evolution of SEO: Insights from a Leading Digital Marketing AgencyDigital Marketing Lab
Explore the latest trends in Search Engine Optimization (SEO) and discover how modern practices are transforming business visibility. This document delves into the shift from keyword optimization to user intent, highlighting key trends such as voice search optimization, artificial intelligence, mobile-first indexing, and the importance of E-A-T principles. Enhance your online presence with expert insights from Digital Marketing Lab, your partner in maximizing SEO performance.
Surat Digital Marketing School is created to offer a complete course that is specifically designed as per the current industry trends. Years of experience has helped us identify and understand the graduate-employee skills gap in the industry. At our school, we keep up with the pace of the industry and impart a holistic education that encompasses all the latest concepts of the Digital world so that our graduates can effortlessly integrate into the assigned roles.
This is the place where you become a Digital Marketing Expert.
This tutorial presentation provides a step-by-step guide on how to use Facebook, the popular social media platform. In simple and easy-to-understand language, this presentation explains how to create a Facebook account, connect with friends and family, post updates, share photos and videos, join groups, and manage privacy settings. Whether you're new to Facebook or just need a refresher, this presentation will help you navigate the features and make the most of your Facebook experience.
Unlock TikTok Success with Sociocosmos..SocioCosmos
Discover how Sociocosmos can boost your TikTok presence with real followers and engagement. Achieve your social media goals today!
https://www.sociocosmos.com/product-category/tiktok/
EASY TUTORIAL OF HOW TO USE G-TEAMS BY: FEBLESS HERNANEFebless Hernane
Using Google Teams (G-Teams) is simple. Start by opening the Google Teams app on your phone or visiting the G-Teams website on your computer. Sign in with your Google account. To join a meeting, click on the link shared by the organizer or enter the meeting code in the "Join a Meeting" section. To start a meeting, click on "New Meeting" and share the link with others. You can use the chat feature to send messages and the video button to turn your camera on or off. G-Teams makes it easy to connect and collaborate with others!
14. Dynamic Topics
Manually updating
keywords to get topic
relevant tweets is not
feasible
“indianelection”
“modi”
“bjp”
“congress”
“jan25”
“egypt”
“tunisia”
“arabspring”
“sandy”
“newyork”
“redcross”
“fema”
“swineflu” “ebola”
14
15. Problem
How can we automatically update
the filters to track a dynamically
evolving topic on Twitter
15
16. Hashtags as Filters
• Identify a topic on Twitter
• Tweets with hashtags are
more informative
• Users have a lot of freedom
to create them
• Some get popular, most die
16
28. Clustering Co-efficient of Hashtag
Co-occurrence network (1%)
Clustering co-efficient
The top ones co-occur
with each other the best
28
29. Determining Relevancy of Co-
occurring Hashtags
#indianelection2015
#modikisarkar
Co-occurring:
Threshold δ
Preferably a prominent hashtag
29
30. Hashtag Co-occurrence
works?
o No. Just co-occurrence does not work
o Many noisy or unrelated hashtags co-occurs
o Determine the “dynamic” relevance of
the top co-occurring hashtag with the
dynamic topic
30
31. Determining Relevancy of Co-
occurring Hashtags
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
δ
Normalized
Frequency
Scoring
31
(Vector Space Model)
32. Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Dynamically Updated
Background Knowledge
δ
32
35. o Entities mentioned on the Event page of
Wikipedia are relevant to the Event
Event Relevant Background
Knowledge
35
36. o Wikipedia’s Hyperlink structure is very
rich
o Page-Page (Wikipedia) links
Indian General
Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India)UPA (India)
BJP
Indian National
Congress
Event Relevant Background
Knowledge – Graph Structure
36
37. Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
One hop from Event
Page
δ
37
38. o Hyperlink structure is dynamically
updated
Indian General
Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India)UPA (India)
BJP
Indian National
Congress
10 May 2010
Event Relevant Background
Knowledge
38
39. o Hyperlink structure is dynamically
updated
Indian General
Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India)UPA (India)
BJP
Indian National
Congress
10 May 2010
29 March 2013
29 March 2013 29 March 2013
29 March 2013
Event Relevant Background
Knowledge
39
40. o Hyperlink structure is dynamically
updated
Indian General
Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India)UPA (India)
BJP
Indian National
Congress
10 May 2010
29 March 2013
29 March 2013 29 March 2013
29 March 2013
20 May 2013
20 May 2013
Event Relevant Background
Knowledge
40
41. Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
Entity scoring based
on relevance to the Event
One hop from Event
Page
δ
41
42. o Edge Based Measure
o Link Overlap Measure: Jaccard similarity
o Out(c) are the links in Wikipedia page “c”
o Final Score: r(c,E) = ed(c,E) + oco(c,E)
Hyperlink Entity Scoring
India General
Election, 2014
Narendra Modi
India General
Election, 2014
India General
Election, 2009
1
Mutually
Important
ed (c,E) = 1
ed (c,E) = 2
42
43. Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
Entity scoring based
on relevance to the Event
One hop from Event
Page
Indian General Elec: 1.0
India: 0.9
Elections: 0.7
UPA: 0.6
BJP: 0.3
NDA: 0.3
Narendra Modi: 0.3
δ
43
44. Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
Entity scoring based
on relevance to the Event
One hop from Event
Page
Indian General Elec: 1.0
India: 0.9
Elections: 0.7
UPA: 0.6
BJP: 0.3
NDA: 0.3
Narendra Modi: 0.3
Similarity
Check
Relevance Score: 0.6
δ
44
45. o Set Based
o Jaccard Similarity
o Considers the entities without the scores
o Vector Based
o Symmetric
o Cosine Similarity
o Asymmetric
o Subsumption Similarity
Similarity Check
45
47. Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
Entity scoring based
on relevance to the Event
One hop from Event
Page
Indian General Elec: 1.0
India: 0.9
Elections: 0.7
UPA: 0.6
BJP: 0.3
NDA: 0.3
Narendra Modi: 0.3
Similarity
Check
Relevance Score: 0.6
δ
47
48. o 2 events
o US Presidential Elections (#election2012)
o Hurricane Sandy (#sandy)
o Top 25 co-occurring hashtags
Evaluation – Dataset
48
49. o Ranking Problem
o Rank the Top 25 hashtags based on the
relevancy of tweets to the event
o Experiment with all the similarity metrics
o Manually annotated the tweets of these
hashtags as relevant/irrelevant (Gold
Standard)
o Ranking Evaluation Metrics
o Mean Average Precision
o NDCG
Evaluation –
Strategy
49
56. o User Interest Identification on Twitter
o Content-based (Only Tweets)
o Term-based (semantic, web, #semanticweb)
o Entity-based (sematic web <same as> #semanticweb)
o Interest Graphs derived from knowledge-base
(Hierarchical Interest Graphs)
o Collaborative (Users’ Friends)
o Hybrid
User Modeling
56
60. What is in your mind? (Next
concept/term)
Fruit
60
61. What is in your mind? (Next
concept/term)
Fruit
Other Fruit
Names
61
62. Cognitive Science
o Human memory has been argued to be
structured as a hierarchy of concepts
(Semantic Network)
o Spreading activation theory has been
utilized to simulate search on semantic
network
o This theory has not been well explored
for user interest modeling
62
63. Hierarchical Interest Graphs
o Extending user profiles from Twitter to
comprise a hierarchy of concepts
o Hierarchy of concepts are derived from
Wikipedia Category Structure
o Each concept in the hierarchy is scored
based on the users extent of interest
63
76. 76
Cricket
M S Dhoni Virat Kohli
Sachin
Tendulkar
Sports
Indian
Cricket
Indian
Cricketers
0.8 0.2 0.6
0.5
0.4
0.25
0.1
Activation Function
Determines the extent of spreading
Example
87. Boosting Common Ancestors
87
Cricket
M S Dhoni Virat Kohli
Sachin
Tendulkar
Sports
Indian
Cricket
Indian
Cricketers3
3
5
5
Michael
Clarke
Shane
Watson
Australian
Cricket
Australian
Cricketers
2
2
87
89. o Bell
𝐴𝑗 = 𝐴𝑖 × 𝐹𝑗
𝑛
𝑖=0
o Bell Log
𝐴𝑗 = 𝐴𝑖 × 𝐹𝐿𝑗
𝑛
𝑖=0
o Priority Intersect
𝐴𝑗 = 𝐴𝑖 × 𝐹𝐿𝑗 × 𝑃𝑗𝑖 × 𝐵𝑗
𝑛
𝑖=0
89
Activation Functions
90. Evaluation
User Study
• 37 Users
• 30K Tweets
Evaluated the top-10 categories of
interests derived from the hierarchy
• 76% Mean Average Precision
• 98% Mean Reciprocal Recall
• 70% are not mentioned in tweets
90
91. o Working on a Tweet recommendation
system that utilizes Hierarchical
Interest Graph
o Preliminary results are “interesting”
91
Tweet Recommendation using
Hierarchical Interest Graph
92. Conclusion
o Focus on “Information” overload instead of
“Data” overload.
o Personalized Information Filtering
o Knowledge-base enabled solutions for
challenges in Tweets filtering
o Wikipedia hyperlink structure and category
graph leveraged for Twitter data filtering
o More Research on User Specific Attribute
Extraction (Personalization) from Twitter
Data
o Activity Estimation
o Location Prediction
95. Through physical monitoring and
analysis, our cellphones could act as
an early warning system to detect
serious health conditions, and
provide actionable information
canary in a coal mine
Empowering Individuals (who are not Larry Smarr!) for their own health
kHealth: knowledge-enabled healthcare
95
97. Motivational Scenario
Manually going through
news articles, diabetes
forums, blogs, etc.
- Time consuming
- Relevant?
Interesting?
Informative? Useful?
97
How about all the relevant and important health
information aggregated at one platform?
A diabetic patient is interested in keeping himself up to date with
new information about diabetes
98. 98
Search and Explore
X Controls
Cancer
X = diet, treatment, exercise
(Pattern-based Approach
leveraging domain
semantics)
Top Health News
Informative news about selected
disease
Faceted search (by health topics)
Learn about disease
Source: Wikipedia
Search &
Explore
Top Health
News
Tweet
Traffic
Learn about
Disease
Home