Twitter has brought a paradigm shift in the way we produce and curate information about real-life events. Huge volumes of user-generated tweets are produced in Twitter, related to events. Not, all of them are useful and informative. A sizable amount of tweets are spams and colloquial personal status updates, which does not provide any useful information about an event. Thus, it is necessary to identify, rank and segregate event-specific informative content from the tweet streams. In this paper, we develop a novel generic framework based on the principle of mutual reinforcement, for identifying event-specific informative content from Twitter. Mutually reinforcing relationships between tweets, hashtags, text units, URLs and users are defined and represented using TwitterEventInfoGraph. An algorithm - TwitterEventInfoRank is proposed, that simultaneously ranks tweets, hashtags, text units, URLs and users producing them in terms of event-specific informativeness by leveraging the semantics of relationships between each of them as represented by TwitterEventInfoGraph. Experiments and observations are reported on four million (approx) tweets collected for five real-life events, and evaluated against popular baseline techniques showing significant improvement in performance.
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
We are the company providing Complete Solution for all Academic Final Year/Semester Student Projects. Our projects are
suitable for B.E (CSE,IT,ECE,EEE), B.Tech (CSE,IT,ECE,EEE),M.Tech (CSE,IT,ECE,EEE) B.sc (IT & CSE), M.sc (IT & CSE),
MCA, and many more..... We are specialized on Java,Dot Net ,PHP & Andirod technologies. Each Project listed comes with
the following deliverable: 1. Project Abstract 2. Complete functional code 3. Complete Project report with diagrams 4.
Database 5. Screen-shots 6. Video File
SERVICE AT CLOUDTECHNOLOGIES
IEEE, WEB, WINDOWS PROJECTS ON DOT NET, JAVA& ANDROID TECHNOLOGIES,EMBEDDED SYSTEMS,MAT LAB,VLSI DESIGN.
ME, M-TECH PAPER PUBLISHING
COLLEGE TRAINING
Thanks&Regards
cloudtechnologies
# 304, Siri Towers,Behind Prime Hospitals
Maitrivanam, Ameerpet.
Contact:-8121953811,8522991105.040-65511811
cloudtechnologiesprojects@gmail.com
http://cloudstechnologies.in/
A Framework for Collecting, Extracting and Managing Event Identity Informatio...Debanjan Mahata
With the popularity of Twitter, there has been voluminous growth in the digital footprints of real-life events in the Internet. The references to different types of events in Twitter have the potential to provide extremely valuable information to researchers and organizations, which could be mined and analyzed for making major decisions. There are tremendous applications in the areas of real-life event analysis, opinion mining, reference tracking, online advertising, recommendation engines, cyber security, event management, enterprise data integration, among others. Thus, there is a need of a generic framework that can collect different event references, extract identity information of the events from them and maintain the information persistently for resolving new references to the events and provide updated analytics. The presented research establishes the design and implementation of such a framework from the perspective of Event Identity Information Management (EIIM) in the domain of Twitter. The paper introduces the problem of EIIM in Twitter, discusses the prevalent challenges and proposes the design of a framework capable of managing persistent identity information of pre-specified set of events. We explore the applications of the research, validate the different components of the framework and conclude with our comments on various criteria showing high efficacy and practical utility of our proposed framework.
Chung-Jui LAI - Polarization of Political Opinion by News MediaREVULN
In 2016 US election, social media played a vital role in shaping public opinions as expressed by the news media that have created the phenomenon of polarization in the United States. Because social media gave people the ability to follow, share, post, comment below everything, the phenomenon of political opinions being spread easily and quickly on social media by the news agencies is bringing out a significantly polarized populace.
Consequently, it’s very important to understand the language differences on Twitter and figure out how propaganda spread by different political parties that influence or perhaps mislead public opinion. This talk will introduce the relationship among the social media, public opinion, and news media, then suggests the method to collect the tweets from Twitter and conduct sentimental and logistic regression analysis on them. Furthermore, this talk points out the special aspect on the relationship between the polarization and the topic of this conference (fake news, disinformation and propaganda).
Main points:
- situation in Taiwan
- research on fake news
- methods for fighting fake news
This presentation gives an overview of the Open data. A number of case studies are given on the spatio-temporal analysis and visualization of the Social Media data (Twitter). The presentation also explains the creation of a heatmap visualisation by using R.
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
We are the company providing Complete Solution for all Academic Final Year/Semester Student Projects. Our projects are
suitable for B.E (CSE,IT,ECE,EEE), B.Tech (CSE,IT,ECE,EEE),M.Tech (CSE,IT,ECE,EEE) B.sc (IT & CSE), M.sc (IT & CSE),
MCA, and many more..... We are specialized on Java,Dot Net ,PHP & Andirod technologies. Each Project listed comes with
the following deliverable: 1. Project Abstract 2. Complete functional code 3. Complete Project report with diagrams 4.
Database 5. Screen-shots 6. Video File
SERVICE AT CLOUDTECHNOLOGIES
IEEE, WEB, WINDOWS PROJECTS ON DOT NET, JAVA& ANDROID TECHNOLOGIES,EMBEDDED SYSTEMS,MAT LAB,VLSI DESIGN.
ME, M-TECH PAPER PUBLISHING
COLLEGE TRAINING
Thanks&Regards
cloudtechnologies
# 304, Siri Towers,Behind Prime Hospitals
Maitrivanam, Ameerpet.
Contact:-8121953811,8522991105.040-65511811
cloudtechnologiesprojects@gmail.com
http://cloudstechnologies.in/
A Framework for Collecting, Extracting and Managing Event Identity Informatio...Debanjan Mahata
With the popularity of Twitter, there has been voluminous growth in the digital footprints of real-life events in the Internet. The references to different types of events in Twitter have the potential to provide extremely valuable information to researchers and organizations, which could be mined and analyzed for making major decisions. There are tremendous applications in the areas of real-life event analysis, opinion mining, reference tracking, online advertising, recommendation engines, cyber security, event management, enterprise data integration, among others. Thus, there is a need of a generic framework that can collect different event references, extract identity information of the events from them and maintain the information persistently for resolving new references to the events and provide updated analytics. The presented research establishes the design and implementation of such a framework from the perspective of Event Identity Information Management (EIIM) in the domain of Twitter. The paper introduces the problem of EIIM in Twitter, discusses the prevalent challenges and proposes the design of a framework capable of managing persistent identity information of pre-specified set of events. We explore the applications of the research, validate the different components of the framework and conclude with our comments on various criteria showing high efficacy and practical utility of our proposed framework.
Chung-Jui LAI - Polarization of Political Opinion by News MediaREVULN
In 2016 US election, social media played a vital role in shaping public opinions as expressed by the news media that have created the phenomenon of polarization in the United States. Because social media gave people the ability to follow, share, post, comment below everything, the phenomenon of political opinions being spread easily and quickly on social media by the news agencies is bringing out a significantly polarized populace.
Consequently, it’s very important to understand the language differences on Twitter and figure out how propaganda spread by different political parties that influence or perhaps mislead public opinion. This talk will introduce the relationship among the social media, public opinion, and news media, then suggests the method to collect the tweets from Twitter and conduct sentimental and logistic regression analysis on them. Furthermore, this talk points out the special aspect on the relationship between the polarization and the topic of this conference (fake news, disinformation and propaganda).
Main points:
- situation in Taiwan
- research on fake news
- methods for fighting fake news
This presentation gives an overview of the Open data. A number of case studies are given on the spatio-temporal analysis and visualization of the Social Media data (Twitter). The presentation also explains the creation of a heatmap visualisation by using R.
Legacy 2.0: the democratization of history and the future of storiesTara Hunt
This is the presentation I gave in San Antonio on March 12, 2010 on the role of the crematory, cemetery and funeral association in preserving, navigating and helping us discover our digital legacies.
Twitter for Irish Archives, Archivists & Records Managers.learnaboutarchives
Is twitter for the birds?
A brief look at twitter and its usage by @archivesireland to promote the work of www.learnaboutarchives.ie and the Archives & Records Association Ireland.
News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...Axel Bruns
Paper by Axel Bruns and Tobias Keller, presented at the Social Media & Society 2020 conference, 22 July 2020. A video of the presentation is here: https://www.youtube.com/watch?v=pCKpDkC8iqI.
Butterfly Hunt: On Collecting #mla14 Tweets (#mla15 #s398)Dr Ernesto Priego
Presentation for the panel "The MLA and its Data: Remix, Reuse and Research, 5:15 - 6:30pm, Modern Language Association Convention 2015, Vancouver Conference Center, 121, VCC West.
Twitter is now an established and a widely popular news medium. Be it normal banter or a discussion on high impact events like Boston marathon blasts, February 2014 US Icestorm, etc., people use Twitter to get updates and also broadcast their thoughts and views. Twitter bots have today become very common and acceptable. People are using them to get updates about emergencies like natural disasters, terrorist strikes, etc., users also use them for getting updates about different places and events, both local and global. Twitter bots provide these users a means to perform certain tasks on Twitter that are both simple and structurally repetitive, at a much higher rate than what would be possible for a human alone. During high impact events these Twitter bots tend to provide a time critical and a comprehensive information source with information aggregated form various different sources. In this study, we present how these bots participate in discussions and augment them during high impact events. We identify bots in 5 high impact events for 2013: Boston blasts, February 2014 US Icestorm, Washington Navy Yard Shooting, Oklahoma tornado, and Cyclone Phailin. We identify bots among top tweeters by getting all such accounts manually annotated. We then study their activity and present many important insights. We determine the impact bots have on information diffusion during these events and how they tend to aggregate and broker information from various sources to different users. We also analyzed their tweets, list down important differentiating features between bots and non bots (normal or human accounts) during high impact events. We also show how bots are slowly moving away from traditional API based posts towards web automation platforms like IFTTT, dlvr.it, etc. Using standard machine learning, we proposed a methodology to identify bots/non bots in real time during high impact events. This study also looks into how the bot scenario has changed by comparing data from high impact events from 2013 against data from similar type of events from 2011. Bots active in high impact events generally don't spread malicious content. Lastly, we also go through an in-depth analysis of Twitter bots who were active during 2013 Boston Marathon Blast. We show how bots because of their programming structure don't pick up rumors easily during these events and even if they do; they do it after a long time.
Text mining, also known as text analytics or text data mining, is a data mining technique that involves extracting meaningful information and knowledge from unstructured textual data. It involves the process of analyzing and deriving insights from large volumes of text data to uncover patterns, trends, and relationships.
Here's an overview of text mining:
Data Collection: The first step in text mining is gathering relevant textual data from various sources, such as documents, web pages, social media, emails, customer reviews, or survey responses.
Text Preprocessing: Text data often requires preprocessing to clean and prepare it for analysis. This involves removing irrelevant information like stopwords (common words like "and" or "the"), punctuation, and special characters. It may also involve stemming or lemmatization to reduce words to their root form.
Tokenization: Tokenization is the process of splitting the text into individual words or tokens. It is a fundamental step that converts the text into a format suitable for analysis.
Text Mining Techniques:
Sentiment Analysis: Sentiment analysis aims to determine the sentiment or opinion expressed in a piece of text, whether it's positive, negative, or neutral. It is commonly used for analyzing customer feedback, social media sentiment, or online reviews.
Topic Modeling: Topic modeling is a technique used to discover latent topics or themes within a collection of documents. It helps identify the main subjects or areas of discussion in the text data.
Named Entity Recognition (NER): NER identifies and extracts specific entities such as names of people, organizations, locations, dates, or product names mentioned in the text.
Text Classification: Text classification involves categorizing or classifying documents into predefined categories or labels. It can be used for tasks such as spam detection, sentiment classification, or topic classification.
Text Clustering: Text clustering aims to group similar documents together based on their textual content. It helps in discovering patterns or similarities within the text data without predefined categories.
Information Extraction: Information extraction focuses on identifying structured information from unstructured text, such as extracting key phrases, relationships, or events mentioned in the text.
Text Summarization: Text summarization aims to generate a concise summary of a long text or document, capturing the main ideas and important information.
Visualization: Text mining often involves visualizing the results to gain insights and communicate findings effectively. Word clouds, bar charts, network diagrams, or heatmaps are examples of visualizations commonly used in text mining.
Interpretation and Applications: The interpretation of text mining results involves extracting meaningful insights, patterns, or knowledge from the analyzed text data. These insights can be used for various applications such as market research, customer feedback ana
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...CODE BLUE
Social media is no doubt a critical battlefield for threat actors to launch InfoOps, especially in a critical moment such as wartime or the election season. We have seen Bot-Driven Information Operations (InfoOps, aka influence campaign) have attempted to spread disinformation, incite protests in the physical world, and doxxing against journalists.
China's Bots-Driven InfoOps, despite operating on a massive scale, are often considered to have low impact and very little organic engagement. In this talk, we will share our observations on these persistent Bots-Driven InfoOps and dissect their harmful disinformation campaigns circulated in cyberspace.
In the past, most bots-driven operations simply parroted narratives of the Chinese propaganda machine, mechanically disseminating the same propaganda and disinformation artifacts made by Chinese state media. However, recently, we saw the newly created bots turn to post artifacts in a livelier manner. They utilized various tactics, including reposting screenshots of forum posts and disguised as members of “Milk Tea Alliance,” to create a false appearance that such content is being echoed across cyberspace.
We particularly focus on an ongoing China's bots-driven InfoOps targeting Taiwan, which we dub "Operation ChinaRoot." Starting in mid-2021, the bots have been disseminating manipulated information about Taiwan's local politics and Covid-19 measures. Our further investigation has also identified the linkage between Operation ChinaRoot and other Chinese state-linked networks such as DRAGONBRIDGE and Spamouflage.
Bitcoin Blockchains on Twitter timelines: A Social Media analysis of cryptocu...Alexia Maddox
Presentation at Concordia University October 2018
Dr Alexia Maddox, Lecturer in Communications, School of Communication and Creative Arts, Deakin University.
Email: a.maddox@deakin.edu.au
Keywords:
Social media analysis; twitter; cryptocurrencies; social disruption; digital trace data.
Abstract
Cryptocurrencies represent emerging financial technologies engendered through overlapping community values of decentralised peer-to-peer exchange, encryption technologies and an overarching agenda towards the disruption of centralised banking within the fiat economy. This paper will trace the development and shifts in public discourse within social media surrounding cryptocurrencies. The last five years have seen cryptocurrencies move from technological emergence to a broadening range of applications and history potholed with disputes, divergence, hacks and scams within the community. The accompanying influence of speculation has shifted the focus from social adoption to value volatility and seen the incorporation of associated technologies within banking and other organisational processes. The emphasis within public discourse has also followed a shift from bitcoin to blockchain. The study is grounded through a Twitter analysis of cryptocurrency-related social media discourse within the Australian context. The social media analysis works with social media archives of the Australian Twittersphere captured between early 2012 to May 2017. Access to this curated archive is through TrISMA and the timeframe under analysis aligns with the most detailed available dataset. The analysis seeks to characterise the emergence of public dialogue surrounding cryptocurrency use and application over time, focusing on peak engagement events. The key concepts directing the focus and interpretation of the social media analysis include financial inclusion, socio-technical disruption and social change. The whimsical quest of the study is to learn where the digital frontier has shifted to within this community and point to possible future developments. From a community studies perspective the case study represents an initial foray into data analytics to explore whether it is possible to detect the shifting shape and form of digital community through its environmental imprint (Maddox 2016). This methodological aspect of the work speaks to an attempt to generate a data recognition practice that can be deployed to search for signatures of social disruption within digital trace data.
Bio:
Alexia Maddox is a digital sociologist with research interests are community studies, research methods and digital frontiers. Here recent book, Research Methods and Global Online Communities: a case study with Routledge, combines these areas and forms the basis of her study of emerging communities forming through the internet and cryptography.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Legacy 2.0: the democratization of history and the future of storiesTara Hunt
This is the presentation I gave in San Antonio on March 12, 2010 on the role of the crematory, cemetery and funeral association in preserving, navigating and helping us discover our digital legacies.
Twitter for Irish Archives, Archivists & Records Managers.learnaboutarchives
Is twitter for the birds?
A brief look at twitter and its usage by @archivesireland to promote the work of www.learnaboutarchives.ie and the Archives & Records Association Ireland.
News Diffusion on Twitter: Comparing the Dissemination Careers for Mainstream...Axel Bruns
Paper by Axel Bruns and Tobias Keller, presented at the Social Media & Society 2020 conference, 22 July 2020. A video of the presentation is here: https://www.youtube.com/watch?v=pCKpDkC8iqI.
Butterfly Hunt: On Collecting #mla14 Tweets (#mla15 #s398)Dr Ernesto Priego
Presentation for the panel "The MLA and its Data: Remix, Reuse and Research, 5:15 - 6:30pm, Modern Language Association Convention 2015, Vancouver Conference Center, 121, VCC West.
Twitter is now an established and a widely popular news medium. Be it normal banter or a discussion on high impact events like Boston marathon blasts, February 2014 US Icestorm, etc., people use Twitter to get updates and also broadcast their thoughts and views. Twitter bots have today become very common and acceptable. People are using them to get updates about emergencies like natural disasters, terrorist strikes, etc., users also use them for getting updates about different places and events, both local and global. Twitter bots provide these users a means to perform certain tasks on Twitter that are both simple and structurally repetitive, at a much higher rate than what would be possible for a human alone. During high impact events these Twitter bots tend to provide a time critical and a comprehensive information source with information aggregated form various different sources. In this study, we present how these bots participate in discussions and augment them during high impact events. We identify bots in 5 high impact events for 2013: Boston blasts, February 2014 US Icestorm, Washington Navy Yard Shooting, Oklahoma tornado, and Cyclone Phailin. We identify bots among top tweeters by getting all such accounts manually annotated. We then study their activity and present many important insights. We determine the impact bots have on information diffusion during these events and how they tend to aggregate and broker information from various sources to different users. We also analyzed their tweets, list down important differentiating features between bots and non bots (normal or human accounts) during high impact events. We also show how bots are slowly moving away from traditional API based posts towards web automation platforms like IFTTT, dlvr.it, etc. Using standard machine learning, we proposed a methodology to identify bots/non bots in real time during high impact events. This study also looks into how the bot scenario has changed by comparing data from high impact events from 2013 against data from similar type of events from 2011. Bots active in high impact events generally don't spread malicious content. Lastly, we also go through an in-depth analysis of Twitter bots who were active during 2013 Boston Marathon Blast. We show how bots because of their programming structure don't pick up rumors easily during these events and even if they do; they do it after a long time.
Text mining, also known as text analytics or text data mining, is a data mining technique that involves extracting meaningful information and knowledge from unstructured textual data. It involves the process of analyzing and deriving insights from large volumes of text data to uncover patterns, trends, and relationships.
Here's an overview of text mining:
Data Collection: The first step in text mining is gathering relevant textual data from various sources, such as documents, web pages, social media, emails, customer reviews, or survey responses.
Text Preprocessing: Text data often requires preprocessing to clean and prepare it for analysis. This involves removing irrelevant information like stopwords (common words like "and" or "the"), punctuation, and special characters. It may also involve stemming or lemmatization to reduce words to their root form.
Tokenization: Tokenization is the process of splitting the text into individual words or tokens. It is a fundamental step that converts the text into a format suitable for analysis.
Text Mining Techniques:
Sentiment Analysis: Sentiment analysis aims to determine the sentiment or opinion expressed in a piece of text, whether it's positive, negative, or neutral. It is commonly used for analyzing customer feedback, social media sentiment, or online reviews.
Topic Modeling: Topic modeling is a technique used to discover latent topics or themes within a collection of documents. It helps identify the main subjects or areas of discussion in the text data.
Named Entity Recognition (NER): NER identifies and extracts specific entities such as names of people, organizations, locations, dates, or product names mentioned in the text.
Text Classification: Text classification involves categorizing or classifying documents into predefined categories or labels. It can be used for tasks such as spam detection, sentiment classification, or topic classification.
Text Clustering: Text clustering aims to group similar documents together based on their textual content. It helps in discovering patterns or similarities within the text data without predefined categories.
Information Extraction: Information extraction focuses on identifying structured information from unstructured text, such as extracting key phrases, relationships, or events mentioned in the text.
Text Summarization: Text summarization aims to generate a concise summary of a long text or document, capturing the main ideas and important information.
Visualization: Text mining often involves visualizing the results to gain insights and communicate findings effectively. Word clouds, bar charts, network diagrams, or heatmaps are examples of visualizations commonly used in text mining.
Interpretation and Applications: The interpretation of text mining results involves extracting meaningful insights, patterns, or knowledge from the analyzed text data. These insights can be used for various applications such as market research, customer feedback ana
[cb22] From Parroting to Echoing: The Evolution of China’s Bots-Driven Info...CODE BLUE
Social media is no doubt a critical battlefield for threat actors to launch InfoOps, especially in a critical moment such as wartime or the election season. We have seen Bot-Driven Information Operations (InfoOps, aka influence campaign) have attempted to spread disinformation, incite protests in the physical world, and doxxing against journalists.
China's Bots-Driven InfoOps, despite operating on a massive scale, are often considered to have low impact and very little organic engagement. In this talk, we will share our observations on these persistent Bots-Driven InfoOps and dissect their harmful disinformation campaigns circulated in cyberspace.
In the past, most bots-driven operations simply parroted narratives of the Chinese propaganda machine, mechanically disseminating the same propaganda and disinformation artifacts made by Chinese state media. However, recently, we saw the newly created bots turn to post artifacts in a livelier manner. They utilized various tactics, including reposting screenshots of forum posts and disguised as members of “Milk Tea Alliance,” to create a false appearance that such content is being echoed across cyberspace.
We particularly focus on an ongoing China's bots-driven InfoOps targeting Taiwan, which we dub "Operation ChinaRoot." Starting in mid-2021, the bots have been disseminating manipulated information about Taiwan's local politics and Covid-19 measures. Our further investigation has also identified the linkage between Operation ChinaRoot and other Chinese state-linked networks such as DRAGONBRIDGE and Spamouflage.
Bitcoin Blockchains on Twitter timelines: A Social Media analysis of cryptocu...Alexia Maddox
Presentation at Concordia University October 2018
Dr Alexia Maddox, Lecturer in Communications, School of Communication and Creative Arts, Deakin University.
Email: a.maddox@deakin.edu.au
Keywords:
Social media analysis; twitter; cryptocurrencies; social disruption; digital trace data.
Abstract
Cryptocurrencies represent emerging financial technologies engendered through overlapping community values of decentralised peer-to-peer exchange, encryption technologies and an overarching agenda towards the disruption of centralised banking within the fiat economy. This paper will trace the development and shifts in public discourse within social media surrounding cryptocurrencies. The last five years have seen cryptocurrencies move from technological emergence to a broadening range of applications and history potholed with disputes, divergence, hacks and scams within the community. The accompanying influence of speculation has shifted the focus from social adoption to value volatility and seen the incorporation of associated technologies within banking and other organisational processes. The emphasis within public discourse has also followed a shift from bitcoin to blockchain. The study is grounded through a Twitter analysis of cryptocurrency-related social media discourse within the Australian context. The social media analysis works with social media archives of the Australian Twittersphere captured between early 2012 to May 2017. Access to this curated archive is through TrISMA and the timeframe under analysis aligns with the most detailed available dataset. The analysis seeks to characterise the emergence of public dialogue surrounding cryptocurrency use and application over time, focusing on peak engagement events. The key concepts directing the focus and interpretation of the social media analysis include financial inclusion, socio-technical disruption and social change. The whimsical quest of the study is to learn where the digital frontier has shifted to within this community and point to possible future developments. From a community studies perspective the case study represents an initial foray into data analytics to explore whether it is possible to detect the shifting shape and form of digital community through its environmental imprint (Maddox 2016). This methodological aspect of the work speaks to an attempt to generate a data recognition practice that can be deployed to search for signatures of social disruption within digital trace data.
Bio:
Alexia Maddox is a digital sociologist with research interests are community studies, research methods and digital frontiers. Here recent book, Research Methods and Global Online Communities: a case study with Routledge, combines these areas and forms the basis of her study of emerging communities forming through the internet and cryptography.
Similar to From Chirps to Whistles - Discovering Event-specific Informative Content from Twitter (20)
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
From Chirps to Whistles - Discovering Event-specific Informative Content from Twitter
1. From Chirps to Whistles
Discovering Event-specific
Informative Content from Twitter
Debanjan Mahata, John R. Talburt
dxmahata@ualr.edu, jrtalburt@ualr.edu
Department of Information Science
University of Arkansas at Little Rock, Little Rock, USA
Vivek Kumar Singh
vivek@cs.sau.ac.in
Department of Computer Science
South Asian University, New Delhi, India
3. “In #Sochi, the Dutch are
dominating the overall Olympic
medal count
http://t.co/jMR1WUqEK4
(Reuters)
http://t.co/dAfDhEgTGA.
”
“New post: Sochi Was For
Suckers - Laugh Studios/
http://t.co/cWQJCBp3Ow #lol
#funny #rofl #funnypic #fail
#wtf.”“Thanks for the memories Sochi!
I've had the time of my life
#Sochi2014 #sochiselfie
http://t.co/DqkLEaAMpo.”
“Cooked my first low-fat meal
today, officially on a diet #sochi.”
Time
Twitter Content for Real-life Events
4. Intriguing Questions
• Which are the event-specific informative tweets and how to
identify them?
• Who are the users producing large amount of event-specific
informative content in Twitter?
• Which are the best hashtags and URLs to follow that will
lead to high quality event-specific information?
• Which are the hashtags and text units suitable for indexing
for efficient retrieval of event-specific information?
• Can we possibly devise a method that answers the above
questions simultaneously?
5. Potential Applications
• Event Monitoring and Analysis
• Event Information Retrieval
• Opinion and Review Mining
• Recommender Systems
• Event Management and Marketing
• Social Media Data Integration
• Digital Journalism
• Many More
6. Challenges
Volume and Velocity Veracity
New post: Sochi Was For Suckers -
Laugh Studios/
http://t.co/cWQJCBp3Ow #lol
#funny #rofl #funnypic #fail #wtf
Informal Text
Variety
Searching the Long
Tail
Sampling
Bias
Sparse Link
Structure Between
Content in
Social Media
Lack of Evaluation
Datasets
7. Problem Statement
Given an event , a time ordered stream of n tweets
related to the event posted in time period , the problem is to find a ranked set of :
• Tweets
• Hashtags
• Text Units
• URLs
• Users
Ordered in terms of their decreasing order of its event-specific informativeness
iE },...,,{ 21 nE mmmM i
iET
}|......{ 1 jimmmmM njiEi
}|......{ 1 jihhhhH pjiEi
}|......{ 1 jiwwwwW rjiEi
}|......{ 1 jillllL tjiEi
}|......{ 1 jiuuuuU sjiEi
9. Event Reference Preparation
• Parts-of-Speech Tagging
• Special Character Detection
• Data Cleansing
• Duplicate Detection
• Stop Word Detection and Elimination
• Slang Word Extraction
• Feeling Word Extraction
• Tokenization
• Stemming
• Tweet Meta-Data
• Expanded URLs
• User Information
• Verification
• Favorite Count
• Retweet Count
• User Mentions
• Entity Extraction
10. Tweet Features
No. of Unigram Tokens, No. of Stop Words, No. of Slang
Words, No. of Feeling Words, No. of Hashtags, Has URL,
Is Verified, No. of User Mentions, Length of Post, No. of
Unique Characters, No. of Special Characters, Favorite
Count, Retweet Count, Formality, No. of Nouns, No. of
Adjectives, No. of Verbs, No. of Adverbs.
Logistic Regression
Model
Performance
Precision Recall F-1 Score
Non-informative (0) 0.70 0.49 0.57
Informative (1) 0.78 0.90 0.84
Avg/Total
Accuracy = 76.64
0.76 0.77 0.75
Olteanu, Alexandra, et al. "CrisisLex: A lexicon for collecting and filtering microblogged communications in crises." In Proceedings of
the 8th International AAAI Conference on Weblogs and Social Media (ICWSM" 14). No. EPFL-CONF-203561. 2014.
Event Related Content Analysis
28000 annotated tweets
26 Events
Related and Informative – “#Media
Large wildfire in N. Colorado prompts
Evacuation : Crews are battling a fast-
Moving wildfire http://t.co/ju1BGTKH
#Politics #News”
Related but not Informative – “RT
@LarimerSheriff: #HighParkFire
update http://t.co/hBy5shen”
Not Related – “#Intern #US #TATTOO
#Wisconsin #Ohio #NC #PA #Florida
#Colorado #Iowa #Nevada #Virginia
#NV #mlb Travel Destinations;
http://t.co/TIHBJKF2”
14. • SeenRank (http://seen.co/about)
• TextRank (Mihalcea, Rada, and Paul Tarau. "TextRank: Bringing order into texts." Association for
Computational Linguistics, 2004.)
• LexRank(Erkan, Günes, and Dragomir R. Radev. "LexRank: graph-based lexical centrality as salience
in text summarization." Journal of Artificial Intelligence Research (2004): 457-479.)
• RTRank
• Centroid(Becker, Hila, Mor Naaman, and Luis Gravano. "Selecting Quality Twitter Content for
Events." ICWSM 11 (2011).)
• Logistic Regression
Baselines
15. Evaluation Metrics
p
i
rel
p
i
DCG
i
1 )1log(
12
p
p
p
IDCG
DCG
nDCG
n
natreferencesrelevantofNumber
natecision Pr
Baeza-Yates, Ricardo, and Berthier Ribeiro-Neto. Modern information retrieval. Vol. 463. New York: ACM press, 1999.
Järvelin, Kalervo, and Jaana Kekäläinen. "Cumulated gain-based evaluation of IR techniques." ACM Transactions on Information
Systems (TOIS) 20.4 (2002): 422-446.
20. Event Name Sydney Siege Crisis
Top 10 Event-specific
Informative Hashtags
#sydneysiege, #SydneySiege, #Sydneysiege, #MartinPlace, #9News,
#SydneyHostageCrisis, #Sydney, #Lindt, #ISIS, #SYDNEYSIEGE
Top 10 Event-specific
Informative Text Units
police, sydney, reporter, lindt, isis, nsw, commissioner, australia,
catherine, martin
Top 5 Event-specific
Informative URLs
1. http://www.cnn.com/2014/12/15/world/asia/australia-sydney-hostage-situation/index.html
2. http://www.bbc.co.uk/news/world-australia-30474089
3. http://edition.cnn.com/2014/12/15/world/asia/australia-sydney-siege-scene/index.html
4. http://rt.com/news/214399-sydney-hostages-islamists-updates/
5. http://www.newsroompost.com/138766/sydney-cafe-siege-ends-gunman-among-two-killed
Top 5 Event-specific
Informative Tweets
1. RT @faithcnn: Hostage taker in Sydney cafe has demanded 2 things: ISIS flag
and; phone call with Australia PM Tony Abbott #SydneySiege
http://t.co/a2vgrn30Xh
2. Aussie grand mufti and; Imam Council condemn #Sydneysiege hostage capture
http://t.co/ED98YKMxqM - LIVE UPDATES http://t.c...
3. RT @PatDollard: #SydneySiege: Hostages Held By Jihadis In Australian Cafe -
WATCH LIVE VIDEO COVERAGE http://t.co/uGxmd7zLpc #tcot #pjnet
4. RT @FoxNews: MORE: Police confirm 3 hostages escape Sydney cafe, unknown
number remain inside http://t.co/pcAt91LIdS #Sydneysiege
5. Watch #sydneysiege police conference live as hostages are still being held
inside a central Sydney cafe http://t.co/OjulBqM7w2 #c4news
Sample Raw Results for Sydney Siege
Crisis
21. Sample Raw Results for Sydney Siege
Crisis
Top Five Event-
specific
Informative Users
Three Randomly Selected Tweet Excerpts
User 1
Total no. of event
related tweets by
the user: 41
1. RT @cnni: Hostage taker in Sydney cafe demands ISIS flag and call with Australian PM, Sky News reports.
http://t.co/a2vgrn30Xh #sydneysiege
2. RT @DR_SHAHID: Hostage taker demands delivery of an #ISIS flag and a conversation with Prime Minister
Tony Abbott http://t.co/xTSDMKCPcD
3. RT @SkyNewsBreak: Update - New South Wales police commissioner confirms five hostages have escaped
from the Lindt cafe in Sydney #sydneysiege
User 2
Total no. of event
related tweets by
the user: 33
1. RT @smh: NSW Police Deputy Commissioner Catherine Burn will hold a press conference to update on the
#SydneySiege at 6.30pm.
2. RT @Y7News: Helpful travel advice for commuters heading out of #Sydney’s CBD this evening -
http://t.co/aQx2lvSosm #sydneysiege
3. RT @hughwhitfeld: British PM David Cameron informed of #sydneysiege .. UK Foreign Office is in touch with
Aus authorities
User 3
Total no. of event
related tweets by
the user: 32
1. RT @RT_com: #SYDNEY: Gunman tall man in late 40s, dressed in black – eyewitness http://t.co/m51P8dUPhB
#SydneySiege http://t.co/NvJzFsGrFN
2. RT @NewsAustralia: 2GB's Ray Hadley claims hostage takers in #SydneySiege "wants to speak to Prime
Minister Abbott live on radio."
3. RT @BBCWorld: "Profoundly shocking" -Australia PM Tony Abbott delivers second #sydneysiege statement.
MORE: http://t.co/VaKt3ZpRZR
22. Future Directions
• Summarizing Event Content
• Identification of Insightful Opinionated
Content
• Event Topic Modeling
• Event-specific Recommendations
• Distributed Processing of
TwitterEventInfoGraph
• Ontology for Event Content in Social Media
• Many More
27. Defining Events
An event is defined as a real-world occurrence with an associated time period
and a time ordered stream of tweets , of substantial volume,
Discussing about the event and posted in time .
iE
)( end
E
start
EE iii
ttT iEM
iET
Becker, Hila, Mor Naaman, and Luis Gravano. "Beyond Trending Topics: Real-World Event Identification on Twitter." ICWSM 11 (2011): 438-441.
},...,,{ 21 nE mmmM i
},...,,{ 21 pE hhhH i
},...,,{ 21 tE lllL i
},...,,{ 21 rE wwwW i
Tweets are primarily composed of
• Set of hashtags
• Set of text units
• Set of URLs
• Set of users },...,,{ 21 sE uuuU i