"Harvesting Data from Twitter Workshop" presented in collaboration with IWAN Research Group.
Trainers: Dr. Nora AlTwairesh, Ms. Tarfa AlBuhairi, Ms. Mawaheb AlTuwaijri, and Ms. Afnan AlMoammar
-------------------------------------
ASA Research Group
Twitter: @ASA__IU
Email: asa@imamu.edu.sa
Website: http://asa.imamu.edu.sa/
-------------------------------------
IWAN Research Group
Twitter: @IWAN_RG
Email: iwan@ksu.edu.sa
Website: http://iwan.ksu.edu.sa
This presentation was part of my graduation project which we have done during the spring semester of 2013 it talked about how we extracted sentiment from tweets using data-mining and NLP methods
Lexicon-based approaches to Twitter sentiment analysis are gaining much popularity due to their simplicity, domain independence, and relatively good performance. These approaches rely on sentiment lexicons, where a collection of words are marked with fixed sentiment polarities. However, words' sentiment orientation (positive, neural, negative) and/or sentiment strengths could change depending on context and targeted entities. In this paper we present SentiCircle; a novel lexicon-based approach that takes into account the contextual and conceptual semantics of words when calculating their sentiment orientation and strength in Twitter. We evaluate our approach on three Twitter datasets using three different sentiment lexicons. Results show that our approach significantly outperforms two lexicon baselines. Results are competitive but inconclusive when comparing to state-of-art SentiStrength, and vary from one dataset to another. SentiCircle outperforms SentiStrength in accuracy on average, but falls marginally behind in F-measure.
Sentiments Analysis using Python and nltk Ashwin Perti
The presentation contains about how to classify the sentiments or sentiment analysis. Especially there are positive or negative emotions. So to classify them we have used python language by taking the help of nltk package.
This presentation was part of my graduation project which we have done during the spring semester of 2013 it talked about how we extracted sentiment from tweets using data-mining and NLP methods
Lexicon-based approaches to Twitter sentiment analysis are gaining much popularity due to their simplicity, domain independence, and relatively good performance. These approaches rely on sentiment lexicons, where a collection of words are marked with fixed sentiment polarities. However, words' sentiment orientation (positive, neural, negative) and/or sentiment strengths could change depending on context and targeted entities. In this paper we present SentiCircle; a novel lexicon-based approach that takes into account the contextual and conceptual semantics of words when calculating their sentiment orientation and strength in Twitter. We evaluate our approach on three Twitter datasets using three different sentiment lexicons. Results show that our approach significantly outperforms two lexicon baselines. Results are competitive but inconclusive when comparing to state-of-art SentiStrength, and vary from one dataset to another. SentiCircle outperforms SentiStrength in accuracy on average, but falls marginally behind in F-measure.
Sentiments Analysis using Python and nltk Ashwin Perti
The presentation contains about how to classify the sentiments or sentiment analysis. Especially there are positive or negative emotions. So to classify them we have used python language by taking the help of nltk package.
Most existing approaches to Twitter sentiment analysis assume that sentiment is explicitly expressed through affective words. Nevertheless, sentiment is often implicitly expressed via latent semantic relations, patterns and dependencies among words in tweets. In this paper, we propose a novel approach that automatically captures patterns of words of similar contextual semantics and sentiment in tweets. Unlike previous work on sentiment pattern extraction, our proposed approach does not rely on external and fixed sets of syntactical templates/patterns, nor requires deep analyses of the syntactic structure of sentences in tweets.
We evaluate our approach with tweet- and entity-level sentiment analysis tasks by using the extracted semantic patterns as classification features in both tasks. We use 9 Twitter datasets in our evaluation and compare the performance of our patterns against 6 state-of-the-art baselines. Results show that our patterns consistently outperform all other baselines on all datasets by 2.19% at the tweet-level and 7.5% at the entity-level in average F-measure.
Trend detection and analysis on TwitterLukas Masuch
By Henning Muszynski, Benjamin Räthlein & Lukas Masuch
The popularity of social media services has increased exponentially in the last few years. The combination of big social data and powerful analytical technologies makes it possible to gain highly valuable insights that otherwise might not be accessible. The Twitter Analyzer comprises several components to collect, analyze and visualize Twitter data. Therefore, we explored various related technologies to implement this tool. We collected about 38 million english tweets related to various and analyzed those data with machine learning techniques to compute the respective sentiment and detect common topics. Furthermore, we visualized the results using varying visualization techniques to emphasize different aspects such as a wordcloud, several chart-types and geospatial visualizations. Used technologies: MongoDB, Python, Twython, Python NLTK, wordcloud2.js, wordfreq, amCharts, Google BigQuery, Google Cloud Storage, CartoDB, EtcML.
SentiTweet is a sentiment analysis tool for identifying the sentiment of the tweets as positive, negative and neutral.SentiTweet comes to rescue to find the sentiment of a single tweet or a set of tweets. Not only that it also enables you to find out the sentiment of the entire tweet or specific phrases of the tweet.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
Sentiment Analysis of tweets which are extracted using twitter API and applying various filters according to the use . The sentiment analysis is done using the Afinn dictionary which is a dictionary consisting of words with their corresponding rating. A rating between +5 and -5 . A positive rating is indicated a positive statement and a negative rating indicated a negative one and a rating of 0 indicates a neutral statement.
Most existing approaches to Twitter sentiment analysis assume that sentiment is explicitly expressed through affective words. Nevertheless, sentiment is often implicitly expressed via latent semantic relations, patterns and dependencies among words in tweets. In this paper, we propose a novel approach that automatically captures patterns of words of similar contextual semantics and sentiment in tweets. Unlike previous work on sentiment pattern extraction, our proposed approach does not rely on external and fixed sets of syntactical templates/patterns, nor requires deep analyses of the syntactic structure of sentences in tweets.
We evaluate our approach with tweet- and entity-level sentiment analysis tasks by using the extracted semantic patterns as classification features in both tasks. We use 9 Twitter datasets in our evaluation and compare the performance of our patterns against 6 state-of-the-art baselines. Results show that our patterns consistently outperform all other baselines on all datasets by 2.19% at the tweet-level and 7.5% at the entity-level in average F-measure.
Trend detection and analysis on TwitterLukas Masuch
By Henning Muszynski, Benjamin Räthlein & Lukas Masuch
The popularity of social media services has increased exponentially in the last few years. The combination of big social data and powerful analytical technologies makes it possible to gain highly valuable insights that otherwise might not be accessible. The Twitter Analyzer comprises several components to collect, analyze and visualize Twitter data. Therefore, we explored various related technologies to implement this tool. We collected about 38 million english tweets related to various and analyzed those data with machine learning techniques to compute the respective sentiment and detect common topics. Furthermore, we visualized the results using varying visualization techniques to emphasize different aspects such as a wordcloud, several chart-types and geospatial visualizations. Used technologies: MongoDB, Python, Twython, Python NLTK, wordcloud2.js, wordfreq, amCharts, Google BigQuery, Google Cloud Storage, CartoDB, EtcML.
SentiTweet is a sentiment analysis tool for identifying the sentiment of the tweets as positive, negative and neutral.SentiTweet comes to rescue to find the sentiment of a single tweet or a set of tweets. Not only that it also enables you to find out the sentiment of the entire tweet or specific phrases of the tweet.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
Sentiment Analysis of tweets which are extracted using twitter API and applying various filters according to the use . The sentiment analysis is done using the Afinn dictionary which is a dictionary consisting of words with their corresponding rating. A rating between +5 and -5 . A positive rating is indicated a positive statement and a negative rating indicated a negative one and a rating of 0 indicates a neutral statement.
ReliefWeb's Journey from RSS Feed to Public APIPhase2
Software Architect Adam Ross presented "ReliefWeb's Journal from RSS Feed to Public API" at the 2014 API Strategy & Practice Conference in Chicago on September 25. He answers the questions:
What is ReliefWeb?
Why does it have an API?
How was it built?
What did it take to open it?
What's next?
PhishAri: Automatic Realtime Phishing Detection on TwitterAnupama Aggarwal
With the advent of online social media, phishers have started using social networks like Twitter, Facebook, Foursquare to spread phishing scams. Twitter is an immensely popular micro-blogging network where people post short messages of 140 characters called tweets. It has over 100 million active users who post about 200 million tweets everyday. Because of this vast information dissemination, phishers have started using Twitter as a medium to spread phishing. It is also difficult to detect phishing on Twitter unlike emails because of the quick spread of phishing links in the network, short size of the content, and use of URL obfuscation to shorten the URL to meet the requirement of 140 character tweet limit. Our technique, PhishAri, detects phishing on Twitter in realtime. We use Twitter specific features along with URL features to detect whether a tweet posted with a URL is phishing or not. Some of the Twitter specific features we use are tweet content and its characteristics like length, hashtags and mentions. Other Twitter features used are the characteristics of the Twitter user posting the tweet such as age of the account, number of tweets and the follower-followee ratio. These twitter specific features coupled with URL based features prove to be a strong mechanism to detect phishing tweets. We use machine learning classification techniques and detect phishing tweets with an accuracy of 92.52%. We have deployed our system for end-users by providing an easy to use Chrome browser extension. The extension works in realtime and classifies a tweet as phishing or safe when it appears in Twitter timeline of a user. In this research, we show that we are able to detect phishing tweets at zero hour with high accuracy which is much faster than public blacklists and as well as Twitter's own defense mechanism to detect malicious content. We also performed a quick user evaluation of PhishAri in a laboratory study to show that users like and are happy to use PhishAri in real-world. To the best of our knowledge, this is the first realtime, comprehensive, and usable system to detect phishing on Twitter.
Talk held at the Royal Statistical Society in London as part of the event series "Blurring the boundaries - New social media, new social science?". I thank Grant Blank from the OII for inviting me to this exciting workshop.
Similar to Harvesting Data from Twitter Workshop: Hands-on Experience (20)
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
4. Why Twitter
• Twitter has become a mass information hub that can be
used to study the evolution of any issue matter:
revolutionary machine
• Research disciplines that study Twitter data spanned
the domains of computer science, information science,
communications, business, economics, education,
medicine, political science, and sociology.
5. • Recent studies show that %60 of daily Arabic tweets
are from Saudi Arabia.
Why Twitter
Hamdy Mubarak and Kareem Darwish. 2014. Using Twitter to collect a multi-dialectal corpus of Arabic. ANLP 2014:1.
6. Twitter API
• Free access to the tweets posted in the last 7 days within a certain
rate-limit.
• Any tweets posted earlier than 7 days are considered historical
tweets and should be purchased through third party providers
• The Twitter API provides three interfaces for tweet collection:
Streaming API, REST API and Search API
7. Streaming API
• The Streaming API provides real-time tweets in a live-poll fashion.
• In a Streaming API, requested tweets will be constantly flowing as
they are posted on Twitter. It is delivered in three bandwidths:
“spritzer” :1%, “gardenhose”: 10% and “firehose”: 100% of all
tweets posted on Twitter.
• A regular user wanting to collect tweets will be granted spritzer
access.
8. REST API
• The REST API was specifically designed for programmatic access
to read and write Twitter data.
• Third party applications that interact with Twitter are provided with
a large set of methods in the REST API to develop these
applications.
• The access of the REST API is also rate-limited, the limit is 150
requests per hour.
9. Search API
• Similar to the REST API, the Search API is pull-based. It replicates
the search functionality provided on the Twitter website. However,
tweets retrieved are restricted to the past 7 days.
• the Search API is not appropriate for high-throughput real-time data
acquisition. As such Twitter Inc. discourages its use and plans to
discontinue it in the future.
10. Create a Twitter App
• To access the Twitter API you need to create a twitter app:
follow this simple tutorial to do so:
https://iag.me/socialmedia/how-to-create-a-twitter-app-in-8-
easy-steps/
• you will use the OAUTH settings in both R and Python:
• Consumer Key
• Consumer Secret
• OAuth Access Token
• OAuth Access Token Secret
11. Tools to Collect Tweets
• Nodexl: https://nodexl.codeplex.com/
• Tweet Archivist : https://www.tweetarchivist.com/
• Twitter Archiving Google Spreadsheet (TAGS):
https://tags.hawksey.info/
24. Python
• Two versions: 2.7 3.X
• Twitter packages: twitter -- -tweepy
• IDE :Anaconda: iPython notebook: Jupyter
25. Installing Python
• Install Anaconda from here
• https://www.continuum.io/downloads
choose Python 2.7 version (only for this tutorial)
• Install the twitter package: From the command line
(terminal) type: pip install twitter
26. Comparison between R and Python
• https://www.datacamp.com/community/tutorials/r-or-python-for-
data-analysis#gs.GuXGfAc
• http://blog.udacity.com/2015/01/python-vs-r-learn-first.html
• http://www.dataschool.io/python-or-r-for-data-science/
27. Contact Us
ASA Research Group
Twitter: @ASA__IU
Email: asa@imamu.edu.sa
Website: http://asa.imamu.edu.sa/
IWAN Research Group
Twitter: @IWAN_RG
Email: iwan@ksu.edu.sa
Website: http://iwan.ksu.edu.sa
Gardenhose access is granted on special request from Twitter Inc. and the
firehose access is granted to third-party business partners of Twitter Inc. which are considered third-party data providers.
An example of these applications is the inclusion of a Tweet share button on some websites that allows the reader of this website to share the link of the website on Twitter by posting it as a tweet; this is an example of writing Twitter data. An example of reading Twitter data is when websites display tweets of a certain hashtag or user account in a widget on their website’s pages.
With the Search API you can only sent 180 Requests every 15 min timeframe. With a maximum number of 100 tweets per Request this means you can mine for 4 x 180 x 100 = 72.000 tweets per hour.
The project was conceived in 1992, with an initial version released in 1994 and a stable beta version in 2000.
R is the leading tool for statistics, data analysis, and machine learning.
R is the leading tool for statistics, data analysis, and machine learning.
R allows you to integrate with other languages (C/C++, Java, Python)