Umm, how did you get that number? Managing Data Integrity throughout the Data...John Kinmonth
We live at the intersection of data and people. Data integrity is a function of the decisions that people make throughout the data lifecycle.
Dave De Noia, Pointmarc lead solution architect in data management, gives his take on the processes and people that affect data integrity throughout organizations at DRIVE 2014 (Data, Reporting, Intelligence, and Visualization Exchange)
Whether you're a retailer merging web analytics data with offline numbers or a healthcare company adding new data management software, De Noia explains how to avoid logic wobble and establish shared data structures.
About Dave:
Dave De Noia lives in the balance of chaos and order inherent to working with data. Starting his career at Microsoft building analyses in both SQL and big data environments, Dave later moved onto Redfin where he created and managed data infrastructure for analysis and reporting projects. Dave now serves as the senior solution and data architect at Pointmarc, a Bellevue-based digital analytics consultancy, where he helps some of the world’s largest brands get value from their data. Naturally functioning as a bridge between business and technical teams, Dave’s professional passion lies at the intersection of data and people.
About Pointmarc:
Pointmarc is a leading digital analytics agency providing actionable marketing insight and analytics platform instrumentation services for Fortune 500 clients within retail, technology, financial, media and pharmaceutical industries. With offices in Seattle, Boston, San Francisco and Portland, Pointmarc’s immersive approach to analytics empowers businesses to dive deeper into their data.
Email info@pointmarc.com for more information on data management or analytics instrumentation, and follow @pointmarc on Twitter for the latest in analytics.
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesBesnik Fetahu
The increasing adoption of Linked Data principles has led
to an abundance of datasets on the Web. However, take-up and reuse is hindered by the lack of descriptive information about the nature of the data, such as their topic coverage, dynamics or evolution. To address this issue, we propose an approach for creating linked dataset profiles. A profile consists of structured dataset metadata describing topics and their relevance. Profiles are generated through the configuration of techniques for resource sampling from datasets, topic extraction from reference datasets and their ranking based on graphical models. To enable a good trade-off between scalability and accuracy of generated profiles, appropriate parameters are determined experimentally. Our evaluation considers topic profiles for all accessible datasets from the Linked Open Data cloud. The results show that our approach generates accurate profiles even with comparably small sample sizes (10%) and outperforms established topic modelling approaches.
Measuring the Speed of the Red Queen's Race; Adaption and Evasion in MalwarePriyanka Aash
Security is a constant cat-and-mouse game between those trying to keep abreast of and detect novel malware, and the authors attempting to evade detection. The introduction of the statistical methods of machine learning into this arms race allows us to examine an interesting question: how fast is malware being updated in response to the pressure exerted by security practitioners? The ability of machine learning models to detect malware is now well known; we introduce a novel technique that uses trained models to measure "concept drift" in malware samples over time as old campaigns are retired, new campaigns are introduced, and existing campaigns are modified. Through the use of both simple distance-based metrics and Fisher Information measures, we look at the evolution of the threat landscape over time, with some surprising findings. In parallel with this talk, we will also release the PyTorch-based tools we have developed to address this question, allowing attendees to investigate concept drift within their own data.
Umm, how did you get that number? Managing Data Integrity throughout the Data...John Kinmonth
We live at the intersection of data and people. Data integrity is a function of the decisions that people make throughout the data lifecycle.
Dave De Noia, Pointmarc lead solution architect in data management, gives his take on the processes and people that affect data integrity throughout organizations at DRIVE 2014 (Data, Reporting, Intelligence, and Visualization Exchange)
Whether you're a retailer merging web analytics data with offline numbers or a healthcare company adding new data management software, De Noia explains how to avoid logic wobble and establish shared data structures.
About Dave:
Dave De Noia lives in the balance of chaos and order inherent to working with data. Starting his career at Microsoft building analyses in both SQL and big data environments, Dave later moved onto Redfin where he created and managed data infrastructure for analysis and reporting projects. Dave now serves as the senior solution and data architect at Pointmarc, a Bellevue-based digital analytics consultancy, where he helps some of the world’s largest brands get value from their data. Naturally functioning as a bridge between business and technical teams, Dave’s professional passion lies at the intersection of data and people.
About Pointmarc:
Pointmarc is a leading digital analytics agency providing actionable marketing insight and analytics platform instrumentation services for Fortune 500 clients within retail, technology, financial, media and pharmaceutical industries. With offices in Seattle, Boston, San Francisco and Portland, Pointmarc’s immersive approach to analytics empowers businesses to dive deeper into their data.
Email info@pointmarc.com for more information on data management or analytics instrumentation, and follow @pointmarc on Twitter for the latest in analytics.
A Scalable Approach for Efficiently Generating Structured Dataset Topic ProfilesBesnik Fetahu
The increasing adoption of Linked Data principles has led
to an abundance of datasets on the Web. However, take-up and reuse is hindered by the lack of descriptive information about the nature of the data, such as their topic coverage, dynamics or evolution. To address this issue, we propose an approach for creating linked dataset profiles. A profile consists of structured dataset metadata describing topics and their relevance. Profiles are generated through the configuration of techniques for resource sampling from datasets, topic extraction from reference datasets and their ranking based on graphical models. To enable a good trade-off between scalability and accuracy of generated profiles, appropriate parameters are determined experimentally. Our evaluation considers topic profiles for all accessible datasets from the Linked Open Data cloud. The results show that our approach generates accurate profiles even with comparably small sample sizes (10%) and outperforms established topic modelling approaches.
Measuring the Speed of the Red Queen's Race; Adaption and Evasion in MalwarePriyanka Aash
Security is a constant cat-and-mouse game between those trying to keep abreast of and detect novel malware, and the authors attempting to evade detection. The introduction of the statistical methods of machine learning into this arms race allows us to examine an interesting question: how fast is malware being updated in response to the pressure exerted by security practitioners? The ability of machine learning models to detect malware is now well known; we introduce a novel technique that uses trained models to measure "concept drift" in malware samples over time as old campaigns are retired, new campaigns are introduced, and existing campaigns are modified. Through the use of both simple distance-based metrics and Fisher Information measures, we look at the evolution of the threat landscape over time, with some surprising findings. In parallel with this talk, we will also release the PyTorch-based tools we have developed to address this question, allowing attendees to investigate concept drift within their own data.
TTO2021: Cross-Lingual Rumour Stance Classification: a First Study with BERT...Weverify
By Carolina Scarton. Presentation at the Truth and Trust Online Conference (TTO 2021). Link: https://truthandtrustonline.com/wp-content/uploads/2021/10/TTO2021_paper_31.pdf
Demo presentation of the MeVer tools for disinformation detection consists of Context aggregation and analysis, Image forensics, DeepFake detector, Near duplicate detection, Visual location estimation and Network analysis and visualization.
Operation-wise Attention Network for Tampering Localization Fusion.Weverify
In this work, we present a deep learning-based approach for image tampering localization fusion. This approach is designed to combine the outcomes of multiple image forensics algorithms and provides a fused tampering localization map, which requires no expert knowledge and is easier to interpret by end users. Our fusion framework includes a set of five individual tampering localization methods for splicing localization on JPEG images. The proposed deep learning fusion model is an adapted architecture, initially proposed for the image restoration task, that performs multiple operations in parallel, weighted by an attention mechanism to enable the selection of proper operations depending on the input signals. This weighting process can be very beneficial for cases where the input signal is very diverse, as in our case where the output signals of multiple image forensics algorithms are combined. Evaluation in three publicly available forensics datasets demonstrates that the performance of the proposed approach is competitive, outperforming the individual forensics techniques as well as another recently proposed fusion framework in the majority of cases.
DETECTING AND VERIFYING ONLINE DISINFORMATION:
HOW NLP AND DATA ANALYSIS CAN HELP.
By Carolina Scarton
Youtube link: https://www.youtube.com/watch?v=JPq3WFhbgsY
LIMITS AND RISKS OF USING AI FOR FACT-CHECKING:
QUESTIONS OF EFFECTIVENESS AND LEGALITY OF AI-DRIVEN DISINFORMATION DETECTION AND MODERATION.
EDMO workshop.
By Kalina Bontcheva
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
JMeter webinar - integration with InfluxDB and Grafana
Stance classification. Uni Cambridge 22 Jan 2021
1. REVISITING AND RE-EVALUATING RUMOUR STANCE CLASSIFICATION
University of Cambridge, 22nd January 2021
Carolina Scarton
c.scarton@sheffield.ac.uk
carolscarton
3. ONLINE RUMOURS
“circulating story of questionable veracity,
which is apparently credible but hard to verify,
and produces sufficient skepticism and/or
anxiety so as to motivate finding out the actual
truth” (Zubiaga et al., 2015)
8. RUMOUR STANCE CLASSIFICATION
➢ Stance of replies can help in predicting veracity (Mendoza et al., 2010;
Kumar and Carley, 2019) → specially denies (Zubiaga et al., 2016)
9. RUMOUR STANCE CLASSIFICATION
➢ Stance of replies can help in predicting veracity (Mendoza et al., 2010;
Kumar and Carley, 2019) → specially denies (Zubiaga et al., 2016)
➢ However,
• four-class classification problem
• support, deny, query, comment
• Highly imbalanced problem
• Support and denies
• most important classes
• Different from traditional stance classification task
10. RUMOUR STANCE CLASSIFICATION
➢ RumourEval 2017 and 2019 → most used datasets (PHEME project)
• Task A: rumour stance classification
11. RUMOUR STANCE CLASSIFICATION
➢ RumourEval 2017 and 2019 → most used datasets (PHEME project)
• Task A: rumour stance classification
• Current models and official evaluation metrics:
• not robust for four-class imbalanced problems
• not robust for problems where classes have different importance
16. DEALING WITH IMBALANCED DATA
FOR STANCE CLASSIFICATION
Yue Li and Carolina Scarton (2020): Revisiting Rumour Stance Classification: Dealing with Imbalanced Data. RDSM 2020.
17. GOING BACK TO BASICS...
➢ RumourEval 2017 data
➢ Feature-based classifier:
• Glove word embeddings (average for Twitter embedding)
18. GOING BACK TO BASICS...
➢ RumourEval 2017 data
➢ Feature-based classifier:
• Glove word embeddings (average for Twitter embedding)
• Features from Twitter metadata (Aker et al., 2017):
• number of replies
• has URL
• verified account
• number of followers, etc.
19. GOING BACK TO BASICS...
➢ RumourEval 2017 data
➢ Feature-based classifier:
• Glove word embeddings (average for Twitter embedding)
• Features from Twitter metadata (Aker et al., 2017):
• number of replies
• has URL
• verified account
• number of followers, etc.
• Textual features (Aker et al., 2017):
• sentiment analysis
• emoticon analysis
• has slang or curse word
• surprise/doubt scores, etc.
20. GOING BACK TO BASICS...
➢ RumourEval 2017 data
➢ Feature-based classifier:
• Glove word embeddings (average for Twitter embedding)
• Features from Twitter metadata (Aker et al., 2017):
• number of replies
• has URL
• verified account
• number of followers, etc.
• Textual features (Aker et al., 2017):
• sentiment analysis
• emoticon analysis
• has slang or curse word
• surprise/doubt scores, etc.
macro-F1: 0.486
21. … LOOKING INTO SOTA
➢ RumourEval 2017 data
➢ BERT model → fine-tuning BERT for stance classification task
macro-F1: 0.516
22. … LOOKING INTO SOTA
➢ RumourEval 2017 data
➢ BERT model → fine-tuning BERT for stance classification task
macro-F1: 0.516 macro-F1: 0.486
23. DEALING WITH IMBALANCED DATA (TRADITIONAL METHODS)
➢ Data-based approaches:
• Random over and undersampling: ROS and RUS
24. DEALING WITH IMBALANCED DATA (TRADITIONAL METHODS)
➢ Data-based approaches:
• Random over and undersampling: ROS and RUS
• Synthetic over-sampling:
• SMOTE: k-nearest neighbours of each observation in the
minority class
• ADASYN: level of hardness of learning the data
observation
25. DEALING WITH IMBALANCED DATA (TRADITIONAL METHODS)
➢ Data-based approaches:
• Random over and undersampling: ROS and RUS
• Synthetic over-sampling:
• SMOTE: k-nearest neighbours of each observation in the
minority class
• ADASYN: level of hardness of learning the data
observation
• Hybrid sampling: SMOTEEN → data cleaning
26. DEALING WITH IMBALANCED DATA (TRADITIONAL METHODS)
➢ Data-based approaches:
• Random over and undersampling: ROS and RUS
• Synthetic over-sampling:
• SMOTE: k-nearest neighbours of each observation in the
minority class
• ADASYN: level of hardness of learning the data
observation
• Hybrid sampling: SMOTEEN → data cleaning
➢ Learning-based approach: threshold moving (TM) →
changing probabilities of predicted classes
27. METHODOLOGY - MODEL SELECTION
➢ Training data: RumourEval 2017 training set
➢ Evaluation: RumourEval 2017 test set
28. METHODOLOGY - MODEL SELECTION
➢ Training data: RumourEval 2017 training set
➢ Evaluation: RumourEval 2017 test set
➢ Training Process: 4-fold cross validation for hyperparameter
tuning, including the parameter in synthetic over-sampling
29. METHODOLOGY - MODEL SELECTION
➢ Training data: RumourEval 2017 training set
➢ Evaluation: RumourEval 2017 test set
➢ Training Process: 4-fold cross validation for hyperparameter
tuning, including the parameter in synthetic over-sampling
➢ Each experiment is run 10 times to assess the model stability
30. METHODOLOGY - MODEL SELECTION
➢ Training data: RumourEval 2017 training set
➢ Evaluation: RumourEval 2017 test set
➢ Training Process: 4-fold cross validation for hyperparameter
tuning, including the parameter in synthetic over-sampling
➢ Each experiment is run 10 times to assess the model stability
➢ Evaluation metrics: Macro-F1, geometric mean of Recall (GMR)
31. METHODOLOGY - MODEL SELECTION
➢ Training data: RumourEval 2017 training set
➢ Evaluation: RumourEval 2017 test set
➢ Training Process: 4-fold cross validation for hyperparameter
tuning, including the parameter in synthetic over-sampling
➢ Each experiment is run 10 times to assess the model stability
➢ Evaluation metrics: Macro-F1, geometric mean of Recall (GMR)
➢ Feature-based classifiers: LR, RF, MLP
44. CONCLUSIONS
➢ Feature-based approaches can still be competitive
➢ Traditional methods for dealing with imbalanced data improve both
feature-based and BERT-based approaches
45. CONCLUSIONS
➢ Feature-based approaches can still be competitive
➢ Traditional methods for dealing with imbalanced data improve both
feature-based and BERT-based approaches
➢ BERT-based approaches → SOTA
• Still room for improvements → support and denies
46. CONCLUSIONS
➢ Feature-based approaches can still be competitive
➢ Traditional methods for dealing with imbalanced data improve both
feature-based and BERT-based approaches
➢ BERT-based approaches → SOTA
• Still room for improvements → support and denies
➢ Clever ways of using thread information may help
47. CONCLUSIONS
➢ Feature-based approaches can still be competitive
➢ Traditional methods for dealing with imbalanced data improve both
feature-based and BERT-based approaches
➢ BERT-based approaches → SOTA
• Still room for improvements → support and denies
➢ Clever ways of using thread information may help
➢ Evaluation needs to be more detailed
54. RUMOUR STANCE CLASSIFICATION EVALUATION
➢ New metrics are needed to reliably evaluate models
• Deal with data imbalance
• Give higher value to the most important classes: support and deny
55. RUMOUR STANCE CLASSIFICATION EVALUATION
➢ New metrics are needed to reliably evaluate models
• Deal with data imbalance
• Give higher value to the most important classes: support and deny
heavily penalises models that achieves a low score
for a given class
56. RUMOUR STANCE CLASSIFICATION EVALUATION
➢ New metrics are needed to reliably evaluate models
• Deal with data imbalance
• Give higher value to the most important classes: support and deny
heavily penalises models that achieves a low score
for a given class
weighted version of AUC
ROC → relationship between R and FPR
57. RUMOUR STANCE CLASSIFICATION EVALUATION
➢ New metrics are needed to reliably evaluate models
• Deal with data imbalance
• Give higher value to the most important classes: support and deny
heavily penalises models that achieves a low score
for a given class
weighted version of AUC
ROC → relationship between R and FPR
weighted version of macro-Fβ
β = 1 → precision and recall have same importance
β > 1 → recall has more importance
58. RUMOUR STANCE CLASSIFICATION EVALUATION
➢ New metrics are needed to reliably evaluate models
• Deal with data imbalance
• Give higher value to the most important classes: support and deny
heavily penalises models that achieves a low score
for a given class
weighted version of AUC
ROC → relationship between R and FPR
weighted version of macro-Fβ
β = 1 → precision and recall have same importance
β > 1 → recall has more importance
Weights → empirically
defined
wsupport
= 0.40
wdeny
= 0.40
wquery
= 0.15
wcomment
= 0.05
67. WEIGHTS DISCUSSION
➢ Weights need to:
• Deal with data imbalance
• Give higher value to the most important classes: support and deny
Weights only based only on data distribution:
Mama Edha:
- wsupport
= 0.157
- wdeny
= 0.396
- wquery
= 0.399
- wcomment
= 0.048
UPV:
- wsupport
= 0.200
- wdeny
= 0.350
- wquery
= 0.350
- wcomment
= 0.100
68. WEIGHTS DISCUSSION
➢ Weights need to:
• Deal with data imbalance
• Give higher value to the most important classes: support and deny
Weights only based only on data distribution:
Mama Edha:
- wsupport
= 0.157
- wdeny
= 0.396
- wquery
= 0.399
- wcomment
= 0.048
UPV:
- wsupport
= 0.200
- wdeny
= 0.350
- wquery
= 0.350
- wcomment
= 0.100
69. CONCLUSION
➢ Evaluation needs to take into account the task purposes:
• Rumour Stance Classification → improve veracity classification / rumour analysis
• Most informative classes: support and deny
• Highly imbalanced four-class classification problem
70. CONCLUSION
➢ Evaluation needs to take into account the task purposes:
• Rumour Stance Classification → improve veracity classification / rumour analysis
• Most informative classes: support and deny
• Highly imbalanced four-class classification problem
➢ Recall based metrics → higher priority to minority classes
71. CONCLUSION
➢ Evaluation needs to take into account the task purposes:
• Rumour Stance Classification → improve veracity classification / rumour analysis
• Most informative classes: support and deny
• Highly imbalanced four-class classification problem
➢ Recall based metrics → higher priority to minority classes
➢ Weighted metrics → higher priority to most important classes
72. CONCLUSION
➢ Evaluation needs to take into account the task purposes:
• Rumour Stance Classification → improve veracity classification / rumour analysis
• Most informative classes: support and deny
• Highly imbalanced four-class classification problem
➢ Recall based metrics → higher priority to minority classes
➢ Weighted metrics → higher priority to most important classes
Ideal evaluation: takes into account multiple metrics!
73. THANK YOU FOR YOUR ATTENTION!
www.weverify.eu
@WeVerify
Try yourself: https://cloud.gate.ac.uk/shopfront#tagged=WeVerify
Thanks to Yue Li for a lot of the slides (and work done!)
Collaboration with Kalina Bontcheva and Diego Silva