SlideShare a Scribd company logo
1 of 28
Experience in collaboration between
academia and industry:
NLP solutions for infodemic management
Ana Meštrović, Faculty of Informatics and Digital Technologies, University of Rijeka
Mladen Fernežir, Velebit AI
1
Overview
• InfoCoV project
• Project results
• Implementation
2
InfoCoV project
3
InfoCoV project
InfoCoV: Multilayer Framework for the Information Spreading
Characterization in Social Media during the COVID-19 Crisis
• Croatian Science Foundation - HRZZ
• 15 June 2020 – 14 January 2022
• Collaboration with Velebit AI
• Information monitoring
• COVID-19 texts in social media
• Research: NLP & SNA
4
Can AI help us in infodemic management?
• AI –> analysis of a large amount of texts
• Machine learning, neural networks, ...
• NLP tasks
• Keyword extraction
• Name entity recognition (NER)
• Topic modelling
• Text classification
• Sentiment analysis
• Fake news detection
• Multilayer framework
• Social network analysis
• Dynamic and spreading
5
Podaci
6
Dataset
Dataset Description Size
Cro-CoV-texts Texts collected from online portals > 186.738 articles
Cro-CoV-comm Users’ comments on COVID-19 articles in
online portals
> 503.325 comments
Cro-CoV-Tweets COVID-19 related tweets posted from users
registered in Croatia
> 1 milion tweets
> 200.000 COVID-19 tweets
Senti-Cro-CoV-Tweets Tweets annotated with the seniment polarity
(positive, negative, neutral)
10.000 annotated tweets
Cro-CoV-netTW Network of Twitter users > 40.000 users
Cro-CoV-multilayerTW Multilayer network of Twitter 6 layers (multilyer network)
Cro-CoV-Reddit Posts and comments from Croatian subreddit
1,654 posts
6,466 comments
Cro-CoV-Forum COVID-19 posts from the Croatian forum 3479 posts (* students)
Cro-CoV-YT COVID-19 posts from YouTube 4530 comments (* students)
Language and classification models
cro-CoV-cseBERT, cro-CoV-BERTić – language models
sent-cro-CoV-cseBERT – sentiment classification
multi-cro-CoV-cseBERT – retweet classification
7
Project results
8
Keyword extraction
9
9
symptoms and
hygiene
medicaments
and drugs
vaccine
general
terms
Online news portals, Cro-CoV-Texts
• 190.000 COVID-19 related articles
• Croatian language
• First 13 months of the pandemic (2 waves)
Named entity recognition
10
Topic modelling
Distribution of topics over time Topic spreding via retweeting
11
Sentiment analysis
Twitter, Cro-CoV-Tweets
• 206.196 COVID-19 related tweets
• Croatian language
• 1.1.2020. – 31.5.2021. (3 waves)
12
Clustering of Tweets
# Topic
0 Informative facts about COVID-19
1 Education and implementation of the COVID-19 policies
2 Coping with the pandemic
3 Revolt against the COVID-19 policies and behaviour of
citizens
4 Public discussion regarding anti-pandemic policies and
vaccines
5 Impact of COVID-19 policies on economy and education
6 Public comments on statements of the politicians and
scientists
7 Information about new daily COVID-19 cases
8 Ironic comments of COVID-19
9 Short generic messages related to COVID-19
13
Clustering of Tweets
• Negative attitudes: „Public discussion
regarding anti-pandemic policies and
vaccines”
• Non-negative attitudes: informative
messages and „Coping with the
panedmic”
14
InfoCoV team
• Laboratory for Semantic Technologies
15
Implementation
16
NLP classification techniques
● Classifying text is a basic NLP problem, but still often challenging
in practice
● Helpful: large language models pre-trained on large amounts of
data
● Regardless of the exact domain, the typical approach is common:
○ Pick a pre-trained language model close to your specific
problem
○ Optionally, tune the language model with your unlabeled data
○ Fine-tune the language model to your labeled data (your
specific categories to predict)
17
Language model tuning
● Available base language model for Croatian:
○ CroSloEngual BERT,
https://huggingface.co/EMBEDDIA/crosloengual-bert
○ BERTić* [bert-ich] /bɜrtitʃ/ - A transformer language model
for Bosnian, Croatian, Montenegrin and Serbian,
https://huggingface.co/classla/bcms-bertic
● Self-supervised tuning to COVID specific Croatian data
○ Useful to prepare data similar to the final classification task
(e.g. oversampling user comments data)
18
BERTić model self-supervised tuning
19
Sentiment classification
● Croatian Tweets related to COVID
● Classification problem into 3 sentiment classes:
○ Neutral: 4914
○ Negative: 3730
○ Positive: 475
● Difficulties:
○ Low amount of labeled data
○ Class disbalance
20
Typical problem: overfitting
21
Options to prevent overfitting
Loss weights can depend on
specific output combinations:
22
true:
0
true:
1
true:
2
predicted:
0
W00 W01 W02
predicted:
1
W10 W11 W12
predicted:
2
W20 W21 W22
● Minority class
oversampling
● Different class loss
weights
● Dropout
● L2 regularization
● Freezing some model
parameters
● Early stopping
● NLP data augmentation
Retweet category classification
A subset of Croatian tweets labeled into two categories
● 0: retweeted only once
● 1: tweets retweeted more than once
Types of features and variants of training
a. Content features extracted from a transformer language model
b. Tabular features representing Twitter users and their
interactions (categorical and numerical)
c. Joined all features
23
Investigating different algorithms
Classification algorithms:
• MLP
• Random Forest
• LightGBM
• NODE
• TabNet
• Category Embeddings & MLP
Github:
https://github.com/InfoCoV/Multi
-Cro-CoV-cseBERT
24
References:
Neural Oblivious Decision
Ensembles for Deep Learning on
Tabular Data,
https://arxiv.org/abs/1909.06312
TabNet: Attentive Interpretable
Tabular Learning,
https://arxiv.org/abs/1908.07442
Optuna hyper-parameter search
25
Optuna hyper-parameter importance
26
Useful Python libraries
● Hugging Face Transformers,
https://huggingface.co/docs/transformers/index
● Simple Transformers, https://simpletransformers.ai/
● Sentence Transformers, https://www.sbert.net/
● PyTorch, https://pytorch.org/
● PyTorch Ignite, https://pytorch-ignite.ai/
● PyTorch Tabular,
https://github.com/manujosephv/pytorch_tabular/
● LightGBM, https://github.com/microsoft/LightGBM
● CLASSLA, https://github.com/clarinsi/classla
27
Thank you
28

More Related Content

Similar to [DSC Croatia 22] Experience in collaboration between academia and industry: NLP solutions for infodemic management - Ana Mestrovic & Mladen Fernezir

Who is paying for the future internet?
Who is paying for the future internet?Who is paying for the future internet?
Who is paying for the future internet?Thomas Jelle
 
Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Bill Liu
 
Big Data and AI for Covid-19
Big Data and AI for Covid-19Big Data and AI for Covid-19
Big Data and AI for Covid-19Andrew Zhang
 
Everbridge Webinar - Ten Years After 9/11
Everbridge Webinar - Ten Years After 9/11Everbridge Webinar - Ten Years After 9/11
Everbridge Webinar - Ten Years After 9/11Everbridge, Inc.
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...Wolfgang Ksoll
 
2nd workshop em data science 08 02 2021
2nd workshop em data science 08 02 20212nd workshop em data science 08 02 2021
2nd workshop em data science 08 02 2021Weverify
 
Introduction to the FutureTDM project
Introduction to the FutureTDM projectIntroduction to the FutureTDM project
Introduction to the FutureTDM projectFutureTDM
 
D1: The NMC Methodology
D1: The NMC MethodologyD1: The NMC Methodology
D1: The NMC Methodologylisbk
 
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)Prashant Khare
 
Tech for Good: Using Map-Based Apps to Connect Us During a Pandemic
Tech for Good: Using Map-Based Apps to Connect Us During a PandemicTech for Good: Using Map-Based Apps to Connect Us During a Pandemic
Tech for Good: Using Map-Based Apps to Connect Us During a PandemicTechSoup
 
UNMS-NVSM: Code Discovery and Reengineering (Ch. Danhier & F. Van de Weerdt)
UNMS-NVSM: Code Discovery and Reengineering (Ch. Danhier & F. Van de Weerdt)UNMS-NVSM: Code Discovery and Reengineering (Ch. Danhier & F. Van de Weerdt)
UNMS-NVSM: Code Discovery and Reengineering (Ch. Danhier & F. Van de Weerdt)NRB
 
TTO Keynote 08 10 2021
TTO Keynote 08 10 2021TTO Keynote 08 10 2021
TTO Keynote 08 10 2021Weverify
 
Mobile TV - Killer content for the mobile generation
Mobile TV - Killer content for the mobile generationMobile TV - Killer content for the mobile generation
Mobile TV - Killer content for the mobile generationpietieter
 
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ ProjectQuantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ ProjectIngo Frommholz
 
COVID-19 As Catalyst For Information System Publication
COVID-19 As Catalyst For Information System PublicationCOVID-19 As Catalyst For Information System Publication
COVID-19 As Catalyst For Information System PublicationRiri Kusumarani
 
Session 5 - Susanne Weber
Session 5 - Susanne WeberSession 5 - Susanne Weber
Session 5 - Susanne WeberCap'Com
 
2 nd International Conference on Big Data and Applications (BDAP 2021)
2 nd International Conference on Big Data and Applications (BDAP 2021)2 nd International Conference on Big Data and Applications (BDAP 2021)
2 nd International Conference on Big Data and Applications (BDAP 2021)ijasuc
 
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceEU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceKasia Szkuta
 

Similar to [DSC Croatia 22] Experience in collaboration between academia and industry: NLP solutions for infodemic management - Ana Mestrovic & Mladen Fernezir (20)

Who is paying for the future internet?
Who is paying for the future internet?Who is paying for the future internet?
Who is paying for the future internet?
 
Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19Big Data and AI in Fighting Against COVID-19
Big Data and AI in Fighting Against COVID-19
 
Big Data and AI for Covid-19
Big Data and AI for Covid-19Big Data and AI for Covid-19
Big Data and AI for Covid-19
 
Everbridge Webinar - Ten Years After 9/11
Everbridge Webinar - Ten Years After 9/11Everbridge Webinar - Ten Years After 9/11
Everbridge Webinar - Ten Years After 9/11
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
 
2nd workshop em data science 08 02 2021
2nd workshop em data science 08 02 20212nd workshop em data science 08 02 2021
2nd workshop em data science 08 02 2021
 
Introduction to the FutureTDM project
Introduction to the FutureTDM projectIntroduction to the FutureTDM project
Introduction to the FutureTDM project
 
D1: The NMC Methodology
D1: The NMC MethodologyD1: The NMC Methodology
D1: The NMC Methodology
 
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
Classifying Crisis Information Relevancy with Semantics (ESWC 2018)
 
Tech for Good: Using Map-Based Apps to Connect Us During a Pandemic
Tech for Good: Using Map-Based Apps to Connect Us During a PandemicTech for Good: Using Map-Based Apps to Connect Us During a Pandemic
Tech for Good: Using Map-Based Apps to Connect Us During a Pandemic
 
UNMS-NVSM: Code Discovery and Reengineering (Ch. Danhier & F. Van de Weerdt)
UNMS-NVSM: Code Discovery and Reengineering (Ch. Danhier & F. Van de Weerdt)UNMS-NVSM: Code Discovery and Reengineering (Ch. Danhier & F. Van de Weerdt)
UNMS-NVSM: Code Discovery and Reengineering (Ch. Danhier & F. Van de Weerdt)
 
TTO Keynote 08 10 2021
TTO Keynote 08 10 2021TTO Keynote 08 10 2021
TTO Keynote 08 10 2021
 
Mobile TV - Killer content for the mobile generation
Mobile TV - Killer content for the mobile generationMobile TV - Killer content for the mobile generation
Mobile TV - Killer content for the mobile generation
 
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ ProjectQuantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
Quantum Mechanics meet Information Search and Retrieval – The QUARTZ Project
 
Idabc 18
Idabc 18Idabc 18
Idabc 18
 
Tunisia_En
Tunisia_EnTunisia_En
Tunisia_En
 
COVID-19 As Catalyst For Information System Publication
COVID-19 As Catalyst For Information System PublicationCOVID-19 As Catalyst For Information System Publication
COVID-19 As Catalyst For Information System Publication
 
Session 5 - Susanne Weber
Session 5 - Susanne WeberSession 5 - Susanne Weber
Session 5 - Susanne Weber
 
2 nd International Conference on Big Data and Applications (BDAP 2021)
2 nd International Conference on Big Data and Applications (BDAP 2021)2 nd International Conference on Big Data and Applications (BDAP 2021)
2 nd International Conference on Big Data and Applications (BDAP 2021)
 
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open EvidenceEU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
EU Data Market study. Presentation at NESSI Summit 2014 IDC & Open Evidence
 

More from DataScienceConferenc1

[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdfDataScienceConferenc1
 
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...DataScienceConferenc1
 
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdfDataScienceConferenc1
 
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdfDataScienceConferenc1
 
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdfDataScienceConferenc1
 
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptxDataScienceConferenc1
 
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdfDataScienceConferenc1
 
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...DataScienceConferenc1
 
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdfDataScienceConferenc1
 
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...DataScienceConferenc1
 
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...DataScienceConferenc1
 
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdfDataScienceConferenc1
 
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptxDataScienceConferenc1
 
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...DataScienceConferenc1
 
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptxDataScienceConferenc1
 
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...DataScienceConferenc1
 
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...DataScienceConferenc1
 
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptxDataScienceConferenc1
 
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptxDataScienceConferenc1
 
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdfDataScienceConferenc1
 

More from DataScienceConferenc1 (20)

[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
 
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
 
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
 
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
 
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
 
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
 
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
 
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
 
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
 
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
 
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
 
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
 
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
 
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
 
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
 
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
 
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
 
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
 
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
 
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
 

Recently uploaded

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 

Recently uploaded (20)

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 

[DSC Croatia 22] Experience in collaboration between academia and industry: NLP solutions for infodemic management - Ana Mestrovic & Mladen Fernezir

  • 1. Experience in collaboration between academia and industry: NLP solutions for infodemic management Ana Meštrović, Faculty of Informatics and Digital Technologies, University of Rijeka Mladen Fernežir, Velebit AI 1
  • 2. Overview • InfoCoV project • Project results • Implementation 2
  • 4. InfoCoV project InfoCoV: Multilayer Framework for the Information Spreading Characterization in Social Media during the COVID-19 Crisis • Croatian Science Foundation - HRZZ • 15 June 2020 – 14 January 2022 • Collaboration with Velebit AI • Information monitoring • COVID-19 texts in social media • Research: NLP & SNA 4
  • 5. Can AI help us in infodemic management? • AI –> analysis of a large amount of texts • Machine learning, neural networks, ... • NLP tasks • Keyword extraction • Name entity recognition (NER) • Topic modelling • Text classification • Sentiment analysis • Fake news detection • Multilayer framework • Social network analysis • Dynamic and spreading 5
  • 6. Podaci 6 Dataset Dataset Description Size Cro-CoV-texts Texts collected from online portals > 186.738 articles Cro-CoV-comm Users’ comments on COVID-19 articles in online portals > 503.325 comments Cro-CoV-Tweets COVID-19 related tweets posted from users registered in Croatia > 1 milion tweets > 200.000 COVID-19 tweets Senti-Cro-CoV-Tweets Tweets annotated with the seniment polarity (positive, negative, neutral) 10.000 annotated tweets Cro-CoV-netTW Network of Twitter users > 40.000 users Cro-CoV-multilayerTW Multilayer network of Twitter 6 layers (multilyer network) Cro-CoV-Reddit Posts and comments from Croatian subreddit 1,654 posts 6,466 comments Cro-CoV-Forum COVID-19 posts from the Croatian forum 3479 posts (* students) Cro-CoV-YT COVID-19 posts from YouTube 4530 comments (* students)
  • 7. Language and classification models cro-CoV-cseBERT, cro-CoV-BERTić – language models sent-cro-CoV-cseBERT – sentiment classification multi-cro-CoV-cseBERT – retweet classification 7
  • 9. Keyword extraction 9 9 symptoms and hygiene medicaments and drugs vaccine general terms Online news portals, Cro-CoV-Texts • 190.000 COVID-19 related articles • Croatian language • First 13 months of the pandemic (2 waves)
  • 11. Topic modelling Distribution of topics over time Topic spreding via retweeting 11
  • 12. Sentiment analysis Twitter, Cro-CoV-Tweets • 206.196 COVID-19 related tweets • Croatian language • 1.1.2020. – 31.5.2021. (3 waves) 12
  • 13. Clustering of Tweets # Topic 0 Informative facts about COVID-19 1 Education and implementation of the COVID-19 policies 2 Coping with the pandemic 3 Revolt against the COVID-19 policies and behaviour of citizens 4 Public discussion regarding anti-pandemic policies and vaccines 5 Impact of COVID-19 policies on economy and education 6 Public comments on statements of the politicians and scientists 7 Information about new daily COVID-19 cases 8 Ironic comments of COVID-19 9 Short generic messages related to COVID-19 13
  • 14. Clustering of Tweets • Negative attitudes: „Public discussion regarding anti-pandemic policies and vaccines” • Non-negative attitudes: informative messages and „Coping with the panedmic” 14
  • 15. InfoCoV team • Laboratory for Semantic Technologies 15
  • 17. NLP classification techniques ● Classifying text is a basic NLP problem, but still often challenging in practice ● Helpful: large language models pre-trained on large amounts of data ● Regardless of the exact domain, the typical approach is common: ○ Pick a pre-trained language model close to your specific problem ○ Optionally, tune the language model with your unlabeled data ○ Fine-tune the language model to your labeled data (your specific categories to predict) 17
  • 18. Language model tuning ● Available base language model for Croatian: ○ CroSloEngual BERT, https://huggingface.co/EMBEDDIA/crosloengual-bert ○ BERTić* [bert-ich] /bɜrtitʃ/ - A transformer language model for Bosnian, Croatian, Montenegrin and Serbian, https://huggingface.co/classla/bcms-bertic ● Self-supervised tuning to COVID specific Croatian data ○ Useful to prepare data similar to the final classification task (e.g. oversampling user comments data) 18
  • 20. Sentiment classification ● Croatian Tweets related to COVID ● Classification problem into 3 sentiment classes: ○ Neutral: 4914 ○ Negative: 3730 ○ Positive: 475 ● Difficulties: ○ Low amount of labeled data ○ Class disbalance 20
  • 22. Options to prevent overfitting Loss weights can depend on specific output combinations: 22 true: 0 true: 1 true: 2 predicted: 0 W00 W01 W02 predicted: 1 W10 W11 W12 predicted: 2 W20 W21 W22 ● Minority class oversampling ● Different class loss weights ● Dropout ● L2 regularization ● Freezing some model parameters ● Early stopping ● NLP data augmentation
  • 23. Retweet category classification A subset of Croatian tweets labeled into two categories ● 0: retweeted only once ● 1: tweets retweeted more than once Types of features and variants of training a. Content features extracted from a transformer language model b. Tabular features representing Twitter users and their interactions (categorical and numerical) c. Joined all features 23
  • 24. Investigating different algorithms Classification algorithms: • MLP • Random Forest • LightGBM • NODE • TabNet • Category Embeddings & MLP Github: https://github.com/InfoCoV/Multi -Cro-CoV-cseBERT 24 References: Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data, https://arxiv.org/abs/1909.06312 TabNet: Attentive Interpretable Tabular Learning, https://arxiv.org/abs/1908.07442
  • 27. Useful Python libraries ● Hugging Face Transformers, https://huggingface.co/docs/transformers/index ● Simple Transformers, https://simpletransformers.ai/ ● Sentence Transformers, https://www.sbert.net/ ● PyTorch, https://pytorch.org/ ● PyTorch Ignite, https://pytorch-ignite.ai/ ● PyTorch Tabular, https://github.com/manujosephv/pytorch_tabular/ ● LightGBM, https://github.com/microsoft/LightGBM ● CLASSLA, https://github.com/clarinsi/classla 27