SlideShare a Scribd company logo
Weekly Meeting - XN 1
Informatics Students
Lingyu Hu, Zhaode Ouyang, Ziyan Yan, Zixun Zhou
Feb 16, 2022
Twitter Transparency Project datasets
● Find some similarities
● What language is the content in?
● Tried to translate content into English (Fail)
Twitter Transparency Project datasets
● Used different languages
● Use hashtags to spread content
Filter by country under Covid's topics
Interesting Found
I found out that most of the fake news about
Covid is happening in the first quarter of 2020
Since then there have been very few false news
stories about local outbreaks on Chinese social
media
Data Cleaning - SBS Ready Format
Works:
- Change separator from ‘,’ to ‘|’
- Remove ‘|’ in text column
- Remove line brakes
- 50GB+ data on Python
- .csv .json .zst
- UTF-8 ready
Not works:
- Output exact 30MB size on each file
- Separated file on seperate language
- 1,048,576+ rows on Excel
- 10GB+ data on Excel
SBS - Analysis Can Run, But No Result
SBS - Working
WordCloud
Keyword Extractor
Sentiment Analysis——Preliminary exploration
Tool——Azure Machine Learning
● One English dataset
● Two Chinese datasets——Abnormal results
A dataset of real news about COVID-19——Chinese
A dataset of fake news about COVID-19——Chinese
A dataset of disinformation about COVID-19——English
Questions
1. What is the relationship between sentiment analysis of data and
our project?
2. With the results of the sentiment analysis, what do we do next?
1. Use existing data to train and build judgment models.
Future Planning

More Related Content

Similar to Weekly Meeting 4.pdf

L1-L2.Introduction to Programming and Python Basics.pptx
L1-L2.Introduction to Programming and Python Basics.pptxL1-L2.Introduction to Programming and Python Basics.pptx
L1-L2.Introduction to Programming and Python Basics.pptx
DeepjyotiChoudhury4
 
The Ring programming language version 1.3 book - Part 4 of 88
The Ring programming language version 1.3 book - Part 4 of 88The Ring programming language version 1.3 book - Part 4 of 88
The Ring programming language version 1.3 book - Part 4 of 88
Mahmoud Samir Fayed
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Matt Stubbs
 
Dictionary project report.docx
Dictionary project report.docxDictionary project report.docx
Dictionary project report.docx
kishoreadhikari2
 
Language Translator using python and google API
Language Translator using python and google APILanguage Translator using python and google API
Language Translator using python and google API
SubhrajitRout6
 
Computer Science Student Final attachment Logbook
Computer Science Student Final attachment Logbook Computer Science Student Final attachment Logbook
Computer Science Student Final attachment Logbook
Paullaster Okoth
 
The Ring programming language version 1.5.2 book - Part 5 of 181
The Ring programming language version 1.5.2 book - Part 5 of 181The Ring programming language version 1.5.2 book - Part 5 of 181
The Ring programming language version 1.5.2 book - Part 5 of 181
Mahmoud Samir Fayed
 
Build your own Language - Why and How?
Build your own Language - Why and How?Build your own Language - Why and How?
Build your own Language - Why and How?
Markus Voelter
 
What is python language and How it works.pdf
What is python language and How it works.pdfWhat is python language and How it works.pdf
What is python language and How it works.pdf
chanduvarma019
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
PRBETTER
 
The Ring programming language version 1.10 book - Part 6 of 212
The Ring programming language version 1.10 book - Part 6 of 212The Ring programming language version 1.10 book - Part 6 of 212
The Ring programming language version 1.10 book - Part 6 of 212
Mahmoud Samir Fayed
 
FEC2017-Introduction-to-programming
FEC2017-Introduction-to-programmingFEC2017-Introduction-to-programming
FEC2017-Introduction-to-programming
Henrikki Tenkanen
 
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
ecij
 
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
ecij
 
Snowplow at the heart of Busuu's data & analytics infrastructure
Snowplow at the heart of Busuu's data & analytics infrastructureSnowplow at the heart of Busuu's data & analytics infrastructure
Snowplow at the heart of Busuu's data & analytics infrastructure
Giuseppe Gaviani
 
2023-My AI Experience - Colm Dunphy.pdf
2023-My AI Experience - Colm Dunphy.pdf2023-My AI Experience - Colm Dunphy.pdf
2023-My AI Experience - Colm Dunphy.pdf
Colm Dunphy
 
Free For All: Getting Started in Open Source
Free For All: Getting Started in Open SourceFree For All: Getting Started in Open Source
Free For All: Getting Started in Open Source
Ali King
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
IRJET Journal
 

Similar to Weekly Meeting 4.pdf (20)

L1-L2.Introduction to Programming and Python Basics.pptx
L1-L2.Introduction to Programming and Python Basics.pptxL1-L2.Introduction to Programming and Python Basics.pptx
L1-L2.Introduction to Programming and Python Basics.pptx
 
The Ring programming language version 1.3 book - Part 4 of 88
The Ring programming language version 1.3 book - Part 4 of 88The Ring programming language version 1.3 book - Part 4 of 88
The Ring programming language version 1.3 book - Part 4 of 88
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
 
Dictionary project report.docx
Dictionary project report.docxDictionary project report.docx
Dictionary project report.docx
 
Language Translator using python and google API
Language Translator using python and google APILanguage Translator using python and google API
Language Translator using python and google API
 
Computer Science Student Final attachment Logbook
Computer Science Student Final attachment Logbook Computer Science Student Final attachment Logbook
Computer Science Student Final attachment Logbook
 
The Ring programming language version 1.5.2 book - Part 5 of 181
The Ring programming language version 1.5.2 book - Part 5 of 181The Ring programming language version 1.5.2 book - Part 5 of 181
The Ring programming language version 1.5.2 book - Part 5 of 181
 
Build your own Language - Why and How?
Build your own Language - Why and How?Build your own Language - Why and How?
Build your own Language - Why and How?
 
What is python language and How it works.pdf
What is python language and How it works.pdfWhat is python language and How it works.pdf
What is python language and How it works.pdf
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
Ice dec04-04-sammy
Ice dec04-04-sammyIce dec04-04-sammy
Ice dec04-04-sammy
 
The Ring programming language version 1.10 book - Part 6 of 212
The Ring programming language version 1.10 book - Part 6 of 212The Ring programming language version 1.10 book - Part 6 of 212
The Ring programming language version 1.10 book - Part 6 of 212
 
FEC2017-Introduction-to-programming
FEC2017-Introduction-to-programmingFEC2017-Introduction-to-programming
FEC2017-Introduction-to-programming
 
CV of Jutheka Lahiry
CV of Jutheka LahiryCV of Jutheka Lahiry
CV of Jutheka Lahiry
 
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
 
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
USING MACHINE LEARNING TO BUILD A SEMI-INTELLIGENT BOT
 
Snowplow at the heart of Busuu's data & analytics infrastructure
Snowplow at the heart of Busuu's data & analytics infrastructureSnowplow at the heart of Busuu's data & analytics infrastructure
Snowplow at the heart of Busuu's data & analytics infrastructure
 
2023-My AI Experience - Colm Dunphy.pdf
2023-My AI Experience - Colm Dunphy.pdf2023-My AI Experience - Colm Dunphy.pdf
2023-My AI Experience - Colm Dunphy.pdf
 
Free For All: Getting Started in Open Source
Free For All: Getting Started in Open SourceFree For All: Getting Started in Open Source
Free For All: Getting Started in Open Source
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
 

More from ZixunZhou

Mid-term presentation.pdf
Mid-term presentation.pdfMid-term presentation.pdf
Mid-term presentation.pdf
ZixunZhou
 
Weekly Meeting 8.pdf
Weekly Meeting 8.pdfWeekly Meeting 8.pdf
Weekly Meeting 8.pdf
ZixunZhou
 
Weekly Meeting 7.pdf
Weekly Meeting 7.pdfWeekly Meeting 7.pdf
Weekly Meeting 7.pdf
ZixunZhou
 
Weekly Meeting 3.pdf
Weekly Meeting 3.pdfWeekly Meeting 3.pdf
Weekly Meeting 3.pdf
ZixunZhou
 
Weekly Meeting 2.pdf
Weekly Meeting 2.pdfWeekly Meeting 2.pdf
Weekly Meeting 2.pdf
ZixunZhou
 
Dashboard Design.pptx
Dashboard Design.pptxDashboard Design.pptx
Dashboard Design.pptx
ZixunZhou
 

More from ZixunZhou (6)

Mid-term presentation.pdf
Mid-term presentation.pdfMid-term presentation.pdf
Mid-term presentation.pdf
 
Weekly Meeting 8.pdf
Weekly Meeting 8.pdfWeekly Meeting 8.pdf
Weekly Meeting 8.pdf
 
Weekly Meeting 7.pdf
Weekly Meeting 7.pdfWeekly Meeting 7.pdf
Weekly Meeting 7.pdf
 
Weekly Meeting 3.pdf
Weekly Meeting 3.pdfWeekly Meeting 3.pdf
Weekly Meeting 3.pdf
 
Weekly Meeting 2.pdf
Weekly Meeting 2.pdfWeekly Meeting 2.pdf
Weekly Meeting 2.pdf
 
Dashboard Design.pptx
Dashboard Design.pptxDashboard Design.pptx
Dashboard Design.pptx
 

Recently uploaded

一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 

Recently uploaded (20)

一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 

Weekly Meeting 4.pdf

  • 1. Weekly Meeting - XN 1 Informatics Students Lingyu Hu, Zhaode Ouyang, Ziyan Yan, Zixun Zhou Feb 16, 2022
  • 2. Twitter Transparency Project datasets ● Find some similarities ● What language is the content in? ● Tried to translate content into English (Fail)
  • 3. Twitter Transparency Project datasets ● Used different languages ● Use hashtags to spread content
  • 4. Filter by country under Covid's topics
  • 5. Interesting Found I found out that most of the fake news about Covid is happening in the first quarter of 2020 Since then there have been very few false news stories about local outbreaks on Chinese social media
  • 6. Data Cleaning - SBS Ready Format Works: - Change separator from ‘,’ to ‘|’ - Remove ‘|’ in text column - Remove line brakes - 50GB+ data on Python - .csv .json .zst - UTF-8 ready Not works: - Output exact 30MB size on each file - Separated file on seperate language - 1,048,576+ rows on Excel - 10GB+ data on Excel
  • 7. SBS - Analysis Can Run, But No Result
  • 9. Sentiment Analysis——Preliminary exploration Tool——Azure Machine Learning ● One English dataset ● Two Chinese datasets——Abnormal results
  • 10. A dataset of real news about COVID-19——Chinese
  • 11. A dataset of fake news about COVID-19——Chinese
  • 12. A dataset of disinformation about COVID-19——English
  • 13. Questions 1. What is the relationship between sentiment analysis of data and our project? 2. With the results of the sentiment analysis, what do we do next? 1. Use existing data to train and build judgment models. Future Planning