SlideShare a Scribd company logo
1 of 46
Download to read offline
Slide 1 -- very brief introduction to your project. (Note, this is to help your classmates refresh their memory about your project, which should be very
short, one or two sentences highlight) Jerry
Slide 2 -- all research methods you have used to complete this project(for each method, use justone sentence to justify why it's necessary to adopt this
method)
Slide 3 -- all data you havecollected (a list of types of the data, including number of the email correspondence, number of the interviews, pages of
documents you reviewed, etc.) {平台,数据仓库(github、kaggle)和canvasoet里提供的数据 (covid,ukrane,伊朗),调查平台数据是否可
用}(Ziyan),爬数据方法(jerry)
清理数据 (Jerry),数据可视化分析(SBS(lingyu),power bi())
Slide 4 ~ x -- final productdesign process (This should be the focus, tell us how your interaction with the sponsors, users, etc. informed your design
thinking, and how you came up with the design ideas) 每周任务,不过也可以放我们的dashboard设计,如何沟通 (lingyu)
Slide x+1 ~ y -- how your final productlooks like? (Note, a list of screen shots would be helpful, or a live demo but make sure your designed websiteworks
properly. You only have about 20 minutes in total, so don't wastetime on searching, finding, or fixing websitepages) 介绍wireframe(Jerry)
Slide y+1 -- Take away points (Whatyou havelearned from doing this project, shareyour valuable experiences)(ziyan)
Final slide -- if you are given an opportunity to re-do the project, whatmay you change??? (Jerry)
ITC 6040 Capstone
Final Report
Strategies for Identifying
Mis-/Disinformation
Team 2:
Ouyang Zhaode, Lingyu Hu, Ziyan Yan, Zixun Zhou
PART 1
PROJECT INTRODUCTION
Mikhail Oet, PhD
Professor in Commerce and Economic
Development (CED) program
Northeastern University
Our Sponsors:
Mission:
To get the rightinformation to the
right people at the right time
Research
• PlatformResearch
• Data Repositories Research
• Data Scraping Methods Research
• U.S., China, Russian Research
DataVisualization
• Dashboard Design
DataAnalysis
• Data Cleaning
• Sentiment Analysis
• Word Semantic Analysis
What Are We Doing? Help Identifying Fake News
How We Identify?
Part 2
Research Method
Research Method---
Qualitative Research
1. Identify Research Questions:
• How to collect data?
• How to use a data repository?
• How to analyze a dataset?
2. Case Study
3. Research Report
Research Method---
Quantitative Research
Specific method: Data Analysis
Analyze objective data---Statistical Data
• Sentiment score
• Information release time
• Amount of information
• Location
Part 3
Data Collection
Data Source (1)
Gi t Hub
❑ Provider of hosting program and it could offer the research results of fake news
❑ The results were not used, but we use the dataset
❑ Use the keywords
❑ Datasetsourced from Weibo about the false information of COVID-19
K a ggl e
❖ Datasetwebsite owned by Google
❖ Offer scientific topics
❖ Provides data on the issue of fake news about COVID-19
Data Source (2)
01
COVID-19
Source: Weibo, Twitter
Topic: Misinformation of COVID-19
02 Ukraine
Event Registry Ukraine-English Dataset
Event Registry Ukraine-Russian Dataset
From Twitter
03 Iran
The theme of Iran will be from Tweet by inputting keywords
Feasibility of Data
01 02
03 04
Highly Feasible Diverse
Visualize Reliable
Data We Collected
Qualitative
Quantitative
Primary
Secondary
Information gather from the guest speakers &
stakeholders(professors, sponsors, and other teams)
Articles and reports we read
Data we scrap on the social media and news websites
Data our sponsor provided&
data repositories we found
Primary
Secondary
COUNT
3 Guest Sections
10+Zoom Recording
15+ Meetings
30+ Emails
30+ Articles,
Reports & Videos
3 Experiment
Web Scrapings
5+ Data
Repositories We
Found
Data Scraping Methods Research
RSS feed to CSV (Online Converting Tools) DataCollectors (Octopus, BrightData)
Web Scraping (Python)
Methods Use Cases Difficulty
RSS feed to CSV Websites Providing RSS Feeds Low
Data Collectors PopularSocial Medias Medium
Web Scraping Static Websites High
Part 4
Data Cleaning & Analysis
Data Cleaning
MS Excel Power Query
• For CSV format
• Easy-to-use
• For ad-hocanalysis (One-time use)
* Limitations:
• Data should less than 1 million row
• Data should less than 1GB
Python
• For JSON or other data format
• Can cut a large data into many smaller files
• Cleaning as scale
• For data pipeline use (continuously data streaming)
* No limitation, but take more time and more effort
We can use
Google Sheet
to do batch
translation
Power BI
1. Data visualization
2. Data query
3. Data Modeling
4. Key data analysis
SBS
•Data visualization tool
•Words data analysis
Part 5
Analysis and findings
Ukraine – Russian
SBS Analysis
• All words content are Russian
• All records are news
• “Russian” and “Ukraine” appeared mostin
the dataset
• Specific words do not appear too much
Ukraine – Russian SBS Analysis
• Ukraine, Russian, and Putin care Topic 1 most
• NATO, Russian, and USA care Topic 2 most
• Xi and Biden care about Topic 5 most
Ukraine – Russian SBS Analysis
• T6 has a strong relationship with T2
• T5 has the second strong relationship
with T2
Conclusions:
• Ukraine, Russian and Putin care Winter Olympicsmost
• Russian, NATO and USA care potentialmilitaryactivities
• Xi and Biden care relationshipwith other countries
• Covid has stronger relationshipwith potentialmilitary activities
• Relationship incountries could influence the potential militaryactivities
Twitter Transparency
Project
Power BI Analysis
• [Hanya Kamu]:Only you
• Most hashtags are meaningless
• Tweet numbers in 2012
• Most tweets appeared in June and August
• Trend is unstable
Part 6
Final Product
User Input
External Data - Revenpack
News Articles
Dashboard
Fake Score
Sentiment
polyfact
propublica
Local Check
Based on
Historical Data
External Data - GDI
Contribution by Country
Dashboard - Data Prerequisite
News Article Dataset
External Data
Part 7
Takeaway Point
Take Away (1)
Data visualization
Take Away (2)
01 02 03
Communication Diversity Identifying Theme
Part 8
Re-design Project
Data Collection
Plan
1. Develop a Data Collection Plan
2. Us
1. What are we going to solve?
e.g., A list of issues
2. What consider success?
e.g., A Service Level Agreements (SLA)
3. What dataavailable?
4. What form does that data come in?
5. Where the datawill be collectedfrom?
6. Whether to measure a sample or the whole population?
7. What format the datawill be displayed?
We Did
We Missed
Project
Management
Techniques
Tools:
Methods:
• Waterfall
• Agile
• Scrum
…
The End
Appendix
Week 1- 2 : Platform Exploration
Exploring platforms where the data can be crawled.
Platforms in Russian, English and Chinese.
If possible, crawl the data by learning new tools.
Week 3 – 4: Learning New Tools
Clean and filter the data.
1
Learn new tools: SBS (and its format),
Power BI, sentiment analysis
2
Provide some basic findings.
3
Week 5 – 6: Data Visualization
• Use data visualization tools to analyze
data
• Determine the final tools: SBS, Power
BI.
• Providing some findings.
Week 7 – 8: Dashboard learn and design
• Keep using SBS to analyze the data
• Design dashboard by learning from
Ravenpack
• Combine and provide sample
designed dashboard
Sample designed dashboard
Process to Final Product
1. Ask Questions
2. Create
3. Feedback
4. Research
5. Revise
6. Feedback
7. Continue Revise
TwitterTransparencyDatasets
WebScrapping
Data Sources
HistoricalData
WeiboDatasets
EventRegistry Datasets
Live Data
TwitterAPI
Process
Storage Consume
Data Analysis & Query
Machine Learning
Cloud Architecture
Data Sources
TwitterTransparencyDatasets
WebScrapping
HistoricalData
WeiboDatasets
EventRegistry Datasets
Live Data
TwitterAPI
Storage
TwitterTransparencyDatasets
WebScrapping
Data Sources
HistoricalData
WeiboDatasets
EventRegistry Datasets
Live Data
TwitterAPI
Process
Storage Consume
Data Analysis & Query
Machine Learning
Cloud Architecture
Process
Storage
Data Analysis & Query
Machine Learning
A Database
of Metadata
Machine Learning
• Existing fake news prediction model
• Can train non-relational data (video, image, audio)
• Low-Code
TwitterTransparencyDatasets
Cleaning
Machine
LearningFake
Score
Sentiment
Data Visualization
Clean
& Transform
WebScrapping
Data Sources
HistoricalData
WeiboDatasets
EventRegistry Datasets
Live Data
Process Consume
Transfer
TwitterAPI
Amazon S3

More Related Content

What's hot

A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseYongyao Jiang
 
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...Ben Blaiszik
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Bertram Ludäscher
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用台灣資料科學年會
 
Quick tour all handout
Quick tour all handoutQuick tour all handout
Quick tour all handoutYi-Shin Chen
 
Querylog-based Assessment of Retrievability Bias in a Large Newspaper Corpus
Querylog-based Assessment of Retrievability Bias in a  Large Newspaper CorpusQuerylog-based Assessment of Retrievability Bias in a  Large Newspaper Corpus
Querylog-based Assessment of Retrievability Bias in a Large Newspaper CorpusMyriam Traub
 
Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...Nattiya Kanhabua
 
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCESFINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCESkevig
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search SystemTrey Grainger
 
LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013Luis Daniel Ibáñez
 
Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data ScienceKrishna Sankar
 

What's hot (20)

MUDROD - Ranking
MUDROD - RankingMUDROD - Ranking
MUDROD - Ranking
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
 
Duplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy DatasetDuplicate Detection on Hoaxy Dataset
Duplicate Detection on Hoaxy Dataset
 
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
 
BDACA1617s2 - Lecture3
BDACA1617s2 - Lecture3BDACA1617s2 - Lecture3
BDACA1617s2 - Lecture3
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用
 
BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5
 
Quick tour all handout
Quick tour all handoutQuick tour all handout
Quick tour all handout
 
Querylog-based Assessment of Retrievability Bias in a Large Newspaper Corpus
Querylog-based Assessment of Retrievability Bias in a  Large Newspaper CorpusQuerylog-based Assessment of Retrievability Bias in a  Large Newspaper Corpus
Querylog-based Assessment of Retrievability Bias in a Large Newspaper Corpus
 
Braintalk cuso nm
Braintalk cuso nmBraintalk cuso nm
Braintalk cuso nm
 
BDACA1516s2 - Lecture7
BDACA1516s2 - Lecture7BDACA1516s2 - Lecture7
BDACA1516s2 - Lecture7
 
Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...Exploiting temporal information in retrieval of archived documents (doctoral ...
Exploiting temporal information in retrieval of archived documents (doctoral ...
 
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCESFINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCES
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search System
 
BDACA1516s2 - Lecture2
BDACA1516s2 - Lecture2BDACA1516s2 - Lecture2
BDACA1516s2 - Lecture2
 
BDACA - Lecture7
BDACA - Lecture7BDACA - Lecture7
BDACA - Lecture7
 
LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013LiveLinkedData - TransWebData - Nantes 2013
LiveLinkedData - TransWebData - Nantes 2013
 
Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data Science
 

Similar to Presentation1.pdf

Mid-term presentation.pdf
Mid-term presentation.pdfMid-term presentation.pdf
Mid-term presentation.pdfZixunZhou
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)Piet J.H. Daas
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data networkJisc RDM
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfneelakandan2001kpm
 
Big Data Certification
Big Data CertificationBig Data Certification
Big Data CertificationExperfy
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3varshakumar21
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
DATA CAPTURING TRAINING_FINAL.pptx
DATA CAPTURING TRAINING_FINAL.pptxDATA CAPTURING TRAINING_FINAL.pptx
DATA CAPTURING TRAINING_FINAL.pptxscokoye
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
Responsible conduct of research: Data Management
Responsible conduct of research: Data ManagementResponsible conduct of research: Data Management
Responsible conduct of research: Data ManagementC. Tobin Magle
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Department of Commerce App Challenge: Big Data Dashboards
Department of Commerce App Challenge: Big Data DashboardsDepartment of Commerce App Challenge: Big Data Dashboards
Department of Commerce App Challenge: Big Data DashboardsBrand Niemann
 
Citi Global T4I Accelerator Data and Analytics Presentation
Citi Global T4I Accelerator Data and Analytics PresentationCiti Global T4I Accelerator Data and Analytics Presentation
Citi Global T4I Accelerator Data and Analytics PresentationMarquis Cabrera
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentationTao Feng
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 

Similar to Presentation1.pdf (20)

Mid-term presentation.pdf
Mid-term presentation.pdfMid-term presentation.pdf
Mid-term presentation.pdf
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
 
Implementing Archivematica, research data network
Implementing Archivematica, research data networkImplementing Archivematica, research data network
Implementing Archivematica, research data network
 
Data science unit1
Data science unit1Data science unit1
Data science unit1
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
 
Big Data Certification
Big Data CertificationBig Data Certification
Big Data Certification
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3
 
Data management plans
Data management plansData management plans
Data management plans
 
DATA CAPTURING TRAINING_FINAL.pptx
DATA CAPTURING TRAINING_FINAL.pptxDATA CAPTURING TRAINING_FINAL.pptx
DATA CAPTURING TRAINING_FINAL.pptx
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
I2DS Project.pdf
I2DS Project.pdfI2DS Project.pdf
I2DS Project.pdf
 
Responsible conduct of research: Data Management
Responsible conduct of research: Data ManagementResponsible conduct of research: Data Management
Responsible conduct of research: Data Management
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Martin Rasmussen: Ensuring availability and quality of research data through ...
Martin Rasmussen: Ensuring availability and quality of research data through ...Martin Rasmussen: Ensuring availability and quality of research data through ...
Martin Rasmussen: Ensuring availability and quality of research data through ...
 
Data management plans
Data management plansData management plans
Data management plans
 
Department of Commerce App Challenge: Big Data Dashboards
Department of Commerce App Challenge: Big Data DashboardsDepartment of Commerce App Challenge: Big Data Dashboards
Department of Commerce App Challenge: Big Data Dashboards
 
Citi Global T4I Accelerator Data and Analytics Presentation
Citi Global T4I Accelerator Data and Analytics PresentationCiti Global T4I Accelerator Data and Analytics Presentation
Citi Global T4I Accelerator Data and Analytics Presentation
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 

More from ZixunZhou

Weekly Meeting 8.pdf
Weekly Meeting 8.pdfWeekly Meeting 8.pdf
Weekly Meeting 8.pdfZixunZhou
 
Weekly Meeting 7.pdf
Weekly Meeting 7.pdfWeekly Meeting 7.pdf
Weekly Meeting 7.pdfZixunZhou
 
Weekly Meeting 6.pdf
Weekly Meeting 6.pdfWeekly Meeting 6.pdf
Weekly Meeting 6.pdfZixunZhou
 
Weekly Meeting 4.pdf
Weekly Meeting 4.pdfWeekly Meeting 4.pdf
Weekly Meeting 4.pdfZixunZhou
 
Weekly Meeting 3.pdf
Weekly Meeting 3.pdfWeekly Meeting 3.pdf
Weekly Meeting 3.pdfZixunZhou
 
Weekly Meeting 2.pdf
Weekly Meeting 2.pdfWeekly Meeting 2.pdf
Weekly Meeting 2.pdfZixunZhou
 
Dashboard Design.pptx
Dashboard Design.pptxDashboard Design.pptx
Dashboard Design.pptxZixunZhou
 

More from ZixunZhou (7)

Weekly Meeting 8.pdf
Weekly Meeting 8.pdfWeekly Meeting 8.pdf
Weekly Meeting 8.pdf
 
Weekly Meeting 7.pdf
Weekly Meeting 7.pdfWeekly Meeting 7.pdf
Weekly Meeting 7.pdf
 
Weekly Meeting 6.pdf
Weekly Meeting 6.pdfWeekly Meeting 6.pdf
Weekly Meeting 6.pdf
 
Weekly Meeting 4.pdf
Weekly Meeting 4.pdfWeekly Meeting 4.pdf
Weekly Meeting 4.pdf
 
Weekly Meeting 3.pdf
Weekly Meeting 3.pdfWeekly Meeting 3.pdf
Weekly Meeting 3.pdf
 
Weekly Meeting 2.pdf
Weekly Meeting 2.pdfWeekly Meeting 2.pdf
Weekly Meeting 2.pdf
 
Dashboard Design.pptx
Dashboard Design.pptxDashboard Design.pptx
Dashboard Design.pptx
 

Recently uploaded

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 

Recently uploaded (20)

Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 

Presentation1.pdf

  • 1. Slide 1 -- very brief introduction to your project. (Note, this is to help your classmates refresh their memory about your project, which should be very short, one or two sentences highlight) Jerry Slide 2 -- all research methods you have used to complete this project(for each method, use justone sentence to justify why it's necessary to adopt this method) Slide 3 -- all data you havecollected (a list of types of the data, including number of the email correspondence, number of the interviews, pages of documents you reviewed, etc.) {平台,数据仓库(github、kaggle)和canvasoet里提供的数据 (covid,ukrane,伊朗),调查平台数据是否可 用}(Ziyan),爬数据方法(jerry) 清理数据 (Jerry),数据可视化分析(SBS(lingyu),power bi()) Slide 4 ~ x -- final productdesign process (This should be the focus, tell us how your interaction with the sponsors, users, etc. informed your design thinking, and how you came up with the design ideas) 每周任务,不过也可以放我们的dashboard设计,如何沟通 (lingyu) Slide x+1 ~ y -- how your final productlooks like? (Note, a list of screen shots would be helpful, or a live demo but make sure your designed websiteworks properly. You only have about 20 minutes in total, so don't wastetime on searching, finding, or fixing websitepages) 介绍wireframe(Jerry) Slide y+1 -- Take away points (Whatyou havelearned from doing this project, shareyour valuable experiences)(ziyan) Final slide -- if you are given an opportunity to re-do the project, whatmay you change??? (Jerry)
  • 2. ITC 6040 Capstone Final Report Strategies for Identifying Mis-/Disinformation Team 2: Ouyang Zhaode, Lingyu Hu, Ziyan Yan, Zixun Zhou
  • 4. Mikhail Oet, PhD Professor in Commerce and Economic Development (CED) program Northeastern University Our Sponsors: Mission: To get the rightinformation to the right people at the right time Research • PlatformResearch • Data Repositories Research • Data Scraping Methods Research • U.S., China, Russian Research DataVisualization • Dashboard Design DataAnalysis • Data Cleaning • Sentiment Analysis • Word Semantic Analysis What Are We Doing? Help Identifying Fake News How We Identify?
  • 6. Research Method--- Qualitative Research 1. Identify Research Questions: • How to collect data? • How to use a data repository? • How to analyze a dataset? 2. Case Study 3. Research Report
  • 7. Research Method--- Quantitative Research Specific method: Data Analysis Analyze objective data---Statistical Data • Sentiment score • Information release time • Amount of information • Location
  • 9. Data Source (1) Gi t Hub ❑ Provider of hosting program and it could offer the research results of fake news ❑ The results were not used, but we use the dataset ❑ Use the keywords ❑ Datasetsourced from Weibo about the false information of COVID-19 K a ggl e ❖ Datasetwebsite owned by Google ❖ Offer scientific topics ❖ Provides data on the issue of fake news about COVID-19
  • 10. Data Source (2) 01 COVID-19 Source: Weibo, Twitter Topic: Misinformation of COVID-19 02 Ukraine Event Registry Ukraine-English Dataset Event Registry Ukraine-Russian Dataset From Twitter 03 Iran The theme of Iran will be from Tweet by inputting keywords
  • 11.
  • 12. Feasibility of Data 01 02 03 04 Highly Feasible Diverse Visualize Reliable
  • 13. Data We Collected Qualitative Quantitative Primary Secondary Information gather from the guest speakers & stakeholders(professors, sponsors, and other teams) Articles and reports we read Data we scrap on the social media and news websites Data our sponsor provided& data repositories we found Primary Secondary COUNT 3 Guest Sections 10+Zoom Recording 15+ Meetings 30+ Emails 30+ Articles, Reports & Videos 3 Experiment Web Scrapings 5+ Data Repositories We Found
  • 14. Data Scraping Methods Research RSS feed to CSV (Online Converting Tools) DataCollectors (Octopus, BrightData) Web Scraping (Python) Methods Use Cases Difficulty RSS feed to CSV Websites Providing RSS Feeds Low Data Collectors PopularSocial Medias Medium Web Scraping Static Websites High
  • 15. Part 4 Data Cleaning & Analysis
  • 16. Data Cleaning MS Excel Power Query • For CSV format • Easy-to-use • For ad-hocanalysis (One-time use) * Limitations: • Data should less than 1 million row • Data should less than 1GB Python • For JSON or other data format • Can cut a large data into many smaller files • Cleaning as scale • For data pipeline use (continuously data streaming) * No limitation, but take more time and more effort We can use Google Sheet to do batch translation
  • 17. Power BI 1. Data visualization 2. Data query 3. Data Modeling 4. Key data analysis
  • 20. Ukraine – Russian SBS Analysis • All words content are Russian • All records are news • “Russian” and “Ukraine” appeared mostin the dataset • Specific words do not appear too much
  • 21. Ukraine – Russian SBS Analysis • Ukraine, Russian, and Putin care Topic 1 most • NATO, Russian, and USA care Topic 2 most • Xi and Biden care about Topic 5 most
  • 22. Ukraine – Russian SBS Analysis • T6 has a strong relationship with T2 • T5 has the second strong relationship with T2 Conclusions: • Ukraine, Russian and Putin care Winter Olympicsmost • Russian, NATO and USA care potentialmilitaryactivities • Xi and Biden care relationshipwith other countries • Covid has stronger relationshipwith potentialmilitary activities • Relationship incountries could influence the potential militaryactivities
  • 23. Twitter Transparency Project Power BI Analysis • [Hanya Kamu]:Only you • Most hashtags are meaningless • Tweet numbers in 2012 • Most tweets appeared in June and August • Trend is unstable
  • 25. User Input External Data - Revenpack News Articles Dashboard Fake Score Sentiment polyfact propublica Local Check Based on Historical Data External Data - GDI Contribution by Country
  • 26. Dashboard - Data Prerequisite News Article Dataset External Data
  • 28. Take Away (1) Data visualization
  • 29. Take Away (2) 01 02 03 Communication Diversity Identifying Theme
  • 31. Data Collection Plan 1. Develop a Data Collection Plan 2. Us 1. What are we going to solve? e.g., A list of issues 2. What consider success? e.g., A Service Level Agreements (SLA) 3. What dataavailable? 4. What form does that data come in? 5. Where the datawill be collectedfrom? 6. Whether to measure a sample or the whole population? 7. What format the datawill be displayed? We Did We Missed
  • 35. Week 1- 2 : Platform Exploration Exploring platforms where the data can be crawled. Platforms in Russian, English and Chinese. If possible, crawl the data by learning new tools.
  • 36. Week 3 – 4: Learning New Tools Clean and filter the data. 1 Learn new tools: SBS (and its format), Power BI, sentiment analysis 2 Provide some basic findings. 3
  • 37. Week 5 – 6: Data Visualization • Use data visualization tools to analyze data • Determine the final tools: SBS, Power BI. • Providing some findings.
  • 38. Week 7 – 8: Dashboard learn and design • Keep using SBS to analyze the data • Design dashboard by learning from Ravenpack • Combine and provide sample designed dashboard Sample designed dashboard
  • 39. Process to Final Product 1. Ask Questions 2. Create 3. Feedback 4. Research 5. Revise 6. Feedback 7. Continue Revise
  • 40. TwitterTransparencyDatasets WebScrapping Data Sources HistoricalData WeiboDatasets EventRegistry Datasets Live Data TwitterAPI Process Storage Consume Data Analysis & Query Machine Learning Cloud Architecture
  • 42. TwitterTransparencyDatasets WebScrapping Data Sources HistoricalData WeiboDatasets EventRegistry Datasets Live Data TwitterAPI Process Storage Consume Data Analysis & Query Machine Learning Cloud Architecture
  • 43. Process Storage Data Analysis & Query Machine Learning A Database of Metadata
  • 44. Machine Learning • Existing fake news prediction model • Can train non-relational data (video, image, audio) • Low-Code
  • 45. TwitterTransparencyDatasets Cleaning Machine LearningFake Score Sentiment Data Visualization Clean & Transform WebScrapping Data Sources HistoricalData WeiboDatasets EventRegistry Datasets Live Data Process Consume Transfer TwitterAPI