SlideShare a Scribd company logo
1 of 28
Download to read offline
DATA
JOURNALISM
Challenges and Case Studies
by Miodrag Marković
ABOUT ME
ABOUT BIRN SRBIJA
Data Journalist. Graduated in Journalism and
Communication. Self-taught coder with
programming skills in Python and Java Script,
required for data mining and data visualisation.
Small independent non-profit newsroom
oriented to investigative journalism.
DATA JOURNALISM
THE CONTEXT
Journalism + small set of tools used in data and computer
science
Data Journalists - Journalists with an additional set of technical
skills
The Purpose - Find the story in data
TOOLS WE USE
(MY EXPERIENCE, BY
INVESTED TIME)
FLOURISH - DATA
VISUALISATION
PYTHON - DATA MINING SOMETHING ELSE
30%
60% 10%
PYTHON FOR DATA MINING
JUPYTER NOTEBOOK
(GEO)PANDAS
BEAUTIFUL SOUP,
SELENIUM...
Key frameworks and libraries for data mining
JAN MAR MAY JUL SEP
50
40
30
20
10
0
FLOURISH FOR DATA VISUALISATION
“Flourish was to enable
everyone to tell stories with
data. Launched in 2018, the
tool is used by a huge
community of creators”.
website: https://flourish.studio
EMBED ON WEBSITE
PREPARE DATA SET CHOOSE TEMPLATE
1
2
3
4
5
6
DATA JOURNALISM WORKFLOW
Find the data
Find the potential sources of data
that can be useful
Get the data
Find the way to get the data. There is
an option for simple download (rare
cases), or request via REST API, web
scraping
Clean the data
Bring the data to state that can be
useful for further data analysis
Data Analysis
Find the story in data
Data Visualisation
Find the most appropriate charts to
represent findings
Tell the story with the data
Find the best ways and tools for the
data storytelling
IN REALITY
HOW THINGS LOOK
OK, THAT IS THEORY
KEY
CHALLENGES
DATA SOURCES
Lack of sources of data produced by the government in
machine-readable formats and REST API services
Lack of sources useful for the purpose of investigative
journalism (mostly statistics)
Data on demand sent on paper
Some sources of data are not free
Institutions refuse to give requested data
2019 2020 2021
0
5
10
15
20
ENGAGING DATA STORYTELLING
INTERESTING
UNDERSTANDABLE
LESS “DASHBOARDISH”
We produce stories for a broader audience,
therefore they should be told in a different and
more simple way compared to the data
science approach. A definite answer is still in
progress.
WHAT COULD BE
BETTER
MORE FLEXIBLE CMS
MORE
AFFORDABLE
DATA
STORYTELLING
PLATFORMS
TECHNICAL STAFF
News websites are mostly in PHP WordPress,
so they ask for a lot of tweaking for data
storytelling purposes
There are specialized platforms for storytelling
in digital environments, but they are too
expensive for small newsrooms
It is hard to make a budget for dedicated
teams with skills in programming and web
development. (From the perspective of small
newsrooms.)
WHAT COULD BE
IMPROVED
GOVERNMENT
Open data platforms exist, however, data still should be more
systematically formatted
It would be nice to see more REST API services
Data on demand should be delivered in machine-readable
formats
Some data sources and services should be free of charge
for journalists
MEDIA INDUSTRY
Media stakeholders should be more aware of the potential of
data for journalism purposes
Newsrooms should have dedicated teams capable of finding
stories in data
Newsrooms should incorporate data storytelling platforms
inside of their content management systems (CMS) to
deliver more engaging stories for their audiences
EDUCATION
Universities should offer programs for data journalism
Newsrooms should give opportunities to journalists
interested in data journalism to learn skills
IT and media industry could be more connected for the
purpose of delivering better data driven journalism
CASE STUDIES
TAX FRAUD SCHEMA
The story in short: An organized group of people take ownership
of companies with debts. Example: You have a company with
huge debts. You will pay me to be the new owner. It is on me how
I will resolve that, and you do not need to worry. This “business
model” was applied to hundreds and hundreds of cases. As a
consequence, creditors couldn’t collect debts.
HOW WE COLLECTED DATA
METHODS:
1. WEB SCRAPING
2. DATA ON DEMAND
We have only basic pieces of information
about individuals. So we need to get more data
from government data sources like the National
Bank,Tax Administration and Trade Register.
WHAT WE DISCOVERED
Debt in total for all companies from this schema. (Money that
Government and creditors lost)
Who are new fictional owners and how many companies are
on their name. (In some cases more than 200 companies)
Where are those companies now. (Fictional addresses in
residential area, in some cases 300 companies per adress)
After a lot of work on data cleaning, we had
some findings.
KILLING THE
COMPETITION
The story in short: How two private companies with strong political
connections became exclusive contractors of government-owned
electric power company (EPS) during the period of 10 years
HOW WE COLLECTED DATA
METHODS:
WEB SCRAPING
The Public Procurement Office has data about
all public procurements (tenders). At that time,
data were not downloadable, so we need to
scrape them
WHAT WE DISCOVERED
These two companies drastically increased their incomes from
public procurements after change of government.
The number of bidders where these companies get contracts
- decreased. In the last few years, they were the only
bidders.
During this period average number of bidders for all public
procurements (EPS) decreased from 3.14 to 1.77.
After a lot of work on data cleaning, we had
some findings.
HIDING THE RIVER
The story in short: The number of floating objects (splav) on
Belgrade rivers increases year by year. We all witnessed that,
however, nobody knows the exact data. We have tried to
calculate the increase of the area under floating objects and
their number during the period of five years.
HOW WE COLLECTED DATA
METHODS:
DRAWING POLIGONS ON SATELITE IMAGES, CALCULATE AREAS
Primary data sources were satellite images
taken in 2015. and in 2021.
WHAT WE DISCOVERED
The total area under commercial floating objects on Belgrade
rivers increase for more than 25% from 2015. to 2021.
The total number of floating objects increased for more than
20% in the same period of time.
After a lot of work on data cleaning and
transforming, we had some findings.
THANK YOU

More Related Content

Similar to [DSC Europe 23] Miodrag Markovic - Data Journalism.pdf

Example of business plan
Example of business planExample of business plan
Example of business planEdebex.com
 
201404 White Paper Digital Universe 2014
201404 White Paper Digital Universe 2014201404 White Paper Digital Universe 2014
201404 White Paper Digital Universe 2014Francisco Calzado
 
The implications of Big Data for BTS and COS
The implications of Big Data for BTS and COSThe implications of Big Data for BTS and COS
The implications of Big Data for BTS and COSGeorge Kershoff
 
Open government international garry lloyd
Open government international   garry lloydOpen government international   garry lloyd
Open government international garry lloydGarry Lloyd
 
BIG DATA, a new way to achieve success in Enterprise Architecture.
BIG DATA, a new way to achieve success in Enterprise Architecture.BIG DATA, a new way to achieve success in Enterprise Architecture.
BIG DATA, a new way to achieve success in Enterprise Architecture.Georges Colin
 
Value Creation for SMBs with Big Data
Value Creation for SMBs with Big DataValue Creation for SMBs with Big Data
Value Creation for SMBs with Big DataAndrey Sadovykh
 
Unlocking Value of Data in a Digital Age
Unlocking Value of Data in a Digital AgeUnlocking Value of Data in a Digital Age
Unlocking Value of Data in a Digital AgeRuud Brink
 
Make data more human
Make data more humanMake data more human
Make data more humanSrijani Das
 
Data Science Innovations
Data Science InnovationsData Science Innovations
Data Science Innovationssuresh sood
 
From E-Government to Open Government
From E-Government to Open GovernmentFrom E-Government to Open Government
From E-Government to Open GovernmentJohann Höchtl
 
Open data 4 startups (2°edition)
Open data 4 startups (2°edition)Open data 4 startups (2°edition)
Open data 4 startups (2°edition)TOP-IX Consortium
 
Adventures with Open Data in a Government World
Adventures with Open Data in a Government WorldAdventures with Open Data in a Government World
Adventures with Open Data in a Government WorldOpen Data @ CTIC
 
An era of game changing insight from Big Data
An era of game changing insight from Big DataAn era of game changing insight from Big Data
An era of game changing insight from Big DataIBM Government
 
Ibm and open data final
Ibm and open data finalIbm and open data final
Ibm and open data finalloredanasales
 
DataBridge OCSI using your data
DataBridge OCSI using your dataDataBridge OCSI using your data
DataBridge OCSI using your datajo_ivens
 
Create the internet of YOUR things
Create the internet of YOUR things Create the internet of YOUR things
Create the internet of YOUR things Richard Chaves
 
Business Review Europe - April 2016
Business Review Europe - April 2016Business Review Europe - April 2016
Business Review Europe - April 2016William Phillips
 

Similar to [DSC Europe 23] Miodrag Markovic - Data Journalism.pdf (20)

Example of business plan
Example of business planExample of business plan
Example of business plan
 
201404 White Paper Digital Universe 2014
201404 White Paper Digital Universe 2014201404 White Paper Digital Universe 2014
201404 White Paper Digital Universe 2014
 
The implications of Big Data for BTS and COS
The implications of Big Data for BTS and COSThe implications of Big Data for BTS and COS
The implications of Big Data for BTS and COS
 
Open government international garry lloyd
Open government international   garry lloydOpen government international   garry lloyd
Open government international garry lloyd
 
BIG DATA, a new way to achieve success in Enterprise Architecture.
BIG DATA, a new way to achieve success in Enterprise Architecture.BIG DATA, a new way to achieve success in Enterprise Architecture.
BIG DATA, a new way to achieve success in Enterprise Architecture.
 
Value Creation for SMBs with Big Data
Value Creation for SMBs with Big DataValue Creation for SMBs with Big Data
Value Creation for SMBs with Big Data
 
Unlocking Value of Data in a Digital Age
Unlocking Value of Data in a Digital AgeUnlocking Value of Data in a Digital Age
Unlocking Value of Data in a Digital Age
 
Make data more human
Make data more humanMake data more human
Make data more human
 
Data Science Innovations
Data Science InnovationsData Science Innovations
Data Science Innovations
 
Opportunities in Data
Opportunities in DataOpportunities in Data
Opportunities in Data
 
From E-Government to Open Government
From E-Government to Open GovernmentFrom E-Government to Open Government
From E-Government to Open Government
 
Open data 4 startups (2°edition)
Open data 4 startups (2°edition)Open data 4 startups (2°edition)
Open data 4 startups (2°edition)
 
Adventures with Open Data in a Government World
Adventures with Open Data in a Government WorldAdventures with Open Data in a Government World
Adventures with Open Data in a Government World
 
An era of game changing insight from Big Data
An era of game changing insight from Big DataAn era of game changing insight from Big Data
An era of game changing insight from Big Data
 
Openeverything
OpeneverythingOpeneverything
Openeverything
 
Ibm and open data final
Ibm and open data finalIbm and open data final
Ibm and open data final
 
DataBridge OCSI using your data
DataBridge OCSI using your dataDataBridge OCSI using your data
DataBridge OCSI using your data
 
Big Data Predictions ebook
Big Data Predictions ebookBig Data Predictions ebook
Big Data Predictions ebook
 
Create the internet of YOUR things
Create the internet of YOUR things Create the internet of YOUR things
Create the internet of YOUR things
 
Business Review Europe - April 2016
Business Review Europe - April 2016Business Review Europe - April 2016
Business Review Europe - April 2016
 

More from DataScienceConferenc1

[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdfDataScienceConferenc1
 
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...DataScienceConferenc1
 
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdfDataScienceConferenc1
 
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdfDataScienceConferenc1
 
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdfDataScienceConferenc1
 
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptxDataScienceConferenc1
 
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdfDataScienceConferenc1
 
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...DataScienceConferenc1
 
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdfDataScienceConferenc1
 
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...DataScienceConferenc1
 
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...DataScienceConferenc1
 
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdfDataScienceConferenc1
 
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptxDataScienceConferenc1
 
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...DataScienceConferenc1
 
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptxDataScienceConferenc1
 
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...DataScienceConferenc1
 
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...DataScienceConferenc1
 
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptxDataScienceConferenc1
 
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptxDataScienceConferenc1
 
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdfDataScienceConferenc1
 

More from DataScienceConferenc1 (20)

[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
 
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
 
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
 
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
 
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
 
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
 
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
 
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
 
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
 
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
 
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
 
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
 
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
 
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
 
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
 
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
 
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
 
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
 
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
 
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
 

Recently uploaded

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 

[DSC Europe 23] Miodrag Markovic - Data Journalism.pdf

  • 1. DATA JOURNALISM Challenges and Case Studies by Miodrag Marković
  • 2. ABOUT ME ABOUT BIRN SRBIJA Data Journalist. Graduated in Journalism and Communication. Self-taught coder with programming skills in Python and Java Script, required for data mining and data visualisation. Small independent non-profit newsroom oriented to investigative journalism.
  • 3. DATA JOURNALISM THE CONTEXT Journalism + small set of tools used in data and computer science Data Journalists - Journalists with an additional set of technical skills The Purpose - Find the story in data
  • 4. TOOLS WE USE (MY EXPERIENCE, BY INVESTED TIME) FLOURISH - DATA VISUALISATION PYTHON - DATA MINING SOMETHING ELSE 30% 60% 10%
  • 5. PYTHON FOR DATA MINING JUPYTER NOTEBOOK (GEO)PANDAS BEAUTIFUL SOUP, SELENIUM... Key frameworks and libraries for data mining
  • 6. JAN MAR MAY JUL SEP 50 40 30 20 10 0 FLOURISH FOR DATA VISUALISATION “Flourish was to enable everyone to tell stories with data. Launched in 2018, the tool is used by a huge community of creators”. website: https://flourish.studio EMBED ON WEBSITE PREPARE DATA SET CHOOSE TEMPLATE
  • 7. 1 2 3 4 5 6 DATA JOURNALISM WORKFLOW Find the data Find the potential sources of data that can be useful Get the data Find the way to get the data. There is an option for simple download (rare cases), or request via REST API, web scraping Clean the data Bring the data to state that can be useful for further data analysis Data Analysis Find the story in data Data Visualisation Find the most appropriate charts to represent findings Tell the story with the data Find the best ways and tools for the data storytelling
  • 8. IN REALITY HOW THINGS LOOK OK, THAT IS THEORY
  • 10. DATA SOURCES Lack of sources of data produced by the government in machine-readable formats and REST API services Lack of sources useful for the purpose of investigative journalism (mostly statistics) Data on demand sent on paper Some sources of data are not free Institutions refuse to give requested data
  • 11. 2019 2020 2021 0 5 10 15 20 ENGAGING DATA STORYTELLING INTERESTING UNDERSTANDABLE LESS “DASHBOARDISH” We produce stories for a broader audience, therefore they should be told in a different and more simple way compared to the data science approach. A definite answer is still in progress.
  • 13. MORE FLEXIBLE CMS MORE AFFORDABLE DATA STORYTELLING PLATFORMS TECHNICAL STAFF News websites are mostly in PHP WordPress, so they ask for a lot of tweaking for data storytelling purposes There are specialized platforms for storytelling in digital environments, but they are too expensive for small newsrooms It is hard to make a budget for dedicated teams with skills in programming and web development. (From the perspective of small newsrooms.)
  • 15. GOVERNMENT Open data platforms exist, however, data still should be more systematically formatted It would be nice to see more REST API services Data on demand should be delivered in machine-readable formats Some data sources and services should be free of charge for journalists
  • 16. MEDIA INDUSTRY Media stakeholders should be more aware of the potential of data for journalism purposes Newsrooms should have dedicated teams capable of finding stories in data Newsrooms should incorporate data storytelling platforms inside of their content management systems (CMS) to deliver more engaging stories for their audiences
  • 17. EDUCATION Universities should offer programs for data journalism Newsrooms should give opportunities to journalists interested in data journalism to learn skills IT and media industry could be more connected for the purpose of delivering better data driven journalism
  • 19. TAX FRAUD SCHEMA The story in short: An organized group of people take ownership of companies with debts. Example: You have a company with huge debts. You will pay me to be the new owner. It is on me how I will resolve that, and you do not need to worry. This “business model” was applied to hundreds and hundreds of cases. As a consequence, creditors couldn’t collect debts.
  • 20. HOW WE COLLECTED DATA METHODS: 1. WEB SCRAPING 2. DATA ON DEMAND We have only basic pieces of information about individuals. So we need to get more data from government data sources like the National Bank,Tax Administration and Trade Register.
  • 21. WHAT WE DISCOVERED Debt in total for all companies from this schema. (Money that Government and creditors lost) Who are new fictional owners and how many companies are on their name. (In some cases more than 200 companies) Where are those companies now. (Fictional addresses in residential area, in some cases 300 companies per adress) After a lot of work on data cleaning, we had some findings.
  • 22. KILLING THE COMPETITION The story in short: How two private companies with strong political connections became exclusive contractors of government-owned electric power company (EPS) during the period of 10 years
  • 23. HOW WE COLLECTED DATA METHODS: WEB SCRAPING The Public Procurement Office has data about all public procurements (tenders). At that time, data were not downloadable, so we need to scrape them
  • 24. WHAT WE DISCOVERED These two companies drastically increased their incomes from public procurements after change of government. The number of bidders where these companies get contracts - decreased. In the last few years, they were the only bidders. During this period average number of bidders for all public procurements (EPS) decreased from 3.14 to 1.77. After a lot of work on data cleaning, we had some findings.
  • 25. HIDING THE RIVER The story in short: The number of floating objects (splav) on Belgrade rivers increases year by year. We all witnessed that, however, nobody knows the exact data. We have tried to calculate the increase of the area under floating objects and their number during the period of five years.
  • 26. HOW WE COLLECTED DATA METHODS: DRAWING POLIGONS ON SATELITE IMAGES, CALCULATE AREAS Primary data sources were satellite images taken in 2015. and in 2021.
  • 27. WHAT WE DISCOVERED The total area under commercial floating objects on Belgrade rivers increase for more than 25% from 2015. to 2021. The total number of floating objects increased for more than 20% in the same period of time. After a lot of work on data cleaning and transforming, we had some findings.