EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
Where to Find Data Sets
1. Where to Find Data Sets
Academic Torrents
Link: http://academictorrents.com/browse.php?cat=6
Description: Shares data sets pulled from scientific papers. These data sets might be difficult to
put into context if the original paper is not read.
Awesome-Public-Datasets on Github
Link: https://github.com/awesomedata/awesome-public-datasets
Description: A collection of public datasets sorted by topic with links to the hosting website.
Bureau of Economic Analysis
Link: https://www.bea.gov/
Description: Economic datasets, such as GDP, personal income, international transactions, etc.
CDC Data & Statistics
Link: https://www.cdc.gov/datastatistics/index.html
Description: The CDC compiles data sets by different types of health-related topics.
Child Care & Early Education Research Connections
Link: https://www.researchconnections.org/childcare/datasets-instruments.jsp
Description: Datasets on child care and early education that allow you to compare variables.
Child & Family Data Archive
Link: https://www.childandfamilydataarchive.org/cfda/pages/cfda/index.html
Description: Datasets on children, families, and their communities.
The Data And Story Library (DASL)
Link: https://dasl.datadescription.com/
Description: A library of datasets that allows you to search by keyword or by statistical method.
Data Hub
Link: https://datahub.io/
Description: A large consolidation of datasets on a large range of topics.
Data.gov
Link https://www.data.gov/
Description: A U.S. government website with access to public government information on all
levels, from federal to local. The datasets are machine-readable.
2. Data.gov.uk
Link: https://data.gov.uk/
Description: Makes non-personal UK government data open. It contains over 30,000 data sets
from the UK government.
Data.world
Link: https://data.world/
Description: Has a broad variety of datasets. You can also upload your own data to collaborate
with other users to analyze and share insights.
Earthdata
Link: https://earthdata.nasa.gov/
Description: The public can access NASA’s data through Earthdata. Different categories of data
sets include atmosphere, solar radiance, cryosphere, human dimensions, land, and the ocean.
EU Open Data Portal
Link: https://data.europa.eu/euodp/en/data/
Description: It contains all the open data from EU institutions.
Eurostat
Link: https://ec.europa.eu/eurostat/
Description: Datasets on a wide range of topics about the EU. Brexit does affect some datasets,
details about which can be found here.
FBI Crime Data
Link: https://www.fbi.gov/services/cjis/ucr
Description: The FBI compiles lists about different types of crimes across different years.
Datasets can be broken up into categories based on the type of crime.
Fiscally Standardized Cities Database: Lincoln Institute of Land Studies
Link: https://www.lincolninst.edu/research-data/data-toolkits/fiscally-standardized-cities
Description: Database that allows you to create custom tables to compare data between cities.
There is also an option to download the entire dataset without creating the custom table.
FiveThirtyEight
Link: https://data.fivethirtyeight.com/
Description: Geared towards opinion polls, politics, economics, and sports. Was originally
created to be a place to aggregate poll data and was named after the 538 electors in the electoral
college.
3. Gapminder
Link: https://www.gapminder.org/data/
Description: Compiles world medical, social, and economic data. Pulled from some previously
mentioned databases, such as the World Health Organization and the World Bank.
General Social Survey
Link: http://gss.norc.org/
Description: A survey to gain insight into how Americans feel about government issues and
decisions.
The Global Health Observatory
Link: https://www.who.int/data/gho
Description: Datasets about universal health coverage, health emergencies, and health and
well-being.
Google Public Data Explorer
Link: https://www.google.com/publicdata/directory
Description: Datasets about human and world development and how they relate to economic
data.
Google’s Dataset Search
Link: https://datasetsearch.research.google.com/
Description: A search engine to find datasets using keywords.
Google Public Datasets
Link: https://cloud.google.com/bigquery/public-data/
Description: Datasets hosted on the Google Cloud Platform (GCP) that can be query searched.
The first 1TB of queries are free.
Google Trends
Link: https://trends.google.com/trends/explore
Description: Allows the user to choose different search terms and compare them against one
another. Measures the proportions of searches on Google against each of the other search terms.
Healthdata.gov
Link: https://healthdata.gov/
Description: It contains 125 years’ worth of healthcare data from the United States.
4. Kaggle
Link: https://www.kaggle.com/
Description: Kaggle has data sets, models to build and explore, and is a collaborative
environment to work with data scientists and machine-learning engineers. There are also options
to compete to solve data science challenges.
National Climatic Data Center
Link: https://www.ncdc.noaa.gov/cdo-web/datasets
Description: The National Climatic Data Center contains environment and weather data from
around the world.
NHS Health and Social Care Information Centre
Link: https://digital.nhs.uk/
Description: A collection of processed data from health and social care systems in England.
Pew Research Center
Link: https://www.pewresearch.org/internet/datasets/
Description: Pew Research Center has data sets organized by survey name. The survey data are
released two years after research reports on the data set are issued.
Quandl
Link: https://www.quandl.com/search?filters=%5B%22Free%22%5D
Description: It contains economic and financial data geared toward investment professionals. Not
all the data sets are free, but the link filters to the free ones.
Reserve Bank of India
Link: https://www.rbi.org.in/Scripts/Statistics.aspx
Description: Datasets about India’s economy, finances, and banking from India’s central bank.
SEER*Explorer
Link: https://seer.cancer.gov/explorer/
Description: The National Cancer Institute's data on cancer statistics based on age, gender, stage,
etc.
UCI Machine Learning Repository
Link: http://archive.ics.uci.edu/ml/index.php
Description: Used by the machine learning community, this consolidates databases, domain
theories, and data generators to be used to analyze machine learning algorithms.
5. UNICEF
Link: https://www.unicef.org/research-and-reports
Description: Has datasets about children around the world that focus on their wellbeing.
United States Census Bureau
Link: https://www.census.gov/data.html
Description: Data from the U.S. Census including population, geographic, and education data.
The U.S. Bureau of Labor Statistics
Link: https://www.bls.gov/data/
Description: Databases containing labor-related statistics such as inflation, employment,
workplace injuries, resources, and various others.
The World Bank
Link: https://data.worldbank.org/
Description: A government organization that provides data on the success of the programs they
implement in developing countries.
YouTube labeled Video Dataset
Link: https://research.google.com/youtube8m/
Description: A dataset of over 8 million video IDs and 4800 visual entities from YouTube.