1. Dataset Sources
Repositories data
Supervised by
Asst. Prof. Dr. Ahmed Aljanabi
Department of Computer Science
Faculty of CS and Mathematics
University of Kufa
Muntathar manhar muhsin
Ph D in Computer Science
2. Introduction
For academic research, you can rely on scientific
sources and websites that provide reliable data sets.
The best way to publish and share research data is
with a research data repository. A repository is an
online database that allows research data to be
preserved across time and helps others find it.
3. Introduction
Here are some sites and engines you can use to find datasets for
academic research purposes
1. Figshare
2. Lincoln Centre for Autonomous Systems (LCAS)
3. Mendeley Data
4. Nasdaq Data Link
5. Harvard Dataverse
6. Harvard Dataverse
7. Dryad Digital Repository
8. Network Repository
When using these resources, be sure to read and understand the
terms of use of the data sets and ensure compliance with any
applicable laws or licenses.
4. figshare
figshare is a repository and a web-based interface designed
for academic research data management and research data
dissemination.
It accepts all file types (with in-browser viewing).
The Figshare team now spans 3 continents
Its characteristics:
1. Online open access repository
2. Data security
3. Contains figures, datasets,
images, and videos.
The website : figshare.com
7. Lincoln Centre for Autonomous Systems (LCAS)
Research center based at the University of Lincoln in UK.
Cross-Disciplinary in Robotics Research.
Specializes in technologies for perception, learning, decision-making,
control, and interaction in autonomous systems, especially mobile robots
and robotic manipulators,
and the integration of these capabilities in application domains including
Agri-food,
Healthcare,
Intelligent transportation,
Logistics,
Nuclear robotics,
Service robotics, and
Space robotics.
The website : https://lcas.lincoln.ac.uk/wp/
8. Lincoln Centre for Autonomous Systems (LCAS)
The link to access the website is
https://lcas.lincoln.ac.uk/wp/
Website interface:
9. Mendeley Data
Mendeley Data is a free and secure cloud-based
communal repository.
Research data can be found in fields, including to:
Natural Sciences and Mathematics
Engineering
Life sciences
Medical and health sciences
Social sciences
Humanities.
The website : https://data.mendeley.com/
12. Nasdaq Data Link
Nasdaq Data Link A premier source for financial, economic
and alternative datasets.
Data Type
Prices & Volumes
Estimates
Fundamentals
Corporate Actions
Sentiment
Derived Metrics
National Statistics
Technical Analysis
Others
The website :
13.
14. figshare
Open access data repository.
Datasets: images, and videos.
Figshare allows researchers to upload any file format and
assigns a digital object identifier (doi) for citations.
Free accounts on figshare can upload files of up to 5gb and
get 20gb of free storage.
fields:
Software and Code
Models and Simulations
Machine Learning and Data Science
The website : https://figshare.com/
15.
16. Harvard Dataverse
The harvard dataverse repository is a free data repository open
to all researchers from any discipline, both inside and outside of
the harvard community,
Can share, archive, cite, access, and explore research data.
You can open your data to the general public, or restrict access
and define customizable terms of use. When you publish your
data, you automatically get a standard data citation with a
digital object identifier (DOI).
Powered by the open-source web application dataverse,
developed by the insitute of quantitative social science at
harvard.
Is free and has a limit of 2.5 GB per file and 10 GB per dataset.
Website source : https://dataverse.harvard.edu/
17.
18. Dryad Digital Repository
Dryad is a curated general-purpose repository that
makes data discoverable, freely reusable, and citable.
Most types of files can be submitted (e.g., text,
spreadsheets, video, photographs, software code)
including compressed archives of multiple files.
Since a guiding principle of Dryad is to make its
contents freely available for research and educational
use, there are no access costs for individual users or
institutions. Instead, Dryad supports its operation by
charging a $120US fee each time data is published.
Website source : https://datadryad.org/stash
19.
20. Network Repository
The first interactive data and network data repository with real-time
visual analytics. Network repository is not only the first interactive
repository, but also the largest network repository with thousands of
donations in 30+ domains (from biological to social network
data). This large comprehensive collection of network graph data is
useful for making significant research findings as well as benchmark
network data sets for a wide variety of applications and domains
(e.g., network science, bioinformatics, machine learning, data
mining, physics, and social science) and includes relational,
attributed, heterogeneous, streaming, spatial, and time series
network data as well as non-relational machine learning data. All
graph data sets are easily downloaded into a standard consistent
format. We also have built a multi-level interactive graph analytics
engine that allows users to visualize the structure of the network data
as well as macro-level graph data statistics as well as important
micro-level network properties of the nodes and edges.
Check out GraphVis: the interactive visual network mining and
machine learning tool.
Website source : https://networkrepository.com/